01/02/2018
Blog technique
DIMCT
Adrien Chevalier
We developped a small tool, « DIMCT » which simply allows tracing inter module calls, without a too big overhead.
During our evaluations tests, we often need to analyze quickly large Windows products, and want to pinpoint how their different bricks work together, especially their modules. In most cases, a module will import another module’s functions, and this will be easily retrieved statically.
However, in a few other cases, a module may export classes constructors, which will return objects containing references towards their virtual methods. In some other few cases, callbacks may be registered and called by other modules. In these cases, it will not be trivial to pinpoint which method will be called by another module (and especially by which function).
We developped a small tool, « DIMCT » (for Dirty Inter-Module Calls Tracer) which allows tracing inter module calls, without a too big overhead.
The tool may be found at https://github.com/AMOSSYS/DIMCT.
Usage
The usage is relatively straightforward:
- Run the provided IDAPython script in order to generate a configuration file;
- Start the monitored process;
- Run the provided executable with the process PID, the configuration file, and the delay before killing the process;
- Load the output with the IDAPython script in order to pinpoint which functions have been called;
- Manually parse the output file if you want more information, e.g ‘who called who’.
Internals
The inner concepts are also quite simple: inline hooks are placed in top of any identified function. The hook points toward a logging function, which only logs intermodular calls. Logs are performed in a dedicated memory area, which is periodically read and dumped by the remote process.
It follows this scheme:
Figure 1: DIMCT flow
The reasons why we call this tool « dirty » are the following ones:
- we do NOT use a shared memory section, the monitoring process keeps reading the remote memory area and wipes it when full (two WriteProcessMemory calls are done, one to wipe the area, the second one to « release » the mutex). We just gave the monitoring process an higher priority than the target process in order to minimize the impact;
- we do NOT use any Windows API in the logging function, so mutexes are implemented with a
lock cmpxchg
instruction (i.e no OS benefits such as thread priority boosts).
Yeah, that’s really dirty, but this actually worked without too much bugs/overhead/drops, so… we keeped it as is. We also did not encounter the need for x64 binaries so actually only x86 processes are handled (the concept remains the same, we will implement it soon, I guess).
The main problems we faced is handling relative instructions while moving our saved instructions. Moving a SHORT JMP
or a CALL
, which opcodes are relatives to the current instruction position is not that straightforward, and that’s the main reason why we used an IDAPython script.
In order to face this problem and use absolute addresses, we replaced CALLS
and JMPS
with PUSH/RET
instructions, and conditional jumps with their counterparts and PUSH/RET
instructions. For instance, a JNZ SHORT <addr>
will be replaced by a JZ SHORT $+6 / PUSH <addr> / RET
. Those absolute addresses belonging to the module itself are stored relatively to the module base address, and then « relocated » at the hook installation. Absolute addresses are also logged in order to be relocated by the program.
As an example, here are the original function, the configuration file and the final result:
Figure 2: DIMCT trampolines
Example
As an example, let’s test it on KernelBase.dll
and the 32 bit version of notepad.exe
. First, load KernelBase.dll
(the SysWOW64 version) in IDA Pro, load the script and run create_config("config.bin", True)
.
Python>create_config("C:\Users\user\Desktop\config.bin", True)
4374 subs will be monitored
On a Windows 10 1709 we actually cover 4374 over 4458 subs.
Now let’s start the notepad.exe
instance and then DIMCT tool, with notepad’s PID and 120 seconds. The interface is actually quite responsive but may be slowed, especially when opening the file/open dialog. Finally we’ve got a log.bin
file of approximatively 6Mb.
Figure 3: DIMCT running
In order to show the results in IDA, we use the parse_output
function, and here are the called functions:
Python>parse_output("C:\Users\User\Desktop\log.bin")
Modules list:
notepad.exe : 00380000 - 003be000
ntdll.dll : 77c20000 - 77dae000
[...]
Unique callers:
COMDLG32.dll
urlmon.dll
gdi32full.dll
TextInputFramework.dll
msvcrt.dll
CoreUIComponents.dll
dwmapi.dll
ntdll.dll
sechost.dll
PROPSYS.dll
cfgmgr32.dll
KERNEL32.DLL
IMM32.DLL
SHLWAPI.dll
USER32.dll
MPR.dll
combase.dll
notepad.exe
uxtheme.dll
OLEAUT32.dll
profapi.dll
SHELL32.dll
RPCRT4.dll
shcore.dll
clbcatq.dll
COMCTL32.dll
windows.storage.dll
MSCTF.dll
ucrtbase.dll
CoreMessaging.dll
twinapi.appcore.dll
ADVAPI32.dll
oleacc.dll
Unique called subs:
0x100f2800 GetProcessHeap
0x100fb300 DeactivateActCtx
0x100f77b0 sub_100F77B0
[...]
Sorting the called functions:
AccessCheck
ActivateActCtx
AddAccessAllowedAce
AddRefActCtx
[...]
Wow64DisableWow64FsRedirection
Wow64RevertWow64FsRedirection
lstrcmpW
lstrcmpiW
lstrlenW
sub_100D991D
sub_100EE882
sub_100F77B0
sub_100FD090
sub_10103281
sub_1010331F
Interrestingly, 6 non exported subs have been called. For instance, sub_100F77B0
and sub_100FD090
are only referenced by CreateThreadpoolIo
. Let’s see who called them:
Python>whocalled("sub_100D991D", "C:\Users\user\Desktop\log.bin")
Unique callers:
0x32c1f4d
Python>whocalled("sub_100EE882", "C:\Users\user\Desktop\log.bin")
Unique callers:
0x32c2e2a
Python>whocalled("sub_100F77B0", "C:\Users\user\Desktop\log.bin")
Unique callers:
ntdll.dll : 0x77c597c7
Python>whocalled("sub_100FD090", "C:\Users\user\Desktop\log.bin")
Unique callers:
ntdll.dll : 0x77c5d087
Python>whocalled("sub_10103281", "C:\Users\user\Desktop\log.bin")
Unique callers:
0x32c1c11
Python>whocalled("sub_1010331F", "C:\Users\user\Desktop\log.bin")
Unique callers:
0x32c42e0
Python>
Ntdll
called the 2 thread pools callbacks, the other ones seem to have been called by jitted code, which is in fact… our own « trampolined » code (which moved several CALL
instructions), which we really should add in the white list.
Conclusion
We hope this basic tool/source code will be useful to others than us. We want it to remain simple, so the biggest improvements will probably be removing the « dirty » part (i.e using shared memory, Windows mutexes, and tuning the assembly code), and adding the x64 support. We may also test it against intra-modular calls in the future, but we’re not really confident over the performances. We’ll see. Feel free to contribute!