Intercepted .NET: the Theory and practice of call interception .NET-functions
…I found a bug. The first call to the database was off by timeout, but the next was normal. It turned out that the Indians were in the typical method of 75000 rows, and the dB connection was off for a while, while there was jit compilation of the method… In my opinion, the method in 75к strings in C# rather call the devil than to work.
I remember a few years ago my friends and I taunted about the fact that soon mobile phones will no longer keep up with the computers at the power of the microprocessor. And here — in the place where I work, 20% of computers ALREADY concede to the power of modern GalaxyS.
A similar situation exists with software development — a couple of years ago. Progressive humanity was sure that the interception of the call .NET functions a lot of perverts. And I would agree with them.
About the interception of API functions written sea of books and articles, filmed a bunch of video tutorials. If earlier, the theme seemed to be a lot of guruparampara, now to write code that will intercept system calls, will not be a problem even for the beginner, there really is no big deal. Of course, only if it is a standard WinAPI-interface. The idea of interception of function calls .NET at first caused a smile of programmers. Smiling and trying to write viruses for .NET Framework.
Times have changed and with them changed the requirements that a potential customer poses to the developer. Now intercept calls in .NET-prog — not so worthless and insane task as it seemed before. And now, after reading this article, you can easily take control of your .NET apps!
Now you can with all boldness to assert that the answer to the ubiquitous Microsoft managed Java developer .NET today as popular as Java programmer. With the introduction of .NET Framework fundamentally changed the scheme “programming language — OS”. If earlier we were talking about the adaptation of one language for different platforms, now — about adaptation of different languages for one platform. Let’s briefly recall what .NET and we need to know if we want to learn the basics interception in his environment. The fact that in our case it will be enough just to know the programming language environment .NET, you also need to have a clear idea of how it works.
Platform .NET has common language runtime (Common Language Runtime — CLR). The common language runtime support CLR managed execution, which has a number of advantages. Together with the common type system, common language runtime the CLR supports interoperability platform languages .NET. Moreover, the platform .NET offers the most full-featured class library .NET Framework. And of course, metadata (Metadata) — information about assemblies, modules and types that make up the program .NET.
The compiler generates metadata, and the CLR and our programs use them. When loaded Assembly and its related modules and types, the metadata is loaded with them.
Metadata as one of the most important and fundamental fact, we now dwell.
So, they store all of the classes, types, constants and strings used .NET-app. Metadata, in turn, divided into several heaps (heaps) or threads. In Microsoft .NET there are five heaps: #US, #Strings, #Blob #GUID and #~.
#US heap stores all the strings that the programmer is “harvested” in your code. For example, if the program displays a string by Print(“hello”) then hello will be stored in the #US heap.
#Strings-a lot keeps things like method names and file names.
#Blob heap contains binary data referenced by the Assembly, such as, for example, the signature of the methods.
#~-the heap contains a set of tables that define the important content .NET Assembly. For example, there are table AssemblyRef, MethodRef, MethodDef, and tables Param. The AssemblyRef table includes a set of external assemblies depends on the Assembly.
Table MethodRef includes sheet external methods that are used by the Assembly. The MethodDef table contains all of the methods that are defined in the Assembly.
Param, in turn, contains all the parameters that are used by methods defined in the MethodDef table. “What is this boring?”, you can ask. Peace! Roll up the carpet impatiently and put it in the trunk waiting, because without understanding “how this crap works”, the meaning of the article before you can not walk :).
We’ll talk more about the MethodDef table. To intercept methods .NET applications is a very necessary thing. Each entry in the method table contains the RVA (relative virtual address) of the method, the method flags method name offset in the heap #Blob on the method signature and an index into the Param table, which contains the first parameter passed to the function. RVA of the method indicates the method body (which contains IL code) in the section .TEXT.
The method signature specifies the order of transmission of parameters (calling convention) which type will be return from the method, etc. So you can understand the topic at the level of
How far can you go?
To the middle — then the forest ends. We are happy to announce half of the forests we passed, and gradually move to our main goal — to learn to catch .NET calls.
Consider the question of execution .NET applications purely practical:
Mscoree.dll (Executive engine environment .NET)
Mscorwks.dll (where most of the stuff happens)
Mscorjit.dll (the same JIT)
Mscorsn.dll (handles verification “strict” names)
Mscorlib.dll (Base Class Library the base class library)
Fushion.dll (assembly binding)
Any .NET-application entry point has only one instruction. This instruction implements a jump to _CorExeMain function located in the table of imports. _CorExeMain in turn, refers to mscoree.dll which starts the loading process and execution .NET applications.
Mscoree.dll calls _CorExeMain from mscorwks.dll. Mscorwks.dll is a relatively big library, which controls and handles the uploading process. It loads the base class library (BCL) and only then calls the entry point Main() of your .NET applications. As Main() in this moment still not decompiled, the code in Main() will be thrown back in mscorwks.dll to compile. Mscorwks.dll cause JITFunction, which will load the environment from the JIT mscorjit.dll.
Once the generated IL code is compiled into native code, the control will be passed back to Main(), which will begin immediate execution.
Well, finally! Talk directly about the interception of calls .NET. We used to take the interception classically, that is, when to implement it or write a proxy wrapper, or a function of just spliced. In the case of .NET is different.
The first thing to understand for interception is that the methods we want to intercept are stored in the end section .TEXT. This is done because the partition .TEXT .NET native Assembly is quite compact — there’s not enough space to store all the intercepted functions.
Someone may ask, why not just change the standard known method (instruction “CALL” and “JUMP” RVA-method) on the intercepted code, and then just snag the code all the original functions? The reason is simple — the user “CALL” and “JUMP” into MSIL code using tokens (signatures) methods rather than offset them. Thus, if I want to get a link to the code that needs to intercept, this must be done by searching for the token method. So, to solve our problem of interception, we will need to expand the section .TEXT of the code.
It seems that the only way to call the original code is to create another method. There are two reasons why this is difficult, but still feasible. First, it requires the inclusion of a new entry in the method table. And secondly, in the methods table for more space for that just yet.
Second — we need to find the RVA of the method in the MethodDef table, and overwrite it so that it pointed to the new location of the intercepted method. To commit this operation, you need to increase the size of the partition .TEXT so she was able to accommodate all this stuff. This should be taken into account both the virtual and raw size of the partition. Virtual size is the actual, the actual size of the section, rawразмер is the size rounded up to the alignment section.
Virtual addresses and sizes are needed in order to know how the executable file is loaded into memory. If, for example, section .TEXT has a virtual address 0x1000, the offset in memory of the running process we find this very section .TEXT that was projected. However, the raw address of the section can be 0x200, and this means that the section .TEXT in the file located at offset 0x200.
The sections that follow .TEXT (section data and relate), you would also have to align, because the extension section .TEXT “picks” at the beginning of the next section, with the result that the file just won’t start. At the end of all this action is updated PE-header. All our code is now intercepted sewn directly into the Assembly, and other methods remained intact.
As you may have guessed, one of the main points that will allow us to take control .NET-programmulina, is to obtain a pointer to the header CLIHeader, which, in turn, contains such a field as Metadata. It is something we need:
Get a pointer to Cliheader C#
FileReader Input = new FileReader(AssemblyPath);
byte Buffer = Input.Read();
ImageBase = Marshal.AllocHGlobal(Buffer.Length * 2);
HeaderOffset = *((UInt32 *)(ImageBase + 60));
PE = (PEHeader *)(ImageBase + HeaderOffset);
HeaderOffset += (UInt32)sizeof(PEHeader);
StandardHeader = (PEStandardHeader *)(ImageBase + HeaderOffset);
RVA *CLIHeaderRVA = (RVA *)((byte *) StandardHeader + 208);
SectionOffset = GetSectionOffset(CLIHeaderRVA-> Address);
CLI = (CLIHeader *)(ImageBase + CLIHeaderRVA->Address – SectionOffset);
MetaDataHeader = (MetaDataHeader *)(ImageBase + CLI->MetaData.Address – SectionOffset);
metadata = new MetaData(Function, ImageBase, (Int32)CLI->MetaData.Address, MetaDataHeader, CLI->MetaData.Size);
Next will be slightly more complicated, especially for those who are familiar with the method of implementation in a PE file by extending its sections.
We will need to write to the partition .TEXT intercepted function and recalculate all the required fields associated with the section to give her a chance to run, and then update the necessary PE header:
VirtualSize = TextSectionHeader->VirtualSize + HookSize;
RawDataSize = VirtualSize;
if ((RawDataSize % FileAlignment) != 0)
RawDataSize += (FileAlignment (RawDataSize % FileAlignment));
StandardHeader->CodeSize = RawDataSize;
HookAddress = TextSectionHeader->VirtualAddress + TextSectionHeader->VirtualSize;
TextSectionHeader->VirtualSize = VirtualSize;
TextSectionHeader->RawDataSize = RawDataSize;
StandardHeader->DataBase = DataSectionHeader->
StandardHeader->ImageSize = SectionHeader->
VirtualAddress + SectionHeader->VirtualSize;
if ((StandardHeader->ImageSize % SectionAlignment) != 0)
StandardHeader->ImageSize += (SectionAlignment (StandardHeader->ImageSize % SectionAlignment));
That’s all. Unfortunately, those 75 thousand lines of code that I would like to share on the pages of the magazine, it just will not fit. Joke :). Fully working code you will be able to find on the disk.
It just so happens that most of the article is a description of the principles .NET-native environment, you probably have heard. But as they say, RTFM and you will have happiness. In order to fully master this technique and to be considered a shaman, you will have to work hard. But it’s not terrible, because the interception .NET applications in your performance is worth it.