Reverse Engineering RET Homepage RET Members Reverse Engineering Projects Reverse Engineering Papers Reversing Challenges Reverser Tools RET Re-Search Engine Reverse Engineering Forum Reverse Engineering Links

Reverse Engineering Team Blog

03.07.05

The Flawed Design of GetProcAddress

Posted in General Posts at 10:52 am by sna

On the now ancient Windows 95 and Windows 98 platforms, you can exhibit a rather strange behavior from GetProcAddress. The API will behave differently if the process invoking it is ran under a debugger. This effectively breaks one of the cardinal rules of debugging – rule number two states “Don’t change the behavior of the target app”.

Robbins mentions the problem both in an MSJ column of his and in his “Debugging Applications” book, but I am not completely content with his description of it:

“…under Windows 95, calling GetProcAddress in your program while it is running under a debugger returns a different address than when it runs outside a debugger. What is actually returned when running under the debugger is a ‘debug thunk’ – a special wrapper around the real call.”

And he goes on to say:

“Windows 95 doesn’t implement copy-on-write in the operating system. With copy-on-write, the operating system will share a common code page in memory, but when a process writes to that memory, the memory is copied so that the individual process gets its own copy that will not interfere with any other process. In the Windows 95 architecture, any memory that is above the 2GB line is shared among all processes. If one process were to write a breakpoint to this shared memory area without the copy-on-write, the breakpoint would apply to all processes, not just the one being debugged.”

So when you do GetProcAddress on a symbol it returns the address of a debug thunk. Note that with this design you cannot export anything other than pure functions from the system DLLs. The thunk consists of a measly two instructions, a push of the real API address and a jump to some function inside the kernel. Unfortunately I do not own a copy of Windows 9x (I assume Windows ME is also affected by this problem) so I have not been able to debug the kernel function. However, I know someone that does own a copy, and he was kind enough to let me run some tests on his Windows 95.

I prepared a small test application that resolves the LoadLibraryA and MessageBoxA API functions using GetProcAddress. The application also imports these functions through its Import Address Table (IAT) and upon running it the return values from GetProcAddress and the function addresses pointed to by the IAT are displayed in a message box.

There are three different cases that are of interest to us. The test application can be run normally, outside a debugger (1). It can also be started under a debugger (2) or it can be attached half way through (3). Here’s a figure showing the result of each of these cases: GetProcAddress tests

The leftmost box looks just like what we had expected, nothing to add there. The middle box, however, is a bit more concerning. You can see that the addresses do not point to the real functions, but what’s really annoying is that the address in the IAT and the address returned from GetProcAddress are not the same! As confirmed by the third box, if you attach a debugger half way through, the IAT will be patched to reflect the fact that the process is now being debugged.

I prepared another test that would call GetProcAddress and store all unique return values in a list so they could be counted. We ran the test for a while and the list held nearly 300 000 different addresses before the test was aborted. What I do not understand is this: Why are debug thunk addresses not cached? If they were, and the same thunk was recycled, breakpoints would actually work and trigger like they should!

What would have been better is of course if they hadn’t used debug thunks to begin with. It would be up to the debugger to know not to write anything above the 2GB line. Oh well…

But even with the current design it’s usually a non-issue. The problem only arises when we have a need to know what a particular IAT entry points to. With the IAT being filled of pointers to debug thunks, there’s no telling which entry points to what. A workaround proposed by Robbins is to traverse the original first thunk instead, and because it is essentially mirroring the IAT you will know what the real IAT entries are. This will usually work, unless, of course, there is no original first thunk. And as you may know Borland compilers do not emit an original first thunk.

I have another idea that might work. Debug thunks seem to always be constructed above the 2GB line, so if we invoke GetProcAddress a couple of times with the same symbol and get back different addresses all above 2GB, it would seem to indicate that we are being set up. We can then read the address being pushed in one of the thunks and save it for later. Also, don’t forget to take note of where the jump in the thunk goes.

When later looking through the IAT, if an address is above 2GB, we can see if what it points to matches the pattern of a thunk (0×68, ??, ??, ??, ??, 0xE9, ??, ??, ??, ??). If it does we read out the supposed jump target and compare it with the previously saved one. If the jump targets match we know that we’re looking at a debug thunk in the IAT.

I could write and post some code later to try the idea out.

Leave a Comment