Flag: Tornado! Hurricane!

Blogs >> RolfRolles's Blog

Created: Tuesday, January 22 2008 16:20.30 CST  
Printer Friendly ...
Compiler Optimizations Regarding Structures
Author: RolfRolles # Views: 4616

Here are some optimizations that I have seen MSVC apply to structure references.  I wish I could give you the real names for these optimizations, but I can't find them in any of my compilers textbooks.  I have a feeling that they're buried away somewhere inside of Randy Allen and Ken Kennedy's incredibly dense tome, "Optimizing Compilers for Modern Architectures".  If anybody knows the real names for these transformations, please speak up.

#1:  Let's say we are accessing multiple entries in a structure that's larger than 80h.  Now as stated in the previous entry, each access to the members situated at >= 0x80 is going to require a dword in the instruction encoding if we generate the "naive" code.  If we instead do:

lea esi, [esi+middle_of_structure_somewhere]

mov eax, [esi-(middle_of_structure_somewhere - member_offset1)]
mov ebx, [esi+(member_offset2 - middle_of_structure_somewhere)]

We can access more of the structure with the one-byte instruction encoding, if those subtracted quantities are bytes.  The compiler chooses middle_of_structure_somewhere specifically to maximize the number of one-byte references.  This is the same idea behind the "frame pointer delta" stack-frame optimization.

#2:  Let's say we have a loop that accesses two arrays of structures inside of another structure, one array beginning at +1234h, the other beginning at +2234h.  If we emit the "naive" code:

; ecx = loop induction variable
imul ebx, ecx, sizeof(structure1)
imul edx, ecx, sizeof(structure2)

mov eax, [esi+1234h+ebx+offset_of_member1]
mov edi, [esi+2234h+edx+offset_of_member2]

Then obviously both of these structure displacements are going to require a separate dword in the instruction encoding for 1234h+offset_of_member1 and 2234h+offset_of_member2.  If we instead do:

lea esi, [esi+1234h]

; ecx = loop induction variable
imul ebx, ecx, sizeof(structure1)
imul edx, ecx, sizeof(structure2)

mov eax, [esi+ebx+offset_of_member1]
mov edi, [esi+1000h+edx+offset_of_member2]

Then if offset_of_member1 is a byte, it's only going to require a byte in the instruction encoding, thus saving three bytes per reference to the first structure (we can combine the previous optimization to place esi such that the number of one-byte references is maximized).  Alternatively, if more members in the second structure are accessed than those in the first, we'll see:

lea esi, [esi+2234h]

; ecx = loop induction variable
imul ebx, ecx, sizeof(structure1)
imul edx, ecx, sizeof(structure2)

mov eax, [esi+ebx+offset_of_member1-1000h]
mov edi, [esi+edx+offset_of_member2]

Once again, the first optimization can also be applied here to choose the optimal placement for esi that maximizes the number of single-byte references.  The multiplications given in the second optimization can also be optimized away into additions.


Blog Comments
c1de0x Posted: Wednesday, January 23 2008 15:11.31 CST
Rolf: Now we're talking!

Just one thing... you say "one byte instruction" when I think you mean "one byte operand". Not a big deal but it can get confusing.

Another place you see negative 'structure' offsets, although not strictly speaking an optimization, is in classes with multiple inheritance. In this case, you'll sometimes have virtual functions receiving pointers to 'this' and using a negative offset to cast up the inheritance hierarchy.

Again, this kind of thing can probably be detected and 'noted' automatically.



Add New Comment
Comment:









There are 31,320 total registered users.


Recently Created Topics
[help] Unpacking VMP...
Mar/12
Reverse Engineering ...
Jul/06
hi!
Jul/01
let 'IDAPython' impo...
Sep/24
set 'IDAPython' as t...
Sep/24
GuessType return une...
Sep/20
About retrieving the...
Sep/07
How to find specific...
Aug/15
How to get data depe...
Jul/07
Identify RVA data in...
May/06


Recent Forum Posts
Finding the procedur...
rolEYder
Question about debbu...
rolEYder
Identify RVA data in...
sohlow
let 'IDAPython' impo...
sohlow
How to find specific...
hackgreti
Problem with ollydbg
sh3dow
How can I write olly...
sh3dow
New LoadMAP plugin v...
mefisto...
Intel pin in loaded ...
djnemo
OOP_RE tool available?
Bl4ckm4n


Recent Blog Entries
halsten
Mar/14
Breaking IonCUBE VM

oleavr
Oct/24
Anatomy of a code tracer

hasherezade
Sep/24
IAT Patcher - new tool for ...

oleavr
Aug/27
CryptoShark: code tracer ba...

oleavr
Jun/25
Build a debugger in 5 minutes

More ...


Recent Blog Comments
nieo on:
Mar/22
IAT Patcher - new tool for ...

djnemo on:
Nov/17
Kernel debugger vs user mod...

acel on:
Nov/14
Kernel debugger vs user mod...

pedram on:
Dec/21
frida.github.io: scriptable...

capadleman on:
Jun/19
Using NtCreateThreadEx for ...

More ...


Imagery
SoySauce Blueprint
Jun 6, 2008

[+] expand

View Gallery (11) / Submit