Flag: Tornado! Hurricane!

The Viral Darwinism of W32.Evol

Tuesday, February 6 2007 14:26.08 CST
Author: Orr # Views: 70154 Printer Friendly ...

Introduction

The W32.Evol virus was discovered around July 2000. Its name is derived from a string found in the virus, but much more can be implied from the name. Up until then, most of the viruses were using Polymorphic engines in order to hide themselves from Anti-Virus scanners. The engine would encrypt the virus with a different key on every generation, and would generate a small, variant decryptor that would consist of different operations but remain functionally equivalent. This technique was beginning to wear out as AV scanners would trace virus-decryption until it was decrypted in memory, visible and clear.

Although Metamorphism, as a technique, appeared in several viruses in the DOS age, it got full attention from virus writers in the 32-bit environment. The idea is simple; Transformation instead of Encryption. Not just a small decryptor would be transformed, but the entire virus body.

A Metamorphic engine is used in order to transform executable (binary) code. The behavior of such an engine varies from virus to virus, but many elements remain the same. A metamorphic engine has to implement some sort of an internal disassembler in order to parse the input code. After disassembly, the engine will transform the program code and will produce new code that will retain its functionality and yet will look different from the original code.

According to Symantec, Evol was the first virus to utilize a 'true' 32-bit Metamorphic Engine, and so it represents another step in the evolution of Anti-AV techniques.

Virus Author's Note: "The only particularity of Evol is its evolution engine - meaning that the virus will mutate every 4 copy of itself. The engine is not an usual polymorphic engine, but rather a metamorphic engine (see Benny description in 29A #4), which means that there is no encrypted code: the whole code of the virus, engine included, is variable. Furthermore, the engine inserts random code, so as to make detection by antivirus more difficult. The virus contains no fixed data : it is only a massive piece of code."

The research I performed over the Metamorphic engine includes a heavily-commented disassembly of the engine available here. Information regarding the behavior of the virus itself is not included in this paper and is available on many Antivirus websites on the Internet, namely the Symantec website.

Legend

Examples in this paper include shortened naming of assembly language expressions:

    Reg � Register (i.e. EAX, EBX)
    Mem � Memory address (i.e. [EAX])
    r/m � Register or Memory
    imm � Immediate Value (i.e. OP Reg, ACABh)
    OP  = {ADC, ADD, AND, CMP, OR, SBB, SUB, XOR}
    OP1 = {DIV, IDIV, IMUL, MUL, NEG, NOT, TEST}
    OP2 = {RCL, RCR, ROL, ROR, SAL, SAR, SHL, SHR}

Calling

The engine is called in the following standard-issue fashion:

    push [ebp+var_14]       ; *outBuf (EDI)
    call GetSizeOfCode
    push eax                ; sizeOfCode
    call SeekStartOfVirus
    push eax                ; *inBuf (ESI)
    call MetaEngine
    cmp  eax, 0
    jz   short EngineFailed

Or, in psuedo-C:

    MetaEngine(*inBuf, sizeOfCode, *outBuf);

Where, *inBuf is a pointer to the code, sizeOfCode is the size, *outBuf is the output to the destination buffer where the mutated code will be stored.

Code Analysis

The engine will perform a analysis over the given code. Aside from on-the-fly disassembly, the virus will allocate 4 table entries for each instruction it analyzes. Each of the entries is a double-word. The structure is accessed in the following way:

+00InputIPPointer to Instruction in the Input Buffer
+04OutputIPPointer to Mutated Instruction in Output Buffer
+08OffsetNewRelativePointer to offset of New Relative Branch Value
+0CNewRelativeNew Relative Value for a Branch Instruction


When the engine first loads, it will use allocate SizeOfCode*16 bytes using VirtualAlloc for the purpose of the above-mentioned table. In the end, theses bytes will be freed using VirtualFree. The virus itself uses internal 'caller' functions (callVirtualAlloc / callVirtualFree), and doesn't call the API's directly.

Every time the engine loads a new instruction for analysis, the first two members of the structure are filled, and the 3rd member is zeroed for later use. The 3rd and 4th fields will only be filled in case the engine analyzes a branch instruction (JMP/Jcc/CALL), to be used when the relocations will be fixed, after the mutation process is complete.

The engine will disassemble only instructions that the author had included, meaning it would fail with unrecognized / unsupported instructions.

Sample Disassembly:

    cmp al, 8Ah         ; MOV r8, r/m8?
    jz short _Mutate?
    cmp al, 8Bh         ; MOV r32, r/m32?
    jz short _Mutate?
    cmp al, 8Dh         ; LEA r32, mem?
    jz short _Mutate?
As you can see, the engine simply checks for the current opcode, and if it is recognized by the engine, it will take an action accordingly.

Code Transformations

I. Instruction Transformation

The engine supports several kinds of instruction mutations, meaning it will write different code with the same functionality. The defined transformations are divided into two parts:
  • Inter-Engine Transformations: These transformations are inlined inside the engine, and are a part of the engine's core.

    OriginalTransformed
    
    - PUSH r/m8
    - PUSH r/m32
    
    
      MOV  EAX, r/m
      PUSH EAX
    
    - MOV reg, imm
    
    
    a. MOV reg, Random
       ADD reg, imm-Random
    
    b. MOV reg, Random SUB reg, -(imm-Random)
    c. MOV reg, Random XOR reg, Random^imm


  • External Transformations: These transformations are called like that due to the fact that their 'physical' location is outside the main engine function. Despite this fact, these routines act as if they are inside the engine itself, and when they are finished they jump back to the engine. There is no visible point behind this, and my guess is that they were added after the initial coding process.

    OriginalTransformed
    
    - MOV r/m, reg
    - MOV reg, r/m
    - TEST r/m, reg
    - LEA r32, mem
    - OP r/m, reg
    - OP reg, r/m
    
    
     PUSH RandomReg
     MOV RandomReg, OriginalReg
     ADD RadnomReg, RandomImm8
     OP r/m - RandomReg, OriginalReg 
     POP RandomReg
    
    
    - MOV r/m, reg
    - TEST r/m, reg
    - OP r/m, reg
    
    
     PUSH RandomReg
     MOV RandomReg, OriginalReg
     OP OriginalR/M, RandomReg
     POP RandomReg
    
    
    - MOV reg, r/m
    - LEA reg, mem
    - OP reg, r/m
    
    
     PUSH RandomReg
     MOV RandomReg, OriginalReg
     OP RandomReg, OriginalR/M
     MOV OriginalReg, RandomReg
     POP RandomReg
    
    
    - OP r/m8, imm8
    - MOV r/m8, imm8
    - TEST r/m8
    
    
     PUSH RandomReg
     MOV RandomReg8, Imm8
     OP OriginalR/M8, RandomReg8
     POP RandomReg
    


    The engine's decision whether to transform a given instruction or not is based upon a random factor. The engine asks for a random number between 0 and 7, and the transformation will be applied only if it is 0 � meaning a probability of 1/8.
II. Alternative Instruction Encoding

The Intel instruction format allows different binary encoding for the same action. The engine supports the following alternative encodings:

Original EncodingModified EncodingMnemonics
7x imm8 0F 8x imm32Jcc short Jcc near
EB imm8 E9 imm32 CALL shortCALL near
A8 imm8 F6 C0 imm8 TEST AL, imm8 TEST AL, imm8
A9 imm32 F7 C0 imm32 TEST EAX, imm32 TEST EAX, imm32
3F imm8 80 ModRM imm8 OP AL, imm8 OP AL, imm8
3F imm32 81 ModRM imm32 OP EAX, imm32 OP EAX, imm32
83 ModRM imm881 ModRM imm32 OP r/m32, imm8OP r/m32, imm32


III. Fixed Transformations

The engine will replace the following bytes with the corresponding sequences:

Original EncodingModified EncodingMnemonics
  A4

50
8A 06
83 C6 01
88 07
83 C7 01
58
  MOVSB  

  PUSH EAX
  MOV AL, [ESI]  
  ADD ESI, 1
  MOV [EDI], AL  
  ADD EDI, 1
  POP EAX
	
 A5

50
8B 06
83 C6 04
89 07
83 C7 04
58
  MOVSD
  PUSH EAX
  MOV [EAX], ESI  
  ADD ESI, 4
  MOV [EDI], EAX  
  ADD EDI, 4
  POP EAX
 AA

88 07
83 C7 01
  STOSB
  MOV EDI, [AL]  
  ADD EDI, 1
 AB

88 07
83 C7 04
  STOSD
  MOV EDI, [EAX]  
  ADD EDI, 4
 AC

8A 06
83 C6 01
  LODSB 
  MOV AL, [ESI]  
  ADD ESI, 1
 AD

8A 06
83 C6 04
  LODSD
  MOV EAX, [ESI]  
  ADD ESI, 4


As you can see, these instructions do not have any parameters passed onto them, thus simply being replaced with their corresponding functionality.

IV. Junk-Code Insertion

The engine will generate instructions that are not reliant upon the original code, and their functionality is essentially "do-nothing". The junk instructions will only be added if the last written byte is between 50h to 52h (PUSH EAX/ECX/EDX).

    - MOV r32, [ebp+Random8]
    - MOV r32, Random32
    - OP r32, Random32            ;ADC/ADD/AND/OR/SBB/SUB/XOR
    - MOV RandomReg8, Random8

It may be noted, however, that these instructions actually do alter the original code flow as they are random and inserted in places in which they will be executed, but these instructions are inserted after PUSH instructions, so we assume the registers will be modified later on.

V. General Instructions

In any other case the engine will store the instruction as is, aside from exceptional opcodes:

OpcodeMnemonicsAction
90
NOP
Don't store
0F xx
(80 > xx > 90)
Special Opcode 0F
Not supported by engine
Abort Engine
CC
INT 3 (Debugger Breakpoint)
Anti Debug
81 C4
ADD ESP, imm32
Store
81 EC
SUB ESP, imm32
Store
83 C4
ADD ESP, imm8
Store
83 EC
ADD ESP, imm8
Store
C0
OP2 r/m8, imm8
Store
D0
OP2 r/m8, imm8
Store
CD
INT
Store
8B EC
MOV EBP, ESP
Store
F3
REP Prefix
Store
C3
RET
Store
50 � 5F
PUSH r32 / POP r32
Store


Relocation Fixups

After the mutation process is completed, the engine fixes instruction relocations. Due to the fact that many times the transformation process results in growth of code, most (if not all) of the branch instructions will lead to an incorrect place in the destination buffer. The engine will utilize the relocation-table it created during the mutation process, and it will patch the new address into place. First, it will loop through the table. For every instruction it will add the 1st and 4th fields (InputIP + NewRelative), thus calculating a virtual original destination. It will then set a second loop that will search for that destination, and patch the entry using the 3rd and 4th fields.

Other Features

I. Anti Debugging

If the engine will detect a breakpoint over the code it mutates, it will jump to the following routine:

    AntiDebug:
        cmp byte ptr [ebx+7], 0BFh      ; are we in kernel mode?
        jnz short ret_AntiDebug
        mov ecx, 1000h                  ; counter = 1000h
        mov edi, 40000000h
        or edi, 80000000h
        add edi, ecx                    ; edi = C0001000h
        rep stosd                       ; copy bytes to [edi]
    ret_AntiDebug:
        retn                            ; this will result in a crash

The above routine can also be considered as 'external', as it is called from the main virus body as well as from the engine.

II. Internal Functions

The engine contains several functions that it uses for many actions:
  • Random: Returns a random number in EAX.
  • Rnd7: Returns a random number between 0-7 and checks if it's 0.
  • CheckDisplacement: Returns the displacement of a given opcode in CL.
  • InvertSign: If necessary, the function inverts the sign (- / +) of AL.
  • GetRandomDword: Returns a random dword in EAX.
  • GetWBit: Extracts the W bit from the opcode into DL.
  • ModifyDH: Modifies the DH according to the W bit.
  • GetRandomReg8: Returns a random 8 bit register in BL.
  • GetRandomReg32: Returns a random 32 bit register in BL.
  • MakePushRandomReg: Generates a PUSH RandomReg instruction.
  • MakePopRandomReg: Generates a POP RandomReg instruction.
Sample Transformations

Presented below are the actual transformations performed by the engine on itself.

    B9 00 10 00 00      mov ecx, 1000h

Transformed:    
    B9 10 B2 00 3C      mov ecx, 3C00B210h
    81 C1 F0 5D FF C3   add ecx, 0C3FF5DF0h ; ecx = 1000h

Some of the "external" transformations:

BeforeAfter
8B 45 0C   mov eax, [ebp+0Ch]
56         push esi
89 EE      mov esi, ebp
83 C6 56   add esi, 56h
8B 46 B6   mov eax, [esi-4Ah]
5E         pop esi
89 43 08   mov [ebx+8], eax
51         push ecx
8B C8      mov ecx, eax
89 4B 08   mov [ebx+8], ecx
59         pop ecx
33 C0      xor eax, eax
51         push ecx
89 C1      mov ecx, eax
33 C8      xor ecx, eax
8B C1      mov eax, ecx
59         pop ecx
80 F9 50   cmp cl, 50h
52         push edx
B2 50      mov dl, 50h
38 D1      cmp cl, dl
5A         pop edx


As you can see in the above examples, the mutated byte-sequences are entirely different then the original ones.

Conclusion

The analysis of the virus engine took me a lot of time, mainly due to the fact that it was done statically, without running the code. I hope this paper helps to shed more light on the idea of how metamorphism is done, as well as the aspects involved in the design of such an engine. Further, I'd like to thank the author of this engine, for creating this piece of code that enhanced my interest in this particular field. I encourage you to look over the reversed source-code of the engine, as it will probably make all things written above a little bit more clear.

Thanks for reading this. As always, any feedback is always welcome.

Orr
www.antilife.org/

Article Comments Write Comment / View Complete Comments

    Username Comment Excerpt Date
nEINEI good work~~ Friday, November 27 2009 04:04.08 CST
adityaks nice one man Wednesday, June 20 2007 06:07.58 CDT
Orr The mistake is in the paper and not in the engi... Friday, April 6 2007 16:02.00 CDT
eraser You are right MazeGen. [code] EB cb JMP cb ... Wednesday, April 4 2007 13:33.07 CDT
MazeGen Very interesting article, thanks. There's on... Friday, March 30 2007 02:57.38 CDT
eraser Many thanks for your valuable contribution. Monday, March 12 2007 11:13.59 CDT
nico Good work, interesting disassembly :) Wednesday, February 21 2007 15:35.17 CST

There are 31,320 total registered users.


Recently Created Topics
[help] Unpacking VMP...
Mar/12
Reverse Engineering ...
Jul/06
hi!
Jul/01
let 'IDAPython' impo...
Sep/24
set 'IDAPython' as t...
Sep/24
GuessType return une...
Sep/20
About retrieving the...
Sep/07
How to find specific...
Aug/15
How to get data depe...
Jul/07
Identify RVA data in...
May/06


Recent Forum Posts
Finding the procedur...
rolEYder
Question about debbu...
rolEYder
Identify RVA data in...
sohlow
let 'IDAPython' impo...
sohlow
How to find specific...
hackgreti
Problem with ollydbg
sh3dow
How can I write olly...
sh3dow
New LoadMAP plugin v...
mefisto...
Intel pin in loaded ...
djnemo
OOP_RE tool available?
Bl4ckm4n


Recent Blog Entries
halsten
Mar/14
Breaking IonCUBE VM

oleavr
Oct/24
Anatomy of a code tracer

hasherezade
Sep/24
IAT Patcher - new tool for ...

oleavr
Aug/27
CryptoShark: code tracer ba...

oleavr
Jun/25
Build a debugger in 5 minutes

More ...


Recent Blog Comments
nieo on:
Mar/22
IAT Patcher - new tool for ...

djnemo on:
Nov/17
Kernel debugger vs user mod...

acel on:
Nov/14
Kernel debugger vs user mod...

pedram on:
Dec/21
frida.github.io: scriptable...

capadleman on:
Jun/19
Using NtCreateThreadEx for ...

More ...


Imagery
SoySauce Blueprint
Jun 6, 2008

[+] expand

View Gallery (11) / Submit