re
def: analyzing software, understanding its structure, functionality, behavior, etc. Important for malware analysis, software security auditing, understanding proprietary protocols, cyber forensics, etc.
Note that there exists different reverse engineering techniques for different langagues. Some languages include C/C++, Java, .NET, Assembler, etc. Some platforms inlcude DOS, MacOS X, Unix/Linux, Windows, etc.
• https://crackmes.one
• x86-asm.md
• os.md
• github.com/Dump-GUY/Malware-analysis-and-Reverse-engineering
• Assembler: Turn assembly code into machine language
• Disassembler: Taking a machine language program and turning it into an assembly language (human-readable) code
- examples include IDA Pro, Ghidra, Objdump, Radare2, GDB
• compilation: taking high-level code and turning it into low-level code, usually assembly
•
•
- e.g.,
- also, see ELF
• "patching a jump" - if you want to change the flow of execution of a binary, for example if there is a license check, you can go in with a hex editor (or, disassembler) and literally change the "je" to a "jne" lol
c->asm: gcc -S [file]
c->asm w/o fluff: gcc -S -O2 -fno-asynchronous-unwind-tables [file]
c->executable: gcc [file] -no-pie -o [filename]
once we have the assembly file, assemble to an executable:
asm->executable: as [file].s -o [file].o
and link it:
gcc [file].o -o [file]
and execute:
./[file]
• GDB
• Objdump
• strace - track system calls
• ltrace - track ibrary calls
• hopper
-
•
•
•
•
•
•
-
•
- example ->
Note because linux has Address space layout randomization, if you want to set a specific memory address you'll prob have to disable ASLR via
•
•
•
•
(note, this section is generally in intel assembly)
// do something
}
will have the structure:
// add items to stack
mov addr1, i (1)
mov addr2, 10
// compare operation
cmp addr2, 0xa # note how this is one more than 10 in hexadecimal
// conditional jump after the loop
jg some addr after loop
// "do something"
// do the increment condition
add addr1, 0x1
// jump back to compare operation
jmp to compare
If you're trying to emulate some for loop in python, they have automatic hexadecimal conversion. That is, if you type "0xe" python will spit out 14:
So you can do something like:
Note that there exists different reverse engineering techniques for different langagues. Some languages include C/C++, Java, .NET, Assembler, etc. Some platforms inlcude DOS, MacOS X, Unix/Linux, Windows, etc.
Resources
• Reverse engineering for beginners• https://crackmes.one
• x86-asm.md
• os.md
• github.com/Dump-GUY/Malware-analysis-and-Reverse-engineering
Definitions
• Machine language: legit 1s and 0s that are pumped into the CPU• Assembler: Turn assembly code into machine language
• Disassembler: Taking a machine language program and turning it into an assembly language (human-readable) code
- examples include IDA Pro, Ghidra, Objdump, Radare2, GDB
• compilation: taking high-level code and turning it into low-level code, usually assembly
•
strings [file]
-> legit just shows the strings
•
file [file]
-> shows the filetype
- e.g.,
NT_Crack.exe: PE32 executable (console) Intel 80386, for MS Windows, 13 sections
- also, see ELF
• "patching a jump" - if you want to change the flow of execution of a binary, for example if there is a license check, you can go in with a hex editor (or, disassembler) and literally change the "je" to a "jne" lol
Making ASM
making the assembly file/executable:c->asm: gcc -S [file]
c->asm w/o fluff: gcc -S -O2 -fno-asynchronous-unwind-tables [file]
c->executable: gcc [file] -no-pie -o [filename]
once we have the assembly file, assemble to an executable:
asm->executable: as [file].s -o [file].o
and link it:
gcc [file].o -o [file]
and execute:
./[file]
Tools
• IDA Pro• GDB
• Objdump
• strace - track system calls
• ltrace - track ibrary calls
• hopper
GDB
•set dis
-> dis options
-
set disassembly-flavor
(intel, etc)
•
info functions
-> show all defined functions
•
disassemble [function]
-> show the asm code for [function]
•
run
-> run the code
Breakpoints
Apple (of app people, wtf) has a nice guide: developer.apple.com/library/archive/documentation/DeveloperTools/gdb/gdb/gdb_6.html•
info breakpoints
-> shows the current breakpoints
•
delete [breakpoint]
-> delete whatever breakpoint
•
break [function]
-> set breakpoint at function
-
break *main
•
break +offset
, break [linenum]
, break *address
- example ->
break *main+88
(will break at +88) in the lil disassembler thing)
Note because linux has Address space layout randomization, if you want to set a specific memory address you'll prob have to disable ASLR via
echo 0 | sudo tee /proc/sys/kernel/randomize_va_space
Observing Stuff
Because GDB is the goat, we can step through the program with it, check out memory addresses, etc.•
continue
, c
-> continue execution till breakpoint
•
info registers
-> shows the content of all registers
•
x/d [address]
-> inspect the value at a memory location
•
stepi
, si
-> execute one assembly instruction at a time
C->ASM
Each structure in C has a certain "mirror" structure in assembly.(note, this section is generally in intel assembly)
For loop
for (int i = 1, i < 10, i++){// do something
}
will have the structure:
// add items to stack
mov addr1, i (1)
mov addr2, 10
// compare operation
cmp addr2, 0xa # note how this is one more than 10 in hexadecimal
// conditional jump after the loop
jg some addr after loop
// "do something"
// do the increment condition
add addr1, 0x1
// jump back to compare operation
jmp to compare
If you're trying to emulate some for loop in python, they have automatic hexadecimal conversion. That is, if you type "0xe" python will spit out 14:
>>> 0xe
14
So you can do something like:
for i in range(0x1, 0xa) : # do something