Introduction
Shellcodes are machine instructions that are used as
a payload in the exploitation of a vulnerability. An exploit is a small code
that targets a vulnerability. Shellcodes are written in assembly. We generally
refer to sites like shell-storm.org to get shellcodes and attach them to our
exploits. But how can we make our shellcodes?
This series of articles focuses on creating our
shellcodes. In Part 1, we'd be understanding basic assembly instructions,
writing our very first assembly code, and turning that into a shell code.
Table of Content
l Understanding CPU Registers
l First Assembly Program
l Assembling and Linking
l Extracting Shellcode
l
Removing NULLs
l A
sample shellcode execution
l Conclusion
Understanding CPU registers
"Assembly is the language of
OS." We have all read this in our computer science textbooks in high
school. But how is assembly written? How is the assembly language able to
control our CPU? How do we make our assembly program?
Before going into assembly, let's
understand our CPU registers. An x86-64 CPU has various 8-byte (64-bit)
registers that can be used to store data, do computation, and other tasks.
These registers are physical and embedded in the chip. They are lightning-fast
and exponentially faster than the hard disk memory. If we can write a program
only using registers, the time required to run it would virtually be
instantaneous.
A CPU contains a Control Unit, Execution
Unit among other things. This execution unit talks to Registers and Flags.
There are many registers on the CPU. But for this
part, we only need to know about the general-purpose registers.
64-bit registers
(ref: researchgate.net)
So, in the image above we can see that there are
legacy 8 registers (RAX, RBX, RCX, RDX, RDI, RSI, RBP, RSP) and then R8 to R15.
These are the general-purpose registers. CPU may also have others like MMX
which we'll encounter later on.
Out of these, these 4 data registers are:
RAX - Accumulator. Used for input/output and most
arithmetic operations.
RBX - Base Register. Used for stack's index
addressing
RCX - Count Register. Used for counting, like a loop
counter.
RDX - Data register. Used in I/O operations along
with RAX for multiply/divide involving large values.
Again, this is just the given function. We can modify
and use these registers in other ways we like.
Next, 3 pointer registers are:
RIP - Instruction Pointer. Stores the offset of the
next instruction to be executed.
RSP - Stack Pointer. Stores the memory address of the
top of the stack.
RBP - Base Pointer. Makes the base of the stack frame
for the current function. This makes it easier to access function parameters
and local variables at fixed offsets from the RBP register. eg: RBP-4 would
store the first integer variable defined in the program.
Finally, there are 2 Index registers:
RSI - Source Index. It is used as as source index for
string operations mainly.
RDI - Destination Index. It is used as a destination
index for string operations mainly.
Apart from these we have some control registers as
well, known as flags. These flags hold values 0 and 1 for set and unset. Some
of these are:
CF - Carry Flag. Used for carry and borrow in
mathematical operations.
PF - Parity Flag. Used for errors while processing
arithmeetic operations. If number of “1” bits are even then PF=0 else it is set
as 1.
ZF - Zero Flag. Used to indicate the result of a
previous operation. This would be used as the input of other operations like JZ,JNZ etc.
Now we are ready to write our first program in
assembly.
First Assembly Program
An assembly program is written with usually 3 main
sections:
1.
Text section - Program instructions are stored here
2.
Data section - Defined data is stored here
3.
BSS section - Undefined data is stored here.
It is also to note that there are 2 main assembly
flavors in Linux 64-bit Assembly: AT&T syntax and Intel syntax.
If you have used GDB before, you’ll notice it
automatically displays the assembly in AT&T syntax. This is a personal
preference. Some people like seeing their assembly in this, but we would be
using the Intel syntax because it seems a lot clearer.
Let's write our first "Hello World"
program.
We always start by defining our skeleton code. I'll
create a file with the extension ".asm"
We always start by defining a global directive.
Since, unlike C, we don't have a main function here to tell the compiler where
a program starts from, in assembly, we use the symbol "_start" to
define the start of the program. In section .text, we define the _start label
to tell the assembler to start instructions from this point.
For full details about global directives, refer to
this post.
Now, we have to define a message "Hello
World." Since this is a piece of data, it must come in .data section
This is how variables are declared:
<variable>: <data type> <value>
The name of the variable is “message”. It is defined
as a sequence of bytes (db=define bytes) and ends with an end line (0xa is the
hex value for "\n").
For full details about data types in assembly, refer
to this post.
Now that we have declared a message, we need
instructions to print it.
It is important to know that assembly also uses the
underlying system calls in an OS. In Linux OS, there are currently 456 system
calls which are defined in /usr/include/x86-64-linux-gnu/unistd_64.h
You can also find an online searchable table here: https://filippo.io/linux-syscall-table/
The syscall used to print a message is
"write." It uses these arguments:
So, these syscalls essentially also use different registers
to process and perform a task. Upon knowing more about what syscall requires in
these registers we'd be able to perform any syscall. To perform write, we need
these values in these registers:
rax -> 1
rdi -> 1 (stdout in Linux is defined by fd=1)
rsi -> Message to display
rdx -> length of the message (which is 12
including end line)
But how do we input these values in these registers?
For this, in Assembly, there are many instructions. The most common instruction
is “mov.” This moves values from:
l Between registers
l Memory to Registers and Registers to
Memory
l Immediate data to registers
l Immediate data to memory
So, we will just move these values into dedicated
registers and our code becomes like this:
However, manually calculating the length of messages
may not be feasible. So, we'll use a little trick. We'll define a new variable
for length and use "equ" which means equals proceeded by
"$" which denotes the current offset and subtract our message's
beginning offset from this to find the length of the message.
We would further need to use the instruction
"syscall" to also call the "write" syscall we just defined.
Without using the "syscall" operation, write won't be performed with
register values.
Finally, we also need to exit from the program.
sys_exit syscall in Linux performs this operation.
So, rax-> 60
And rdi-> any value we want for the error code.
Let's give this 0 for now.
Assembling and Linking
Now this code is ready to run. We always need to do
these steps to run an assembly code:
1.
Assemble using nasm
2.
Link with necessary libraries using ld
An assembler produces object files as output. We then
link it with necessary libraries that contain the definition of certain
instructions and create an executable. We will use “nasm” to do the assembling and “ld” to link.
Since it is a 64-bit elf that we want, the command
would become:
nasm -f elf64 1.asm -o 1.o
ld 1.o -o 1
./1
As we see, we have now generated an executable file
that is printing "hello world." Perfect. We can now proceed to create
our shellcode using this binary.
Extracting shellcode
We created our assembly code and made an executable
out of it that prints something. Let's say a poor exploit (not a good one,
haha) wants to exploit something with the payload to print “Hello World”. How
would one do this?
For this, we need to extract the instruction bytes
from our executable. We can use objdump to do this
Upon seeing the binary with objdump, we can see our
assembly code and the instructions in hex written alongside it. We are
providing -M intel because we want the output in Intel assembly format.
objdump -d 1 -M intel
We all know computers only know binary. However,
displaying binary on screen is not feasible. So, computer scientists used hex
instructions. This gets translated into the CPU and the computer acts.
Removing NULLs
We need to extract these bytes and use them in our C
code! Simple? BUT WAIT!
Another fundamental we know is that null bytes can
sometimes terminate an action. So we must remove these null bytes from our
shellcode to prevent any mishappening. To exactly know which instructions won't
generate null bytes comes with practice. But certain tricks can be used in
simple programs to achieve this.
For example, using "xor rax,rax" would
assign rax=0 since xoring anything with itself gives 0.
So, we can do "xor rax,rax" and then
"add rax,1" to make RAX as 1.
In our code, you'll observe every mov instruction
creates 0s. So, if we have to assign a value of “1”, we can xor to make it 0
and then “add” 1. “Add” instruction simply adds the value given to the register
mentioned.
Following this trick we can re-write our code like
this:
Let's see if we still have 0s or not.
We can still observe some 0s in movabs
and mov instructions. We can use some tricks to reduce these 0s further.
This would still produce 0s near mov rsi,
message. We can reduce this by using "lea." “lea” command loads an
address into the memory. This is also known as the “memory referencing.” We’ll
see the details in a future article on rel and memory referencing.
We can still see 2 null bytes there but for now, this
is workable. We can use the "jmp call pop" technique to remove this
as well. Let's talk about that in further articles.
This binary also works. Let's extract these bytes and
make it a shellcode. We can copy these manually too (tiring!) but let's use a
command line fu for this:
objdump -d ./PROGRAM | grep
-Po '\s\K[a-f0-9]{2}(?=\s)' | sed 's/^/\\x/g' | perl -pe 's/\r?\n//' | sed
's/$/\n/'
Shellcode: \x48\x31\xc0\x48\x83\xc0\x01\x48\x31\xff\x48\x83\xc7\x01\x48\x8d\x35\xeb\x0f\x00\x00\x48\x31\xd2\x48\x83\xc2\x0c\x0f\x05\x48\x31\xc0\x48\x83\xc0\x3c\x48\x31\xff\x0f\x05
Sample shellcode execution
The shellcode we just created can
not be executed in C programs because “Hello World” was being fetched as static
data. For this, we will utilize another technique called JMP, CALL, and POP.
This we will cover in the next article. For this part, let’s focus on executing
a ready-made shellcode.
On sites like shell-storm.org, you
would observe that the assembly of a program is given, and then the related
shellcode as well. For example, here
we see that an assembly program is written to execute “execve(/bin/sh)” which
spawns up a new shell using the Linux system call “execve”
The shellcode observed is: \x31\xc0\x48\xbb\xd1\x9d\x96\x91\xd0\x8c\x97\xff\x48\xf7\xdb\x53\x54\x5f\x99\x52\x57\x54\x5e\xb0\x3b\x0f\x05
To execute this shellcode, we need
to write a small C program. Here is a skeleton:
#include <stdio.h>
#include <string.h>
char code[] = "<shellcode>";
int main()
{
printf("len:%zu
bytes\n", strlen(code));
(*(void(*)()) code)();
return 0;
}
So, the code becomes like so and
we have to compile it with no modern compiler protections command. Also, note
that we are using Ubuntu 14 to test our shellcode since even after no
protections, modern systems may still block the execution of such shellcodes
(due to memory permissions or ASLR issues) which we will tackle in future
articles.
Now, we can run this binary and
observe how it spawns a new shell!
Conclusion
In
the article, we saw how we can write out our assembly programs using registers
and Linux syscalls, make an executable, and then extract the instruction bytes
using objdump. These instruction bytes can then be used as a payload in
exploits. That is why it is called a shellcode. We created our shellcode which
prints “Hello World” but we didn’t execute it in the C program. The reason was
that “Hello World” was static data in the program that couldn’t be properly
loaded in registers using the assembly we created. For this, we have to use a
technique called JMP, CALL, POP and utilize stack for it. We shall see this in
the next article. Thanks for reading this part of the series.
0 comments:
Post a Comment