2. Windows x64 Shellcode Development intro
#x64 #shellcode #golang #asm
Last updated
Was this helpful?
#x64 #shellcode #golang #asm
Last updated
Was this helpful?
Shellcode is a small piece of code written in assembly language that is used to perform a specific function in the context of a software exploit. The term "shellcode" comes from the idea that the code often opens a shell, providing an attacker with command-line access to a compromised system.
Shellcode is commonly associated with security exploits, especially in the field of cybersecurity and penetration testing. It is often injected into a vulnerable program's memory through various means, such as buffer overflows or other vulnerabilities, to take control of the program's execution flow.
The functionality of shellcode can vary widely, depending on the goals of the attacker. It might include actions like spawning a shell, downloading and executing malicious payloads, or performing other malicious activities.
It's important to note that while shellcode itself is not inherently malicious, it is commonly used as a component of exploits and attacks.
A common tool for generating shellcode is . Any payload generated from this tool is heavily signatured by AV/EDR vendors. Being able to write custom shellcode is a great addition to the arsenal of any offensive security professional.
Throughout this blog post I will be using Intel Syntax. I think it's much easier to read and write. It's also the default syntax for Windbg which is the debugger I will be using for testing my assembly code.
Intel syntax follows this convention:
So let's take a a real example. The add command will add the value on the right to the to the value on the left.
In this case if rax had the value 2 before the command was ran, after execution it will have the value of 3.
mov rax,1;
Moves value 1 (decimal) to rax. Add 0x in front of the number for hex values
mov rax, qword ptr [r8];
Moves the qword from the location of r8 to rax
add rax,1;
Adds 1 to rax
sub rax,1;
Subtracts 1 from rax
push rax;
Pushes the value of rax to the stack
pop rax;
Pops the first value of the stack into rax;
call rax;
Calls the function at the address stored in rax
jmp rax;
Jumps at the address stored in rax
xor rax,rax;
logical xor, zeros the contents of rax
int3;
breakpoint
The above instructions are the most commonly used when we are writing shellcode. A few others will be used but they will be explained as we walk through the actual code.
Let's take rax as an example.
rax
64-bit register
eax
32-bit register (lower 32 bits of rax)
ax
16-bit register (lower 16 bits of eax)
ah
8-bit register (higher 8 bits of ax)
al
8-bit register (lower 8 bits of ax)
Let's take r8 as another example
r8
64-bit register
r8d
32-bit register (lower 32 bits of r8)
r8w
16-bit register (lower 16 bits of r8d)
r8b
8-bit register (lower 8 bits of r8w)
d - double word
w - word
b - byte
With everything we have now in place, let's have a quick look at the code we are trying to execute in a higher level language.
From the above code we essentially have 4 lines of code we would like to turn into assembly.
Line 12: Import kernel32.dll
Line14: Get the Process address of WinExec
Line16: Get a Pointer to a null terminated string "calc.exe"
Line 18: Call the Winexec function passing the pointer from line 16 as the first argument and 1 as the second.
As mentioned previously the first step of developing our shellcode is to find the base address of kernel32.dll in memory. Kernel32.dll is always loaded in the process memory on creation.
To find the address we have to perform the following tasks.
Let's walk through the assembly code in windbg to ensure we get the expected results. We start by adding a breakpoint int3;
at the top of our shellcode.
Line 3: In the context of Windows, the gs
register points to the thread information block (TEB), which contains information about the current thread
In Windbg we can view the structure using the following command
We can see that the PEB is located at offset 0x60 from the beginning of the TEB. Once we step over the following instruction we should get the address of PEB in rax
A quick sanity check confirms that the value in the rax register matches the one from the TEB.
Line 5: We have the value of PEB in RAX and we now try to get the address of the PEB_LDR_DATA.
We can then use the following command to view the PEB Structure and identify the offset for LDR
As we can see from above the offset to Ldr is 0x18. Let's step over line 5 that has the following instruction to see if we get the address of PEB_LDR_DATA in the RSI register
Let's do a quick sanity check using windbg
Great, we now have the address of the struct in rsi
Line 6: We now have the PEB_LDR_DATA address in rsi and we want to get the value of InMemoryOrderModuleList to rsi. Let's view the struct in windbg once again to make sure we have the correct offset in our shellcode.
The offset seems to be correct in our code
Let's step over in our code to see if we get the right value in rsi.
Kernel32 comes after the current executable and ntdll. So moving forward twice should give us the _LDR_DATA_TABLE_ENTRY of kernel32. Let's confirm this.
First Entry ks.exe:
Second entry ntdll.dll
Third entry kernel32.dll
We can see from the beginning of the structure the offset to the DllBase is at 0x30. Since we are substructing 0x10 from the r9 register to get to the beginning of the structure we only need to add 0x20 to get the DllBase value.
Let's confirm that after stepping over line 9 in our code the register r9 will hold the kernel32.dll base address.
Awesome.. with the address of kernel32 in r9 we can now proceed to get the address of winexec
The next step step in our shellcode is to create a function to walk through the exports directory of any given dll (base address) and return the absolute address of the function. Although in this example we will only call it once, in larger and more real world scenarios we will most likely have to call this function multiple times.
Let's break the code down into smaller pieces to understand exactly what's happening.
The parse_module function expects 2 values from the caller:
R9 -> should hold the base address of the dll
Line 2: The offset value to the beginning of the nt header is moved to ecx
PE-bear is an excellent tool that can be used to cross check if the values we see in windbg are indeed the right ones.
Let's step over the code in line 2 to make sure we are getting the correct result. The value we expect to see in ecx is E8.
Lines 3-7: What happens on these lines is basically the following calculation
NtHeader = DllBase + 0xE8
Export Directory = NtHeader + 0x88
From a quick calculation we can see that Export directory is at offset 0x170 from the base address. Let's check in PE Bear if that offset points to the export directory in PE Bear
We can see that the offset 0x170 points to the RVA of the export directory.
Let's walk over the following instruction in windbg to ensure that r15 holds the RVA value we expect to see
We can now add r9 which holds the dllBase address to calculate the absolute address of the Export directory.
In windbg we can confirm that we are indeed pointing to the right location by viewing the first two double words.
We can see the value of Characteristics (00000000 ) and ReproChecksum (2e35230e)
The last 3 lines store the number of function names in ecx and the address of names in r14.
A quick look in PE-bear reveals that we have the right values in both ecx, and r14. We can see that the first value at r14 is the same as the first Name RVA below.
The functionality of this code is fairly simple.
Line 2: checks if ecx = 0 and if it is it jumps to line 10 that terminates the execution of our shellcode. When ecx is 0 it means that our shellcode went through the whole export list without finding the requested function.
Line3: Decrements ecx by 1 for every iteration
Line4: zeros rsi
Line 5: For the first iteration the last Export RVA is moved to esi
Line 6: Adds base address to RVA to get absolute value in rsi
The second time the loop reaches this point this is the output from Windbg
It matches the exported functions from PE-bear
Lines 1-4 : Zero rax & rdx and clear DF flag
The iteration code is where the hashing happens.
Line 6: loadsb takes the first byte from the address pointed to by the RSI and write is to the lowest byte of rax (al)
Lines 8-9: Checks if the value is 0 that indicates the end of the string and jumps to the next function
Line 10: The x86-64 assembly instruction ror edx, 0x0d
performs a "rotate right" operation on the contents of the edx register. In this case, the rotation is by 13 bits (0x0d in hexadecimal is 13 in decimal).
Imagine edx could only hold 4 bits. Here is an example of the ror effect after rotating right 1 bit.
Line 11: Adds eax to edx
Line 12: Loops to the next byte
The last part of the code is where the actual comparison takes place with the provided hash. Our hash will be located in r8d.
Line 2: Compares calculated hash from the previous function with the one we provided
Line 3: If they are not equal it jumps back to our search_function loop to get the next entry.
Lines 4-10 Only execute if the provided and calculated hashes match
Line 4: r15 holds the address of the export directory. The offset 0x24 points to the AddressOfNameOrdinals
Line 5: Adds base address to the RVA to get absolute address of the Address of name ordinal
Line 6: Adds the ordinal value of the function above the desired one in ecx
As we can see the ordinal value in ecx is pointing to the function WideCharToMultiByte
The ordinal value of WinExec is 639.
Lines 7-8: Point to the addresses of functions. That 's the value we need to call the function.
Line 9: Gets the RVA of the address of function for WinExec in eax
Comparing with the previous screenshot we can see that it's a match
Line 10: We add the base address and we have the function in the rax register ready to be called as needed.
If you made it to this point, you are probably wondering how can you calculate the hash and provide it to the assembly code.
The following code will calculate and print the hash for us:
We now reach the end of our code.
We then feed the value to r8d.
Line 3: We call the parse_module function. if everything went well rax will have the address of the function
Great, we now only have to pass the arguments to the function.
It's a good place to pause now and have a quick look on the x64 calling convention. When calling a function in x64 the first four arguments will go to the registers rcx,rdx,r8,r9 and all the rest to the stack from right to left. So the last argument should be pushed to the stack first and so on.
With this knowledge let's pass the arguments to WinExec.
So WinExec definition from microsoft states that that the first argument should be the a pointer to the null terminated string.
Lines 4-8:
Line 4: zero -> rcx
Line 5: push 0 to the stack. This will act as the null termination
Line 6: The hex values of calc.exe are moved to rcx
calc.exe = 63 61 6C 63 2E 65 78 65 + 00
Line 7: Pushes the string to the stack
Line 8: Get a pointer to the string in the rcx register. ( first argument)
Lines 9-10: zero rdx and inc by 1.
Line 11: Argument storage space ( shadow space) and stack alignment
Line 12: Finally calling the function.
Just before calling the function this is what we see in rcx,rdx which is exactly what we expect.
Stepping over the function should launch a calc.exe process
The whole list of registers can be found on . Let's have a quick look on how some of the registers will be used within the shellcode.
A good template I found online when I was looking for one can be found .
I am not a big fan of python so I ported the above script in go. Also to make it easy for development and debugging I will include the and automatically launch and attach Windbg Preview.
The ported code can be downloaded from my github page . If you are more familiar with python you can use the original template from exploitdb and follow along. Only caveat is that you will have to write your own shellcode runner to execute the code.
Get the address of the structure from
From PEB we can get the
And from PEB_LDR_DATA we can get
R8d -> should have the hash of the function (more on this )
The shellcode author in this case came up with a smart algorithm that generates a hash based on the Function name. It then compares the generated hash with the hash we provide it. The caveat of that is that we have to write a to calculate that hash for us.
Referring back to the we need to get a pointer to a null terminated string, in this example a pointer to 'calc.exe' and then call the function.
Line 2: we can use the to calculate the function hash:
A great source of information is .
To convert ascii to hex I am using this online converter
The whole shellcode template can be found .