1. Keystone Engine

#keystone-engine #assembler #golang-bindings

Keystone Engine

From the official website this is how keystone is described:

Keystone is a lightweight multi-platform, multi-architecture assembler framework.

Highlight features:

  • Multi-architecture, with support for Arm, Arm64 (AArch64/Armv8), Ethereum Virtual Machine, Hexagon, Mips, PowerPC, Sparc, SystemZ, & X86 (include 16/32/64bit).

  • Clean/simple/lightweight/intuitive architecture-neutral API.

  • Implemented in C/C++ languages, with bindings for Java, Masm, Visual Basic, C#, PowerShell, Perl, Python, NodeJS, Ruby, Go, Rust, Haskell & OCaml available.

  • Native support for Windows & *nix (with Mac OSX, Linux, *BSD & Solaris confirmed).

  • Thread-safe by design.

  • Open source.

Keystone is based on LLVM, but it goes much further with a lot more to offer. Find in this Blackhat USA 2016 slides more technical details behind our assembler engine.

Keystone with GoLang

A package with go bindings is provided here but unfortunately it requires cgo to compile. I love both go and c but cgo is appalling in my opinion. Thankfully the keystone engine provides a DLL we can use. The caveat when using the dll is that it will only run on windows.

If you need to run keystone on linux it's probably easier to use python instead (or cgo).

Documentation

There is no real documentation for the framework. A few examples can be found here and some on their github page.

What I found useful is to download the Windows-Core Engine from here (it includes the precompiled dll).

Make sure to download the dll for the right architecture.

In the 'includes' folder the keystone.h file can be found with descriptions of the exported functions and how the framework should be used. A sample from the header is shown below:

/*
 Assemble a string given its the buffer, size, start address and number
 of instructions to be decoded.
 This API dynamically allocate memory to contain assembled instruction.
 Resulted array of bytes containing the machine code  is put into @*encoding

 NOTE 1: this API will automatically determine memory needed to contain
 output bytes in *encoding.

 NOTE 2: caller must free the allocated memory itself to avoid memory leaking.

 @ks: handle returned by ks_open()
 @str: NULL-terminated assembly string. Use ; or \n to separate statements.
 @address: address of the first assembly instruction, or 0 to ignore.
 @encoding: array of bytes containing encoding of input assembly string.
	   NOTE: *encoding will be allocated by this function, and should be freed
	   with ks_free() function.
 @encoding_size: size of *encoding
 @stat_count: number of statements successfully processed

 @return: 0 on success, or -1 on failure.

 On failure, call ks_errno() for error code.
*/
KEYSTONE_EXPORT
int ks_asm(ks_engine *ks,
        const char *string,
        uint64_t address,
        unsigned char **encoding, size_t *encoding_size,
        size_t *stat_count);

Golang Implementation

Up until now I only needed to develop 32 and 64bit shellcode for x86 architecture, so I will not bother adding extra constants for arm etc.

Also I am not planning to implement this as part of a large project so I will not implement functions such as ks_free() that frees memory.

Keystone functions

The following functions will be implemented:

  • ks_open (creates a new instance of keystone)

  • ks_asm (it receives the assembly string and returns the assembly equivalent bytes)

Let's dive into it.

Code

Firstly we need to download the dll from here and include it in our current working path. We then import the dll using LoadLibrary from the windows package.


	fmt.Println("[+] Loading keystone.dll")
	hModule, err := windows.LoadLibrary("keystone.dll")
	if err != nil {
		return []byte{}, fmt.Errorf("Failed to load Libray\n")
	}

If the dll is in a different directory make sure to include the absolute path.

As mentioned previously from the exported functions we will only use ks_open and ks_asm. Using GetProcAddress from the windows package we can get the functions' addresses.

	fmt.Println("[+] Getting function addresses")
	ks_open_proc, err := windows.GetProcAddress(hModule, "ks_open")
	if err != nil {
		return []byte{}, fmt.Errorf("Failed to get address for ks_open\n")
	}
	ks_asm_proc, err := windows.GetProcAddress(hModule, "ks_asm")
	if err != nil {
		return []byte{}, fmt.Errorf("Failed to get address for ks_asm\n")

	}

ks_open() function

/*
 Create new instance of Keystone engine.

 @arch: architecture type (KS_ARCH_*)
 @mode: hardware mode. This is combined of KS_MODE_*
 @ks: pointer to ks_engine, which will be updated at return time

 @return KS_ERR_OK on success, or other value on failure (refer to ks_err enum
   for detailed error).
*/
KEYSTONE_EXPORT
ks_err ks_open(ks_arch arch, int mode, ks_engine **ks);

As we can see from the above definition in the keystone header we need to define the architecture, mode and provide a pointer of the location where our session handle will be stored.

The architecture constants are defined below

// Architecture type
typedef enum ks_arch {
    KS_ARCH_ARM = 1,    // ARM architecture (including Thumb, Thumb-2)
    KS_ARCH_ARM64,      // ARM-64, also called AArch64
    KS_ARCH_MIPS,       // Mips architecture
    KS_ARCH_X86,        // X86 architecture (including x86 & x86-64)
    KS_ARCH_PPC,        // PowerPC architecture (currently unsupported)
    KS_ARCH_SPARC,      // Sparc architecture
    KS_ARCH_SYSTEMZ,    // SystemZ architecture (S390X)
    KS_ARCH_HEXAGON,    // Hexagon architecture
    KS_ARCH_EVM,        // Ethereum Virtual Machine architecture
    KS_ARCH_MAX,
} ks_arch;

And the modes:

// Mode type
typedef enum ks_mode {
    KS_MODE_LITTLE_ENDIAN = 0,    // little-endian mode (default mode)
    KS_MODE_BIG_ENDIAN = 1 << 30, // big-endian mode
    // arm / arm64
    KS_MODE_ARM = 1 << 0,              // ARM mode
    KS_MODE_THUMB = 1 << 4,       // THUMB mode (including Thumb-2)
    KS_MODE_V8 = 1 << 6,          // ARMv8 A32 encodings for ARM
    // mips
    KS_MODE_MICRO = 1 << 4,       // MicroMips mode
    KS_MODE_MIPS3 = 1 << 5,       // Mips III ISA
    KS_MODE_MIPS32R6 = 1 << 6,    // Mips32r6 ISA
    KS_MODE_MIPS32 = 1 << 2,      // Mips32 ISA
    KS_MODE_MIPS64 = 1 << 3,      // Mips64 ISA
    // x86 / x64
    KS_MODE_16 = 1 << 1,          // 16-bit mode
    KS_MODE_32 = 1 << 2,          // 32-bit mode
    KS_MODE_64 = 1 << 3,          // 64-bit mode
    // ppc 
    KS_MODE_PPC32 = 1 << 2,       // 32-bit mode
    KS_MODE_PPC64 = 1 << 3,       // 64-bit mode
    KS_MODE_QPX = 1 << 4,         // Quad Processing eXtensions mode
    // sparc
    KS_MODE_SPARC32 = 1 << 2,     // 32-bit mode
    KS_MODE_SPARC64 = 1 << 3,     // 64-bit mode
    KS_MODE_V9 = 1 << 4,          // SparcV9 mode
} ks_mode;

From the definitions above we will only go ahead and implement the following :

const (
	MODE_32 = 4
	MODE_64 = 8

	ARCH_X86 = 4
)

With everything required in place we can go ahead and call the function. If 32-bit shellcode is required we should change MODE_64 to MODE_32.

fmt.Println("[+] Running ks_open_proc")
var ksSession uintptr
r1, _, err := syscall.SyscallN(ks_open_proc, uintptr(ARCH_X86), uintptr(MODE_64), uintptr(unsafe.Pointer(&ksSession)))
if r1 != 0 {
	return []byte{}, fmt.Errorf("ks_open failed")
}

ks_asm() function

/*
 Assemble a string given its the buffer, size, start address and number
 of instructions to be decoded.
 This API dynamically allocate memory to contain assembled instruction.
 Resulted array of bytes containing the machine code  is put into @*encoding

 NOTE 1: this API will automatically determine memory needed to contain
 output bytes in *encoding.

 NOTE 2: caller must free the allocated memory itself to avoid memory leaking.

 @ks: handle returned by ks_open()
 @str: NULL-terminated assembly string. Use ; or \n to separate statements.
 @address: address of the first assembly instruction, or 0 to ignore.
 @encoding: array of bytes containing encoding of input assembly string.
	   NOTE: *encoding will be allocated by this function, and should be freed
	   with ks_free() function.
 @encoding_size: size of *encoding
 @stat_count: number of statements successfully processed

 @return: 0 on success, or -1 on failure.

 On failure, call ks_errno() for error code.
*/
KEYSTONE_EXPORT
int ks_asm(ks_engine *ks,
        const char *string,
        uint64_t address,
        unsigned char **encoding, size_t *encoding_size,
        size_t *stat_count);

From the above definition we will require the following:

  • the handle returned by ks_open stored in variable ksSession .

  • We then require a pointer to our null terminated string.

  • address can be ignored so it will be set to 0

  • A pointer for the buffer to be written

  • A pointer for the size of the buffer to be written

  • A pointer for the number of statements successfully processed

In order to get a pointer to null terminated string the following code can be used.

ptr, err := syscall.BytePtrFromString(asm)
if err != nil {
    return []byte{}, fmt.Errorf("Failed to get byte ptr from string\n")
}

We now have everything we need to call the the ks_asm function

	var bytearray, size, count uintptr

	r1, _, _ = syscall.SyscallN(ks_asm_proc,
		ksSession,
		uintptr(unsafe.Pointer(ptr)),
		0,
		uintptr(unsafe.Pointer(&bytearray)),
		uintptr(unsafe.Pointer(&size)),
		uintptr(unsafe.Pointer(&count)),
	)
	if r1 != 0 {
		return []byte{}, fmt.Errorf("ks_asm failed")
	}

Bytes in a slice

We then go ahead and copy the memory contents to a byte slice using the following code .

fmt.Println("[+] Copying bytes from memory to byte slice")
bytes := make([]byte, size)
copy(bytes, (*[1 << 30]byte)(unsafe.Pointer(bytearray))[:size])
return bytes, nil

Test code

Let's test our code to ensure we get the expected results.

func main() {
	asmString := "nop; nop; inc rax;"
	bytes, err := GenerateShellcode(asmString)
	if err != nil {
		log.Fatalln(err)
	}
	for _, bt := range bytes {
		fmt.Printf("0x%x ", bt)
	}
}

With this sequence of commands we should expect the following output

The output of our script matches the expected results. Great :)

Last updated