Coffee: A COFF loader made in Rust

Coffee: A COFF loader made in Rust

Abstract

COFF stands for Common Object File Format, it is the file format generated by compilers after the code-generation stage, they typically only include the machine code or assembly code generated from the corresponding source code without any external dependencies. beacon_inline_execute is a custom Windows COFF loader made primarily by Cobalt Strike, their main goal is to load a BOF (Beacon Object File) in-memory and handle custom internal functions that are implemented in the loader's code and external functions, that being, Windows API calls in this case.

Coffee works by parsing the object file format, allocating necessary memory for execution, and calling the entry-point inside our loader's process. The code then gets executed and the output is written to the terminal. BOFs, or object files as you may call, are known for their ability to stay hidden in red-team operations, as the code gets executed in memory dynamically, it'll have reduced static file signature detection by design, alongside other advantages like the ability to implement custom functionality by writing new internal functions and the compiled result being small helping in post-exploitation.

Coffee's GitHub repository can be found at: https://github.com/hakaioffsec/coffee

Why Rust

Rust is a relatively new language on the block compared to C and C++, however, it brings a whole new programming concept, type-safety, low-level/high-level/full-stack programming, blazing fast execution, it can do almost everything, not only from the perspective of offensive tooling, but software development as well. It comes with a great dependency system in which we have great crates for offensive development in Windows such as windows-sys, windows, winapi in which winapi provides the pure FFI bindings for us to call our Windows API functions, it feels more like C programming, and windows, windows-sys provides auto-generated bindings in a more rustified-way for us.

Their compiler

Not only Rust provides great productivity, their compilation system is awesome. It can compile to any platform, even WebAssembly. Not only that but in the backstage, their compiler rustc uses LLVM for code generation. From the Overview of the compiler page:

"We then begin what is vaguely called code generation or codegen. The code generation stage is when higher level representations of source are turned into an executable binary. rustc uses LLVM for code generation. The first step is to convert the MIR to LLVM Intermediate Representation (LLVM IR). This is where the MIR is actually monomorphized, according to the list we created in the previous step. The LLVM IR is passed to LLVM, which does a lot more optimizations on it. It then emits machine code. It is basically assembly code with additional low-level types and annotations added (e.g. an ELF object or WASM). The different libraries/binaries are then linked together to produce the final binary."

What's important to note in the code generation stage is that Monomorphization takes place, that means that not only our code will be more optimized, but our code will be also more secure from static signature scanning since this stage will make a different static definition and execution flow of each function in our binary, but shouldn't be considered an "bypass" in of itself.

The COFF format

COFF stands for Common Object File Format, it is a format for executable code, object code and shared library code. Our main target here is the object file code. COFF was initially created for Unix systems, but it has been adopted by Microsoft and has been since used on PE (Portable Executable) format on Windows. Most of our code was based off their documentation about the PE format and the COFF format that is listed on the References Page.

Writing a COFF loader in Rust

Our first idea for this project was making a port of the original  trustedsec/COFFLoader project made in C to Rust. The first actual challenge is the syntax, for example, memory pointers, references, casting, types, work in a quite different way than C or C++. Another point is that their code is a bit confusing, understanding what's happening can be a bit challenging especially for a port of the original project into Rust.

After writing some parts of the original COFFLoader we realized that it would be way better to write our own from scratch in Rust because things work a little bit differently and by not porting the whole original project we would be making fancier and easier-to-read code.

The flow-of-execution of our COFF loader goes as follows:

Diagram of the flow of execution of the COFF loader

Knowing the flow of execution, we'll run a demo code as shown below:

#include "beacon.h"

void go(char *args, int length)
{
    BeaconOutput(CALLBACK_OUTPUT, "Hello World from BeaconOutput", 30);
    BeaconPrintf(CALLBACK_OUTPUT, "Hello World %s", "from BeaconPrintf");
}
Our demo code written in C, beacon.h are the compatibility layer imports

We then get the following output:

Coffee executing the compile BOF from the code of our demo.c program

As you can see, most of it is parsing the COFF file header, getting the right values, and writing/allocating them in the right manner to achieve execution in-memory of the relocated addresses and transmuted functions.

First, we check if the BOF loaded is a x86_64 or x86 BOF, because we can't allocate a x86_64 COFF on a x86 running process and vice-versa, after that, we parse through each section and allocate memory for each section, copying the section's raw data to the allocated space afterwards.

After that, we handle relocations. This is where the relocatable code can be re-mapped into our own memory by reading the file's relocations, this is where it gets a little bit tricky, because we need to get the right addresses and their respective sizes, all relocation types are documented on Microsoft's documentation. More in-depth information about the relocations process can be found at OtterHacker's CoffLoader post but, basically, we just need to copy the relocatable address to our address space allocated beforehand.

In this part is important that we have parsed the import address for the function we're parsing symbols on, the get_import_from_symbol checks if the symbol's raw function name is an internal or an external (Windows API) function, depending on which, it either gets the pointer to the extern "C" Rust function, or the address of a function from LoadLibrary and GetProcAddress, what's nice to note here is that we can also implement other ways of getting the address of the procedure call, like writing a custom GetProcAddress.

We then gather all symbols, parse through them, and check which symbol's name contains "go", that is, the default entry-point name for BOFs. After getting the address from transmuting the base address for the text section to a Rust function, we call that function. After the entry-point gets called, in theory, our implemented Rust internal functions get called. We then get to the challenge that it was understanding how Cobalt Strike's functions work, most of the code is dedicated at the compatibility layer of cobalt strike's original functions so we can run our BOFs flawlessly. BeaconOutput and BeaconPrintf are by far the most used internal functions that get called because they handle the output of the execution, it works by populating a char buffer inside the beacon compatibility layer, after the data gets populated on the BeaconOutput function, the data gets parsed using the get_output_data and then converted into a slice and printed after the execution using the internal println! Rust function, which then, prints the output to our terminal. Most of the internal functions used are sourced from yamakadi's unfinished ldr.

Conclusion

COFF loader is a very interesting concept, it can be used to rapid-develop and adapt a payload to a certain scenario. Rust is a perfect language for implementing such projects as it is type-safe, the productivity is great, and its compilation system is awesome.

We have successfully implemented a custom COFF loader in Rust from scratch, we have managed to implement a basic compatibility layer for Cobalt Strike's original BOF functions. Our code is also heavily commented and credited so anyone can understand what's going on in the loader. The loader is also supporting multi-platform, meaning, it can load a BOF for both x86 and x86_64.

We've faced great challenges during the development of this project, such as understanding how exactly the COFF format works, making a compatibility layer for Cobalt Strike's BOF functions, understanding how to manipulate memory pointers in Rust and understanding how to properly use the FFI bindings to properly implement our internal functions.

We are very proud of the outcome of this project, and we expect it to be a great tool for red-team operations.

The GitHub repository can be found at: https://github.com/hakaioffsec/coffee

References

Thanks to the amazing people who have written about COFF loaders and helped me understand the format: