Back to Articles

WCC: Reversing the Compilation Pipeline to Turn Linux Executables Back Into Object Files

[ View on GitHub ]

WCC: Reversing the Compilation Pipeline to Turn Linux Executables Back Into Object Files

Hook

What if you could take any compiled Linux binary—say, /bin/ls—and reverse the linking process to get back relocatable object files? Or load it into a shell and call its internal functions interactively, like reflection in Java but for C?

Context

Traditional compilation flows in one direction: source code becomes object files, then a linker combines them into executables or shared libraries. This pipeline is lossy by design—once you link an executable, relocation information gets resolved and discarded, symbols may be stripped, and the binary becomes a monolithic blob. If you want to understand what's inside, you reach for disassemblers like IDA Pro or debuggers like GDB. If you want to reuse code from a binary, you're usually out of luck without the original objects.

The Witchcraft Compiler Collection (WCC) attacks this one-way street from the opposite direction. Created for GNU/Linux and POSIX systems, it's a suite of tools that treat compiled binaries as malleable artifacts: wld converts executables into shared libraries, wcc 'unlinks' binaries back into relocatable object files, and wsh provides a Lua-based shell that loads executables and lets you call their functions interactively. It's the kind of low-level binary sorcery that makes reverse engineers grin and compiler purists nervous.

Technical Insight

Outputs

Parsing Layer

WCC Toolkit

Binary Inputs

ET_EXEC

Modify headers

ET_EXEC → ET_DYN

Analyze binary

Analyze binary

Reconstruct symbols

& relocations

Load into memory

dlopen

Expose functions

Parse ELF

Parse & Disassemble

Instruction analysis

Binary loading

ELF Executable

Shared Library

wld

Linker

wcc

Compiler

wsh

Shell

libbfd

libelf

Capstone

Disassembler

Shared Library

ET_DYN

Relocatable Object

.o file

Lua Interface

with Functions

System architecture — auto-generated

WCC consists of three core components that leverage libbfd, libelf, and the Capstone disassembler to manipulate ELF binaries. Let's examine each and how they achieve their black magic.

The wld (Witchcraft Linker) performs the simplest transformation: converting ELF executables into shared libraries. This works by modifying ELF headers and segments—changing the e_type field from ET_EXEC to ET_DYN and adjusting program headers to make the binary loadable as a library. Why would you want this? Because shared libraries can be loaded into other processes with dlopen(), while executables cannot. This creates what WCC calls 'non-relocatable shared libraries'—they're not position-independent, but they're callable. A typical invocation looks like:

wld /bin/ls -o ls.so
# Now you can dlopen("ls.so") and call its functions

The wcc (Witchcraft Compiler) is where things get interesting. It reverses the linking process by analyzing compiled binaries and reconstructing the symbol tables and relocation information that were resolved during linking. When you compile with gcc -c, you get relocatable object files (.o) with unresolved symbols. When you link them, those relocations get fixed up and the information is discarded. WCC reads the executable, disassembles it, analyzes cross-references, and rebuilds relocation entries. For Intel x86_64 ELF binaries, it can generate fully relocatable objects:

wcc -c /usr/bin/program -o program.o
# Now program.o can be linked with other objects
ld program.o mycode.o -o hybrid_binary

This unlinking capability is powerful for binary patching—you can unlink an executable, link in your own object files with modified functionality, and relink. The limitation is significant though: full relocation rebuilding only works reliably on Intel ELF x86_64. ARM, SPARC, and other architectures can be processed, but WCC can't perfectly reconstruct their relocations.

The wsh (Witchcraft Shell) is perhaps the most immediately useful tool. It's an embedded Lua interpreter that loads ELF binaries into its address space and exposes their functions, global variables, and sections through a scripting interface. This provides reflection-like capabilities for compiled code:

-- In wsh interactive shell
wsh> loadbin("/bin/ls")
wsh> symbols()  -- List all symbols in the loaded binary
wsh> sections() -- Show all ELF sections
wsh> help(main) -- Get information about the main function

-- Call functions directly
wsh> execve = resolve("execve")
wsh> help(execve)

-- Search memory
wsh> grep("ELF", 0x400000, 0x1000)  -- Find "ELF" signature in memory

-- Even call internal functions if you know their signatures
wsh> call("internal_parser", {arg1, arg2})

Under the hood, wsh uses dlopen() to load binaries (after potentially converting them with wld), then parses their ELF structures to build a symbol map. The Lua bindings expose this information and allow direct memory access. You can inspect data structures, call functions with crafted arguments, and essentially treat the binary as a live, queryable object.

The architecture relies heavily on libbfd (from GNU binutils) for binary format parsing and Capstone for multi-architecture disassembly. This gives WCC the ability to handle not just ELF, but also PE/COFF formats in reading mode, and to work across Intel, ARM, and SPARC architectures—though with varying levels of relocation reconstruction capability. The codebase is pure C, embracing the philosophy that tools for manipulating C binaries should themselves be written in C.

Gotcha

WCC's most significant limitation is architectural scope: while it can read and analyze binaries across multiple architectures, the full unlinking capability (wcc -c to produce relocatable objects) only works reliably for Intel ELF x86_64 binaries. If you're working with ARM binaries, SPARC executables, or anything beyond x86_64, you'll get limited functionality. The relocation reconstruction is fundamentally architecture-specific, and only the Intel implementation is complete. This is a hard technical barrier—reconstructing relocations requires deep understanding of each architecture's addressing modes and relocation types.

The format conversion has similar constraints. While wld operates exclusively on ELF format, and wcc can read PE/COFF files via libbfd, the output is always ELF. You can't use WCC to convert Windows binaries into linkable objects for a Windows toolchain. Additionally, the documentation freely admits this is experimental 'black magic.' Stripped binaries lose symbol information that WCC relies on for accurate reconstruction. Binaries with heavy obfuscation, packed executables, or those using non-standard linking tricks (custom loaders, position-independent executables with unusual relocations) may produce unexpected results or fail entirely. The wsh shell also requires symbols to be present—calling functions by name doesn't work if those names were stripped, though you can still call by address if you know it.

Verdict

Use if: You're doing security research, malware analysis, or reverse engineering on Linux x86_64 systems and need to interact with compiled code programmatically. WCC's wsh shell is invaluable for rapid prototyping—calling binary functions from Lua is significantly faster than writing C wrappers or using GDB scripts. Use it when you need to patch existing binaries by unlinking, modifying, and relinking, or when you want to repurpose code from executables without source access. It shines for understanding legacy systems where documentation is poor but binaries are available. Skip if: You need production-grade tooling with guarantees, cross-platform support beyond x86_64 Linux, or are working primarily with stripped/obfuscated binaries. For standard binary analysis, mature tools like Radare2, Binary Ninja, or GDB provide more comprehensive features and better documentation. If you need runtime instrumentation rather than static manipulation, Frida is more appropriate. And if you're not comfortable with experimental tools that might break on edge cases, stick with objcopy and traditional binutils—WCC is explicitly 'witchcraft,' not industrial-strength engineering.

// ADD TO YOUR README
[![Featured on Starlog](https://starlog.is/api/badge/developer-tools/endrazine-wcc.svg)](https://starlog.is/api/badge-click/developer-tools/endrazine-wcc)