Exporters & deliverables

warden export converts the knowledge base for a given version into one of four deliverable formats. All four are deterministic: given the same KB state, the output is byte-identical on every run. That means they diff cleanly in git and compose naturally with CI pipelines.

The four formats

headers

A C header of recovered function prototypes. Use this to feed names into a downstream C/C++ toolchain or as a human-readable symbol sheet.

pseudo

Per-function listings: the recovered name, type signature, agent summary, provenance, and lifted pseudo-C (requires the original .wasm to be present; falls back to a mnemonic count otherwise). Use this for manual review.

kb-text

A columnar, stable text dump of every symbol (index, stable id, lock flag, provenance, confidence, and name), sorted by function index. The primary format for committing alongside source and reading diffs across versions.

ghidra

A Python script that pushes recovered names back into a Ghidra project. Run it from Ghidra’s Python console after loading the same module.

Basic usage

warden export <label> --format <fmt>

By default, output goes to stdout. Pass --out <file> to write to a file instead.

warden export v1 --format kb-text                        # stdout
warden export v1 --format kb-text   --out v1.kb.txt      # file
warden export v1 --format headers   --out recovered.h
warden export v1 --format pseudo    --out v1.pseudo.txt
warden export v1 --format ghidra    --out rename_v1.py

--db selects the project database when it is not the default warden.db:

warden export v1 --format kb-text --db /path/to/project.db --out v1.kb.txt

Format details

`headers`

Emits a C header wrapped in an include guard. Each defined function gets a comment line with its function index, Wasm type signature, provenance, and confidence score, followed by a skeleton declaration. Only defined (non-imported) functions are emitted; imports are excluded.

/* WARDEN recovered header */
/* version: v1  emscripten: 3.1.55 */
#ifndef WARDEN_RECOVERED_H
#define WARDEN_RECOVERED_H
#include <stdint.h>

/* idx=3 (i32) -> (i32)  [oracle 0.94] */
void malloc(void); /* TODO: real prototype from type sig */

/* idx=7 () -> ()  [human 1.00] */
void verify_license(void); /* TODO: real prototype from type sig */

#endif /* WARDEN_RECOVERED_H */

The prototypes are stubs. Parameter types are not yet recovered from the Wasm type signature. The comment carries the full signature string so you can fill it in. This is a known limitation; proper C prototype reconstruction is on the roadmap.

When to use it: feeding into a C toolchain, generating a symbol cheat-sheet, or as a starting point for writing a real header by hand.

`pseudo`

Emits a readable listing of every defined function with its recovered name, type signature, agent-generated summary, and provenance/confidence. When the original .wasm path is still accessible in the KB, instruction mnemonics are included inline; otherwise the listing notes the instruction count without disassembly.

// WARDEN pseudocode: v1

// ---- verify_license  () -> () ----
// Checks the license key against a hardcoded salt and calls abort if invalid.
// provenance=human confidence=1.00
function verify_license {  // stable=a3f9e1c042d8
    i32.const
    call
    i32.eqz
    if
    call
    end
}

// ---- malloc  (i32) -> (i32) ----
// provenance=oracle confidence=0.94
function malloc {  // stable=7b2d88fc901a
    // 142 instructions (disassembly not loaded)
}

When to use it: manual review of agent and Oracle output, writing a report, or orienting a second analyst. The stable ID truncated to 12 hex characters appears in every function header so cross-referencing the KB is easy.

`kb-text`

A columnar dump of every function in index order (including imports) with a fixed-width layout designed for git diff. The format is:

# WARDEN KB export (version_id=1)
index  stable_id          lk provenance  conf   name
4a9f31b2c8e60d17    oracle      0.94  malloc
7d1e55a3f0b2c8e6    import      1.00  emscripten_memcpy_big
a3f9e1c042d8b917  L human       1.00  verify_license
0c8b47f21e93da55    agent       0.61  process_frame
3d7f1a82bc094e61    -           -     -

Columns: index (Wasm function index), stable_id (first 16 hex chars of the full stable identity hash), lk (L when the symbol is locked; space otherwise), provenance, confidence, name (dash when unnamed).

Commit kb-text output alongside your source code. When the vendor ships a new .wasm, warden ingest + warden diff + warden export v2 --format kb-text gives you a git diff that shows exactly which functions changed, were added, or were dropped, and which annotations carried over automatically.

When to use it: source-controlled annotation snapshots, CI regression detection, sharing the current KB state without giving someone access to the database.

`ghidra`

Emits a Python script for Ghidra’s built-in scripting console. The script iterates over every defined function that has a recovered name and calls fn.setName(name, SourceType.USER_DEFINED) to apply it.

# WARDEN -> Ghidra rename script (run in Ghidra's Python console).
# Assumes the nneonneo/ghidra-wasm-plugin loaded the same module.
from ghidra.program.model.symbol import SourceType
fm = currentProgram.getFunctionManager()
renames = [
    (3, 'malloc'),
    (7, 'verify_license'),
    (12, 'process_frame'),
]
for idx, name in renames:
    # Map wasm function index -> Ghidra function (plugin-specific helper).
    fn = getFunctionByWasmIndex(idx) if 'getFunctionByWasmIndex' in dir() else None
    if fn is not None:
        fn.setName(name, SourceType.USER_DEFINED)
print('WARDEN: applied %d renames' % len(renames))

The Ghidra round-trip is a Phase 1 bridge that targets the nneonneo/ghidra-wasm-plugin and calls its getFunctionByWasmIndex helper. If that helper is not present in your Ghidra environment, the rename loop silently skips every function. Verify the plugin is loaded before running the script.

When to use it: you already have a Ghidra project open for the same module and want WARDEN’s recovered names applied without re-doing the work interactively. The index-based mapping is stable as long as the loaded .wasm is the same binary that WARDEN ingested for that version.

Built-in decompiler

The warden.lift module contains a pure-Python stack-machine lifter that re-folds Wasm stack operations back into readable pseudo-C. It handles the integer subset comprehensively including infix arithmetic, memory loads and stores, local and global variables, and function calls. It degrades gracefully for anything unmodeled by emitting a /* mnemonic */ comment and an opaque temporary instead of crashing.

How `--format pseudo` uses it

When you run warden export --format pseudo and the original .wasm is available, the exporter now calls the lifter instead of dumping raw instruction mnemonics. Each function block contains a proper pseudo-C body:

// WARDEN pseudocode: v1

// ---- parse_token  (i32, i32) -> (i32) ----
// Parses a token from the input buffer.
// provenance=oracle confidence=0.91
i32 parse_token(i32 p0, i32 p1) {
    return ((p0 + p1) * 7);
}  // stable=c4a8f21d903b

// ---- verify_license  () -> () ----
// Checks the license key against a hardcoded salt and calls abort if invalid.
// provenance=human confidence=1.00
void verify_license() {
    /* unreachable */
}  // stable=a3f9e1c042d8

Functions whose .wasm is not on disk fall back to the previous mnemonic-count note; the switch is automatic.

Targeting a single function

Use warden lift to decompile one function by name without running a full export:

warden lift v1 parse_token                # first match by name
warden lift v1 parse_token --index 7      # disambiguate by function index
warden lift v1 parse_token --out out.c    # write to file instead of stdout

The --index N flag is useful when multiple functions share a recovered name across an ambiguous KB state.

Python API

from warden.lift import lift_function, lift_module

pseudo_one = lift_function(module, func)   # -> str  (one function)
pseudo_all = lift_module(module)           # -> str  (all defined functions, index order)

lift_module skips imports (they have no body) and concatenates in function-index order so the result diffs cleanly across builds.

The lifter covers the integer and control-flow subset that Emscripten-compiled C/C++ produces in practice. Floating-point ops and SIMD instructions emit /* mnemonic */ placeholders. The output is always valid pseudo-C, never a crash or a partial file.

HTML report

warden report writes a self-contained HTML file: no server, no CDN, no build step. Everything is inlined so the file opens from any clone with a double-click, and the output is deterministic (same KB state in, byte-identical HTML out) so it diffs cleanly in git.

warden report v1                          # writes warden-report-v1.html to cwd
warden report v1 --out reports/v1.html   # explicit path
warden report v1 --db /path/to/project.db --out v1.html

What the report contains

Section	Description
Coverage summary	Named / total defined functions with a progress bar broken down by provenance (oracle, human, agent).
Confidence heatmap	Every defined function in index order. Row background hue encodes provenance; alpha encodes confidence. Solid green rows are human-verified; fading amber rows are agent guesses that need review.
Thread and memory model	Atomic sites, pthread markers, and shared-memory facts recorded by `warden analyze`. Hidden when the KB has no thread facts.
Changelog	The diff from the nearest earlier version: a chip summary (unchanged / moved / modified / new / deleted) followed by a “needs review” list of genuine app-level deltas. Hidden for the first version.

The heatmap color key:

Color	Provenance	Trust level
Emerald	`human`	Verified by hand
Blue	`oracle`	Matched against a known corpus
Cyan / teal	`export` / `import`	Free fact from the binary
Violet	`string-xref`	Inferred from a string reference
Amber	`diff-carry`	Carried across a version bump
Dark amber	`agent`	Model guess (lowest trust)
Zinc (desaturated)	(unnamed)	No symbol recovered

Python API

from warden.report import render_report, write_report

html: str = render_report(kb, version_id)              # returns HTML string
write_report(kb, version_id, "reports/v1.html")        # writes UTF-8 file

Pass module=<Module> to either function if you have the parsed .wasm on hand; it is optional and reserved for future inline disassembly views. The report is fully driven by the KB without it.

Commit the HTML report alongside kb-text snapshots. The report is byte-identical for the same KB state, so git diff --stat will tell you at a glance whether anything actually changed between runs. This is useful in CI to detect spurious annotation drift.

Comparing across versions

Because all formats are deterministic, you can snapshot them at each version and use standard diff tooling to review what changed:

warden export v1 --format kb-text --out snapshots/v1.kb.txt
warden export v2 --format kb-text --out snapshots/v2.kb.txt
diff snapshots/v1.kb.txt snapshots/v2.kb.txt

Functions with unchanged stable_id and annotations appear as unchanged lines. New functions, dropped functions, and any confidence or provenance changes are visible immediately.

For a richer semantic changelog (which functions are new, removed, carried over, or only partially matched), use warden diff before exporting. The diff engine runs the same fingerprinting that export relies on, so the two views are consistent.

Reference

Flag	Default	Description
`--format, -f`	`kb-text`	Output format: `headers`, `pseudo`, `kb-text`, or `ghidra`.
`--out, -o`	(stdout)	Write output to a file instead of printing to stdout.
`--db`	`warden.db`	Project database path (or `WARDEN_DB` env var).

​The four formats

headers

pseudo

kb-text

ghidra

​Basic usage

​Format details

​headers

​pseudo

​kb-text

​ghidra

​Built-in decompiler

​How --format pseudo uses it

​Targeting a single function

​Python API

​HTML report

​What the report contains

​Python API

​Comparing across versions

​Reference

The four formats

Basic usage

Format details

`headers`

`pseudo`

`kb-text`

`ghidra`

Built-in decompiler

How `--format pseudo` uses it

Targeting a single function

Python API

HTML report

What the report contains

Python API

Comparing across versions

Reference