Skip to main content
All contributions are welcome: bug fixes, test coverage, new pipeline stages, documentation, and toolchain integrations. WARDEN is in alpha: the core pipeline works, many planned phases are scaffolded, and there is real ground to cover. All contributions are released under the MIT License that governs the project. By submitting a pull request you agree that your contribution will be released under the same terms.
Not sure where to start? See the roadmap for the planned phases and their current status, or open an issue on GitHub and ask. The maintainers are happy to point you at something appropriately scoped.

Dev setup

Python 3.10 or later is required. The core path has no native dependencies.
1

Clone the repo and create a virtual environment

git clone https://github.com/purpshell/warden.git
cd warden

python -m venv .venv
source .venv/bin/activate     # Windows: .venv\Scripts\activate
2

Install with dev extras

make dev
make dev is equivalent to pip install -e '.[all]' followed by pre-commit install. The [all] group is shorthand for [agents,mcp,dev]:
ExtraWhat it adds
agentsopenai>=1.68, anthropic>=0.40 (LLM crew; falls back to offline heuristic without a key)
mcpmcp>=1.2 (Model Context Protocol tool surface)
devpytest, pytest-cov, ruff, mypy, pre-commit
If you only want to run the core pipeline without agents or MCP, pip install -e . is sufficient. The hard dependencies are just typer and rich.
3

Verify the install

warden version
make demo        # runs `warden demo` end-to-end: the whole pipeline, offline
If make demo completes without error, the environment is good.

Pre-commit hooks

make dev installs pre-commit hooks that run ruff, ruff-format, mypy, and a set of standard file-health checks (trailing whitespace, YAML/TOML syntax, merge-conflict markers, a binary-file size guard at 512 KB) on every commit. If a hook rejects your commit, fix the reported issue and re-stage. Do not skip hooks with --no-verify.

Development workflow

make check      # ruff lint + mypy typecheck + pytest: what CI runs
make fmt        # auto-format and fix with ruff
make test       # pytest alone
make cov        # pytest --cov=warden --cov-report=term-missing
make demo       # warden demo end-to-end
Run make help to see all targets. make check is the gate: CI runs the same target, so passing it locally means your PR will be green.

Project layout

Each package under src/warden/ maps to one conceptual pipeline stage. New functionality should go in the matching package; cross-cutting utilities go in the nearest sensible parent.
src/warden/
    cli.py          # Typer application: all CLI entry-points live here
    project.py      # WardenProject: per-project config and DB path resolution
    samples.py      # Pure-Python .wasm emitter used by demo and tests
    ingest/         # Binary parser: sections, functions, name section, LEB-128, JS glue
    identity/       # Stable function fingerprinting and cross-version similarity
    kb/             # KnowledgeBase (SQLite): schema, queries, migrations
    diff/           # Cross-version diffing and annotation carry-over
    oracle/         # Emscripten Oracle: seed signatures, corpus matching
    agents/         # LLM agent crew (requires [agents] extra)
    export/         # Output formats (JSON, etc.)
    verify/         # Annotation verification helpers
    mcp/            # MCP tool surface (requires [mcp] extra)

tests/
    conftest.py     # Shared fixtures
    fixtures/       # Reserved for on-disk fixture files
    test_cli.py
    test_fingerprint.py
    test_ingest.py
    test_kb.py
    test_leb128.py
    test_pipeline.py

docs/
    VISION.md       # Architecture and design rationale
    AGENTS.md       # Agent crew design
    MCP.md          # MCP integration
    LIMITATIONS.md  # Honest current limitations
    VERIFICATION.md # Verification pipeline

Testing

The suite is intentionally runnable on a bare checkout with no native toolchain installed.

The samples fixture (no toolchain needed)

src/warden/samples.py is a pure-Python .wasm emitter. It produces three related modules (reference_module(), app_v1(), app_v2()) that exercise the entire pipeline end-to-end: ingest, fingerprinting, Oracle match, diff, and carry-over. These are generated at test time in memory. No Emscripten, WABT, or wasm-tools is required.

Shared conftest fixtures

tests/conftest.py exposes the following fixtures. Use these rather than creating ad-hoc modules in individual test files.
FixtureTypeProvides
reference_wasmbytesLabeled runtime module with a name section
app_v1_wasmbytesStripped v1 target
app_v2_wasmbytesv2 target with one modified and one new function
demo_gluestrSample Emscripten JS glue
sample_dirdict[str, Path]All artifacts written to tmp_path
kbKnowledgeBaseIsolated in-memory-equivalent DB in tmp_path, auto-closed

Writing tests

  • Every non-trivial change should come with a test. PRs that add behavior without tests will be asked to add them before merge.
  • Name test files test_<module>.py, mirroring the package they cover.
  • Use pytest’s built-in tmp_path fixture for any filesystem work; never write into the repo tree.
  • Tests that require optional extras (openai, anthropic, mcp) must be guarded with pytest.importorskip("openai"), pytest.importorskip("anthropic"), or an equivalent skip so CI passes without those extras.
  • Avoid network calls in tests. If a test genuinely requires them, mark it with a custom marker and document that it is skipped in standard CI.

Code style

Style is enforced by ruff (lint and format) and mypy (static types). make fmt applies ruff’s auto-fixes; make check runs both tools and treats failures as errors. Key ruff settings from pyproject.toml:
SettingValue
Line length100
TargetPython 3.10
Selected rule setsE, F, I (isort), UP, B, W
IgnoredE501 (long lines tolerated in data/tables), B008 (Typer’s Argument/Option-in-defaults pattern is intentional)
Type hints are required for all public functions and methods. mypy is configured with ignore_missing_imports = true and warn_unused_ignores = true. Match the annotation density of the file you are editing. If the surrounding code is fully annotated, your additions must be too. Docstrings follow the same density rule: match what is already in the module. Most public functions have a one-line summary; complex functions include a short description of parameters and return value. Do not add docstrings to private helpers that lack them in existing code. from __future__ import annotations appears at the top of every source file; keep that convention.

The two invariants

Any change that touches identity/ or kb/ must preserve these properties.
The stable_id computed for a given function body must be identical across Python interpreter runs, platforms, and WARDEN versions. It is the primary key of the knowledge base: if it drifts, every annotation stored under the old key becomes orphaned.If you change the fingerprint algorithm in a way that would alter existing stable_id values, you must write a migration in kb/ and bump the schema version. Include a test that verifies the new algorithm produces the same stable_id for all fixtures in tests/conftest.py (or documents the intentional remap if a migration is present).
Every write to the symbols table must go through KnowledgeBase.upsert_symbol. That method enforces the rank/confidence ordering described in concepts: human writes are sovereign; agents may only fill empty slots or overwrite lower-confidence agent output; other automated sources resolve by rank and then confidence.Do not insert or update the symbols table directly. A PR that bypasses upsert_symbol will not merge.

PR etiquette

1

Branch from main

Use a short, descriptive branch name: feat/diff-carry-weights, fix/leb128-signed-overflow, docs/mcp-tool-surface.
2

Keep commits focused

One logical change per commit. The commit message subject should complete the sentence “This commit…” and use the imperative mood. Squash fixup commits before opening the PR.
3

Pass make check

Run make check locally before pushing. CI runs the same target (ruff check, mypy, and pytest), so a local green means a CI green.
4

Include tests and docs

PRs that add behavior without tests will be asked to add them before merge. Bug-fix PRs should add a regression test. If your change affects a CLI flag, a KB schema column, a public API, or a documented pipeline stage, update the relevant file in docs/ or the inline help strings in cli.py.
5

Write a useful PR description

Explain why the change is needed, what approach you took, and how to verify it. Link any related issue. For large changes, open a draft PR early so the direction can be discussed before a lot of code is written.
One feature per PR. If you are building something large, a draft PR is the right place to align on direction before the bulk of the code is written.

Where to start

Good first contributions map to the early phases in the roadmap.

Parser completeness

src/warden/ingest/ handles the common section types. Edge-case sections, import table coverage, and data segments are good targets that do not require architecture changes.

Oracle corpus expansion

Adding Emscripten runtime signatures to seed_signatures.json in src/warden/oracle/ is high-value work with a clear interface and no architectural dependency.

Diff carry-over heuristics

The similarity classifier in src/warden/identity/ has room for better structural and type-sensitive weights. Improvements there sharpen both the Oracle and the diff.

Export formats

src/warden/export/ accepts new emitters: a Ghidra script export, DWARF-like annotation output, or a structured JSON schema for third-party tooling.

Test coverage

make cov shows what is not yet exercised. Any module below 80% line coverage is a reasonable, self-contained target.

Documentation

docs/LIMITATIONS.md tracks known gaps. Fixing a limitation and removing it from that list is a clean, contained contribution.
If you are unsure where to begin, open an issue on GitHub and ask.
Last modified on June 7, 2026