Everything you need.
Nothing you don't.
D.U.H. is designed around a clean hexagonal kernel โ providers plug in at the edge, tools compose in the middle, and security wraps every layer.
Provider Freedom
Anthropic Claude, OpenAI API, ChatGPT Plus/Pro (Codex via OAuth), local Ollama, or a deterministic stub for CI. Switch providers at runtime โ same harness, any model.
Security-First by Design
3-layer pluggable security: 13 vulnerability scanners, taint-propagating UntrustedStr, HMAC-bound confirmation tokens, lethal trifecta check, macOS Seatbelt + Linux Landlock sandboxing.
28% Faster than Claude Code
45.7s average vs 63.2s for Claude Code on identical tasks with the same model. Direct Python process โ no Node.js runtime, no Ink TUI overhead. 100% success rate vs 67%.
26 Built-in Tools
Read, Write, Edit, MultiEdit, Bash, Glob, Grep, WebSearch, WebFetch, Task (subagents), Docker, Database, HTTP, GitHub, LSP, TestImpact, MemoryStore, NotebookEdit, and more.
Rich TUI & REPL
Full Textual TUI, Rich-powered REPL with 21 slash commands including /connect (OAuth), /snapshot, /plan, /pr, /compact. Print mode for CI pipelines. SDK mode via NDJSON.
Claude Agent SDK Drop-in
Implements the full NDJSON streaming protocol. Drop duh anywhere claude is expected โ same interface, any provider behind it. Verified end-to-end.
Numbers don't lie.
3 independent runs per tool, same model (Claude Haiku 4.5), same prompt, same task (FastAPI URL shortener from spec), fully isolated directories.
| Metric | D.U.H. (n=3) | Claude Code (n=2 successful) |
|---|---|---|
| Avg completion time | 45.7s | 63.2s |
| Success rate | 3/3 (100%) | 2/3 (67%) |
| Avg LOC generated | 419 | 273 |
| Avg tests generated | 18 | 10.5 |
| All tests pass (on success) | 3/3 โ | 2/2 โ |
| Self-correction behavior | Yes (detected & fixed Pydantic issue) | Minimal |
| Runtime / startup | Python (direct) | Node.js + Ink TUI |
Methodology: Same Anthropic API key, same model (Haiku 4.5), same prompt verbatim, same TASK.md, separate directories, fresh git repo per run, --max-turns 15 on both tools. D.U.H. advantage is architectural simplicity โ less startup overhead means more time for actual model interaction. See full benchmark report.
Persistent agentic-swarm extension.
A layered runtime on top of D.U.H. that turns single-shot agent invocations into a persistent, event-driven swarm. One TOML file declares the topology; one host daemon runs forever.
Persistent host daemon
duh wave start spins up a long-running process. 10 subcommands: ls, inspect, pause, resume, logs --follow, install, uninstall, web, start, stop.
Event ingress
Webhook (HMAC-verified), filewatch, cron, MCP push, manual seam โ all routing through one append-only TriggerLog. Each trigger spawns a Task by topology rule.
RLM substrate
Bytes addressed by reference, never summarised. Bounded recursion with cycle detection. Cites Zhang/Kraska/Khattab arXiv 2512.24601.
Cross-agent handles
Workers see selected REPL handles; results bind back as new handles. Cites Yang/Zou/Pan et al. arXiv 2604.25917.
The deepest security
story in open-source AI agents.
Three independent layers that address every published agent RCE in the 2024โ2026 CVE corpus. No other open-source AI coding agent comes close.
Layer 1 Vulnerability Monitoring
- 13 scanners across 3 tiers (Minimal / Extended / Paranoid)
- Project-file RCE scanner (CVE-2025-59536)
- MCP tool-poisoning scanner (CVE-2025-54136)
- Sandbox bypass scanner (CVE-2025-59532)
- Command injection scanner (CVE-2026-35022)
- OAuth hardening scanner
- SARIF output for GitHub Code Scanning
- Pre-push git hook installer
duh security scanยทduh security init
Layer 2 Runtime Hardening
- Taint propagation โ
UntrustedStrtags every string by origin and propagates through all string ops - Confirmation tokens โ HMAC-bound, prevent tainted strings reaching Bash/Write/Edit without user confirmation
- Lethal trifecta check โ blocks sessions with simultaneous read-private + read-untrusted + network-egress
- MCP Unicode normalization (GlassWorm defense)
- Per-hook filesystem namespacing
- PEP 578 audit hook bridge (sub-500ns)
- Signed plugin manifests (TOFU + sigstore-ready)
Layer 3 OS Sandboxing
- macOS Seatbelt โ sandbox-exec profiles for shell commands and MCP servers
- Linux Landlock โ syscall-level filesystem access control
- Network policy layer โ blocks outbound traffic unless explicitly allowed
- Approval modes:
suggest/auto-edit/full-auto - MCP subprocess sandboxing via host OS facilities
- Provider differential fuzzer (Hypothesis property tests)
Any model. Zero lock-in.
D.U.H.'s kernel never imports a provider directly. Providers are adapters at the edge โ swap them without touching your workflow.
Anthropic Claude
API key or /connect
OpenAI API
GPT-4o, o1, o3
ChatGPT / Codex
PKCE OAuth โ no API key
Ollama
Local โ any pulled model
Stub Provider
Deterministic CI testing
litellm / OpenRouter
100+ models via proxy
Up in 60 seconds.
# Install pip install duh-cli # Set your Anthropic key export ANTHROPIC_API_KEY=sk-ant-... # Run a task (print mode โ great for scripts) duh -p "fix the bug in auth.py" # Interactive REPL duh # Use a different model / provider duh --provider openai --model gpt-4o -p "refactor db.py" # Local model with Ollama duh --provider ollama --model qwen2.5-coder -p "write tests" # Run diagnostics duh doctor # Security scan (SARIF output) duh security scan