D.U.H. vs Claude Code vs Cursor vs Aider

Benchmark: D.U.H. vs Claude Code

Task: Build a FastAPI URL shortener from spec (TASK.md) · Model: Claude Haiku 4.5 · 3 independent runs each · Same API key, same prompt, same max-turns (15).

Raw Results

Run	Tool	Time	Files	LOC	Tests	Pass Rate
1	Claude Code	61.3s	3	227	9/9	100%
2	Claude Code	65.1s	3	319	12/12	100%
3	Claude Code	50.7s	0	0	—	FAIL (no output)
1	D.U.H.	52.6s	3	475	24/24	100%
2	D.U.H.	48.8s	3	446	18/18	100%
3	D.U.H.	35.6s	3	335	12/12	100%

Aggregated (CC Run 3 excluded — tool failure)

Metric	D.U.H. (n=3)	Claude Code (n=2)
Avg completion time	45.7s	63.2s
Avg LOC	419	273
Avg tests generated	18	10.5
Success rate	3/3 (100%)	2/3 (67%)
All tests pass (on success)	3/3 ✓	2/2 ✓

Analysis

D.U.H. is ~28% faster on average (45.7s vs 63.2s). The gap is startup overhead — Claude Code loads its Ink TUI, plugin system, and Node.js runtime; D.U.H. is a direct Python process with no UI overhead in print mode. D.U.H. also showed self-correction behavior in Run 1 (detected a Pydantic URL normalization issue, fixed the test, re-ran), which is why it produced more LOC and tests. Both tools achieved 100% test pass rates on successful runs — the same model generates the code in both cases.

Security Comparison

D.U.H. is ahead of every competing open-source agent on security depth. This is by design — ADRs 053 and 054 address every published 2024–2026 agent CVE.

Security Feature	D.U.H.	Claude Code	Codex	OpenCode
Vulnerability scanner framework	13 scanners, 3 tiers	None	None	None
Project-file RCE scanner (CVE-2025-59536)	✓	—	—	—
MCP tool-poisoning scanner (CVE-2025-54136)	✓	—	—	—
Sandbox bypass scanner (CVE-2025-59532)	✓	—	—	—
SARIF output (GitHub Code Scanning)	✓	—	—	—
Taint propagation (UntrustedStr)	Unique in OSS AI agents	None	None	None
Confirmation tokens (HMAC-bound)	✓	—	—	—
Lethal trifecta check	✓	—	—	—
MCP Unicode normalization (GlassWorm)	✓ (NFKC + bidi/tag rejection)	—	—	—
MCP hash-pinning (MCPoison defense)	✓	—	—	—
Per-hook filesystem namespacing	✓	—	—	—
PEP 578 audit hook bridge	✓ (sub-500ns)	—	—	—
Signed plugin manifests	TOFU + sigstore-ready	✓	Partial	—
Plugin trust store + revocation	✓	—	—	—
macOS Seatbelt sandboxing	✓	Partial	—	—
Linux Landlock sandboxing	✓	—	✓	—
Provider differential fuzzer	✓ (Hypothesis)	—	—	—
CVE replay fixtures	4 CVEs	—	—	—
Security test count	330+	Unknown (internal)	Unknown	Unknown

Full security deep dive →

Tool Comparison

D.U.H. ships 26 built-in tools — more than any competing open-source agent. Legend: Y = fully implemented, P = partial, N = not implemented.

Tool	D.U.H.	Claude Code	Codex	OpenCode
Read	Y	Y	Y	Y
Write	Y	Y	Y	Y
Edit (exact string match)	Y	Y	Y	Y
MultiEdit	Y	Y	N	N
Bash (subprocess)	Y	Y	Y	Y
Glob	Y	Y	Y	Y
Grep	Y	Y	Y	Y
WebSearch (Serper + Tavily)	Y	Y	Y	N
WebFetch (taint-tagged)	Y	Y	N	N
Task (subagent spawn)	Y (4 types)	Y (60+ types)	Y	N
Skill (invoke skills)	Y	Y	Y	N
ToolSearch (deferred tools)	Y	Y	N	N
NotebookEdit	Y	Y	Y	N
LSP integration	Y	Y	Y	Y
Docker	Y	N	Y	N
Database	Y	N	N	N
HTTP	Y	N	N	N
GitHub (PR/issues)	Y	Y	Y	N
TestImpact	Y	N	N	N
TodoWrite	Y	Y	N	N
AskUserQuestion	Y	Y	Y	N
EnterWorktree / ExitWorktree	Y	Y	N	N
MemoryStore / MemoryRecall	Y	Y	N	N
Total tool count	26	25+	~15	~10

Provider Support

D.U.H. supports 5 providers including ChatGPT Plus/Pro via OAuth (no API key required). The kernel is provider-agnostic — zero provider imports in core code.

Provider	D.U.H.	Claude Code	Codex	OpenCode
Anthropic (Claude)	✓ API key or /connect	✓ (native)	—	✓
OpenAI API (key)	✓ GPT-4o, o1, o3	—	✓	✓
ChatGPT Plus/Pro — Codex (OAuth)	✓ PKCE OAuth (ADR-051/052)	—	✓ (native)	—
Ollama (local models)	✓ any pulled model	—	✓	—
Stub (deterministic)	✓ DUH_STUB_PROVIDER=1	—	—	—
Google Gemini	Planned	—	—	✓
AWS Bedrock	Planned	✓	—	✓
Azure OpenAI	Planned	—	—	✓
Provider auto-detection	✓ model name inference	—	—	✓
Total providers	5	1 (Anthropic)	2	5

ℹ

ChatGPT OAuth: D.U.H. is the only Python agent that supports ChatGPT Plus/Pro subscription via PKCE OAuth against auth.openai.com — meaning you can use Codex-family models without paying for a separate API key. Tokens are stored in ~/.config/duh/auth.json (0600 permissions). See ADR-051 and ADR-052.

Architecture Comparison

Project statistics and architectural dimensions across agents. All D.U.H. numbers are independently verified.

Project Statistics

Metric	D.U.H.	Claude Code	Codex	OpenCode
Language	Python	TypeScript	Rust	Go
Source LOC	24,327	~512,000	~603,000	~42,000
Test LOC	55,438	Internal	Unknown	Unknown
Tests passing	4,160	Internal	Unknown	Unknown
Test:code ratio	2.3:1	Unknown	Unknown	Unknown
Coverage	100%	Unknown	Unknown	Unknown
ADRs	54	Proprietary	Proprietary	0
License	Apache 2.0	Proprietary	Apache 2.0	MIT
Status	Production alpha	Production	Production	Archived

Architectural Dimensions

Dimension	D.U.H.	Claude Code	Codex	OpenCode
Architecture style	Hexagonal (ports & adapters)	Monolith + React hooks	Workspace (78 Rust crates)	Go packages
Kernel isolation	Strict (zero provider imports)	Entangled with React	Strict (trait boundaries)	Partial
Dependency injection	Explicit (Deps dataclass)	React context/hooks	Rust traits	Go interfaces
Test strategy	Unit + integration + property + benchmark	Internal	Unknown	Unknown
Error handling	Events + metadata flags	Events + React error boundaries	Rust Result types	Go errors
Security depth	3-layer (scanning + taint + sandbox)	Pattern-based	Sandbox-focused	Minimal

Full Feature Matrix

Legend: Y = Fully implemented · P = Partial (works but incomplete) · S = Scaffolded (not functional) · N = Not implemented

Snapshot as of April 2026 · See full document on GitHub

Feature	D.U.H.	Claude Code	Codex	OpenCode
Core Loop
Multi-turn agentic loop	Y	Y	Y	Y
Streaming text output	Y	Y	Y	Y
Thinking/reasoning blocks	Y	Y	N	N
Error recovery in loop	Y	Y	Y	P
MCP Integration
MCP client (stdio/SSE/HTTP/WS)	Y (4 transports)	Y	Y	N
MCP tool discovery & execution	Y	Y	Y	N
MCP Unicode normalization	Y (GlassWorm)	N	N	N
MCP subprocess sandboxing	Y (Seatbelt/Landlock)	N	N	N
MCP hash-pinning (MCPoison)	Y	N	N	N
Session & Context
Session persistence (disk)	Y	Y	Y	Y (SQLite)
Session resume (--continue)	Y	Y	Y	Y
Context compaction (truncation)	Y	Y	Y	Y
Context compaction (LLM summary)	Y (ModelCompactor)	Y (16 modules)	Y	N
CLI & Interface
Print mode (-p)	Y	Y	Y	N (REPL only)
Interactive REPL	Y (Rich, 21 commands)	Y (Ink, 114 commands)	Y (Bubble Tea)	Y (Bubble Tea)
/plan, /undo, /snapshot	Y	Y	N	N
/connect (OAuth flow)	Y	N	N	N
Doctor diagnostics	Y (duh doctor)	N	N	N
Security CLI (scan/init/exception)	Y	N	N	N
SDK & Hooks
Claude Agent SDK drop-in	Y (verified e2e)	Y (IS the SDK target)	N	N
stream-json NDJSON protocol	Y	Y (native)	N	N
Total hook event types	29	29+	Unknown	0
Hook blocking (input rewrite)	Y (ADR-045)	Y	P	N
Plugins & Skills
Skill loading (.claude/skills/)	Y (parity with CC)	Y	Y	N
Plugin discovery (entry_points)	Y	Y	Y	N
Signed plugin manifests	Y (TOFU + sigstore)	Y	P	N
Property-based testing	Y (Hypothesis)	N	N	N

✓

Where D.U.H. is ahead of Claude Code: Provider-agnostic (5 vs 1), pluggable security scanning (13 scanners), taint propagation (UntrustedStr), confirmation token gating, lethal trifecta check, MCP hash-pinning + Unicode normalization, PEP 578 audit hook bridge, signed plugin manifests with TOFU trust, provider differential fuzzer, and fully documented with 54 public ADRs.