Honest assessment ยท No marketing fluff

D.U.H. vs the field

Detailed feature-by-feature comparison with Claude Code, Codex, OpenCode, and others. Based on real code counts and direct testing, not marketing claims.

Benchmark: D.U.H. vs Claude Code

Task: Build a FastAPI URL shortener from spec (TASK.md) ยท Model: Claude Haiku 4.5 ยท 3 independent runs each ยท Same API key, same prompt, same max-turns (15).

Raw Results

Run Tool Time Files LOC Tests Pass Rate
1 Claude Code 61.3s 3 227 9/9 100%
2 Claude Code 65.1s 3 319 12/12 100%
3 Claude Code 50.7s 0 0 โ€” FAIL (no output)
1 D.U.H. 52.6s 3 475 24/24 100%
2 D.U.H. 48.8s 3 446 18/18 100%
3 D.U.H. 35.6s 3 335 12/12 100%

Aggregated (CC Run 3 excluded โ€” tool failure)

Metric D.U.H. (n=3) Claude Code (n=2)
Avg completion time 45.7s 63.2s
Avg LOC 419 273
Avg tests generated 18 10.5
Success rate 3/3 (100%) 2/3 (67%)
All tests pass (on success) 3/3 โœ“ 2/2 โœ“

Analysis

D.U.H. is ~28% faster on average (45.7s vs 63.2s). The gap is startup overhead โ€” Claude Code loads its Ink TUI, plugin system, and Node.js runtime; D.U.H. is a direct Python process with no UI overhead in print mode. D.U.H. also showed self-correction behavior in Run 1 (detected a Pydantic URL normalization issue, fixed the test, re-ran), which is why it produced more LOC and tests. Both tools achieved 100% test pass rates on successful runs โ€” the same model generates the code in both cases.

Security Comparison

D.U.H. is ahead of every competing open-source agent on security depth. This is by design โ€” ADRs 053 and 054 address every published 2024โ€“2026 agent CVE.

Security Feature D.U.H. Claude Code Codex OpenCode
Vulnerability scanner framework 13 scanners, 3 tiers None None None
Project-file RCE scanner (CVE-2025-59536) โœ“ โ€” โ€” โ€”
MCP tool-poisoning scanner (CVE-2025-54136) โœ“ โ€” โ€” โ€”
Sandbox bypass scanner (CVE-2025-59532) โœ“ โ€” โ€” โ€”
SARIF output (GitHub Code Scanning) โœ“ โ€” โ€” โ€”
Taint propagation (UntrustedStr) Unique in OSS AI agents None None None
Confirmation tokens (HMAC-bound) โœ“ โ€” โ€” โ€”
Lethal trifecta check โœ“ โ€” โ€” โ€”
MCP Unicode normalization (GlassWorm) โœ“ (NFKC + bidi/tag rejection) โ€” โ€” โ€”
MCP hash-pinning (MCPoison defense) โœ“ โ€” โ€” โ€”
Per-hook filesystem namespacing โœ“ โ€” โ€” โ€”
PEP 578 audit hook bridge โœ“ (sub-500ns) โ€” โ€” โ€”
Signed plugin manifests TOFU + sigstore-ready โœ“ Partial โ€”
Plugin trust store + revocation โœ“ โ€” โ€” โ€”
macOS Seatbelt sandboxing โœ“ Partial โ€” โ€”
Linux Landlock sandboxing โœ“ โ€” โœ“ โ€”
Provider differential fuzzer โœ“ (Hypothesis) โ€” โ€” โ€”
CVE replay fixtures 4 CVEs โ€” โ€” โ€”
Security test count 330+ Unknown (internal) Unknown Unknown
Full security deep dive โ†’

Tool Comparison

D.U.H. ships 26 built-in tools โ€” more than any competing open-source agent. Legend: Y = fully implemented, P = partial, N = not implemented.

Tool D.U.H. Claude Code Codex OpenCode
ReadYYYY
WriteYYYY
Edit (exact string match)YYYY
MultiEditYYNN
Bash (subprocess)YYYY
GlobYYYY
GrepYYYY
WebSearch (Serper + Tavily)YYYN
WebFetch (taint-tagged)YYNN
Task (subagent spawn)Y (4 types)Y (60+ types)YN
Skill (invoke skills)YYYN
ToolSearch (deferred tools)YYNN
NotebookEditYYYN
LSP integrationYYYY
DockerYNYN
DatabaseYNNN
HTTPYNNN
GitHub (PR/issues)YYYN
TestImpactYNNN
TodoWriteYYNN
AskUserQuestionYYYN
EnterWorktree / ExitWorktreeYYNN
MemoryStore / MemoryRecallYYNN
Total tool count 26 25+ ~15 ~10

Provider Support

D.U.H. supports 5 providers including ChatGPT Plus/Pro via OAuth (no API key required). The kernel is provider-agnostic โ€” zero provider imports in core code.

Provider D.U.H. Claude Code Codex OpenCode
Anthropic (Claude) โœ“ API key or /connect โœ“ (native) โ€” โœ“
OpenAI API (key) โœ“ GPT-4o, o1, o3 โ€” โœ“ โœ“
ChatGPT Plus/Pro โ€” Codex (OAuth) โœ“ PKCE OAuth (ADR-051/052) โ€” โœ“ (native) โ€”
Ollama (local models) โœ“ any pulled model โ€” โœ“ โ€”
Stub (deterministic) โœ“ DUH_STUB_PROVIDER=1 โ€” โ€” โ€”
Google Gemini Planned โ€” โ€” โœ“
AWS Bedrock Planned โœ“ โ€” โœ“
Azure OpenAI Planned โ€” โ€” โœ“
Provider auto-detection โœ“ model name inference โ€” โ€” โœ“
Total providers 5 1 (Anthropic) 2 5
โ„น

ChatGPT OAuth: D.U.H. is the only Python agent that supports ChatGPT Plus/Pro subscription via PKCE OAuth against auth.openai.com โ€” meaning you can use Codex-family models without paying for a separate API key. Tokens are stored in ~/.config/duh/auth.json (0600 permissions). See ADR-051 and ADR-052.

Architecture Comparison

Project statistics and architectural dimensions across agents. All D.U.H. numbers are independently verified.

Project Statistics

Metric D.U.H. Claude Code Codex OpenCode
Language Python TypeScript Rust Go
Source LOC 24,327 ~512,000 ~603,000 ~42,000
Test LOC 55,438 Internal Unknown Unknown
Tests passing 4,160 Internal Unknown Unknown
Test:code ratio 2.3:1 Unknown Unknown Unknown
Coverage 100% Unknown Unknown Unknown
ADRs 54 Proprietary Proprietary 0
License Apache 2.0 Proprietary Apache 2.0 MIT
Status Production alpha Production Production Archived

Architectural Dimensions

Dimension D.U.H. Claude Code Codex OpenCode
Architecture style Hexagonal (ports & adapters) Monolith + React hooks Workspace (78 Rust crates) Go packages
Kernel isolation Strict (zero provider imports) Entangled with React Strict (trait boundaries) Partial
Dependency injection Explicit (Deps dataclass) React context/hooks Rust traits Go interfaces
Test strategy Unit + integration + property + benchmark Internal Unknown Unknown
Error handling Events + metadata flags Events + React error boundaries Rust Result types Go errors
Security depth 3-layer (scanning + taint + sandbox) Pattern-based Sandbox-focused Minimal

Full Feature Matrix

Legend: Y = Fully implemented ยท P = Partial (works but incomplete) ยท S = Scaffolded (not functional) ยท N = Not implemented

Snapshot as of April 2026 ยท See full document on GitHub

Feature D.U.H. Claude Code Codex OpenCode
Core Loop
Multi-turn agentic loopYYYY
Streaming text outputYYYY
Thinking/reasoning blocksYYNN
Error recovery in loopYYYP
MCP Integration
MCP client (stdio/SSE/HTTP/WS)Y (4 transports)YYN
MCP tool discovery & executionYYYN
MCP Unicode normalizationY (GlassWorm)NNN
MCP subprocess sandboxingY (Seatbelt/Landlock)NNN
MCP hash-pinning (MCPoison)YNNN
Session & Context
Session persistence (disk)YYYY (SQLite)
Session resume (--continue)YYYY
Context compaction (truncation)YYYY
Context compaction (LLM summary)Y (ModelCompactor)Y (16 modules)YN
CLI & Interface
Print mode (-p)YYYN (REPL only)
Interactive REPLY (Rich, 21 commands)Y (Ink, 114 commands)Y (Bubble Tea)Y (Bubble Tea)
/plan, /undo, /snapshotYYNN
/connect (OAuth flow)YNNN
Doctor diagnosticsY (duh doctor)NNN
Security CLI (scan/init/exception)YNNN
SDK & Hooks
Claude Agent SDK drop-inY (verified e2e)Y (IS the SDK target)NN
stream-json NDJSON protocolYY (native)NN
Total hook event types2929+Unknown0
Hook blocking (input rewrite)Y (ADR-045)YPN
Plugins & Skills
Skill loading (.claude/skills/)Y (parity with CC)YYN
Plugin discovery (entry_points)YYYN
Signed plugin manifestsY (TOFU + sigstore)YPN
Property-based testingY (Hypothesis)NNN
โœ“

Where D.U.H. is ahead of Claude Code: Provider-agnostic (5 vs 1), pluggable security scanning (13 scanners), taint propagation (UntrustedStr), confirmation token gating, lethal trifecta check, MCP hash-pinning + Unicode normalization, PEP 578 audit hook bridge, signed plugin manifests with TOFU trust, provider differential fuzzer, and fully documented with 54 public ADRs.