Detailed feature-by-feature comparison with Claude Code, Codex, OpenCode, and others. Based on real code counts and direct testing, not marketing claims.
Task: Build a FastAPI URL shortener from spec (TASK.md) ยท Model: Claude Haiku 4.5 ยท 3 independent runs each ยท Same API key, same prompt, same max-turns (15).
| Run | Tool | Time | Files | LOC | Tests | Pass Rate |
|---|---|---|---|---|---|---|
| 1 | Claude Code | 61.3s | 3 | 227 | 9/9 | 100% |
| 2 | Claude Code | 65.1s | 3 | 319 | 12/12 | 100% |
| 3 | Claude Code | 50.7s | 0 | 0 | โ | FAIL (no output) |
| 1 | D.U.H. | 52.6s | 3 | 475 | 24/24 | 100% |
| 2 | D.U.H. | 48.8s | 3 | 446 | 18/18 | 100% |
| 3 | D.U.H. | 35.6s | 3 | 335 | 12/12 | 100% |
| Metric | D.U.H. (n=3) | Claude Code (n=2) |
|---|---|---|
| Avg completion time | 45.7s | 63.2s |
| Avg LOC | 419 | 273 |
| Avg tests generated | 18 | 10.5 |
| Success rate | 3/3 (100%) | 2/3 (67%) |
| All tests pass (on success) | 3/3 โ | 2/2 โ |
D.U.H. is ~28% faster on average (45.7s vs 63.2s). The gap is startup overhead โ Claude Code loads its Ink TUI, plugin system, and Node.js runtime; D.U.H. is a direct Python process with no UI overhead in print mode. D.U.H. also showed self-correction behavior in Run 1 (detected a Pydantic URL normalization issue, fixed the test, re-ran), which is why it produced more LOC and tests. Both tools achieved 100% test pass rates on successful runs โ the same model generates the code in both cases.
D.U.H. is ahead of every competing open-source agent on security depth. This is by design โ ADRs 053 and 054 address every published 2024โ2026 agent CVE.
| Security Feature | D.U.H. | Claude Code | Codex | OpenCode |
|---|---|---|---|---|
| Vulnerability scanner framework | 13 scanners, 3 tiers | None | None | None |
| Project-file RCE scanner (CVE-2025-59536) | โ | โ | โ | โ |
| MCP tool-poisoning scanner (CVE-2025-54136) | โ | โ | โ | โ |
| Sandbox bypass scanner (CVE-2025-59532) | โ | โ | โ | โ |
| SARIF output (GitHub Code Scanning) | โ | โ | โ | โ |
| Taint propagation (UntrustedStr) | Unique in OSS AI agents | None | None | None |
| Confirmation tokens (HMAC-bound) | โ | โ | โ | โ |
| Lethal trifecta check | โ | โ | โ | โ |
| MCP Unicode normalization (GlassWorm) | โ (NFKC + bidi/tag rejection) | โ | โ | โ |
| MCP hash-pinning (MCPoison defense) | โ | โ | โ | โ |
| Per-hook filesystem namespacing | โ | โ | โ | โ |
| PEP 578 audit hook bridge | โ (sub-500ns) | โ | โ | โ |
| Signed plugin manifests | TOFU + sigstore-ready | โ | Partial | โ |
| Plugin trust store + revocation | โ | โ | โ | โ |
| macOS Seatbelt sandboxing | โ | Partial | โ | โ |
| Linux Landlock sandboxing | โ | โ | โ | โ |
| Provider differential fuzzer | โ (Hypothesis) | โ | โ | โ |
| CVE replay fixtures | 4 CVEs | โ | โ | โ |
| Security test count | 330+ | Unknown (internal) | Unknown | Unknown |
D.U.H. ships 26 built-in tools โ more than any competing open-source agent. Legend: Y = fully implemented, P = partial, N = not implemented.
| Tool | D.U.H. | Claude Code | Codex | OpenCode |
|---|---|---|---|---|
| Read | Y | Y | Y | Y |
| Write | Y | Y | Y | Y |
| Edit (exact string match) | Y | Y | Y | Y |
| MultiEdit | Y | Y | N | N |
| Bash (subprocess) | Y | Y | Y | Y |
| Glob | Y | Y | Y | Y |
| Grep | Y | Y | Y | Y |
| WebSearch (Serper + Tavily) | Y | Y | Y | N |
| WebFetch (taint-tagged) | Y | Y | N | N |
| Task (subagent spawn) | Y (4 types) | Y (60+ types) | Y | N |
| Skill (invoke skills) | Y | Y | Y | N |
| ToolSearch (deferred tools) | Y | Y | N | N |
| NotebookEdit | Y | Y | Y | N |
| LSP integration | Y | Y | Y | Y |
| Docker | Y | N | Y | N |
| Database | Y | N | N | N |
| HTTP | Y | N | N | N |
| GitHub (PR/issues) | Y | Y | Y | N |
| TestImpact | Y | N | N | N |
| TodoWrite | Y | Y | N | N |
| AskUserQuestion | Y | Y | Y | N |
| EnterWorktree / ExitWorktree | Y | Y | N | N |
| MemoryStore / MemoryRecall | Y | Y | N | N |
| Total tool count | 26 | 25+ | ~15 | ~10 |
D.U.H. supports 5 providers including ChatGPT Plus/Pro via OAuth (no API key required). The kernel is provider-agnostic โ zero provider imports in core code.
| Provider | D.U.H. | Claude Code | Codex | OpenCode |
|---|---|---|---|---|
| Anthropic (Claude) | โ API key or /connect | โ (native) | โ | โ |
| OpenAI API (key) | โ GPT-4o, o1, o3 | โ | โ | โ |
| ChatGPT Plus/Pro โ Codex (OAuth) | โ PKCE OAuth (ADR-051/052) | โ | โ (native) | โ |
| Ollama (local models) | โ any pulled model | โ | โ | โ |
| Stub (deterministic) | โ DUH_STUB_PROVIDER=1 | โ | โ | โ |
| Google Gemini | Planned | โ | โ | โ |
| AWS Bedrock | Planned | โ | โ | โ |
| Azure OpenAI | Planned | โ | โ | โ |
| Provider auto-detection | โ model name inference | โ | โ | โ |
| Total providers | 5 | 1 (Anthropic) | 2 | 5 |
ChatGPT OAuth: D.U.H. is the only Python agent that supports ChatGPT Plus/Pro subscription via PKCE OAuth against auth.openai.com โ meaning you can use Codex-family models without paying for a separate API key. Tokens are stored in ~/.config/duh/auth.json (0600 permissions). See ADR-051 and ADR-052.
Project statistics and architectural dimensions across agents. All D.U.H. numbers are independently verified.
| Metric | D.U.H. | Claude Code | Codex | OpenCode |
|---|---|---|---|---|
| Language | Python | TypeScript | Rust | Go |
| Source LOC | 24,327 | ~512,000 | ~603,000 | ~42,000 |
| Test LOC | 55,438 | Internal | Unknown | Unknown |
| Tests passing | 4,160 | Internal | Unknown | Unknown |
| Test:code ratio | 2.3:1 | Unknown | Unknown | Unknown |
| Coverage | 100% | Unknown | Unknown | Unknown |
| ADRs | 54 | Proprietary | Proprietary | 0 |
| License | Apache 2.0 | Proprietary | Apache 2.0 | MIT |
| Status | Production alpha | Production | Production | Archived |
| Dimension | D.U.H. | Claude Code | Codex | OpenCode |
|---|---|---|---|---|
| Architecture style | Hexagonal (ports & adapters) | Monolith + React hooks | Workspace (78 Rust crates) | Go packages |
| Kernel isolation | Strict (zero provider imports) | Entangled with React | Strict (trait boundaries) | Partial |
| Dependency injection | Explicit (Deps dataclass) | React context/hooks | Rust traits | Go interfaces |
| Test strategy | Unit + integration + property + benchmark | Internal | Unknown | Unknown |
| Error handling | Events + metadata flags | Events + React error boundaries | Rust Result types | Go errors |
| Security depth | 3-layer (scanning + taint + sandbox) | Pattern-based | Sandbox-focused | Minimal |
Legend: Y = Fully implemented ยท P = Partial (works but incomplete) ยท S = Scaffolded (not functional) ยท N = Not implemented
Snapshot as of April 2026 ยท See full document on GitHub
| Feature | D.U.H. | Claude Code | Codex | OpenCode |
|---|---|---|---|---|
| Core Loop | ||||
| Multi-turn agentic loop | Y | Y | Y | Y |
| Streaming text output | Y | Y | Y | Y |
| Thinking/reasoning blocks | Y | Y | N | N |
| Error recovery in loop | Y | Y | Y | P |
| MCP Integration | ||||
| MCP client (stdio/SSE/HTTP/WS) | Y (4 transports) | Y | Y | N |
| MCP tool discovery & execution | Y | Y | Y | N |
| MCP Unicode normalization | Y (GlassWorm) | N | N | N |
| MCP subprocess sandboxing | Y (Seatbelt/Landlock) | N | N | N |
| MCP hash-pinning (MCPoison) | Y | N | N | N |
| Session & Context | ||||
| Session persistence (disk) | Y | Y | Y | Y (SQLite) |
| Session resume (--continue) | Y | Y | Y | Y |
| Context compaction (truncation) | Y | Y | Y | Y |
| Context compaction (LLM summary) | Y (ModelCompactor) | Y (16 modules) | Y | N |
| CLI & Interface | ||||
| Print mode (-p) | Y | Y | Y | N (REPL only) |
| Interactive REPL | Y (Rich, 21 commands) | Y (Ink, 114 commands) | Y (Bubble Tea) | Y (Bubble Tea) |
| /plan, /undo, /snapshot | Y | Y | N | N |
| /connect (OAuth flow) | Y | N | N | N |
| Doctor diagnostics | Y (duh doctor) | N | N | N |
| Security CLI (scan/init/exception) | Y | N | N | N |
| SDK & Hooks | ||||
| Claude Agent SDK drop-in | Y (verified e2e) | Y (IS the SDK target) | N | N |
| stream-json NDJSON protocol | Y | Y (native) | N | N |
| Total hook event types | 29 | 29+ | Unknown | 0 |
| Hook blocking (input rewrite) | Y (ADR-045) | Y | P | N |
| Plugins & Skills | ||||
| Skill loading (.claude/skills/) | Y (parity with CC) | Y | Y | N |
| Plugin discovery (entry_points) | Y | Y | Y | N |
| Signed plugin manifests | Y (TOFU + sigstore) | Y | P | N |
| Property-based testing | Y (Hypothesis) | N | N | N |
Where D.U.H. is ahead of Claude Code: Provider-agnostic (5 vs 1), pluggable security scanning (13 scanners), taint propagation (UntrustedStr), confirmation token gating, lethal trifecta check, MCP hash-pinning + Unicode normalization, PEP 578 audit hook bridge, signed plugin manifests with TOFU trust, provider differential fuzzer, and fully documented with 54 public ADRs.