Features

Everything you need.
Nothing you don't.

D.U.H. is designed around a clean hexagonal kernel — providers plug in at the edge, tools compose in the middle, and security wraps every layer.

🔌

Provider Freedom

Anthropic Claude, OpenAI API, ChatGPT Plus/Pro (Codex via OAuth), local Ollama, or a deterministic stub for CI. Switch providers at runtime — same harness, any model.

🛡️

Security-First by Design

3-layer pluggable security: 13 vulnerability scanners, taint-propagating UntrustedStr, HMAC-bound confirmation tokens, lethal trifecta check, macOS Seatbelt + Linux Landlock sandboxing.

⚡

28% Faster than Claude Code

45.7s average vs 63.2s for Claude Code on identical tasks with the same model. Direct Python process — no Node.js runtime, no Ink TUI overhead. 100% success rate vs 67%.

🔧

26 Built-in Tools

Read, Write, Edit, MultiEdit, Bash, Glob, Grep, WebSearch, WebFetch, Task (subagents), Docker, Database, HTTP, GitHub, LSP, TestImpact, MemoryStore, NotebookEdit, and more.

💻

Rich TUI & REPL

Full Textual TUI, Rich-powered REPL with 21 slash commands including /connect (OAuth), /snapshot, /plan, /pr, /compact. Print mode for CI pipelines. SDK mode via NDJSON.

🔗

Claude Agent SDK Drop-in

Implements the full NDJSON streaming protocol. Drop duh anywhere claude is expected — same interface, any provider behind it. Verified end-to-end.

Benchmarks

Numbers don't lie.

3 independent runs per tool, same model (Claude Haiku 4.5), same prompt, same task (FastAPI URL shortener from spec), fully isolated directories.

28%
Faster average execution

45.7s

D.U.H. avg completion time

63.2s

Claude Code avg time

100%
D.U.H. success rate (3/3)

67%

Claude Code success rate (2/3)

18
Avg tests generated (D.U.H.)

10.5

Avg tests generated (CC)

419
Avg lines of code (D.U.H.)

Metric	D.U.H. (n=3)	Claude Code (n=2 successful)
Avg completion time	45.7s	63.2s
Success rate	3/3 (100%)	2/3 (67%)
Avg LOC generated	419	273
Avg tests generated	18	10.5
All tests pass (on success)	3/3 ✓	2/2 ✓
Self-correction behavior	Yes (detected & fixed Pydantic issue)	Minimal
Runtime / startup	Python (direct)	Node.js + Ink TUI

ℹ

Methodology: Same Anthropic API key, same model (Haiku 4.5), same prompt verbatim, same TASK.md, separate directories, fresh git repo per run, --max-turns 15 on both tools. D.U.H. advantage is architectural simplicity — less startup overhead means more time for actual model interaction. See full benchmark report.

duhwave

Persistent agentic-swarm extension.

A layered runtime on top of D.U.H. that turns single-shot agent invocations into a persistent, event-driven swarm. One TOML file declares the topology; one host daemon runs forever.

🛰️

Persistent host daemon

duh wave start spins up a long-running process. 10 subcommands: ls, inspect, pause, resume, logs --follow, install, uninstall, web, start, stop.

📡

Event ingress

Webhook (HMAC-verified), filewatch, cron, MCP push, manual seam — all routing through one append-only TriggerLog. Each trigger spawns a Task by topology rule.

🧬

RLM substrate

Bytes addressed by reference, never summarised. Bounded recursion with cycle detection. Cites Zhang/Kraska/Khattab arXiv 2512.24601.

🔗

Cross-agent handles

Workers see selected REPL handles; results bind back as new handles. Cites Yang/Zou/Pan et al. arXiv 2604.25917.

Read the duhwave guide →

Security

The deepest security
story in open-source AI agents.

Three independent layers that address every published agent RCE in the 2024–2026 CVE corpus. No other open-source AI coding agent comes close.

Layer 1 Vulnerability Monitoring

13 scanners across 3 tiers (Minimal / Extended / Paranoid)
Project-file RCE scanner (CVE-2025-59536)
MCP tool-poisoning scanner (CVE-2025-54136)
Sandbox bypass scanner (CVE-2025-59532)
Command injection scanner (CVE-2026-35022)
OAuth hardening scanner
SARIF output for GitHub Code Scanning
Pre-push git hook installer
duh security scan · duh security init

Layer 2 Runtime Hardening

Taint propagation — UntrustedStr tags every string by origin and propagates through all string ops
Confirmation tokens — HMAC-bound, prevent tainted strings reaching Bash/Write/Edit without user confirmation
Lethal trifecta check — blocks sessions with simultaneous read-private + read-untrusted + network-egress
MCP Unicode normalization (GlassWorm defense)
Per-hook filesystem namespacing
PEP 578 audit hook bridge (sub-500ns)
Signed plugin manifests (TOFU + sigstore-ready)

Layer 3 OS Sandboxing

macOS Seatbelt — sandbox-exec profiles for shell commands and MCP servers
Linux Landlock — syscall-level filesystem access control
Network policy layer — blocks outbound traffic unless explicitly allowed
Approval modes: suggest / auto-edit / full-auto
MCP subprocess sandboxing via host OS facilities
Provider differential fuzzer (Hypothesis property tests)

Deep dive into security →

Providers

Any model. Zero lock-in.

D.U.H.'s kernel never imports a provider directly. Providers are adapters at the edge — swap them without touching your workflow.

🔮

Anthropic Claude

API key or /connect

🤖

OpenAI API

GPT-4o, o1, o3

💬

ChatGPT / Codex

PKCE OAuth — no API key

🦙

Ollama

Local — any pulled model

🧪

Stub Provider

Deterministic CI testing

🌐

litellm / OpenRouter

100+ models via proxy

Quick Start

Up in 60 seconds.

Terminal

# Install
pip install duh-cli

# Set your Anthropic key
export ANTHROPIC_API_KEY=sk-ant-...

# Run a task (print mode — great for scripts)
duh -p "fix the bug in auth.py"

# Interactive REPL
duh

# Use a different model / provider
duh --provider openai --model gpt-4o -p "refactor db.py"

# Local model with Ollama
duh --provider ollama --model qwen2.5-coder -p "write tests"

# Run diagnostics
duh doctor

# Security scan (SARIF output)
duh security scan

Full Getting Started Guide →

By the numbers

Built for production alpha.

4160

Tests passing

100%

Line coverage

Architecture Decision Records

Built-in tools

Provider adapters

Hook event types

Vulnerability scanners

2.3x

Test:code ratio

The Universal AICoding Agent

Everything you need.Nothing you don't.