Ardur
Proof & Test Results
Real-world evidence that Ardur's governance works.
Ardur doesn’t ask you to trust marketing claims. Here’s the actual data from tests with real models.
Cloud Model Governance Test
We asked a cloud model with 1-trillion-parameter capacity to build a complete Code Repository Manager — a mini-GitHub clone with repositories, commits, branches, issues, pull requests, search, and an admin dashboard. 20 files, ~2000+ lines of code each, all Python stdlib.
Every single tool call went through Ardur first.
Results
| Model | Type | Duration | Tool Calls | Files Built | Denials |
|---|---|---|---|---|---|
| Cloud Model | Cloud · 1T params | 12 min | 35 | 18 of 20 | 0 |
| Local Model | Local · 5GB | 15 min | 4 | 4 of 20 | 0 |
What this proves
- Zero false denials. The proxy never blocked a legitimate tool call.
Every
PERMITwas correct. - Negligible overhead. Ardur added ~4ms to each tool call. The model’s thinking time dominated — governance is not the bottleneck.
- Works with cloud and local models. Both Ollama cloud API and local Ollama models work without changes.
- Handles sustained workloads. The cloud model ran for 12 minutes across 35 tool calls without a single governance failure.
How to reproduce
# Set your Ollama API key
export ARDUR_OLLAMA_API_KEY="your-key"
# Run the test
cd python
PYTHONPATH=. python tests/run_cloud_model_test.py "$MODEL_NAME"
# Results land in tests/test-results/
The test script and all result data are in the repo at
python/tests/run_cloud_model_test.py and python/tests/test-results/.
Go AAT Engine Tests
The Go AAT package has 49 tests covering the full Attenuating Authorization Token specification:
- All 13 constraint types (Exact, Pattern, Range, OneOf, etc.)
- Cross-type constraint subsumption
- Root AAT issuance with holder binding
- Child derivation with depth tracking and parent-chain linking
- Proof of Possession (PoP) JWT round-trips
- Full §7 chain verification (8-step algorithm)
cd go && go test ./pkg/aat/... -v
# 49 passed, 0 failures
CI & Automated Checks
Every push and PR runs:
| Check | What it does |
|---|---|
| Python 3.10 + 3.13 | Full pytest suite |
| Go | go test ./... + go vet ./... |
| CodeQL | Static analysis for Python + Go |
| Secret scan | gitleaks + forbidden-term gate |
| Link check | lychee on all markdown links |
| Format validation | JSON + YAML parse on all files |
| Hugo build | Site builds cleanly |
The honest caveat
What you see above is real, but the test surface is not yet exhaustive:
- The cloud model test runs 30-turn sessions — longer runs are possible but not yet in the automated suite.
- Kernel-level capture (eBPF) is implemented but the full integration test harness isn’t public yet.
- These are single-session tests — multi-tenant, concurrent session testing is on the roadmap.
The coverage map and known limitations keep the full picture current.