PR1M3Claw
Enterprise secure AI agent governance platform
Private repository — active development
“Morality is not a speech. It is a switch statement.”
AI agents are gaining autonomy, but the safety mechanisms guarding them are still just system prompts — speeches about why doing bad things is wrong. I’m building an enterprise-secure agent host that compiles morality into WASM sandboxes, Zod schemas, and Biscuit Datalog rules. A speech can be argued with. Physics cannot. PR1M3Claw is a deterministic governance architecture where an LLM is allowed to execute only what the cryptographic trust chain permits, only within a capability-scoped sandbox, and only after a Constitutional Supervisor co-signs the intent.
The Problem
What needed solving
Current AI safety relies on system prompts and content filters — probabilistic guardrails that can be bypassed with creative prompt engineering. As enterprises deploy autonomous AI agents that make procurement decisions, modify employee records, and execute financial transactions, the gap between “the model usually follows instructions” and “the system physically prevents unauthorized actions” becomes a critical liability.
The industry lacks a governance layer that enforces constraints at the infrastructure level rather than the conversational level. Enterprises need deterministic safety — not “the model was told not to do that,” but “the system architecture makes it physically impossible.”
The Solution
How I approached it
PR1M3Claw implements a 5-layer Guardian Architecture that enforces safety through physics, not persuasion. Layer 1 uses Biscuit cryptographic tokens with Ed25519 signatures and Datalog attenuation rules — sub-agents are mathematically provable subsets of their parent’s authority, making privilege escalation impossible by construction.
Layers 2 through 5 add a WASM capability sandbox with deny-all defaults (the execution boundary), a Rust semantic firewall with Llama-Guard-3 sidecar evaluation, a Constitutional Supervisor that co-signs every action via gRPC, and circuit breakers with health-score-based anomaly detection. Every layer operates independently — compromising one does not compromise the chain.
How It Works
Under the hood
At the core is Ternary Moral Logic (TML) — a deterministic state machine where every agent action resolves to one of four states: Permit (+1), Sacred Pause (0), Prohibit (−1), or Self-Sacrifice (−2). There is no “probably safe.”
When an agent issues a tool call, the payload first passes through the Semantic Firewall — a Rust hyper proxy that canonicalizes Unicode (NFKC), scans for canary token leaks, sanitizes HTML via Archon patterns, and runs the content through Llama-Guard-3 for harm classification. If the firewall passes, the Constitutional Supervisor evaluates the action against TML rules and signs the payload with Ed25519.
Only then does the pipeline internally invoke the WASM Cage to provision a sandbox with the exact capabilities required for that specific intent hash. If at any point the system’s health score drops below 0.1, the Self-Sacrifice protocol activates: cryptographic keys are zeroized, storage is wiped, and the process is killed.
Multi-jurisdiction compliance (GDPR, CCPA, PIPEDA, Quebec Law 25) is encoded as TypeScript interfaces and Zod schemas — a missing lawful basis field is a compile-time error, not a policy violation discovered in an audit.
Architecture
The 5-Layer Guardian Stack
Cryptographic Identity
Biscuit tokens with Ed25519 signatures and Datalog attenuation rules. Sub-agents are mathematically provable subsets of their parent's authority.
Biscuit Tokens · Ed25519 · DatalogWASM Capability Sandbox
Each agent action runs in a Wasmtime sandbox with deny-all defaults. Only the exact capabilities required for the intent hash are provisioned.
Wasmtime · WASI · Capability-basedSemantic Firewall
A Rust hyper proxy that canonicalizes Unicode, scans for canary tokens, sanitizes content, and runs harm classification via Llama-Guard-3.
Rust · hyper · Llama-Guard-3Constitutional Supervisor
Every agent action is evaluated against Ternary Moral Logic rules and co-signed via gRPC before execution.
gRPC · TML EngineCircuit Breakers
Health-score-based anomaly detection protecting the chain. Drop below 0.1 health and the system zeroizes keys and self-sacrifices.
Token Bucket · 3σ AnomalyTML Engine
Ternary Moral Logic States
Permit
Action passes all checks and is authorized to execute
Sacred Pause
Insufficient context — action is held for human review
Prohibit
Action violates policy constraints and is blocked
Self-Sacrifice
System compromise detected — keys zeroized, storage wiped
Impact
Results and outcomes
PR1M3Claw introduces a fundamentally different approach to AI safety — one where constraints are architectural rather than conversational. The 5-layer architecture is fully implemented through Milestone 12, with 9 TypeScript packages and 4 Rust crates, full CI/CD, and a Next.js 15 governance dashboard. The platform supports multi-jurisdiction privacy compliance across EU, UK, Canadian, Australian, and US regulatory frameworks.
Version 0.9.5 has shipped. The project is currently transitioning into an enterprise-secure agent host (M15), with OpenTelemetry observability, live dashboard wiring, and red team hardening as the final milestones before production deployment. The codebase is private during active development and security testing.