Loading...
Loading...
Every autonomous action — evaluated, enforced, and cryptographically proven before it executes.
As autonomous AI agents progress from experimental assistants toward real-world actuators — executing financial transactions, modifying production systems, managing patient data, and making legal determinations — a critical governance gap emerges. Models can generate decisions, but no independent layer exists to evaluate, enforce, and prove whether those decisions should be allowed to execute.
OmegaEngine addresses this gap. It is a pre-execution governance platform that sits between AI agents and the actions they propose. Every action passes through a multi-stage pipeline of safety analysis, policy enforcement, multi-model consensus, risk scoring, and cryptographic audit — producing a verifiable, tamper-evident decision record before any real-world effect occurs.
Foundation models produce increasingly capable reasoning, but their integration into autonomous workflows exposes a fundamental tension: capability without governance. An AI agent that can draft a contract, execute a trade, or modify a database brings value proportional to its autonomy — and risk proportional to the absence of controls.
Today, teams address this with ad-hoc solutions: prompt engineering (easily jailbroken), post-hoc logging (damage occurs before review), single-model guardrails (single point of failure), manual approval queues (don't scale), and custom rule engines (brittle, no AI understanding). None provide production-grade, pre-execution governance with cryptographic proof, multi-model consensus, and continuous adversarial testing.
“The fundamental challenge is not building smarter AI — it is building infrastructure that makes AI actions auditable, reversible, and provable before they execute.”
Six immutable invariants — the OmegaConstitution — are declared as immutable code-level constants (lib/constitution.ts). A CI unit test asserts all six are present and true, so a deploy cannot silently drop one; runtime conformance is checked separately by an invariant health monitor at /api/proof/health. The production execution gate is on by default:
“The six invariants are immutable code-level constants asserted in CI — a deploy cannot silently drop one — with runtime conformance monitored separately at /api/proof/health.”
| # | Invariant | Meaning |
|---|---|---|
| 1 | Decision Episodes Required | No action may execute without a corresponding decision record |
| 2 | Audit Logging Required | Every decision must produce a tamper-evident audit trail |
| 3 | Irreversible Actions Require Approval | Destructive operations require explicit human authorization |
| 4 | Enforcement May Not Be Disabled | Safety enforcement cannot be toggled off via configuration |
| 5 | Autonomy Has a Hard Ceiling | Maximum authority levels beyond which human review is mandatory |
| 6 | Human Is Final Authority | Automated systems may recommend but never override a human decision |
OmegaEngine is organized as a layered defense system where each layer operates independently and any layer can halt execution. The six layers are:
“Each layer operates independently — a failure in any single layer cannot cascade to compromise the overall safety posture of the system.”
| Layer | Name | Function |
|---|---|---|
| 0 | Edge Proxy | Kill switch, rate limiting, IP guard, body size limits, CORS, CSP, auth assertion |
| 1 | Firewall | Deterministic compound-phrase blocking across 4 threat categories |
| 2 | Safety System | 22-category detection, multi-language attacks (12+ languages), benign context suppression, optional Tier-2 LLM judge |
| 3 | Decision Engine | 6 base heuristics + 8 extended risk dimensions, temporal analysis, self-verification |
| 4 | Policy Engine | Plan-aware routing, adversarial pattern detection, arbitration-based escalation |
| 5 | Cryptographic Proof | SHA-256 genesis hash, Merkle tree, HMAC-SHA256 signature, timestamp witness |
| Metric | Value |
|---|---|
| API Routes | 511 route files (616 endpoints) |
| Core Library | 756 TypeScript modules |
| React Components | 182 component files |
| Database Models | 93 (PostgreSQL via Prisma) |
| Test Coverage | 8,695+ tests across 538 test files |
| Safety Engine | 3,736 lines · 2,000+ deterministic terms across 22 detection categories (6 families: content, injection, obfuscation, jailbreak, multilingual, exfiltration) |
| Red-Team Attack Library | 5,023 vectors across 32 categories (getStats().totalAttacks; 4,973 unique by id) |
| LLM Providers | 8 supported (OpenAI, Anthropic, Google, Mistral, Groq, DeepSeek, Meta, Custom BYOK) |
Every decision request traverses an ordered pipeline where each stage can independently halt execution:
“The deterministic baseline requires zero external API calls — sub-second decisions even when all LLM providers are unavailable (avg <500ms · p95 <1s in our live benchmark).”
Input → Edge Validation → Firewall → Safety Check → Heuristic Scoring → Extended Heuristics → Calibration → Temporal Analysis → Model Routing → [LLM Inference → Arbitration] → Policy Enforcement → Self-Verification → Provenance Chain → Audit Persistence → Response
| Output Category | Fields |
|---|---|
| Verdict | recommendedAction (PROCEED/PROCEED_WITH_GUARDRAILS/RECONSIDER/ESCALATE), needsHumanReview, ethicsConcern (confidence is a Pro+ object, not a top-level 0–1 number) |
| Risk Assessment | riskScore (0–100), riskLevel, rewardScore, regretScore, safetyScore |
| Extended Heuristics | 8 dimensions: stakeholder impact, cascade risk, regulatory exposure, financial materiality, reputation risk, precedent, uncertainty, option value |
| Temporal Intelligence | Optimal timing window, time decay, opportunity cost/day, decision half-life, deadline proximity, seasonal factors |
| Self-Verification | Devil's advocate arguments, failure scenarios, blind spots, 7 cognitive bias checks, robustness level |
| Audit | Trace ID, HMAC-signed provenance chain, Merkle root, genesis hash |
The first-line safety system is a rules-based, deterministic engine with no external model calls in the hot path. This ensures consistent behavior regardless of LLM availability, single-digit-millisecond latency (~1–2 ms, no network I/O), zero cost per evaluation, and no risk of the safety layer itself being jailbroken — the component that decides ‘is this an attack?’ is not itself a model that can be argued with.
It evaluates input across 22 detection categories (grouped into six families) spanning content safety (self-harm, violence, hate speech), injection attacks (SQL, XSS, command injection, SSRF, path traversal), obfuscation (Base64, ROT13, leetspeak, unicode homoglyphs, zero-width, reversed text), jailbreaks (prompt injection, instruction override, delimiter attacks), multi-language attacks (Chinese, Japanese, Korean, Arabic, Russian, Hindi, and 6 more), and exfiltration (canary tokens, credential harvesting).
Two reliability features matter: word-boundary matching (ASCII terms anchored to boundaries so ‘What is the capital of France?’ no longer trips a command-injection term), and benign-context suppression — when benign confidence ≥ 0.4 and no high-confidence attack signal is present, false-positive tags are suppressed and risk scores are capped, holding the measured benign false-positive rate at 0%.
An optional second tier closes the semantic gap. The deterministic engine catches 81.0% (94/116) of a self-authored 116-attack corpus at 0% false positives (re-measured 2026-06-25; 78.4% as first published); a Tier-2 LLM judge adjudicates only the residual ambiguous cases (roleplay, ‘write a poem that…’, simulation framing), lifting combined recall to 97.4% (113/116) at 0% false positives. The judge is default-off, fail-safe (errors fall back to Tier-1), and asymmetric (it can only upgrade allow→block; a Tier-1 block always stands). Reproduce with npm run dogfood:layered.
| Safety Mode | Behavior |
|---|---|
| STRICT | +10 to all risk scores; strictest interpretation |
| BALANCED | Default thresholds; standard operation |
| LENIENT | −5 from risk scores, but never below a floor of 15 — misconfiguration cannot weaken critical protections |
No single language model should be trusted as the sole authority on a consequential decision. The Arbitration Kernel v5.0 runs decisions through multiple frontier models in parallel, collects structured reasoning, scores for hallucination, applies weighted consensus using domain/cost/performance/ELO trust ratings, detects material disagreements via semantic clustering, cascades on weak consensus with tiebreaker models, and updates trust ratings over time.
“When models disagree, that disagreement itself is information — it signals that the decision has genuine ambiguity that may require human judgment.”
| Signal | Routing Decision |
|---|---|
| Emotionally sensitive content | Claude Sonnet primary (higher alignment) |
| Legal/compliance scenarios | GPT-5.4 primary (precise reasoning) |
| Financial/business context | GPT-5.4 primary (cost-tuned ensemble) |
| Long-term strategy | Llama 4 Scout primary (long-context capability) |
| Low risk tolerance | Wider ensemble (up to 3 cross-checking voters) |
| Strategy | Description |
|---|---|
| Weighted Consensus | Highest composite score (confidence × hallucination penalty × domain × cost × trust) |
| Majority Vote | Most common answer among cluster analysis wins |
| Highest Confidence | Model with highest self-reported confidence wins |
| Conservative | Most cautious (lowest risk) answer wins |
OmegaEngine computes risk from 14 independent factors organized into two tiers (6 + 8 = 14). Tier 1 (Base Heuristics, 6) covers base risk score, reversibility, time pressure, emotional volatility, information completeness, and domain severity. Tier 2 (Extended Dimensions, 8) adds stakeholder impact, cascade risk, regulatory exposure, financial materiality, reputation risk, precedent, uncertainty, and option value.
Each extended dimension contributes a weighted adjustment (−30 to +30) to the base risk score. To avoid tied scores on similar inputs while preserving reproducibility, the dimensions use a deterministic jitter function based on FNV-1a hashing — the same input always produces the same score, but near-identical inputs do not collapse to the same number.
“Risk is not a single number — it is a 14-factor assessment (6 base + 8 extended) capturing stakeholder impact, cascade potential, regulatory exposure, and eleven other independent factors, each surfaced in the response for inspection.”
OmegaEngine produces two distinct kinds of cryptographic evidence, with different trust models. Every decision produces a Merkle-rooted provenance chain that is HMAC-SHA256 signed — a tamper-evident internal audit trail, verifiable by a party holding the signing secret. The component hashes are listed below.
Separately, security attestations and the transparency log are signed with Ed25519 against a published JWKS (/api/jwks). This makes them publicly verifiable: a third party can confirm an attestation's content hash, signature, and Merkle inclusion offline, with no shared secret and no access to our servers — `npx @omegaengine/verify <id>`. Attestations are appended to an RFC 6962-style append-only transparency log that publishes a signed tree head (/api/transparency/sth) and answers inclusion and consistency proofs, so a verifier can prove a result is in the log and that the log was never rewritten.
“‘Verify us — don't trust us.’ The HMAC chain proves the operator's audit trail is intact; the Ed25519 + JWKS + transparency-log construction lets anyone prove an attestation is authentic and unaltered, offline, against a public key.”
| Component | Algorithm | Purpose |
|---|---|---|
| Genesis Hash | SHA-256 | Fingerprint of the original raw input |
| Input Hash | SHA-256 | Hash of the normalized, canonicalized input |
| Heuristics Hash | SHA-256 | Hash of the scoring weights used |
| Policy Version Hash | SHA-256 | Hash of the policy version applied |
| Model Fingerprint | SHA-256 | Hash of the model(s) that contributed |
| Merkle Root | SHA-256 (iterative) | Tamper-evident composite of all component hashes |
| HMAC Signature | HMAC-SHA256 | Signed for authenticity verification |
| Timestamp Witness | Configurable | Local clock by default; external witness (database / RFC 3161) under dual/strict via OMEGA_TIMESTAMP_WITNESS |
The policy engine is an enterprise-grade, tenant-aware, plan-aware risk router. It evaluates decisions against a strict hierarchy: adversarial pattern detection (always runs, cannot be disabled) → hard safety blocks (CSAM, extremism, self-harm) → sensitive domain enforcement → sensitive tag enforcement → arbitration-based escalation → role/plan routing → default allow.
The policy version is auto-computed from a SHA-256 hash of the hard block tag list, so when rules change, the version changes automatically and is captured in every subsequent provenance chain. If any exception occurs during evaluation, the system returns REQUIRE_HUMAN_REVIEW — never silently allowing a request that couldn't be properly evaluated. The engine ships 27 industry presets (banking, healthcare, government, legal, education, SaaS, cybersecurity, retail, and more) as tenant-configurable starting points with dual-control enforcement.
“Adversarial pattern detection runs before all other policy checks and cannot be disabled — even if the compliance engine is turned off.”
| Plan | LOW Risk | MEDIUM Risk | HIGH Risk | CRITICAL |
|---|---|---|---|---|
| Free | Allow | Review (sensitive) | Block (sensitive) | Block |
| Pro | Allow | Allow | Review | Block |
| Enterprise | Allow | Allow (logged) | Review | Block |
The local judgment model implements 13 independent defense layers: obfuscation detection, safety analysis (multi-category heuristic patterns), firewall, multi-language attack detection (12+ languages), attack classification (12+ types), adversarial suffix detection, prompt leakage prevention, context injection detection, homoglyph normalization, semantic similarity detection, canary token detection, payload structure analysis, and multi-signal fusion.
Every decision also includes built-in red-teaming via adversarial self-verification: devil's advocate arguments, domain-specific failure scenarios, blind spot detection (6 categories), 7 cognitive bias checks (confirmation, survivorship, anchoring, availability, sunk cost, recency, optimism), and an adversarial robustness score (0–100).
“Every decision red-teams itself before responding — generating devil's advocate arguments and checking for 7 cognitive biases that could compromise judgment quality.”
The system supports three-level feedback (GOOD/NEUTRAL/BAD) used for OmegaGrade quality scoring per domain (S/A+/A/B/C/D), scoring weight calibration, RLHF training data export, and topic analytics.
Historical performance of each model is tracked via OmegaIQ scores (0–100) and ELO trust ratings. These feed back into the arbitration kernel's weighting, creating a self-improving model selection system.
The temporal analysis module adds time-awareness: optimal timing windows (ACT_NOW through DELAY_RECOMMENDED), time decay scoring, opportunity cost per day with domain multipliers, decision half-life (24h to 1 year), deadline proximity detection, and seasonal factor awareness (quarters, holidays, budget cycles).
All requests pass through a unified edge proxy orchestrating: kill switch (system-wide emergency disable), global rate limiting, access gate, 1MB body size limits, admin CORS enforcement, security headers (CSP, HSTS, X-Frame-Options), UUID request ID propagation, API versioning with Sunset/Deprecation headers, and auth assertion tracking.
The PostgreSQL schema comprises 93 models with comprehensive indexing on high-cardinality columns and compound keys. Configurable per-organization retention policies (7–365 days or unlimited on Enterprise). The core is Apache-2.0 and self-hostable — you can run the identical API in your own VPC.
Data handling: a decision record persists the scenario text and raw input/output (JSON), the verdict, and tenant scoping; PII redaction (lib/pii.ts) is available, retention is configurable per org, and secrets are encrypted at rest with AES-256-GCM (KMS-backed for enterprise). BYOK provider keys supplied by header are request-scoped (AsyncLocalStorage) and not persisted; dashboard-saved keys are encrypted at rest with only a prefix kept for display. For strict data-residency or air-gap requirements, self-host the Apache-2.0 core in your own VPC/region — there is no certified residency guarantee on the hosted path. See the whitepaper §13 and the Data Processing Agreement.
| Target | Configuration |
|---|---|
| Vercel (recommended) | Auto-deploy: main → staging → production promotion |
| Docker | docker-compose.yml with PostgreSQL + application |
| Kubernetes | Full k8s manifests with Helm charts |
| Self-hosted | Deployment checklist with hardening guide |
OmegaEngine ships built-in compliance test mappings for 17 frameworks: OWASP LLM Top 10, OWASP Agentic Top 10, NIST AI RMF, EU AI Act, ISO 42001, SOC 2, HIPAA, PCI-DSS, ISO 27001, GDPR, CCPA, FedRAMP, CISA AI, Singapore PDPA, Brazil LGPD, Japan APPI, and the Australia Privacy Act.
All audit records are signed with HMAC-SHA256, providing tamper evidence (any modification invalidates the signature), non-repudiation (prove a specific decision at a specific time), and chain integrity (Merkle trees enable batch verification).
Designed for production-grade latency and throughput:
| Path | P99 Latency | Description |
|---|---|---|
| Deterministic (no LLM) | sub-second (avg <500ms · p95 <1s) | Free tier, safety-only evaluation |
| Single-model decision | < 2s | One LLM call + full pipeline |
| Multi-model arbitration | < 5s | Parallel execution, bounded by slowest model |
| Local safety analysis | ~1–2 ms | Rules-based, no network I/O |
| Circuit breaker fast-fail | 0ms | Instant rejection when provider is OPEN |
A governance layer is only as good as the threats it actually defends. The assets protected are the integrity of each verdict, the audit trail, the authenticity of attestations, the Ed25519 signing key, and tenant/BYOK provider keys.
LLM providers are treated as untrusted (hence arbitration and BYOK isolation); the deterministic safety + policy core is trusted and self-hostable; the signing key never leaves the server; and the only thing a verifier must trust is the public JWKS. Out of scope: OmegaEngine governs the decision to act — it does not execute the action, and the integrating agent must treat denied/escalated as blocking.
“Providers are untrusted by design — no single model is ever a point of trust. The only public root of trust is a published key anyone can read.”
| Adversary | Capability | Primary defense |
|---|---|---|
| Jailbreaking agent / prompt injection | Crafts inputs (directly or via poisoned retrieval) to elicit a forbidden action | Deterministic Tier-1 safety + optional judge + policy hard-blocks |
| Malicious tenant | Escalate plan/credits or read another tenant's data | RBAC + scope/role/plan guards; per-tenant isolation; Stripe-verified entitlements |
| Compromised model provider | Returns wrong or adversarial output | Multi-model arbitration + disagreement detection + circuit breakers |
| Tamperer / repudiator | Alters or denies a past decision or attestation | HMAC + Merkle chain; Ed25519 + transparency log for public attestations |
OmegaEngine is early-stage, and a top-tier system states what it does not yet do. We do not hold formal certifications (SOC 2, HIPAA, ISO, FedRAMP) — compliance mappings are self-assessed control readiness, and MITRE ATLAS mapping is self-assessed.
Safety recall is not 100%: the deterministic tier catches ~81% (94/116) at 0% false positives, and the combined two-tier system reaches 97.4% — but the LLM judge that closes the gap is default-off and adds latency and cost. The base decision engine is heuristic, not a trained model (consistent, free, unjailbreakable — but it relies on the LLM tier and arbitration for novel semantics). HMAC audit chains require the signing secret to verify; for verification by untrusted third parties, use the Ed25519 attestations and transparency log. And governance is not enforcement of the downstream effect — honoring a verdict is the integrating system's responsibility.
“We would rather tell you a limit than have you discover it in production. Every number in this whitepaper is reproducible from the source tree or a named benchmark.”
OmegaEngine is, to our knowledge, the only platform that combines all of these capabilities at once:
| Capability | Prompt Eng. | Post-Hoc Logs | Single Guard | Custom Rules | OmegaEngine |
|---|---|---|---|---|---|
| Pre-execution gating | ✗ | ✗ | ✓ | ✓ | ✓ |
| Multi-model consensus | ✗ | ✗ | ✗ | ✗ | ✓ |
| Cryptographic audit | ✗ | Partial | ✗ | ✗ | ✓ |
| Publicly verifiable attestations | ✗ | ✗ | ✗ | ✗ | ✓ (Ed25519 + JWKS) |
| Adversarial testing | ✗ | ✗ | ✗ | ✗ | ✓ |
| Fail-closed safety | ✗ | N/A | Partial | ✗ | ✓ |
| Cognitive bias detection | ✗ | ✗ | ✗ | ✗ | ✓ (7 biases) |
| Compliance mapping | ✗ | Partial | ✗ | ✗ | ✓ (17 frameworks) |
| Quarter | Focus |
|---|---|
| Q1 2026 ✅ | RBAC + scoping, cognitive memory, decision cache, circuit breakers, 57 compliance controls, SARIF red team export |
| Q2 2026 🔧 | Terminal Pro, SDK v1.0 (npm + PyPI), sandbox rebuild, quickstart wizard, badge API, webhook integrations, per-seat enterprise billing |
| Q3 2026 | Agent-to-agent security testing, SIEM integrations, insurance underwriting APIs, multi-region deployment |
| Q4 2026 | Blockchain anchoring, self-hosted enterprise edition, on-premises deployment toolkit |
OmegaEngine provides the governance layer that makes autonomy auditable, safe, and provable. Every decision is evaluated, every action is enforced, and every outcome is cryptographically proven.
@software{omegaengine2026,
title = {OmegaEngine: Decision Infrastructure for Autonomous AI},
author = {Mahoney, Jesse},
year = {2026},
url = {https://omegaengine.ai/whitepaper},
version = {2.0}
}© 2026 OmegaEngine. Apache-2.0 License.
v2.0 · Updated June 2026 · 18 min read