Static attack-surface analysis · deterministic · reproducible

The Agent Security Leaderboard

Six agent blueprints — the tool stacks teams ship first — scored by OmegaEngine's static threat-model analyzer. None scores above a C. The most exposed (devops / infra agent) ships 3 dangerous tools out of 5, with no approval gate.

DevOps / infra agent

Deploys services, rotates credentials, manages environments.

exfiltration ×1destructive ×1privilege ×1

3/5 dangerous tools

3 targeted attacks synthesized

surface42/100

Finance / payments agent

Treasury copilot: invoices, payouts, transfers, refunds.

financial ×4has approval tool

4/5 dangerous tools

4 targeted attacks synthesized

surface48/100

Customer-support agent

Help-desk copilot with ticketing, email, billing and account tools.

exfiltration ×1financial ×1

2/6 dangerous tools

2 targeted attacks synthesized

surface53/100

Coding agent

Edits the repo, opens PRs, manages branches in CI.

exfiltration ×1destructive ×1

2/5 dangerous tools

2 targeted attacks synthesized

surface58/100

Email / inbox assistant

Reads the inbox, drafts and sends replies, forwards threads.

exfiltration ×2

2/5 dangerous tools

2 targeted attacks synthesized

surface66/100

RAG knowledge assistant

Answers questions over internal docs; posts summaries to chat.

exfiltration ×2

2/4 dangerous tools

2 targeted attacks synthesized

surface66/100

Methodology

Each blueprint's tool schema is classified by the same open-source analyzer the agent-scanCLI uses (capability keywords over tool names, descriptions and parameters: destructive > privilege > financial > exfiltration), and a targeted attack case is synthesized per dangerous tool. The surface score starts at 100 and deducts per dangerous tool by severity, plus the two couplings real exploits use — an exfiltration tool alongside data-read tools, and financial tools without an approval tool.

This measures the blueprint's surface, not a live agent's behavior. What your agent actually does under these attacks is what the live scan measures — and what the Agent-Security Benchmark tracks across defense levels.

Interactive · runs in your browser

Score your blueprint right here.

One tool per line (name: description). The same analyzer, running locally — nothing leaves this page.

attack surface53/100

2/4 dangerous tools · no approval tool

Attacks the live scan would run

→ Exfiltration via send_email
→ Unapproved financial action via issue_refund

Run the live scan →Share your grade

Where does your agent land?

Run the same analyzer on your real tool schema — free, offline, no key. Then drive your live agent through every synthesized attack and get a signed report.

$ npx @omegaengine/agent-scan --config agent.jsonRun it in the browser →

Ω OmegaEngine

Static attack-surface analysis · deterministic · reproducible

The Agent Security Leaderboard

DevOps / infra agent

Deploys services, rotates credentials, manages environments.

exfiltration ×1destructive ×1privilege ×1

3/5 dangerous tools

3 targeted attacks synthesized

surface42/100

Finance / payments agent

Treasury copilot: invoices, payouts, transfers, refunds.

financial ×4has approval tool

4/5 dangerous tools

4 targeted attacks synthesized

surface48/100

Customer-support agent

Help-desk copilot with ticketing, email, billing and account tools.

exfiltration ×1financial ×1

2/6 dangerous tools

2 targeted attacks synthesized

surface53/100

Coding agent

Edits the repo, opens PRs, manages branches in CI.

exfiltration ×1destructive ×1

2/5 dangerous tools

2 targeted attacks synthesized

surface58/100

Email / inbox assistant

Reads the inbox, drafts and sends replies, forwards threads.

exfiltration ×2

2/5 dangerous tools

2 targeted attacks synthesized

surface66/100

RAG knowledge assistant

Answers questions over internal docs; posts summaries to chat.

exfiltration ×2

2/4 dangerous tools

2 targeted attacks synthesized

surface66/100

Methodology

Interactive · runs in your browser

Score your blueprint right here.

One tool per line (name: description). The same analyzer, running locally — nothing leaves this page.

attack surface53/100

2/4 dangerous tools · no approval tool

Attacks the live scan would run

→ Exfiltration via send_email
→ Unapproved financial action via issue_refund

Run the live scan →Share your grade

Where does your agent land?

Run the same analyzer on your real tool schema — free, offline, no key. Then drive your live agent through every synthesized attack and get a signed report.

$ npx @omegaengine/agent-scan --config agent.jsonRun it in the browser →