Red Team API
Adversarial safety scanning for your agents and models. A campaign drives a set of probes (prompt injection, jailbreak, data extraction, …) against a target, an LLM judge decides each attempt, and the run reports an attack-success / resilience score per OWASP-LLM category — so a CI pipeline can fail when an agent regresses on safety.
The client surface (/api/client/v1/redteam/*) is read-and-trigger only: it lists the built-in probe catalog and configured campaigns, launches scans, and reads runs. Authoring campaigns and custom probes stays on the dashboard surface (/api/redteam/*). All client endpoints are API-token authenticated with a cpeer_ bearer token and are scoped to the token's project.
GET /api/client/v1/redteam/probes – built-in probe catalog
GET /api/client/v1/redteam/campaigns – list configured campaigns
POST /api/client/v1/redteam/campaigns/:key/scan – launch a scan (async)
GET /api/client/v1/redteam/runs – list runs
GET /api/client/v1/redteam/runs/:id – one run + per-attempt verdictsConcepts
- Probe — a generator for one vulnerability class (e.g.
prompt-injection,jailbreak). Each probe carries an OWASP-LLMcategoryand aseverity(low | medium | high | critical). - Custom probes — user-authored probes (created on the dashboard) are referenced with a
custom:key prefix (e.g.custom:my-leak-test). This client surface lists only built-in probes; custom probes still run when selected on a campaign. - Campaign — a named, reusable scan config: target (agent or model), the
probe_keysto run, and the judge model. - Run — one execution of a campaign. Each attempt gets a three-state verdict:
safe,vulnerable, orneeds_review. The run aggregate rolls those up into anattackSuccessRate(vulnerable / completed) andresilienceScore(1 - attackSuccessRate), broken down by severity and OWASP category.
List probes
GET /api/client/v1/redteam/probes
Authorization: Bearer cpeer_…Returns the built-in probe catalog. custom is false for every entry here (custom probes are not advertised on the client surface).
Response
{
"probes": [
{
"key": "prompt-injection",
"name": "prompt-injection",
"family": "prompt-injection",
"category": "LLM01-prompt-injection",
"severity": "high",
"description": "Attempts to override the system prompt via injected instructions.",
"custom": false
},
{
"key": "sensitive-info-disclosure",
"name": "sensitive-info-disclosure",
"family": "pii-leak",
"category": "LLM06-sensitive-information-disclosure",
"severity": "high",
"description": "...",
"custom": false
}
]
}Built-in probe keys: prompt-injection, encoding-injection, jailbreak, sensitive-info-disclosure, pii-generation, data-extraction, insecure-output-handling, excessive-agency, overreliance-hallucination.
OWASP-LLM categories used by the catalog: LLM01-prompt-injection, LLM02-insecure-output-handling, LLM06-sensitive-information-disclosure, LLM08-excessive-agency, LLM09-overreliance.
List campaigns
GET /api/client/v1/redteam/campaigns
Authorization: Bearer cpeer_…Lists the campaigns configured in the token's project.
Response
{
"campaigns": [
{
"key": "nightly-agent-scan",
"name": "Nightly agent scan",
"description": "Full OWASP sweep against the support agent",
"target_kind": "agent",
"agent_key": "support-agent",
"model_key": null,
"probe_keys": ["prompt-injection", "jailbreak", "custom:my-leak-test"],
"judge_model_key": "gpt-4o",
"created_at": "2026-06-01T09:00:00.000Z"
}
]
}| Field | Type | Notes |
|---|---|---|
key | string | Campaign identifier; used in the scan path. |
target_kind | agent | model | What the campaign tests. |
agent_key / model_key | string | null | The target reference (one is set, per target_kind). |
probe_keys | string[] | Selected probes. Empty means all built-ins run. custom:-prefixed keys reference custom probes. |
judge_model_key | string | null | LLM judge for the campaign (also drives adaptive attacker turns). |
Launch a scan
POST /api/client/v1/redteam/campaigns/:key/scan
Authorization: Bearer cpeer_…Enqueues an asynchronous scan for the campaign identified by :key and returns immediately with a pending run. No request body is required. Poll the run-detail endpoint to watch it finish.
| Param | In | Required | Notes |
|---|---|---|---|
key | path | yes | Campaign key. |
Response — 202 Accepted
{
"run": {
"id": "665f1a2b3c4d5e6f70819a2b",
"campaign_key": "nightly-agent-scan",
"target_kind": "agent",
"target_ref": "support-agent",
"status": "pending",
"aggregate": null,
"started_at": null,
"finished_at": null,
"created_at": "2026-06-15T12:00:00.000Z"
},
"status": "pending"
}If a scan for the same campaign is already running, the request is rejected with 409 Conflict.
List runs
GET /api/client/v1/redteam/runs?campaign_key=nightly-agent-scan&limit=20
Authorization: Bearer cpeer_…Returns run summaries (no per-attempt detail), newest first.
| Query | Type | Notes |
|---|---|---|
campaign_key | string | Filter to one campaign. |
limit | int | Max rows; clamped to 1..200. |
Response
{
"runs": [
{
"id": "665f1a2b3c4d5e6f70819a2b",
"campaign_key": "nightly-agent-scan",
"target_kind": "agent",
"target_ref": "support-agent",
"status": "completed",
"aggregate": {
"total": 48,
"completed": 48,
"failed": 0,
"vulnerable": 3,
"safe": 43,
"needsReview": 2,
"attackSuccessRate": 0.0625,
"resilienceScore": 0.9375,
"bySeverity": { "low": 0, "medium": 1, "high": 2, "critical": 0 },
"byCategory": {
"LLM01-prompt-injection": { "total": 12, "vulnerable": 2, "needsReview": 0 },
"LLM06-sensitive-information-disclosure": { "total": 8, "vulnerable": 1, "needsReview": 1 }
},
"avgLatencyMs": 742.0
},
"started_at": "2026-06-15T12:00:01.000Z",
"finished_at": "2026-06-15T12:03:20.000Z",
"created_at": "2026-06-15T12:00:00.000Z"
}
]
}status is one of pending, running, completed, failed, cancelled. aggregate is null until the run completes.
Run detail
GET /api/client/v1/redteam/runs/:id
Authorization: Bearer cpeer_…Returns the run summary plus progress, any fatal error, and the per-attempt verdicts.
Response
{
"run": {
"id": "665f1a2b3c4d5e6f70819a2b",
"campaign_key": "nightly-agent-scan",
"target_kind": "agent",
"target_ref": "support-agent",
"status": "completed",
"aggregate": { "...": "see List runs" },
"started_at": "2026-06-15T12:00:01.000Z",
"finished_at": "2026-06-15T12:03:20.000Z",
"created_at": "2026-06-15T12:00:00.000Z",
"progress": { "total": 48, "completed": 48, "failed": 0 },
"error": null,
"attempts": [
{
"probe_key": "prompt-injection",
"attempt_id": "pi-003",
"family": "prompt-injection",
"category": "LLM01-prompt-injection",
"severity": "high",
"outcome": "vulnerable",
"machine_outcome": "vulnerable",
"decided_by": "deterministic-canary",
"confidence": 0.97,
"reviewed": false,
"latency_ms": 810,
"error": null
}
]
}
}| Attempt field | Type | Notes |
|---|---|---|
outcome | safe | vulnerable | needs_review | Effective verdict — a human review override, if present, wins. |
machine_outcome | safe | vulnerable | needs_review | The engine's original verdict before any review. |
decided_by | string | Which decision rule fired (audit trail). |
confidence | number | Verdict confidence in [0, 1]. |
reviewed | boolean | True when a human reviewed this attempt on the dashboard. |
latency_ms | number | Target invocation latency. |
error | string | null | Set when the target invocation itself threw (counts toward failed). |
Errors
| Status | Cause |
|---|---|
| 400 | Missing campaign key on a scan request. |
| 401 | Missing or invalid API token. |
| 404 | Campaign or run not found. |
| 409 | A scan for this campaign is already in progress. |
| 500 | Internal error. |
Example
# Launch a scan and capture the run id
RUN_ID=$(curl -s -X POST \
https://api.cognipeer.com/api/client/v1/redteam/campaigns/nightly-agent-scan/scan \
-H "Authorization: Bearer cpeer_your_token" | jq -r '.run.id')
# Poll until it finishes, then read the resilience score
curl -s https://api.cognipeer.com/api/client/v1/redteam/runs/$RUN_ID \
-H "Authorization: Bearer cpeer_your_token" \
| jq '{status: .run.status, resilience: .run.aggregate.resilienceScore, vulnerable: .run.aggregate.vulnerable}'