Skip to content

Analysis API

Conversation analysis: extract structured fields from transcripts, judge quality against a rubric, and score extraction accuracy against ground truth — on demand or on a nightly cron. Built for use cases like IVR/call-center review.

All endpoints are under /api/analysis/* and are session-authenticated. Requests are tenant- and project-scoped from the session.

Concepts

definition    →  fieldSet + extraction prompt + modes (+ models, + schedule)
conversation  →  an ingested transcript { role, content }[] (+ referenceFields)
run           →  one execution of a definition over a set of conversations

Modes

ModeEffect
extractAlways on — pulls the fieldSet out of each transcript as typed JSON.
storeWrites the extracted fields back onto each conversation (extractedFields, lastAnalyzedAt).
judgeAn LLM grades each conversation against judge.rubric (0–1).
accuracyCompares extracted fields to the conversation's referenceFields, per field.

Definitions

http
GET    /api/analysis/definitions?search=intent
POST   /api/analysis/definitions
GET    /api/analysis/definitions/:id
PATCH  /api/analysis/definitions/:id
DELETE /api/analysis/definitions/:id

Create

json
{
  "name": "Call intent & resolution",
  "fieldSet": [
    { "key": "intent", "type": "enum", "enumValues": ["billing", "support"], "required": true },
    { "key": "resolved", "type": "boolean" }
  ],
  "extractionInstructions": "Focus on the caller's primary reason.",
  "modes": { "store": true, "accuracy": true, "judge": { "rubric": "Was the caller helped politely?" } },
  "extractionModelKey": "gpt-4o-mini",
  "judgeModelKey": "gpt-4o",
  "schedule": { "cron": "0 2 * * *", "enabled": true }
}
FieldTypeRequiredNotes
namestringyeskey is slugified from it.
fieldSetarrayyesEach { key, type, description?, enumValues?, required? }; typestring | number | boolean | enum. enum needs enumValues.
modesobjectyes{ store?, accuracy?, judge?: { rubric, threshold? } }.
extractionModelKeystringrecommendedModel used for extraction (required at run time).
judgeModelKeystringwhen modes.judgeModel used for grading.
scheduleobjectno{ cron, enabled }. Validated with a standard 5-field cron expression (UTC).

Conversations

http
GET    /api/analysis/conversations?search=refund&limit=100
POST   /api/analysis/conversations
GET    /api/analysis/conversations/:id
DELETE /api/analysis/conversations/:id

Ingest

Accepts a single conversation or { "conversations": [...] } for bulk import (e.g. an external call export).

json
{
  "conversations": [
    {
      "name": "Call 1042",
      "transcript": [
        { "role": "caller", "content": "I was charged twice." },
        { "role": "agent", "content": "I've issued a refund." }
      ],
      "referenceFields": { "intent": "billing", "resolved": true }
    }
  ]
}
FieldTypeRequiredNotes
transcriptarrayyesNon-empty { role, content }[].
referenceFieldsobjectnoGround truth for the accuracy mode.
namestringnoDisplay name; key is slugified/auto-generated.
sourceimported | platform | manualnoDefaults to imported.

Runs

Run a definition

http
POST /api/analysis/definitions/:key/run
json
{
  "selection": {
    "strategy": "keys",
    "conversationKeys": ["call-1042", "call-1043"]
  }
}

The request body must include a selection object describing which conversations to analyze:

FieldTypeRequiredNotes
strategystringyesOne of all | tag | random | unanalyzed | keys.
tagstringwhen strategy: tagTag to filter conversations by.
sampleSizenumberwhen strategy: randomNumber of conversations to sample.
conversationKeysstring[]when strategy: keysExplicit conversation keys to run.

The run is asynchronous: it is enqueued and returns immediately with HTTP 202 Accepted and a pending run. Extraction then runs per conversation with bounded concurrency, followed by optional judge and accuracy; the run is persisted with an aggregate once it finishes. Poll GET /api/analysis/runs/:id to watch it complete.

Response — 202 Accepted

json
{
  "run": {
    "id": "…",
    "definitionKey": "call-intent-resolution",
    "status": "pending",
    "aggregate": null,
    "items": []
  }
}

Once completed, the run's aggregate and items are populated:

json
{
  "run": {
    "id": "…",
    "definitionKey": "call-intent-resolution",
    "status": "completed",
    "aggregate": { "total": 120, "completed": 118, "failed": 2, "passed": 110, "passRate": 0.932, "avgJudgeScore": 0.88, "avgExtractionAccuracy": 0.91 },
    "items": [
      { "conversationKey": "call-1042", "passed": true, "extractedFields": { "intent": "billing", "resolved": true }, "missing": [], "judge": { "score": 0.9, "passed": true }, "accuracy": { "score": 1, "comparedCount": 2, "perField": { "intent": { "expected": "billing", "actual": "billing", "match": true } } } }
    ]
  }
}

avgJudgeScore / avgExtractionAccuracy are null when no item used that mode.

List / get runs

http
GET /api/analysis/runs?definitionKey=call-intent-resolution&limit=50
GET /api/analysis/runs/:id

Scheduling

When a definition has schedule.enabled with a valid cron, the background analysis scheduler fires it automatically (e.g. 0 2 * * * = 02:00 UTC nightly). Each cron slot fires at most once, decided against the most recent run. See the guide.

Alerting

Run aggregates feed the alert system through the analysis module:

MetricSource
analysis_pass_rateaggregate.passRate × 100, averaged over completed runs in the window.
analysis_avg_judge_scoreaggregate.avgJudgeScore × 100.
analysis_avg_accuracyaggregate.avgExtractionAccuracy × 100.

A rule like analysis_avg_accuracy lt 85 over 24h notifies you when extraction quality drops on the nightly run.

Errors

StatusCause
400Missing name, empty/invalid fieldSet, enum without enumValues, bad cron, transcript missing role/content.
404Definition / conversation / run not found (or unknown definition key on run).
409A run for this definition is already in progress.
500Internal error.

Community edition is AGPL-3.0. Commercial licensing and support are available separately.