Skip to content

Chat Completions API

OpenAI-compatible chat completion endpoint with optional streaming.

Endpoint

POST /api/client/v1/chat/completions

Request

json
{
  "model": "gpt-4",
  "messages": [
    { "role": "system", "content": "You are a helpful assistant." },
    { "role": "user", "content": "What is the capital of France?" }
  ],
  "temperature": 0.7,
  "max_tokens": 1000,
  "stream": false,
  "request_id": "optional-correlation-id"
}

Parameters

FieldTypeRequiredDescription
modelstringYesModel key configured in the dashboard
messagesarrayYesArray of message objects
temperaturenumberNoSampling temperature (0-2)
max_tokensnumberNoMaximum tokens in response
max_completion_tokensnumberNoAlternative to max_tokens
streambooleanNoEnable SSE streaming (default: false)
request_idstringNoClient-provided correlation ID

Message Format

json
{ "role": "system" | "user" | "assistant" | "tool", "content": "..." }

Response (Non-Streaming)

json
{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "model": "gpt-4",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "The capital of France is Paris."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 25,
    "completion_tokens": 8,
    "total_tokens": 33
  },
  "request_id": "req_abc123"
}

Response (Streaming)

When stream: true, the response is a Server-Sent Events stream:

HTTP/1.1 200 OK
Content-Type: text/event-stream
X-Request-Id: req_abc123

data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","choices":[{"delta":{"role":"assistant"},"index":0}]}

data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","choices":[{"delta":{"content":"The"},"index":0}]}

data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","choices":[{"delta":{"content":" capital"},"index":0}]}

data: [DONE]

Guardrail Blocking

If a guardrail blocks the request:

json
{
  "error": {
    "message": "Content blocked by guardrail",
    "type": "guardrail_block",
    "guardrail_key": "pii-checker",
    "action": "block",
    "findings": [
      { "category": "email", "message": "PII detected" }
    ]
  }
}

Status code: 400

Errors

StatusDescription
400Missing model or messages, guardrail block
401Invalid API token
500Model not found
500Model is not configured for chat completions
429Rate limit or quota exceeded
500Provider error (circuit breaker / upstream)

Examples

cURL

bash
curl -X POST https://gateway.example.com/api/client/v1/chat/completions \
  -H "Authorization: Bearer cpeer_your_token" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

Python

python
import requests

response = requests.post(
    "https://gateway.example.com/api/client/v1/chat/completions",
    headers={"Authorization": "Bearer cpeer_your_token"},
    json={
        "model": "gpt-4",
        "messages": [{"role": "user", "content": "Hello"}]
    }
)
print(response.json())

Node.js

typescript
const response = await fetch('https://gateway.example.com/api/client/v1/chat/completions', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer cpeer_your_token',
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    model: 'gpt-4',
    messages: [{ role: 'user', content: 'Hello' }],
  }),
});
const data = await response.json();

Community edition is AGPL-3.0. Commercial licensing and support are available separately.