Chat Completions API
POST /agent/:id/chat/completions returns an OpenAI Chat Completions-compatible stream of an agent run. The response is text/event-stream and parses without modification through the OpenAI Python and Node SDKs, LangChain ChatOpenAI, and AI SDK's openaiCompatible provider.
Quick start
Python (openai SDK)
from openai import OpenAI
client = OpenAI(
base_url="https://api.alien.club/agent/<agent_id>",
api_key="<your-access-token>",
)
stream = client.chat.completions.create(
model="agent",
messages=[{"role": "user", "content": "What is the duration of the trial period for a SYNTEC executive?"}],
stream=True,
)
for chunk in stream:
delta = chunk.choices[0].delta
if delta.content:
print(delta.content, end="", flush=True)
TypeScript (openai SDK)
import OpenAI from "openai"
const client = new OpenAI({
baseURL: "https://api.alien.club/agent/<agent_id>",
apiKey: "<your-access-token>",
})
const stream = await client.chat.completions.create({
model: "agent",
messages: [{ role: "user", content: "What is the duration of the trial period for a SYNTEC executive?" }],
stream: true,
})
for await (const chunk of stream) {
const delta = chunk.choices[0]?.delta
if (delta?.content) process.stdout.write(delta.content)
}
cURL
curl -N \
-H "Authorization: Bearer <your-access-token>" \
-H "Content-Type: application/json" \
-d '{"model":"agent","messages":[{"role":"user","content":"Hello"}],"stream":true}' \
https://api.alien.club/agent/<agent_id>/chat/completions
Request body
The request body matches OpenAI's Chat Completions create request with these constraints:
| Field | Required | Notes |
|---|---|---|
model | yes | Free-form string. The platform routes to the configured upstream model for this agent. |
messages | yes | Standard OpenAI message array. |
stream | yes (must be true) | Non-streaming responses use a different endpoint. |
stream_options.include_usage | no | When true, a final usage-only chunk is emitted before [DONE]. |
tools | no | Currently advisory. The agent runs with its own bound tool set. |
n | no | Must be 1 or omitted. |
temperature, top_p, max_tokens, max_completion_tokens, seed, presence_penalty, frequency_penalty, logit_bias | no | Forwarded to the upstream model where supported. |
function_call, functions | rejected | Deprecated by OpenAI in favour of tools — we follow suit. |
Stream format
The stream is a sequence of JSON-encoded chunks framed as Server-Sent Events:
data: {"id":"...","object":"chat.completion.chunk","created":1234567890,"model":"...","choices":[{"index":0,"delta":{"role":"assistant","content":""},"finish_reason":null}]}
data: {"id":"...","object":"chat.completion.chunk","created":1234567890,"model":"...","choices":[{"index":0,"delta":{"content":"Hello"},"finish_reason":null}]}
data: {"id":"...","object":"chat.completion.chunk","created":1234567890,"model":"...","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}
data: [DONE]
Every chunk validates against the OpenAI Python SDK's openai.types.chat.ChatCompletionChunk Pydantic model. The full chunk schema, ordering rules, and tool-call streaming behaviour follow OpenAI's published reference: https://platform.openai.com/docs/api-reference/chat/streaming.
The terminator is the literal data: [DONE]\n\n frame, identical to OpenAI's behaviour.
Resume
Not supported. POST /agent/:id/chat/completions is stateless: a reconnect means re-POSTing, which starts a new agent run. This matches OpenAI's actual behaviour — Chat Completions has no Last-Event-ID semantics and the OpenAI SDKs do not implement client-side resume.
If you need resume — e.g. to recover from a flaky network mid-run without losing the agent's progress — use the Responses API instead.
x_alien extension
Each chunk MAY carry an x_alien top-level extension object with multi-agent context that has no native OpenAI field:
{
"id": "...",
"object": "chat.completion.chunk",
"created": 1234567890,
"model": "...",
"choices": [...],
"x_alien": {
"agent_id": "subagent-legifrance",
"agent_register": {
"id": "subagent-legifrance",
"kind": "subagent",
"name": "Légifrance researcher",
"parent_id": "MAIN",
"dispatched_by_tool_call_id": "call_abc"
},
"kind": "text"
}
}
Fields:
agent_id— required whenx_alienis present. Identifies which agent in the run produced this chunk.agent_register— emitted once per run, on the first chunk where each agent appears. Carries the agent's identity, kind (main/subagent/tool), display name, parent agent id, and (for subagents dispatched via thetasktool) the originating tool-call id.kind—"text"(default) or"reasoning". Tags reasoning/thinking content so extension-aware UIs can render it separately.lifecycle— optional, one of"agent_end"or"subagent_dispatched". Marks notable moments without parsingfinish_reason.error— present only on the closing chunk of a failed run, carriescodeandmessage.
Standard consumers
Clients that don't know about x_alien ignore the field — ChatCompletionChunk.model_validate accepts unknown top-level keys. You get vanilla OpenAI text streaming with no agent identity context. This is the explicit degradation contract: if your tooling parses against a strict schema that rejects unknown keys, strip x_alien first.
Errors
Pre-stream errors
If the request fails validation (HTTP 4xx) or the agent cannot be reached (HTTP 5xx), you receive a non-streaming JSON body matching OpenAI's standard error envelope:
{ "error": { "message": "...", "type": "invalid_request_error", "param": null, "code": "..." } }
Mid-stream errors
If the agent fails after some chunks have been emitted, the server emits one final chunk and closes with [DONE]. Standard consumers see finish_reason: "stop" and a clean termination. Extension-aware consumers detect the failure via x_alien.error:
data: {"id":"...","object":"chat.completion.chunk","created":1234567890,"model":"...","choices":[{"index":0,"delta":{},"finish_reason":"stop"}],"x_alien":{"agent_id":"MAIN","lifecycle":"agent_end","error":{"code":"upstream_timeout","message":"upstream model timed out after 30s"}}}
data: [DONE]
We deliberately do NOT use a non-standard finish_reason: "error" value because the OpenAI SDK's Choice.finish_reason is a closed Literal and adding "error" would fail SDK validation. Surface errors via x_alien.error and you'll get the full context; rely on finish_reason alone and you'll see a clean stop.
If full error visibility is critical, use the Responses API — its response.failed event is a first-class part of the OpenAI surface.
Heartbeat
When no chunk has flowed for 15 seconds, the server emits SSE comment lines (:keep-alive\n\n) to keep the connection alive. Comments are valid SSE and ignored by every OpenAI client we've verified.
Compatibility checklist
The streams produced by this endpoint are validated against:
- The OpenAI Python SDK's
openai.types.chat.ChatCompletionChunkPydantic model (any chunk that the SDK rejects is, by definition, non-conformant). - Reference fixtures transcribed from OpenAI's published documentation examples (https://platform.openai.com/docs/api-reference/chat/streaming).
If you find a case where a standard OpenAI consumer breaks against this endpoint, that's a bug — please report it.
See also
- Streaming overview — when to use Chat Completions vs Responses API.
- Responses API — richer typed events with native resume.
- OpenAI reference: https://platform.openai.com/docs/api-reference/chat/streaming.