Skip to main content

Chat Completions API

POST /agent/:id/chat/completions accepts an OpenAI Chat Completions request and drives the underlying workflow with it. Any client that speaks OpenAI Chat Completions — the Python and Node SDKs, LangChain ChatOpenAI, Vercel AI SDK — works without modification.

How the endpoint maps to the workflow

The endpoint resolves a single user_prompt from the messages array and passes it to the workflow alongside a session identifier:

  1. Walk messages in reverse to find the last entry with role: "user".
  2. Flatten its content to a plain string (joins multi-part arrays, drops non-text parts).
  3. Use conversation_id from the request body as session_id, or generate a UUID if omitted.
  4. Inject { user_prompt, session_id } as the httpRequest node's input body.

The workflow receives user_prompt and session_id. It does not receive the full messages array — the agentInput node rebuilds conversation context from session_id by querying the agent's internal session store.

Request body

FieldRequiredNotes
modelyesInjected into the workflow as httpRequest-0.model. Wire it through agentInputdeepAgent to let callers select the upstream LLM per request.
messagesyesStandard OpenAI message array. At least one user message is required.
streamnotrue for SSE streaming, false (or omitted) for a blocking response.
conversation_idnoPlatform extension. Identifies the conversation across turns. Generated if absent.
nnoMust be 1 or omitted.
toolsnoCurrently advisory — the agent runs with its own bound tool set.
temperature, top_p, max_tokens, presence_penalty, frequency_penaltynoForwarded to the upstream model where supported.
function_call, functionsrejectedRemoved in favour of tools. Returns 422.

Multi-turn conversations

The Chat Completions API is stateless on the platform side: you must resend the full conversation history in messages on every turn, alternating user and assistant roles. The endpoint extracts only the last user message as user_prompt, but the complete history you provide is what the agent uses for context.

conversation_id is forwarded to the workflow as session_id. It is a label, not a server-side history store — omitting prior messages means the agent has no context of earlier turns, even if you pass the same conversation_id.

{
"model": "agent",
"messages": [
{ "role": "user", "content": "What is the SYNTEC collective agreement?" },
{ "role": "assistant", "content": "The SYNTEC agreement covers IT and consulting firms..." },
{ "role": "user", "content": "What is the trial period for an executive?" }
],
"conversation_id": "my-session-abc123",
"stream": true
}

Capture conversation_id from the first response's x_alien extension and pass it on subsequent turns alongside the full history. If you want the platform to manage conversation history server-side so you don't have to resend messages, use the Responses API instead.

tip

The Responses API handles session state for you — just pass previous_response_id and send only the new message each turn. Chat Completions requires the full messages array every time.

Quick start

Python (openai SDK)

from openai import OpenAI

client = OpenAI(
base_url="https://api.alien.club/agent/<agent_id>",
api_key="<your-access-token>",
)

# First turn — no conversation_id yet
stream = client.chat.completions.create(
model="agent",
messages=[
{"role": "user", "content": "What is the SYNTEC collective agreement?"}
],
stream=True,
extra_body={"conversation_id": None},
)

conversation_id = None
for chunk in stream:
delta = chunk.choices[0].delta
if delta.content:
print(delta.content, end="", flush=True)
# Capture conversation_id from the first chunk's x_alien extension
if conversation_id is None and hasattr(chunk, "x_alien"):
conversation_id = chunk.x_alien.get("conversation_id")

print()

# Second turn — pass conversation_id to continue the session
stream = client.chat.completions.create(
model="agent",
messages=[
{"role": "user", "content": "What is the SYNTEC collective agreement?"},
{"role": "assistant", "content": "The SYNTEC agreement covers..."},
{"role": "user", "content": "What is the trial period for an executive?"},
],
stream=True,
extra_body={"conversation_id": conversation_id},
)

for chunk in stream:
delta = chunk.choices[0].delta
if delta.content:
print(delta.content, end="", flush=True)

TypeScript (openai SDK)

import OpenAI from "openai"

const client = new OpenAI({
baseURL: "https://api.alien.club/agent/<agent_id>",
apiKey: "<your-access-token>",
})

// First turn
const firstStream = await client.chat.completions.create({
model: "agent",
messages: [
{ role: "user", content: "What is the SYNTEC collective agreement?" }
],
stream: true,
})

let conversationId: string | undefined
let assistantReply = ""

for await (const chunk of firstStream) {
const content = chunk.choices[0]?.delta?.content
if (content) {
process.stdout.write(content)
assistantReply += content
}
// x_alien is a platform extension — cast through unknown to access it
const xAlien = (chunk as unknown as { x_alien?: { conversation_id?: string } }).x_alien
if (!conversationId && xAlien?.conversation_id) {
conversationId = xAlien.conversation_id
}
}

// Second turn
const secondStream = await client.chat.completions.create({
model: "agent",
messages: [
{ role: "user", content: "What is the SYNTEC collective agreement?" },
{ role: "assistant", content: assistantReply },
{ role: "user", content: "What is the trial period for an executive?" },
],
stream: true,
// @ts-expect-error — platform extension field
conversation_id: conversationId,
})

for await (const chunk of secondStream) {
const content = chunk.choices[0]?.delta?.content
if (content) process.stdout.write(content)
}

cURL

curl -N \
-H "Authorization: Bearer <your-access-token>" \
-H "Content-Type: application/json" \
-d '{
"model": "agent",
"messages": [
{"role": "user", "content": "What is the SYNTEC collective agreement?"}
],
"conversation_id": "my-session-abc123",
"stream": true
}' \
https://api.alien.club/agent/<agent_id>/chat/completions

Non-streaming mode

Omit stream (or set it to false) to receive a blocking JSON response in the standard ChatCompletion shape. The platform waits for the full agent run to complete before returning. Use this only for short-lived, low-latency agents — agentic runs with subagents and tool calls may take tens of seconds and will hold the connection open for the full duration.

response = client.chat.completions.create(
model="agent",
messages=[{"role": "user", "content": "Summarise this document in one sentence."}],
stream=False,
extra_body={"conversation_id": "my-session-abc123"},
)
print(response.choices[0].message.content)

x_alien extension

Each chunk (and the non-streaming response) carries an optional x_alien top-level field with platform context unavailable in the standard OpenAI schema:

FieldWhen presentDescription
conversation_idFirst chunk onlyThe session_id / conversation_id for this turn. Pass back on the next turn.
agent_idAlwaysID of the agent that produced this chunk.
agent_registerFirst chunk per agentIdentity card for this agent: id, kind, name, parent_id, dispatched_by_tool_call_id.
kindAlways"text" or "reasoning". Use to separate reasoning traces from final answers in the UI.
lifecycleAt agent boundaries"agent_end" or "subagent_dispatched".
errorOn failure{ code, message } carried on the closing chunk of a failed run.

Standard consumers that don't know about x_alien ignore the field safely — the OpenAI SDKs accept unknown top-level keys.

Resuming a dropped connection

Chat Completions has no resume semantics. If the SSE connection drops mid-run, re-POSTing the request starts a new agent run. If you need to resume an in-flight run after a network blip without restarting, use the Responses API instead.

See also