Responses API
POST /agent/:id/responses is the preferred way to drive an agentic workflow. It builds on OpenAI's modern Responses surface, which is purpose-designed for multi-turn agents: session state lives on the server, multi-turn continuity is a first-class field (previous_response_id), and every in-flight run is resumable after a network drop.
The platform also exposes:
GET /agent/:id/responses/:response_id— retrieve or stream a completed/in-flight response from a sequence number.POST /agent/:id/responses/:response_id/cancel— cancel an in-flight run.
All three endpoints are drop-in compatible with the official OpenAI SDKs (Python ≥ 1.50, Node ≥ 4.50, AI SDK 5+).
How the endpoint maps to the workflow
- Resolve
inputto a prompt string: if a plain string, use it directly; if an array of input items, walk in reverse and use the content of the lastuser-role item. - Resolve
previous_response_idto asession_idby querying the server-side response store (Redis). On first turn, generate a freshsession_id. - Generate a
response_id(format:resp_<base64url-uuid>). - Inject
{ user_prompt, session_id }as thehttpRequestnode's input body. - Write the response metadata to the store so the next turn can chain via
previous_response_id.
The workflow receives exactly the same user_prompt and session_id fields as the Chat Completions path. The difference is that here the session linkage is managed entirely by the platform — callers never need to maintain a conversation_id themselves.
Request body
| Field | Required | Notes |
|---|---|---|
model | yes | Injected into the workflow as httpRequest-0.model. Wire it through agentInput → deepAgent to let callers select the upstream LLM per request. |
input | yes | Plain string or array of OpenAI input items. |
stream | no | true for SSE streaming, false for a blocking response. |
previous_response_id | no | Links this turn to a prior run. The platform resolves it to the matching session_id. |
instructions | no | Developer/system-level instructions. Forwarded to the runtime where supported. |
metadata | no | Key/value map (≤ 16 keys, 64-char keys, 512-char values). |
tools, tool_choice | no | Currently advisory. |
temperature, top_p, max_output_tokens, parallel_tool_calls | no | Forwarded to the upstream model where supported. |
background: true | rejected | Returns 422. Use stream: true with the GET resume endpoint instead. |
conversation | rejected | Returns 422. Use previous_response_id for multi-turn linkage. |
Multi-turn continuity
On each turn the response carries an id field. Pass that id as previous_response_id on the next turn — the platform resolves it to the correct session_id and the agent runtime picks up where it left off.
Turn 1: POST /agent/:id/responses { input: "Hello" }
→ response.id = "resp_abc"
Turn 2: POST /agent/:id/responses { input: "Who are you?", previous_response_id: "resp_abc" }
→ response.id = "resp_def"
Turn 3: POST /agent/:id/responses { input: "Thanks", previous_response_id: "resp_def" }
You do not need to resend message history. The agent's session_id-keyed memory handles continuity. If you lose the previous_response_id (e.g. the client crashes), start a new session — the old session remains valid in Redis for 24 hours and can be linked from another client if you stored the response_id.
Quick start
Python (openai SDK)
from openai import OpenAI
client = OpenAI(
base_url="https://api.alien.club/agent/<agent_id>",
api_key="<your-access-token>",
)
# First turn
with client.responses.stream(
model="agent",
input="What is the SYNTEC collective agreement?",
) as stream:
for event in stream:
if event.type == "response.output_text.delta":
print(event.delta, end="", flush=True)
first_response = stream.get_final_response()
print()
# Second turn — chain via previous_response_id
with client.responses.stream(
model="agent",
input="What is the trial period for an executive?",
previous_response_id=first_response.id,
) as stream:
for event in stream:
if event.type == "response.output_text.delta":
print(event.delta, end="", flush=True)
TypeScript (openai SDK)
import OpenAI from "openai"
const client = new OpenAI({
baseURL: "https://api.alien.club/agent/<agent_id>",
apiKey: "<your-access-token>",
})
// First turn
const firstStream = await client.responses.create({
model: "agent",
input: "What is the SYNTEC collective agreement?",
stream: true,
})
let previousResponseId: string | undefined
for await (const event of firstStream) {
if (event.type === "response.output_text.delta") {
process.stdout.write(event.delta)
}
if (event.type === "response.completed") {
previousResponseId = event.response.id
}
}
// Second turn
const secondStream = await client.responses.create({
model: "agent",
input: "What is the trial period for an executive?",
previous_response_id: previousResponseId,
stream: true,
})
for await (const event of secondStream) {
if (event.type === "response.output_text.delta") {
process.stdout.write(event.delta)
}
}
TypeScript (Vercel AI SDK)
import { createOpenAI } from "@ai-sdk/openai"
import { streamText } from "ai"
const provider = createOpenAI({
baseURL: "https://api.alien.club/agent/<agent_id>",
apiKey: "<your-access-token>",
})
// First turn
const { fullStream, experimental_providerMetadata } = await streamText({
model: provider.responses("agent"),
prompt: "What is the SYNTEC collective agreement?",
})
let previousResponseId: string | undefined
for await (const part of fullStream) {
if (part.type === "text-delta") process.stdout.write(part.textDelta)
if (part.type === "finish") {
previousResponseId = (await experimental_providerMetadata)?.openai?.responseId as string
}
}
// Second turn
const { fullStream: secondStream } = await streamText({
model: provider.responses("agent"),
prompt: "What is the trial period for an executive?",
providerOptions: {
openai: { previousResponseId },
},
})
for await (const part of secondStream) {
if (part.type === "text-delta") process.stdout.write(part.textDelta)
}
cURL
# First turn
curl -N \
-H "Authorization: Bearer <your-access-token>" \
-H "Content-Type: application/json" \
-d '{"model":"agent","input":"Hello","stream":true}' \
https://api.alien.club/agent/<agent_id>/responses
# Second turn — set previous_response_id to the id from the response.completed event
curl -N \
-H "Authorization: Bearer <your-access-token>" \
-H "Content-Type: application/json" \
-d '{"model":"agent","input":"Who are you?","previous_response_id":"resp_abc","stream":true}' \
https://api.alien.club/agent/<agent_id>/responses
Non-streaming mode
Omit stream for a blocking JSON response in the standard Response shape. The platform waits for the full agent run. As with Chat Completions, avoid this for long-running agentic workflows.
response = client.responses.create(
model="agent",
input="Summarise this document in one sentence.",
previous_response_id=previous_response_id,
)
print(response.output_text)
Response object gotchas
Timestamp units are inconsistent
In the Response object, created_at is a Unix timestamp in seconds but completed_at is in milliseconds. Do not subtract them directly to compute duration:
# Wrong — produces a nonsensical result
duration = response.completed_at - response.created_at
# Correct
duration_seconds = (response.completed_at / 1000) - response.created_at
x_alien_agent_registry is a JSON-encoded string
metadata.x_alien_agent_registry is a JSON string nested inside the metadata map — it requires a second JSON.parse / json.loads call:
import json
registry = json.loads(response.metadata["x_alien_agent_registry"])
# registry is now a list of agent identity dicts
const registry = JSON.parse(response.metadata["x_alien_agent_registry"] as string)
Retrieve a response
GET /agent/:id/responses/<resp_id>
Returns the full Response object in its current state. If the run has already completed, the response includes the final output. If the run is still in flight and you pass ?starting_after=<seq>, the GET streams remaining events as SSE (see Streaming — resume).
Resume after a network drop
Every streaming event carries a sequence_number. If the connection drops, reconnect by GETting the response:
GET /agent/:id/responses/<resp_id>?starting_after=<last_sequence_number>
The server replays all events with a higher sequence number. If the run is still in flight the GET continues live. If it has already completed the replay ends with the original terminal event. See Streaming Responses for full semantics.
Cancel a run
POST /agent/:id/responses/<resp_id>/cancel
Returns a standard envelope with cancelled in the data object:
{ "success": true, "data": { "cancelled": true } }
cancelled: true if the run was in-flight; cancelled: false if it had already completed or failed. Cancellation is best-effort — events already enqueued may still arrive before the worker acknowledges the cancel signal.
Session storage and TTL
Response metadata is stored in Redis for 24 hours after creation. After expiry:
GET /agent/:id/responses/<resp_id>returns410 Gone.previous_response_idreferencing an expired response generates a new session instead of returning an error — the chain is broken but the new turn succeeds.
Responses API vs Chat Completions
| Capability | Responses API | Chat Completions |
|---|---|---|
| Multi-turn session | previous_response_id (server-managed) | conversation_id + full messages history (client-managed) |
| Resume after network drop | Yes — GET ?starting_after=<seq> | No |
| Typed event stream | Yes — response.* discriminated union | No — ChatCompletionChunk |
| Standard OpenAI consumer | Python ≥ 1.50, Node ≥ 4.50, AI SDK 5+ | All OpenAI-compatible clients |
| Recommended for | Agentic workflows, production UIs | Quick integrations, broad tooling reach |
See also
- Configure Your Workflow — wire the httpRequest and httpResponse nodes.
- Chat Completions API — broader client compatibility.
- Streaming Responses — SSE format, sequence numbers, and resume.
- OpenAI reference: https://platform.openai.com/docs/api-reference/responses/create.