Responses API

POST /agent/:id/responses is the preferred way to drive an agentic workflow. It builds on OpenAI's modern Responses surface, which is purpose-designed for multi-turn agents: session state lives on the server, multi-turn continuity is a first-class field (previous_response_id), and every in-flight run is resumable after a network drop.

The platform also exposes:

GET /agent/:id/responses/:response_id — retrieve or stream a completed/in-flight response from a sequence number.
POST /agent/:id/responses/:response_id/cancel — cancel an in-flight run.

All three endpoints are drop-in compatible with the official OpenAI SDKs (Python ≥ 1.50, Node ≥ 4.50, AI SDK 5+).

How the endpoint maps to the workflow

Resolve input to a prompt string: if a plain string, use it directly; if an array of input items, walk in reverse and use the content of the last user-role item.
Resolve previous_response_id to a session_id by querying the server-side response store (Redis). On first turn, generate a fresh session_id.
Generate a response_id (format: resp_<base64url-uuid>).
Inject { user_prompt, session_id } as the httpRequest node's input body.
Write the response metadata to the store so the next turn can chain via previous_response_id.

The workflow receives exactly the same user_prompt and session_id fields as the Chat Completions path. The difference is that here the session linkage is managed entirely by the platform — callers never need to maintain a conversation_id themselves.

Request body

Field	Required	Notes
`model`	yes	Injected into the workflow as `httpRequest-0.model`. Wire it through `agentInput` → `deepAgent` to let callers select the upstream LLM per request.
`input`	yes	Plain string or array of OpenAI input items.
`stream`	no	`true` for SSE streaming, `false` for a blocking response.
`previous_response_id`	no	Links this turn to a prior run. The platform resolves it to the matching `session_id`.
`instructions`	no	Developer/system-level instructions. Forwarded to the runtime where supported.
`metadata`	no	Key/value map (≤ 16 keys, 64-char keys, 512-char values).
`tools`, `tool_choice`	no	Currently advisory.
`temperature`, `top_p`, `max_output_tokens`, `parallel_tool_calls`	no	Forwarded to the upstream model where supported.
`background: true`	rejected	Returns `422`. Use `stream: true` with the GET resume endpoint instead.
`conversation`	rejected	Returns `422`. Use `previous_response_id` for multi-turn linkage.

Multi-turn continuity

On each turn the response carries an id field. Pass that id as previous_response_id on the next turn — the platform resolves it to the correct session_id and the agent runtime picks up where it left off.

Turn 1:  POST /agent/:id/responses  { input: "Hello" }
           → response.id = "resp_abc"

Turn 2:  POST /agent/:id/responses  { input: "Who are you?", previous_response_id: "resp_abc" }
           → response.id = "resp_def"

Turn 3:  POST /agent/:id/responses  { input: "Thanks", previous_response_id: "resp_def" }

You do not need to resend message history. The agent's session_id-keyed memory handles continuity. If you lose the previous_response_id (e.g. the client crashes), start a new session — the old session remains valid in Redis for 24 hours and can be linked from another client if you stored the response_id.

Quick start

Python (`openai` SDK)

from openai import OpenAI

client = OpenAI(
    base_url="https://api.alien.club/agent/<agent_id>",
    api_key="<your-access-token>",
)

# First turn
with client.responses.stream(
    model="agent",
    input="What is the SYNTEC collective agreement?",
) as stream:
    for event in stream:
        if event.type == "response.output_text.delta":
            print(event.delta, end="", flush=True)
    first_response = stream.get_final_response()

print()

# Second turn — chain via previous_response_id
with client.responses.stream(
    model="agent",
    input="What is the trial period for an executive?",
    previous_response_id=first_response.id,
) as stream:
    for event in stream:
        if event.type == "response.output_text.delta":
            print(event.delta, end="", flush=True)

TypeScript (`openai` SDK)

import OpenAI from "openai"

const client = new OpenAI({
    baseURL: "https://api.alien.club/agent/<agent_id>",
    apiKey: "<your-access-token>",
})

// First turn
const firstStream = await client.responses.create({
    model: "agent",
    input: "What is the SYNTEC collective agreement?",
    stream: true,
})

let previousResponseId: string | undefined

for await (const event of firstStream) {
    if (event.type === "response.output_text.delta") {
        process.stdout.write(event.delta)
    }
    if (event.type === "response.completed") {
        previousResponseId = event.response.id
    }
}

// Second turn
const secondStream = await client.responses.create({
    model: "agent",
    input: "What is the trial period for an executive?",
    previous_response_id: previousResponseId,
    stream: true,
})

for await (const event of secondStream) {
    if (event.type === "response.output_text.delta") {
        process.stdout.write(event.delta)
    }
}

TypeScript (Vercel AI SDK)

import { createOpenAI } from "@ai-sdk/openai"
import { streamText } from "ai"

const provider = createOpenAI({
    baseURL: "https://api.alien.club/agent/<agent_id>",
    apiKey: "<your-access-token>",
})

// First turn
const { fullStream, experimental_providerMetadata } = await streamText({
    model: provider.responses("agent"),
    prompt: "What is the SYNTEC collective agreement?",
})

let previousResponseId: string | undefined

for await (const part of fullStream) {
    if (part.type === "text-delta") process.stdout.write(part.textDelta)
    if (part.type === "finish") {
        previousResponseId = (await experimental_providerMetadata)?.openai?.responseId as string
    }
}

// Second turn
const { fullStream: secondStream } = await streamText({
    model: provider.responses("agent"),
    prompt: "What is the trial period for an executive?",
    providerOptions: {
        openai: { previousResponseId },
    },
})

for await (const part of secondStream) {
    if (part.type === "text-delta") process.stdout.write(part.textDelta)
}

cURL

# First turn
curl -N \
  -H "Authorization: Bearer <your-access-token>" \
  -H "Content-Type: application/json" \
  -d '{"model":"agent","input":"Hello","stream":true}' \
  https://api.alien.club/agent/<agent_id>/responses

# Second turn — set previous_response_id to the id from the response.completed event
curl -N \
  -H "Authorization: Bearer <your-access-token>" \
  -H "Content-Type: application/json" \
  -d '{"model":"agent","input":"Who are you?","previous_response_id":"resp_abc","stream":true}' \
  https://api.alien.club/agent/<agent_id>/responses

Non-streaming mode

Omit stream for a blocking JSON response in the standard Response shape. The platform waits for the full agent run. As with Chat Completions, avoid this for long-running agentic workflows.

response = client.responses.create(
    model="agent",
    input="Summarise this document in one sentence.",
    previous_response_id=previous_response_id,
)
print(response.output_text)

Response object gotchas

Timestamp units are inconsistent

In the Response object, created_at is a Unix timestamp in seconds but completed_at is in milliseconds. Do not subtract them directly to compute duration:

# Wrong — produces a nonsensical result
duration = response.completed_at - response.created_at

# Correct
duration_seconds = (response.completed_at / 1000) - response.created_at

`x_alien_agent_registry` is a JSON-encoded string

metadata.x_alien_agent_registry is a JSON string nested inside the metadata map — it requires a second JSON.parse / json.loads call:

import json
registry = json.loads(response.metadata["x_alien_agent_registry"])
# registry is now a list of agent identity dicts

const registry = JSON.parse(response.metadata["x_alien_agent_registry"] as string)

Retrieve a response

GET /agent/:id/responses/<resp_id>

Returns the full Response object in its current state. If the run has already completed, the response includes the final output. If the run is still in flight and you pass ?starting_after=<seq>, the GET streams remaining events as SSE (see Streaming — resume).

Resume after a network drop

Every streaming event carries a sequence_number. If the connection drops, reconnect by GETting the response:

GET /agent/:id/responses/<resp_id>?starting_after=<last_sequence_number>

The server replays all events with a higher sequence number. If the run is still in flight the GET continues live. If it has already completed the replay ends with the original terminal event. See Streaming Responses for full semantics.

Cancel a run

POST /agent/:id/responses/<resp_id>/cancel

Returns a standard envelope with cancelled in the data object:

{ "success": true, "data": { "cancelled": true } }

cancelled: true if the run was in-flight; cancelled: false if it had already completed or failed. Cancellation is best-effort — events already enqueued may still arrive before the worker acknowledges the cancel signal.

Session storage and TTL

Response metadata is stored in Redis for 24 hours after creation. After expiry:

GET /agent/:id/responses/<resp_id> returns 410 Gone.
previous_response_id referencing an expired response generates a new session instead of returning an error — the chain is broken but the new turn succeeds.

Responses API vs Chat Completions

Capability	Responses API	Chat Completions
Multi-turn session	`previous_response_id` (server-managed)	`conversation_id` + full `messages` history (client-managed)
Resume after network drop	Yes — `GET ?starting_after=<seq>`	No
Typed event stream	Yes — `response.*` discriminated union	No — `ChatCompletionChunk`
Standard OpenAI consumer	Python ≥ 1.50, Node ≥ 4.50, AI SDK 5+	All OpenAI-compatible clients
Recommended for	Agentic workflows, production UIs	Quick integrations, broad tooling reach

How the endpoint maps to the workflow​

Request body​

Multi-turn continuity​

Quick start​

Python (openai SDK)​

TypeScript (openai SDK)​

TypeScript (Vercel AI SDK)​

cURL​

Non-streaming mode​

Response object gotchas​

Timestamp units are inconsistent​

x_alien_agent_registry is a JSON-encoded string​

Retrieve a response​

Resume after a network drop​

Cancel a run​

Session storage and TTL​

Responses API vs Chat Completions​

See also​