Responses API

The Responses API is OpenAI's modern streaming surface. It uses a richer, strongly-typed event taxonomy and supports native resume — the killer feature for production UIs that need to recover from network drops without losing in-flight runs.

The Alien platform exposes two endpoints:

POST /agent/:id/responses — start a new response, optionally streaming.
GET /agent/:id/responses/:respId?starting_after=<seq> — resume an existing response from a sequence number.

Both are drop-in compatible with client.responses.create(...) from the official OpenAI SDKs (Python ≥ 1.50, Node ≥ 4.50, AI SDK 5+).

Quick start

Python (`openai` SDK)

from openai import OpenAI

client = OpenAI(
    base_url="https://api.alien.club/agent/<agent_id>",
    api_key="<your-access-token>",
)

with client.responses.stream(
    model="agent",
    input="What is the duration of the trial period for a SYNTEC executive?",
) as stream:
    for event in stream:
        if event.type == "response.output_text.delta":
            print(event.delta, end="", flush=True)

TypeScript (`openai` SDK)

import OpenAI from "openai"

const client = new OpenAI({
    baseURL: "https://api.alien.club/agent/<agent_id>",
    apiKey: "<your-access-token>",
})

const stream = await client.responses.create({
    model: "agent",
    input: "What is the duration of the trial period for a SYNTEC executive?",
    stream: true,
})

for await (const event of stream) {
    if (event.type === "response.output_text.delta") {
        process.stdout.write(event.delta)
    }
}

cURL — initial request

curl -N \
  -H "Authorization: Bearer <your-access-token>" \
  -H "Content-Type: application/json" \
  -d '{"model":"agent","input":"Hello","stream":true}' \
  https://api.alien.club/agent/<agent_id>/responses

cURL — resume after a network drop

# Read the last sequence_number you saw, then:
curl -N \
  -H "Authorization: Bearer <your-access-token>" \
  https://api.alien.club/agent/<agent_id>/responses/<resp_id>?starting_after=<last_seq>

Request body

The request body matches OpenAI's Responses create request:

Field	Required	Notes
`model`	yes	Free-form string. The platform routes to the configured upstream model for this agent.
`input`	yes	Either a free-form string (treated as a user message) or an array of input items per OpenAI's input taxonomy.
`instructions`	no	System or developer-style instructions.
`stream`	no, defaults `false`	Set to `true` for streaming.
`metadata`	no	Free-form key/value map (≤16 keys, 64-char keys, 512-char values). Forwarded to `Response.metadata`.
`tools`, `tool_choice`	no	Currently advisory.
`temperature`, `top_p`, `max_output_tokens`, `parallel_tool_calls`, `previous_response_id`, `conversation`	no	Accepted, forwarded to the runtime where supported.
`background: true`	rejected	Background responses are out of scope for v1. Use `stream: true` and resume via `GET` instead.

Event format

Each event is a Server-Sent Events frame carrying both the SSE event: line (the event type discriminator) and a data: line (the JSON payload):

event: response.created
data: {"type":"response.created","sequence_number":0,"response":{...}}

event: response.output_item.added
data: {"type":"response.output_item.added","sequence_number":1,"output_index":0,"item":{...}}

event: response.output_text.delta
data: {"type":"response.output_text.delta","sequence_number":2,"item_id":"...","output_index":0,"content_index":0,"delta":"Hello","logprobs":[]}

event: response.completed
data: {"type":"response.completed","sequence_number":N,"response":{...}}

Every event validates against the OpenAI Python SDK's openai.types.responses.ResponseStreamEvent discriminated union.

There is no [DONE] terminator. The stream closes after exactly one terminal event:

response.completed — successful run.
response.failed — failure. Carries Response.error with code and message.
response.incomplete — truncated by max_output_tokens or content filter.

Event types

The complete event subset emitted by this endpoint:

Event type	When
`response.created`	First event. Carries the `Response` object with `status: "in_progress"`.
`response.in_progress`	Optional progress signal during long runs.
`response.output_item.added`	A new top-level output item appears (message, function_call, or reasoning item).
`response.content_part.added`	Within a message, a new content part starts (`output_text`, `refusal`, or `reasoning_text`).
`response.output_text.delta`	Text token delta within an `output_text` part.
`response.output_text.done`	Closes an `output_text` part.
`response.function_call_arguments.delta`	Streaming JSON-string fragments of a function call's arguments.
`response.function_call_arguments.done`	Closes a function-call item.
`response.reasoning_summary_part.added`	A new reasoning summary part appears.
`response.reasoning_summary_text.delta`	Text delta inside a reasoning summary.
`response.reasoning_summary_text.done`	Closes a reasoning summary part.
`response.content_part.done`	Closes a content part.
`response.output_item.done`	Closes a top-level output item.
`response.completed`	Terminal success. `Response.usage` populated.
`response.failed`	Terminal failure. `Response.error` populated.
`response.incomplete`	Terminal partial result. `Response.incomplete_details.reason` populated.

The full schema for each event is documented at https://platform.openai.com/docs/api-reference/responses-streaming.

Sequence numbers and resume

Every event carries a monotonically increasing sequence_number, starting at 0. This is the resume cursor.

When a streaming connection drops mid-run, reconnect by GETting the response id with starting_after set to the last sequence number you successfully processed:

GET /agent/:id/responses/<resp_id>?starting_after=<last_seq>

The server replays all events with sequence_number > <last_seq>. If the run is still in flight, the GET continues live as new events arrive. If the run has already terminated, the GET replays the tail and closes with the original terminal event.

Storage and TTL

Streamed responses are persisted server-side in Redis for 24 hours after creation. After expiry:

GET /agent/:id/responses/<resp_id> returns HTTP 410 Gone.
The response cannot be resumed and must be re-issued via POST.

24 hours covers any realistic network-recovery window. If you need durable replay beyond a day, store the events client-side as you receive them.

Failure modes for resume

Status	Reason
200	Normal — replay or live tail begins.
400	`starting_after` is invalid (not a non-negative integer, or beyond the response's last sequence number).
404	Response unknown to this agent — wrong id or wrong agent.
410	Response existed but its TTL expired.

Subagent context via `metadata.x_alien_*`

Multi-agent runs (where the main agent dispatches subagents via tools) carry the agent registry in the Response.metadata field. The Responses API permits arbitrary metadata per response — this is a documented extension point, not a standards violation.

Example Response.metadata carrying subagent context:

{
  "x_alien_root_agent_id": "MAIN",
  "x_alien_agent_registry": "[{\"id\":\"MAIN\",\"kind\":\"main\",\"name\":\"main\",\"parent_id\":null},{\"id\":\"sub-legifrance\",\"kind\":\"subagent\",\"name\":\"Légifrance researcher\",\"parent_id\":\"MAIN\"}]"
}

Per-item agent identity is encoded in the item's id using a structured prefix: agent:<agent_id>::msg_<random> for messages, agent:<agent_id>::fc_<random> for function calls. Standard consumers treat the prefix as opaque (which is how the SDK treats every item id); extension-aware consumers parse the prefix to render per-subagent affordances.

The 512-character cap on a single metadata value means very large registries (≥50 agents) may be truncated. If truncation occurs, metadata.x_alien_registry_truncated is set to "true" and consumers should fall back to parsing per-item id prefixes.

Errors

Pre-stream errors

Failures before the first event return HTTP 4xx/5xx with a JSON body matching OpenAI's standard error envelope:

{ "error": { "message": "...", "type": "invalid_request_error", "code": "..." } }

Mid-stream errors

Failures after the first event are surfaced via the response.failed terminal event:

{
  "type": "response.failed",
  "sequence_number": <int>,
  "response": {
    "id": "resp_...",
    "status": "failed",
    "error": { "code": "server_error", "message": "..." },
    "metadata": {
      "x_alien_error_code": "upstream_timeout",
      "x_alien_error_message": "upstream model timed out after 30s"
    },
    ...
  }
}

Response.error.code is constrained to OpenAI's documented Literal codes (server_error, rate_limit_exceeded, invalid_prompt, etc.) — Alien-specific codes are surfaced via metadata.x_alien_error_code.

Heartbeat

The server emits SSE comment lines (:keep-alive\n\n) every 15 seconds during quiet periods. Comments do not parse as events and do not advance sequence_number.

Compatibility checklist

The streams produced by these endpoints are validated against:

The OpenAI Python SDK's openai.types.responses.ResponseStreamEvent discriminated union (ResponseCreatedEvent, ResponseOutputItemAddedEvent, ResponseTextDeltaEvent, etc.).
Reference fixtures derived from OpenAI's Responses streaming reference and the OpenAI Python SDK source.

If a standard OpenAI Responses consumer breaks against this endpoint, that's a bug — please report it.

Quick start​

Python (openai SDK)​

TypeScript (openai SDK)​

cURL — initial request​

cURL — resume after a network drop​

Request body​

Event format​

Event types​

Sequence numbers and resume​

Storage and TTL​

Failure modes for resume​

Subagent context via metadata.x_alien_*​

Errors​

Pre-stream errors​

Mid-stream errors​

Heartbeat​

Compatibility checklist​

See also​