Streaming agent runs

Alien Intelligence exposes three streaming surfaces. The first two are drop-in compatible with the official OpenAI Python and Node SDKs — point base_url at https://api.alien.club/agent/<id> and the SDKs work without modification. The third is the platform-native envelope used by builders who want full multi-modal visibility into a workflow run.

Surface	Endpoint	Resume	Best for
Chat Completions	`POST /agent/:id/chat/completions`	No	Broadest tooling reach. Drop-in for any OpenAI Chat Completions consumer (LangChain `ChatOpenAI`, AI SDK `openaiCompatible`, plain `openai-python`).
Responses API	`POST /agent/:id/responses` + `GET /agent/:id/responses/:respId?starting_after=<seq>`	Yes (native, via `sequence_number`)	Production UIs that need to recover from network blips, resume mid-run, or surface reasoning summaries with strong typing.
NodeStreamEvent aggregator	`GET /jobs/:job_id/stream`	Yes (via `Last-Event-ID`)	Debug UIs and builders who want every workflow node's events on one stream — agent execution today, image generation and voice synthesis as those nodes ship.

The OpenAI surfaces are sourced from the deep_agent payloads on the same internal stream that the aggregator endpoint exposes — they filter on node_type === "deep_agent" and translate the inner agent events to the OpenAI wire formats. Behaviour around tool calls, subagent dispatches, and finishing reasons is identical between Chat Completions and Responses; the differentiator is transport (stateless vs stateful + resume).

Which one should I use?

Use Chat Completions when:

Your client is already configured for OpenAI's Chat Completions API and you want the smallest possible integration delta.
You don't need resume on network drops — a reconnect = re-run is acceptable.
You're integrating via LangChain, AI SDK's openaiCompatible provider, or any tool whose primary support story is Chat Completions.

Use Responses API when:

Network reliability matters: you want the client to drop and reconnect to the same in-flight run.
You need to display reasoning summaries from o-series and gpt-5 models with native event types.
You need strict per-event sequencing for replay/audit (sequence_number is monotonic and durable for 24h).
Your client is on AI SDK 5+ or openai-python ≥ 1.50 (the SDK versions that natively support client.responses.create).

Use the NodeStreamEvent aggregator when:

You want full multi-modal visibility into a workflow run — agent execution alongside image generation, voice synthesis, or any other workflow node type as they ship.
You're building a debug or internal-tooling UI that renders the entire stream grouped by node.
You need a stable, generic envelope that absorbs new node types without protocol churn on your side.

Authentication

Both endpoints use the platform's standard bearer-token auth. Pass the access token in the Authorization: Bearer <token> header — exactly as you would for OpenAI directly. API keys (X-API-Key header) are also accepted for service-to-service callers.

Compatibility guarantee

The wire format conforms to the OpenAI Python SDK's own Pydantic models — the same models used by every official OpenAI client at runtime. We run round-trip validation in CI: every chunk and event we emit must parse through openai.types.chat.ChatCompletionChunk (Chat Completions) or openai.types.responses.ResponseStreamEvent (Responses) without rejection. If the OpenAI SDK accepts it, your tooling will too.

In practice, this means:

All standard fields (id, object, created, model, choices[], delta, usage, finish_reason, sequence_number, output_index, content_index, item_id, etc.) are present and shaped per OpenAI's spec.
We add x_alien (Chat Completions, top-level on each chunk) or metadata.x_alien_* (Responses API, on the Response object) extension fields to surface multi-agent context — these are transparently ignored by clients that don't know about them.

Read the per-surface details:

Chat Completions API
Responses API
Agent Artifacts — downloadable docx/xlsx/pdf produced by the agent mid-run.
NodeStreamEvent envelope and /jobs/:id/stream

OpenAI reference docs

The wire format is a strict subset of OpenAI's published streaming behaviour. When in doubt about a field's semantics, the canonical references are:

Chat Completions streaming: https://platform.openai.com/docs/api-reference/chat/streaming
Responses streaming: https://platform.openai.com/docs/api-reference/responses-streaming

The Alien-specific behaviour layered on top is documented on the per-surface pages.

Which one should I use?​

Authentication​

Compatibility guarantee​

OpenAI reference docs​

Which one should I use?

Authentication

Compatibility guarantee

OpenAI reference docs