Skip to main content

Security Model

The Alien Intelligence platform uses multiple independent security layers. No single authentication mechanism spans the entire request path — each service hop re-authenticates independently, and authorization is enforced at every layer from the API gateway to individual database queries.

Authentication Overview

The platform uses five distinct authentication mechanisms, each designed for a different type of actor:

ActorMechanismWhen Used
Human usersOAuth2 + OIDCWeb dashboard, interactive sessions
API consumersAPI tokensProgrammatic access, scripts, integrations
Data clustersService keysCluster-to-platform communication (heartbeat, sync)
Data plane operatorsService keysOperator-to-platform communication (registration, heartbeat)
AI agents (MCP)OAuth2 PKCEAI assistant access to data via MCP tools

Dual-Guard System

The platform backend accepts two types of credentials on every API route:

  1. OAuth JWT — issued by the identity provider, passed in an HTTP header. Used by web sessions and MCP-authenticated requests.
  2. API token — an opaque string prefixed with oat_, passed as a Bearer token. Used by programmatic consumers.

The backend tries OAuth first, then falls back to API token authentication. This means:

  • Web users authenticate seamlessly through single sign-on
  • API consumers use long-lived tokens without needing OAuth flows
  • If the identity provider is temporarily unavailable, API token authentication still works

User Authentication (OAuth2 + OIDC)

Human users authenticate through a self-hosted OIDC identity provider. The flow:

  1. User visits the platform and is redirected to the identity provider
  2. User signs in (or uses existing SSO session)
  3. Identity provider issues a short-lived JWT containing user identity, email, and group memberships
  4. Platform backend validates the JWT using the provider's JWKS endpoint
  5. If the user does not exist in the platform database, they are auto-provisioned on first login

The JWT is short-lived and automatically refreshed using the offline_access scope. Session state is stored server-side.

info

The identity provider handles all credential storage, password policies, MFA, and session management. The platform backend never sees or stores user passwords.

API Token Authentication

For programmatic access — scripts, CI/CD integrations, third-party applications — the platform issues API tokens.

Token Properties

PropertyDescription
FormatOpaque string prefixed with oat_
StorageHash comparison (scrypt) against database — the raw token is not stored
AbilitiesGranular permission scopes per token
ExpirationOptional expiration date
RevocationCan be revoked at any time from the dashboard

Token Abilities

Each API token is issued with a specific set of abilities that control what operations it can perform:

AbilityGrants
CLUSTER_READView cluster status, metadata, and configuration
CLUSTER_WRITECreate clusters, modify configuration, trigger operations
DATASET_READList and view datasets and entries
DATASET_WRITECreate datasets, upload entries, trigger pipelines

Tokens can be scoped to specific abilities. A token with only DATASET_READ cannot upload files or modify configurations, even if the user who created it has full access.

note

The abilities listed above are examples — the full authorization layer is ability-based on tokens. New abilities can be defined as the platform evolves, and each token can be issued with any combination of available abilities.

Cluster Authentication

Data clusters and the platform communicate using service keys — random strings generated during cluster registration.

Registration Flow

  1. A new data cluster boots with a one-time registration token
  2. The cluster's Data API exchanges this token with the platform for a permanent service API key
  3. The service API key is stored as a Kubernetes Secret on the cluster
  4. All subsequent calls to the platform (heartbeats, sync, status updates) use this service key as a Bearer token

The platform validates service keys by comparing their scrypt hash against the stored hash in the database. The raw key is never stored on the platform side.

Cluster-to-Platform Calls

CallAuthenticationFrequency
HeartbeatService API keyEvery 30 seconds
Batch syncService API keyEvery 30 seconds
RegistrationOne-time token (exchanged for key)Once
Deletion notificationService API keyOn cluster deletion

Platform-to-Cluster Calls (Proxy)

When the platform proxies a user request to a data cluster, it uses the cluster's service API key to authenticate with the Data API. The proxy also forwards the user's identity headers so the Data API knows who originated the request.

MCP Authentication (AI Agents)

AI agents (Claude, GPT-4, custom agents) access the platform through MCP servers. Authentication uses OAuth2 with PKCE (Proof Key for Code Exchange), designed for public clients that cannot securely store a client secret.

MCP Authentication Flow

The critical property of this flow is that the AI agent's access is bounded by the authorizing user's permissions. An agent cannot access data that the user who authorized it cannot access.

Per-Tool RBAC

MCP servers enforce per-tool role-based access control. Each tool declares what abilities it requires (e.g., dataset:read, entry:read), and the MCP framework checks the user's permissions before executing the tool.

Enterprise: OAuth Bypass via API Key or JWT

For enterprise MCP configurations, the OAuth PKCE flow can be bypassed by injecting a pre-existing credential directly in the Authorization header. The MCP server's token validation supports three authentication paths:

  1. Proxy-issued token — the standard path, issued through the OAuth PKCE flow described above.
  2. JWT token — if the Bearer token is a JWT (detected by its three-segment format), the MCP server validates it via the identity provider's introspection endpoint. This allows web applications or services that already hold a valid JWT to use MCP tools directly without a separate OAuth flow.
  3. Platform API token — if the Bearer token is an opaque token (e.g., prefixed with oat_), the MCP server validates it by calling the platform backend's /users/me endpoint. The backend authenticates the token against its access token store and returns the user's identity, organization, abilities, and roles.

In both bypass cases, the MCP server extracts the user's identity and permissions from the validation response and enforces the same per-tool RBAC as the standard OAuth flow. The user's access is bounded by the abilities associated with the token — no escalation is possible.

This enables enterprise integrations where an organization provisions API tokens or manages JWTs through their own identity infrastructure, then configures their MCP clients to use these tokens directly.

Authorization Model

Organization Roles

Access control is organized around organizations. Every user belongs to one or more organizations, and their role within each organization determines what they can do:

RoleCapabilities
ViewerRead-only access to organization datasets and clusters
WriterUpload entries, trigger pipelines, run workflows
OwnerFull organization admin: cluster management, billing, user management
ClientIssue tokens to the organization's own clients for reselling access (Enterprise only)

Roles are enforced by a policy system in the backend. Every controller action checks the user's organization role before accessing data. The authorization check happens before any database query or external call.

Data Owner Isolation

For organizations that expose their datasets to other users (e.g., public datasets in a marketplace), a dedicated middleware ensures that data owners only see analytics about access to their own datasets — they cannot see other customers' data or access patterns.

Cluster Proxy Authorization

When a request is proxied to a data cluster, the authorization check verifies:

  1. The user has the right to access the cluster — this can be because they belong to the owning organization, the dataset or cluster is public, or they have been explicitly granted access
  2. The cluster is in an active state (not offline or suspended)
  3. The request source type is logged (human, worker, or MCP)

Offline and suspended clusters reject all proxy requests. This prevents access to clusters that are undergoing maintenance or have been disabled.

Network Security

Platform Internal — Service Mesh

All services within the platform cluster communicate through a service mesh (Istio) with mutual TLS (mTLS) encryption. This means:

  • All pod-to-pod traffic is encrypted — even within the same cluster
  • Service identity is verified — each service authenticates with its own certificate
  • Access policies restrict communication — not every service can call every other service. For example, only the backend can call the Skupper Gateway.

Cross-Cluster — mTLS Tunnels

Communication between the platform and data clusters uses Skupper mTLS tunnels:

  • Encrypted end-to-end — traffic is encrypted from the platform to the data cluster
  • Mutually authenticated — both sides verify identity with certificates
  • Outbound-only from data clusters — for on-premise deployments, no inbound firewall rules needed on your infrastructure
  • Single-use access grants — tunnel establishment uses time-limited, single-use tokens

Data Cluster Internal

Within each data cluster, services are secured individually:

ServiceSecurity Mechanism
PostgreSQLPer-tenant database role with scoped credentials
MinIOTLS via auto-generated certificates, per-tenant IAM user
QdrantJWT-based RBAC with per-tenant tokens
MeilisearchMaster key + per-tenant API key scoped to tenant indexes
Data APINo public ingress — reachable only via Skupper tunnel
note

The Data API has no public-facing endpoint. It is accessible only through the Skupper mTLS tunnel from the platform. This means there is no attack surface exposed to the internet from the data cluster side.

Data Isolation

Data isolation operates at every layer of the stack. See Data Sovereignty for the full treatment. Here is the summary:

Per-Tenant Resources

Each tenant gets completely separate data infrastructure:

ResourceIsolationCross-Tenant Access
DatabaseSeparate PostgreSQL databaseImpossible — different DB, different credentials
Object storageSeparate MinIO bucket + IAM userImpossible — IAM policy scopes to bucket
Vector databaseSeparate Qdrant collection + JWTImpossible — JWT scopes to collection
Search indexesSeparate Meilisearch indexes + API keyImpossible — API key scopes to indexes
API deploymentSeparate pod in separate namespaceImpossible — network policies enforce namespace boundary

Platform-Side Isolation

On the platform side, all data access is scoped by organization. Database queries filter by organization_id, and the policy system prevents cross-organization data access. The platform stores only metadata — never document content — so even in the event of a platform-side breach, customer documents are not exposed.

Audit Trail

The platform maintains audit logs for all data access:

Log TypeWhat Is RecordedWhere Stored
Proxy call logsEvery proxied request: user, cluster, path, timestamp, response statusPlatform database
Execution historiesData access analytics: who accessed what, when, via which path (human/worker/MCP)Platform database
Sync logsBatch sync events: what metadata was synchronized, whenPlatform database
Kubernetes auditInfrastructure operations: deployments, secret access, CRD changesKubernetes audit log

These logs support compliance requirements for GDPR (data access tracking), HIPAA (access audit trail), and ISO 27001 (security event logging).

Next Steps