What is Alien Intelligence?

Alien Intelligence is a data platform for organizations that need to manage large document collections with AI-powered processing, search, and agent integration — with strong per-tenant data isolation.

The platform separates orchestration (the platform layer) from data storage and processing (isolated data clusters). Each tenant gets dedicated databases, storage, vector indexes, and search engines — fully isolated from other tenants. By default, Alien hosts and manages everything for you. For enterprises with strict data sovereignty requirements (GDPR, HIPAA, regulated industries), data clusters can optionally be deployed on your own infrastructure.

Key Capabilities

Data Management

Create, organize, and version large document collections from a single control plane. Datasets support typed schemas, lifecycle tracking, and manifest-based storage that separates metadata from files. The platform catalog stores only metadata pointers — content stays in your isolated data cluster.

Intelligent Processing

Raw documents are automatically transformed into searchable, AI-queryable knowledge bases through composable pipeline components. The platform ships with pre-built pipeline presets for common formats (PDF, scientific XML, DOCX) and supports custom pipelines by composing existing components. Processing runs as parallelized workflows with automatic retry and scratch space per job.

Search and Discovery

Two complementary search engines run on every data cluster:

Keyword search with typo tolerance and faceted filtering, returning results in under 50ms
Vector similarity search across embedded document chunks for semantic discovery

Multi-cluster fan-out lets a single query span datasets stored on different clusters simultaneously, with results merged and ranked by relevance.

AI Agent Integration

AI assistants (Claude, GPT-4, custom agents) can directly search, read, and analyze your document collections through the Model Context Protocol (MCP). The platform provides tools covering the complete data access workflow, from dataset discovery to reading individual figures from processed documents. Authentication follows end-user permissions — agents never receive more access than the human who authorized them.

Research Intelligence

Built-in access to public research databases including over 600 million research products via OpenAIRE and 14 million bibliographic records from France's national library (BnF). AI agents can conduct literature reviews, profile authors, track funding outputs, and navigate historical document collections without additional subscriptions.

Data Isolation by Design

Every tenant on Alien Intelligence gets strong data isolation, regardless of where the cluster is hosted:

Namespace isolation: Each tenant gets dedicated databases, storage buckets, vector collections, and search indexes with scoped credentials.
Metadata-only sync: Only dataset names, entry counts, and sync status flow to the platform catalog. Never content.
Proxy architecture: The platform never holds a direct route to your storage. All data access is authenticated and per-entry.
No bulk egress: There are no API endpoints that export data back to the platform.

On-Premise Deployment (Enterprise)

For organizations in regulated industries — healthcare, defense, financial services, government research — data clusters can be deployed on your own infrastructure. This adds full data sovereignty with:

Network topology: On-premise clusters initiate outbound-only connections via encrypted mTLS tunnels. No inbound firewall rules required.
Physical data residency: Documents, embeddings, and indexes physically reside on infrastructure you control.

On-premise deployment is recommended only for teams with the capacity to manage Kubernetes infrastructure. Alien-hosted clusters are maintained by Alien with better response times and automatic updates.

Both deployment modes are structurally compatible with GDPR, HIPAA, and ISO 27001 requirements.

Who Uses Alien Intelligence?

Research institutions building searchable corpora of scientific literature with semantic search and AI-powered analysis
Content providers ingesting proprietary document formats into structured, AI-queryable collections
Cultural heritage organizations providing AI-native access to large historical document archives
Regulated enterprises that need vector search, LLM analysis, and multi-agent workflows with strong data isolation — optionally on their own infrastructure

Where to Go Next

Understand the platform architecture, data clusters, datasets, pipelines, and search.

Step-by-step guides for creating clusters, uploading documents, and configuring pipelines.

Interactive API documentation for the Platform API and Data API with request/response schemas.

Python and TypeScript client libraries for programmatic access to the Data API.

Deep dives into deployment models, networking, infrastructure, processing engine, and compliance.

What is Alien Intelligence?

Key Capabilities

Data Management

Intelligent Processing

Search and Discovery

AI Agent Integration

Research Intelligence

Data Isolation by Design

On-Premise Deployment (Enterprise)

Who Uses Alien Intelligence?

Where to Go Next

Core Concepts

How-To Guides

API Reference

SDK Reference

Architecture

Key Capabilities​

Data Management​

Intelligent Processing​

Search and Discovery​

AI Agent Integration​

Research Intelligence​

Data Isolation by Design​

On-Premise Deployment (Enterprise)​

Who Uses Alien Intelligence?​

Where to Go Next​

Core Concepts

How-To Guides

API Reference

SDK Reference

Architecture

Key Capabilities

Data Management

Intelligent Processing

Search and Discovery

AI Agent Integration

Research Intelligence

Data Isolation by Design

On-Premise Deployment (Enterprise)

Who Uses Alien Intelligence?

Where to Go Next