AI Capabilities

The platform integrates AI at multiple levels: document processing pipelines use OCR and embedding models to make documents searchable, the workflow engine orchestrates LLM calls across multiple providers, and MCP servers give AI agents direct access to your data. This page covers the AI architecture and how these capabilities work together.

Multi-Provider LLM Support

The platform supports multiple LLM providers, configurable per-node in workflows and per-tenant for embeddings. This avoids vendor lock-in and enables cost optimization by routing different tasks to the most appropriate model.

Supported Providers

Provider	Capabilities	Typical Use Cases
OpenAI	Chat completion, structured output, embeddings	General-purpose reasoning, JSON output, embeddings
Anthropic	Chat completion, long-context analysis	Complex analysis, research synthesis, long documents
Mistral	Chat completion, OCR, embeddings	European data processing, document extraction, multilingual
Google Gemini	Chat completion, embeddings	Cost-effective batch processing, embeddings

Provider Selection

Provider selection happens at two levels:

Per-node in workflows — each node in the visual workflow editor can be configured with a specific provider and model. A single workflow can use OpenAI for summarization, Anthropic for analysis, and Mistral for translation.
Per-tenant for embeddings — each data cluster tenant has a configured embedding provider. All documents in that tenant's datasets use the same embedding model for consistency in vector search.

info

Embedding provider and vector dimensions are configured when creating a data cluster. Changing the embedding provider after documents have been processed requires re-embedding all existing documents. Choose your embedding provider carefully at cluster creation time.

Embedding Generation

Embeddings are vector representations of text that enable semantic search — finding documents by meaning rather than exact keyword matches.

How Embedding Works in Pipelines

During document processing, the embedding step runs after chunking:

Each chunk produces a fixed-dimensional vector (the dimension depends on the model). These vectors are stored in Qdrant alongside the chunk text and metadata, enabling semantic similarity search.

Embedding Providers

Provider	Characteristics
OpenAI	High-quality embeddings, multiple model sizes available
Mistral	Strong multilingual support
Google Gemini	Cost-effective, competitive quality

The embedding provider is configured per-tenant, ensuring all vectors in a collection use the same model and dimensions. This is critical for search quality — mixing embedding models in the same collection produces unreliable similarity scores.

OCR via Mistral Document AI

The platform uses Mistral's Document AI service for optical character recognition (OCR). This is the first step in making PDF documents searchable.

What OCR Produces

Output	Format	Description
Extracted text	Markdown	Full document text with heading structure preserved
Figures	Images (base64)	All images, diagrams, and figures extracted from the document
OCR metadata	JSON	Page-by-page extraction details

The OCR output preserves document structure — headings, paragraphs, lists, and tables are represented in Markdown. This structure is important for the chunking step, which uses heading boundaries to create semantically coherent chunks.

Figure Extraction

Figures extracted during OCR are processed through a figure linking step that:

Resolves figure references in the Markdown text (e.g., "Figure 1" links to the actual image)
Converts all images to a consistent format (PNG)
Stores figures alongside the processed document in MinIO
Makes figures accessible via the Data API

The MCP Architecture

The platform implements the Model Context Protocol (MCP) — an open standard for giving AI agents structured access to external data and tools. MCP servers expose your data to AI assistants like Claude, GPT-4, and custom agents through a standardized tool interface.

How MCP Works

The critical property of this architecture is that the AI agent accesses data with the authorizing user's permissions. The agent cannot see data the user cannot see, and every access is logged in the platform's audit trail.

Available MCP Servers

Server	Tools	Data Source	Purpose
Data Cluster	7 tools	Your document collections	Browse datasets, keyword search, semantic search, read documents, view figures
OpenAIRE	29 tools	OpenAIRE Graph (600M+ products)	Literature review, citation analysis, author profiling, research trends
BnF	15 tools	Bibliotheque nationale de France	Historical documents, bibliographic records, digitized collections

MCP Authentication

MCP servers support three authentication paths, all of which enforce the same per-tool RBAC:

OAuth PKCE — the standard path for interactive AI assistants. The user authorizes the agent through a browser-based OAuth flow.
JWT relay — for web applications or services that already hold a valid JWT from the identity provider.
API token — for enterprise integrations using platform API tokens.

In all cases, the MCP server extracts the user's identity and permissions from the token and enforces per-tool access control. Each tool declares what abilities it requires (e.g., dataset:read, entry:read), and the framework checks permissions before executing the tool.

How AI Agents Access Data Securely

The MCP architecture enforces multiple security boundaries:

User-scoped access — the agent inherits the authorizing user's permissions
Per-tool RBAC — each tool declares required abilities, checked before execution
Proxy layer — all data access goes through the platform's authenticated proxy (no direct cluster access)
Audit logging — every tool call is logged with the user identity, MCP session ID, and request details
Organization scoping — agents can only access data within the user's organization

Workflow Orchestration

The visual workflow editor enables building multi-step AI pipelines without writing code. Workflows are defined as directed acyclic graphs (DAGs) where nodes represent operations and edges define data flow.

Node Categories

Category	Examples	Description
Data Access	Vector search, keyword search, download entry	Retrieve data from your clusters
LLM	Chat completion, structured output	Call language models with configurable providers
Document Processing	Text splitter, summarizer, translator	Transform and analyze text
Audio	Text-to-speech	Generate audio from text
Research	OpenAIRE search, citation analysis	Access research intelligence tools
Agents	Agent node, group node	Multi-agent orchestration and composable sub-workflows
System	Conditional, loop, merge	Control flow and data routing

Building a Workflow

Add nodes to the canvas — each node represents one operation
Connect nodes with edges — defines the data flow between steps
Configure parameters — set model, prompt, search query, etc. per node
Reference upstream outputs — use template syntax to pass data between nodes
Run the workflow — the platform executes the DAG, tracking cost and status per node

Multi-Agent Orchestration

For complex tasks, the platform supports hierarchical multi-agent execution:

Agent nodes execute autonomous AI agents with access to tools and sub-agents
Group nodes define composable sub-workflows that dissolve into the parent DAG
Tool routing — agents can call vector search, keyword search, and other data access tools as part of their reasoning
State management — agent conversations and intermediate state are tracked for debugging

This enables building sophisticated AI applications — for example, a research agent that searches your document collection, cross-references findings with the global research graph (via OpenAIRE), and produces a structured analysis.

AI Cost Tracking

The platform tracks AI costs at multiple levels:

Level	What Is Tracked
Per-node	Token counts (input/output), API call costs, model used
Per-job	Aggregate cost across all nodes in the workflow
Per-organization	Historical cost data for billing and budgeting

Cost data is available in the job execution summary, allowing you to understand which steps in your workflows are most expensive and optimize accordingly — for example, by switching to a smaller model for simple classification tasks or batching multiple documents into a single LLM call.

Next Steps

Processing Engine — How document processing pipelines and AI workflows execute
AI Agent Integration — Connect AI assistants to your data via MCP
Search and Query — Semantic and keyword search
Compliance — How AI access is audited and secured

Multi-Provider LLM Support​

Supported Providers​

Provider Selection​

Embedding Generation​

How Embedding Works in Pipelines​

Embedding Providers​

OCR via Mistral Document AI​

What OCR Produces​

Figure Extraction​

The MCP Architecture​

How MCP Works​

Available MCP Servers​

MCP Authentication​

How AI Agents Access Data Securely​

Workflow Orchestration​

Node Categories​

Building a Workflow​

Multi-Agent Orchestration​

AI Cost Tracking​

Next Steps​