Python SDK Quickstart
The data-api-client Python package provides a fully typed client for the Data API. It is auto-generated from the OpenAPI specification and includes Pydantic models for all request and response bodies.
Requirements
- Python 3.11 or later
pip,uv, or another Python package manager
Installation
The package is published to the Alien Intelligence GitLab PyPI registry.
Using pip
pip install data-api-client \
--extra-index-url https://gitlab.com/api/v4/projects/75857874/packages/pypi/simple
Using uv
uv pip install data-api-client \
--extra-index-url https://gitlab.com/api/v4/projects/75857874/packages/pypi/simple
Permanent Configuration
To avoid passing the registry URL every time, configure pip globally:
pip config set global.extra-index-url \
"https://gitlab.com/api/v4/projects/75857874/packages/pypi/simple https://pypi.org/simple"
Then install with:
pip install data-api-client
For Projects Using pyproject.toml (uv)
Add to your pyproject.toml:
[tool.uv.sources]
data-api-client = { index = "gitlab" }
[[tool.uv.index]]
name = "gitlab"
url = "https://gitlab.com/api/v4/projects/75857874/packages/pypi/simple"
Then:
uv add data-api-client
Configuration
Alien Hosted (Default)
For Alien Hosted deployments, requests go through the platform proxy:
from data_api_client import ApiClient, Configuration
config = Configuration(
host="https://api.alien.club/clusters/YOUR_CLUSTER_ID/proxy"
)
# Authenticate with an API token
config.api_key["Authorization"] = "Bearer oat_YOUR_API_TOKEN"
config.api_key_prefix["Authorization"] = "Bearer"
Get your API token from the Keys section in the platform dashboard. See Manage Your Organization for details.
On-Premise (Direct Access)
For on-premise deployments where you have direct access to the data cluster:
config = Configuration(
host="https://your-data-cluster.internal.example.com"
)
# Authenticate with the cluster service API key
config.api_key["Authorization"] = "Bearer YOUR_SERVICE_API_KEY"
config.api_key_prefix["Authorization"] = "Bearer"
Basic Operations
List Datasets
from data_api_client import ApiClient, Configuration
from data_api_client.api.datasets_api import DatasetsApi
config = Configuration(
host="https://api.alien.club/clusters/YOUR_CLUSTER_ID/proxy"
)
config.api_key["Authorization"] = "Bearer oat_YOUR_API_TOKEN"
config.api_key_prefix["Authorization"] = "Bearer"
with ApiClient(config) as client:
datasets_api = DatasetsApi(client)
datasets = datasets_api.list_datasets()
for dataset in datasets:
print(f"{dataset.id}: {dataset.name} ({dataset.entry_count} entries)")
Get a Dataset
with ApiClient(config) as client:
datasets_api = DatasetsApi(client)
dataset = datasets_api.get_dataset(id=1)
print(f"Name: {dataset.name}")
print(f"Schema: {dataset.schema}")
print(f"Entries: {dataset.entry_count}")
Batch Get Entries
Efficiently retrieve multiple entries in a single request:
from data_api_client.api.entries_api import EntriesApi
with ApiClient(config) as client:
entries_api = EntriesApi(client)
response = entries_api.batch_get_entries(
dataset_id=1,
limit=50
)
print(f"Retrieved {len(response.entries)} entries")
for entry in response.entries:
print(f" {entry.id}: {entry.name} — {entry.status}")
Upload a File
with ApiClient(config) as client:
entries_api = EntriesApi(client)
entry = entries_api.create_entry(
dataset_id=1,
name="Research Paper 2024",
file=open("paper.pdf", "rb")
)
print(f"Created entry {entry.id} — status: {entry.status}")
Search
Keyword Search
from data_api_client.api.search_api import SearchApi
with ApiClient(config) as client:
search_api = SearchApi(client)
results = search_api.keyword_search(
query="protein folding",
dataset_ids=[1, 2],
limit=20
)
for hit in results.hits:
print(f"{hit.name} — score: {hit.score}")
Vector Search (Semantic)
with ApiClient(config) as client:
search_api = SearchApi(client)
results = search_api.vector_search_chunks(
query="novel approaches to protein structure prediction",
dataset_ids=[1],
score_threshold=0.7,
limit=10
)
for chunk in results.results:
print(f"Score: {chunk.score:.2f}")
print(f"Text: {chunk.chunk_text[:150]}...")
print()
Async Usage
The client supports async operations for high-concurrency applications:
import asyncio
from data_api_client import ApiClient, Configuration
from data_api_client.api.datasets_api import DatasetsApi
async def list_all_datasets():
config = Configuration(
host="https://api.alien.club/clusters/YOUR_CLUSTER_ID/proxy"
)
config.api_key["Authorization"] = "Bearer oat_YOUR_API_TOKEN"
config.api_key_prefix["Authorization"] = "Bearer"
async with ApiClient(config) as client:
datasets_api = DatasetsApi(client)
datasets = await datasets_api.list_datasets()
return datasets
datasets = asyncio.run(list_all_datasets())
for ds in datasets:
print(ds.name)
Error Handling
All API errors raise ApiException with status code, reason, and response body:
from data_api_client.exceptions import ApiException
try:
dataset = datasets_api.get_dataset(id=999)
except ApiException as e:
print(f"API Error: {e.status} — {e.reason}")
print(f"Response: {e.body}")
Common error codes:
| Status | Meaning |
|---|---|
| 401 | Authentication failed — check your API token |
| 403 | Insufficient permissions — check your role and token abilities |
| 404 | Resource not found — verify the ID exists |
| 422 | Validation error — check request parameters |
| 503 | Data cluster unavailable — the cluster may be offline or unreachable |
Next Steps
- TypeScript SDK Quickstart — TypeScript/JavaScript client
- Data API Overview — API conventions and direct access
- Data API Reference — Full auto-generated endpoint documentation
- Upload Documents — End-to-end upload workflow
- Search and Query — Search patterns and examples