Skip to main content

Python SDK Quickstart

The data-api-client Python package provides a fully typed client for the Data API. It is auto-generated from the OpenAPI specification and includes Pydantic models for all request and response bodies.

Requirements

  • Python 3.11 or later
  • pip, uv, or another Python package manager

Installation

The package is published to the Alien Intelligence GitLab PyPI registry.

Using pip

pip install data-api-client \
--extra-index-url https://gitlab.com/api/v4/projects/75857874/packages/pypi/simple

Using uv

uv pip install data-api-client \
--extra-index-url https://gitlab.com/api/v4/projects/75857874/packages/pypi/simple

Permanent Configuration

To avoid passing the registry URL every time, configure pip globally:

pip config set global.extra-index-url \
"https://gitlab.com/api/v4/projects/75857874/packages/pypi/simple https://pypi.org/simple"

Then install with:

pip install data-api-client

For Projects Using pyproject.toml (uv)

Add to your pyproject.toml:

[tool.uv.sources]
data-api-client = { index = "gitlab" }

[[tool.uv.index]]
name = "gitlab"
url = "https://gitlab.com/api/v4/projects/75857874/packages/pypi/simple"

Then:

uv add data-api-client

Configuration

Alien Hosted (Default)

For Alien Hosted deployments, requests go through the platform proxy:

from data_api_client import ApiClient, Configuration

config = Configuration(
host="https://api.alien.club/clusters/YOUR_CLUSTER_ID/proxy"
)

# Authenticate with an API token
config.api_key["Authorization"] = "Bearer oat_YOUR_API_TOKEN"
config.api_key_prefix["Authorization"] = "Bearer"

Get your API token from the Keys section in the platform dashboard. See Manage Your Organization for details.

On-Premise (Direct Access)

For on-premise deployments where you have direct access to the data cluster:

config = Configuration(
host="https://your-data-cluster.internal.example.com"
)

# Authenticate with the cluster service API key
config.api_key["Authorization"] = "Bearer YOUR_SERVICE_API_KEY"
config.api_key_prefix["Authorization"] = "Bearer"

Basic Operations

List Datasets

from data_api_client import ApiClient, Configuration
from data_api_client.api.datasets_api import DatasetsApi

config = Configuration(
host="https://api.alien.club/clusters/YOUR_CLUSTER_ID/proxy"
)
config.api_key["Authorization"] = "Bearer oat_YOUR_API_TOKEN"
config.api_key_prefix["Authorization"] = "Bearer"

with ApiClient(config) as client:
datasets_api = DatasetsApi(client)

datasets = datasets_api.list_datasets()
for dataset in datasets:
print(f"{dataset.id}: {dataset.name} ({dataset.entry_count} entries)")

Get a Dataset

with ApiClient(config) as client:
datasets_api = DatasetsApi(client)

dataset = datasets_api.get_dataset(id=1)
print(f"Name: {dataset.name}")
print(f"Schema: {dataset.schema}")
print(f"Entries: {dataset.entry_count}")

Batch Get Entries

Efficiently retrieve multiple entries in a single request:

from data_api_client.api.entries_api import EntriesApi

with ApiClient(config) as client:
entries_api = EntriesApi(client)

response = entries_api.batch_get_entries(
dataset_id=1,
limit=50
)

print(f"Retrieved {len(response.entries)} entries")
for entry in response.entries:
print(f" {entry.id}: {entry.name}{entry.status}")

Upload a File

with ApiClient(config) as client:
entries_api = EntriesApi(client)

entry = entries_api.create_entry(
dataset_id=1,
name="Research Paper 2024",
file=open("paper.pdf", "rb")
)
print(f"Created entry {entry.id} — status: {entry.status}")
from data_api_client.api.search_api import SearchApi

with ApiClient(config) as client:
search_api = SearchApi(client)

results = search_api.keyword_search(
query="protein folding",
dataset_ids=[1, 2],
limit=20
)

for hit in results.hits:
print(f"{hit.name} — score: {hit.score}")

Vector Search (Semantic)

with ApiClient(config) as client:
search_api = SearchApi(client)

results = search_api.vector_search_chunks(
query="novel approaches to protein structure prediction",
dataset_ids=[1],
score_threshold=0.7,
limit=10
)

for chunk in results.results:
print(f"Score: {chunk.score:.2f}")
print(f"Text: {chunk.chunk_text[:150]}...")
print()

Async Usage

The client supports async operations for high-concurrency applications:

import asyncio
from data_api_client import ApiClient, Configuration
from data_api_client.api.datasets_api import DatasetsApi

async def list_all_datasets():
config = Configuration(
host="https://api.alien.club/clusters/YOUR_CLUSTER_ID/proxy"
)
config.api_key["Authorization"] = "Bearer oat_YOUR_API_TOKEN"
config.api_key_prefix["Authorization"] = "Bearer"

async with ApiClient(config) as client:
datasets_api = DatasetsApi(client)
datasets = await datasets_api.list_datasets()
return datasets

datasets = asyncio.run(list_all_datasets())
for ds in datasets:
print(ds.name)

Error Handling

All API errors raise ApiException with status code, reason, and response body:

from data_api_client.exceptions import ApiException

try:
dataset = datasets_api.get_dataset(id=999)
except ApiException as e:
print(f"API Error: {e.status}{e.reason}")
print(f"Response: {e.body}")

Common error codes:

StatusMeaning
401Authentication failed — check your API token
403Insufficient permissions — check your role and token abilities
404Resource not found — verify the ID exists
422Validation error — check request parameters
503Data cluster unavailable — the cluster may be offline or unreachable

Next Steps