Skip to main content

EntryCreateResponse

Response after creating an entry.

entry objectrequired

Created entry

nameName (string)required

Entry name

slugSlug (string)required

URL-friendly slug

description object

Entry description

anyOf
string
statusEntryStatus (string)required

Current processing status

Possible values: [pending, uploading, uploaded, processing, processed, error]

mime_typeMime Type (string)required

MIME type of primary file

idId (integer)required

Entry ID

manifest objectrequired

Entry manifest with all file locations and metadata

schema_versionSchema Version (string)required

Schema version (e.g., 'v3')

dataset_schema_idDataset Schema Id (string)required

Dataset schema identifier (e.g., 'arxiv_papers_ocr')

original object

Original files section

anyOf
files object[]

List of original files

  • Array [
  • keyKey (string)required

    S3 key for the file

    sizeSize (integer)required

    File size in bytes

    mime_typeMime Type (string)required

    MIME type of the file

    hash object

    SHA256 hash of the file

    anyOf
    string
    created_at object

    File creation timestamp

    anyOf
    string<date-time>
    expires_at object

    File expiration timestamp (for processing artifacts)

    anyOf
    string<date-time>
  • ]
  • metadata object

    Original metadata (title, author, etc.)

    property name*any

    Original metadata (title, author, etc.)

    processing object

    Processing artifacts section

    anyOf
    steps_completedstring[]

    List of completed processing steps

    files object[]

    Intermediate processing files

  • Array [
  • keyKey (string)required

    S3 key for the file

    sizeSize (integer)required

    File size in bytes

    mime_typeMime Type (string)required

    MIME type of the file

    hash object

    SHA256 hash of the file

    anyOf
    string
    created_at object

    File creation timestamp

    anyOf
    string<date-time>
    expires_at object

    File expiration timestamp (for processing artifacts)

    anyOf
    string<date-time>
  • ]
  • processed object

    Processed content section

    anyOf
    content_key object

    S3 key for main content.json file

    anyOf
    string
    size object

    Size of content.json in bytes

    anyOf
    integer
    fields_summary object

    Quick stats for UI (text_length, chunk_count, etc.)

    property name*any

    Quick stats for UI (text_length, chunk_count, etc.)

    completed_at object

    Processing completion timestamp

    anyOf
    string<date-time>
    additional_files object

    Additional processed files (figures, etc.)

    anyOf
  • Array [
  • keyKey (string)required

    S3 key for the file

    sizeSize (integer)required

    File size in bytes

    mime_typeMime Type (string)required

    MIME type of the file

    hash object

    SHA256 hash of the file

    anyOf
    string
    created_at object

    File creation timestamp

    anyOf
    string<date-time>
    expires_at object

    File expiration timestamp (for processing artifacts)

    anyOf
    string<date-time>
  • ]
  • full_manifest_key object

    S3 key if manifest >5KB (stored externally)

    anyOf
    string
    storage_pathStorage Path (string)required

    Base storage path (e.g., 'datasets/123/entries/456')

    primary_file_key object

    S3 key of primary original file (cached)

    anyOf
    string
    processed_content_key object

    S3 key of processed content (cached)

    anyOf
    string
    file_size_bytes object

    Total size of all files in bytes (cached)

    anyOf
    integer
    dataset_idDataset Id (integer)required

    Parent dataset ID

    created_atstring<date-time>required

    Creation timestamp

    updated_atstring<date-time>required

    Last update timestamp

    processing_completed_at object

    Processing completion timestamp

    anyOf
    string<date-time>
    versionVersion (integer)

    Version number for optimistic locking

    Default value: 1
    upload_url object

    Pre-signed upload URL (if applicable)

    anyOf
    string
    EntryCreateResponse
    {
    "entry": {
    "name": "string",
    "slug": "string",
    "description": "string",
    "status": "pending",
    "mime_type": "string",
    "id": 0,
    "manifest": {
    "dataset_schema_id": "arxiv_papers_ocr",
    "original": {
    "files": [
    {
    "created_at": "2025-11-04T10:00:00Z",
    "hash": "sha256:abc123...",
    "key": "datasets/123/entries/456/original/paper.pdf",
    "mime_type": "application/pdf",
    "size": 2048000
    },
    {
    "key": "datasets/123/entries/456/original/thumbnail.jpg",
    "mime_type": "image/jpeg",
    "size": 50000
    }
    ],
    "metadata": {
    "arxiv_id": "2024.12345",
    "authors": [
    "John Doe",
    "Jane Smith"
    ],
    "published_date": "2024-11-01",
    "title": "Deep Learning for Computer Vision"
    }
    },
    "processed": {
    "additional_files": [
    {
    "key": "datasets/123/entries/456/processed/figures/fig_001.png",
    "mime_type": "image/png",
    "size": 80000
    }
    ],
    "completed_at": "2025-11-04T10:30:00Z",
    "content_key": "datasets/123/entries/456/processed/content.json",
    "fields_summary": {
    "chunk_count": 120,
    "figure_count": 8,
    "text_length": 45000
    },
    "size": 150000
    },
    "processing": {
    "files": [
    {
    "expires_at": "2025-12-04T10:00:00Z",
    "key": "datasets/123/entries/456/processing/embeddings.npy",
    "mime_type": "application/octet-stream",
    "size": 200000
    }
    ],
    "steps_completed": [
    "ocr",
    "chunking",
    "embedding"
    ]
    },
    "schema_version": "v3"
    },
    "storage_path": "string",
    "primary_file_key": "string",
    "processed_content_key": "string",
    "file_size_bytes": 0,
    "dataset_id": 0,
    "created_at": "2024-07-29T15:51:28.071Z",
    "updated_at": "2024-07-29T15:51:28.071Z",
    "processing_completed_at": "2024-07-29T15:51:28.071Z",
    "version": 1
    },
    "upload_url": "string"
    }