Vector search for similar chunks with auto-embedding (PRIMARY)

POST /api/v1/vector/chunks

Primary vector search endpoint - Returns matching text chunks directly from Qdrant.

This is the FAST search method that returns chunks without fetching full entry content. Use this when you want quick results with snippet-level granularity.

Auto-Embedding Support:

Provide query (text) for automatic embedding generation
OR provide query_vector (pre-computed embeddings)
If both provided, query_vector takes precedence

Flow:

If query provided: Generate embeddings using configured provider
Query vector is sent to Qdrant
Matching chunks are returned with scores and metadata
No database or S3 lookups (very fast!)

Performance:

With query_vector: < 100ms
With query (auto-embed): 500ms-2s (includes embedding generation)

Use cases:

Quick similarity search
Preview/snippet display
Finding relevant passages
High-throughput scenarios

Request

Responses

Successful Response

Vector search for similar chunks with auto-embedding (PRIMARY)

/api/v1/vector/chunks

Request​

Responses​

Request

Responses