Skip to main content

Vector search for similar chunks with auto-embedding (PRIMARY)

POST 

/api/v1/vector/chunks

Primary vector search endpoint - Returns matching text chunks directly from Qdrant.

This is the FAST search method that returns chunks without fetching full entry content. Use this when you want quick results with snippet-level granularity.

Auto-Embedding Support:

  • Provide query (text) for automatic embedding generation
  • OR provide query_vector (pre-computed embeddings)
  • If both provided, query_vector takes precedence

Flow:

  1. If query provided: Generate embeddings using configured provider
  2. Query vector is sent to Qdrant
  3. Matching chunks are returned with scores and metadata
  4. No database or S3 lookups (very fast!)

Performance:

  • With query_vector: < 100ms
  • With query (auto-embed): 500ms-2s (includes embedding generation)

Use cases:

  • Quick similarity search
  • Preview/snippet display
  • Finding relevant passages
  • High-throughput scenarios

Request

Responses

Successful Response