Skip to main content

deepSearch

Runs a multi-agent research workflow over one or more RAG datasets. Builds a LangGraph state machine composed of a researcher agent, a tools node (wrapping vectorSearch and rerank), an advisor agent that scores research quality and identifies evidence gaps, and a synthesize node that extracts structured facts. The graph iterates — researcher → tools → advisor → researcher — until the advisor scores quality ≥ 0.9 with no evidence gaps, or max_research_iterations is reached.

Dataset resolution: if collection_id is provided and dataset_ids is not, the node calls getCollection to resolve dataset IDs. If neither is provided, a ValueError is raised.

Parameters

ParamTypeRequiredDefaultDescription
querystringYesResearch question. Minimum length: 1
dataset_idsinteger[]?NonullDataset IDs to search. Takes priority over collection_id
collection_idinteger?NonullCollection to resolve dataset IDs from, when dataset_ids is not provided
max_research_iterationsintegerNo3Maximum iteration cycles. Range: 1–10
max_concurrent_agentsintegerNo3Maximum concurrent research agents. Range: 1–10
allow_clarificationbooleanNofalseAsk clarifying questions before research
max_tool_calls_per_iterationintegerNo10Maximum tool calls per iteration. Range: 1–30
research_modelstringNo"gpt-4o-mini"LLM model for researcher and advisor agents
research_providerstring?NonullLLM provider. Auto-detected from model when not set
summarization_modelstring?NonullLLM model for summarization. Falls back to research_model
final_report_modelstring?NonullLLM model for the final report. Falls back to research_model
search_kintegerNo10Documents retrieved per RAG search call. Range: 1–50
embedding_modelstringNo"BAAI/bge-m3"Embedding model for vector search
embedding_providerstringNo"runpod"Embedding provider
max_documents_per_agentintegerNo20Maximum documents each agent can analyze. Range: 1–100
report_formatstringNo"detailed"One of: summary, detailed, comprehensive

Output

FieldTypeDescription
extracted_factsExtractedFact[]Structured facts extracted by the synthesis LLM
citationsCitation[]Unique citations deduplicated by DOI
research_summarystringFree-text summary produced by the synthesis LLM
evidence_gapsstring[]Gaps in evidence identified by the advisor
total_documents_analyzedintegerCount of unique documents analyzed across all iterations
total_iterationsintegerNumber of researcher→tools→advisor cycles completed
research_quality_scorefloatFinal quality score assigned by the advisor (0.0–1.0)
execution_time_secondsfloatWall-clock time in seconds

Each ExtractedFact:

FieldTypeDescription
factstringExtracted fact
confidence_scorefloatConfidence (0.0–1.0)
supporting_citationsinteger[]Citation id values supporting this fact
relevance_scorefloatRelevance to the query (0.0–1.0)

Each Citation:

FieldTypeDescription
idintegerSequential citation ID
titlestringPaper title
doistringDOI
subject_areastringSubject area or field

Example

{
"id": "research",
"type": "deepSearch",
"data": {
"label": "Deep Search",
"isExecuted": false,
"handles": ["inputs", "outputs"],
"schema": {},
"params": {
"query": { "value": "{{ $input.question }}", "isExpression": true, "isAttachedToInputNode": false },
"collection_id": { "value": 42, "isExpression": false, "isAttachedToInputNode": false },
"max_research_iterations": { "value": 4, "isExpression": false, "isAttachedToInputNode": false },
"research_model": { "value": "gpt-4o-mini", "isExpression": false, "isAttachedToInputNode": false },
"search_k": { "value": 25, "isExpression": false, "isAttachedToInputNode": false },
"report_format": { "value": "detailed", "isExpression": false, "isAttachedToInputNode": false }
},
"inputs": [], "outputs": [], "errors": []
},
"position": { "x": 0, "y": 0 },
"isSelected": false,
"isDragging": false
}