# Retrieval ## Retrieve **post** `/api/v1/retrieval/retrieve` Retrieve relevant chunks via hybrid search (vector + full-text), with filtering on built-in or user-defined metadata. ### Query Parameters - `organization_id: optional string` - `project_id: optional string` ### Cookie Parameters - `session: optional string` ### Body Parameters - `index_id: string` ID of the index to retrieve against. - `query: string` Natural-language query to retrieve relevant chunks. - `custom_filters: optional map[object { operator, value } or array of object { operator, value } ]` Filters on user-defined metadata fields. - `FilterTypeUnionStrIntBoolFloat object { operator, value }` - `operator: "eq" or "ne" or "gt" or 5 more` - `"eq"` - `"ne"` - `"gt"` - `"lt"` - `"gte"` - `"lte"` - `"in"` - `"nin"` - `value: string or boolean or number or array of string or boolean or number` - `string` - `boolean` - `number` - `array of string or boolean or number` - `string` - `boolean` - `number` - `array of object { operator, value }` - `operator: "eq" or "ne" or "gt" or 5 more` - `"eq"` - `"ne"` - `"gt"` - `"lt"` - `"gte"` - `"lte"` - `"in"` - `"nin"` - `value: number or array of number` - `number` - `array of number` - `full_text_pipeline_weight: optional number` Weight of the full-text search pipeline (0-1). - `num_candidates: optional number` Number of candidates for approximate nearest neighbor search. - `rerank: optional object { enabled, top_n }` Reranking configuration applied after hybrid search. Enabled by default. - `enabled: optional boolean` Set to false to disable reranking. - `top_n: optional number` Number of results to return after reranking. - `score_threshold: optional number` Minimum score threshold for returned results. - `static_filters: optional object { parsed_directory_file_id }` Filters on built-in document fields (page range, chunk index, etc.). - `parsed_directory_file_id: optional object { operator, value }` - `operator: "eq" or "ne" or "gt" or 5 more` - `"eq"` - `"ne"` - `"gt"` - `"lt"` - `"gte"` - `"lte"` - `"in"` - `"nin"` - `value: string or array of string` - `string` - `array of string` - `top_k: optional number` Maximum number of results to return. - `vector_pipeline_weight: optional number` Weight of the vector search pipeline (0-1). ### Returns - `results: array of object { content, metadata, rerank_score, 2 more }` Ordered list of retrieved chunks. - `content: string` Text content of the retrieved chunk. - `metadata: optional map[string or number or number or 3 more]` User-defined metadata associated with the chunk. - `string` - `number` - `number` - `boolean` - `unknown` - `MetadataListValue = array of string` - `rerank_score: optional number` Relevance score from the reranker, if reranking was applied. - `score: optional number` Hybrid search relevance score. - `static_fields: optional object { attachments, chunk_end_char, chunk_index, 5 more }` Built-in fields stored for every exported chunk. - `attachments: optional array of object { attachment_name, source_id, type }` Attachments associated with the chunk - `attachment_name: string` Attachment-relative path, e.g. 'screenshots/page_7.jpg'. - `source_id: string` File ID to pass as source_id when fetching the attachment. - `type: string` Attachment kind, e.g. 'screenshot', 'items'. - `chunk_end_char: optional number` End character offset of the chunk. - `chunk_index: optional number` Index of the chunk within the file. - `chunk_start_char: optional number` Start character offset of the chunk. - `chunk_token_count: optional number` Token count of the chunk. - `page_range_end: optional number` Last page number covered by this chunk. - `page_range_start: optional number` First page number covered by this chunk. - `parsed_directory_file_id: optional string` ID of the parsed file. ### Example ```http curl https://api.cloud.llamaindex.ai/api/v1/retrieval/retrieve \ -H 'Content-Type: application/json' \ -H "Authorization: Bearer $LLAMA_CLOUD_API_KEY" \ -d '{ "index_id": "idx-abc123", "query": "What are the key findings?" }' ``` #### Response ```json { "results": [ { "content": "content", "metadata": { "foo": "string" }, "rerank_score": 0, "score": 0, "static_fields": { "attachments": [ { "attachment_name": "attachment_name", "source_id": "source_id", "type": "type" } ], "chunk_end_char": 0, "chunk_index": 0, "chunk_start_char": 0, "chunk_token_count": 0, "page_range_end": 0, "page_range_start": 0, "parsed_directory_file_id": "parsed_directory_file_id" } } ] } ``` ## Find Files **post** `/api/v1/retrieval/files/find` Search for files by name. ### Query Parameters - `organization_id: optional string` - `project_id: optional string` ### Cookie Parameters - `session: optional string` ### Body Parameters - `index_id: string` ID of the index to search within. - `file_name: optional string` Exact file name to match. - `file_name_contains: optional string` Substring match on file name (case-insensitive). - `page_size: optional number` The maximum number of items to return. The service may return fewer than this value. If unspecified, a default page size will be used. The maximum value is typically 1000; values above this will be coerced to the maximum. - `page_token: optional string` A page token, received from a previous list call. Provide this to retrieve the subsequent page. ### Returns - `items: array of object { file_id, file_name }` The list of items. - `file_id: string` ID of the file. - `file_name: string` Display name of the file. - `next_page_token: optional string` A token, which can be sent as page_token to retrieve the next page. If this field is omitted, there are no subsequent pages. - `total_size: optional number` The total number of items available. This is only populated when specifically requested. The value may be an estimate and can be used for display purposes only. ### Example ```http curl https://api.cloud.llamaindex.ai/api/v1/retrieval/files/find \ -H 'Content-Type: application/json' \ -H "Authorization: Bearer $LLAMA_CLOUD_API_KEY" \ -d '{ "index_id": "idx-abc123" }' ``` #### Response ```json { "items": [ { "file_id": "file_id", "file_name": "file_name" } ], "next_page_token": "next_page_token", "total_size": 0 } ``` ## Grep File **post** `/api/v1/retrieval/files/grep` Grep within a file's parsed content using a regex pattern. ### Query Parameters - `organization_id: optional string` - `project_id: optional string` ### Cookie Parameters - `session: optional string` ### Body Parameters - `file_id: string` ID of the file to grep. - `index_id: string` ID of the index the file belongs to. - `pattern: string` Regex pattern to search for. - `context_chars: optional number` Number of characters of context to include before and after the matched pattern in the content field of the response - `page_size: optional number` The maximum number of items to return. The service may return fewer than this value. If unspecified, a default page size will be used. The maximum value is typically 1000; values above this will be coerced to the maximum. - `page_token: optional string` A page token, received from a previous list call. Provide this to retrieve the subsequent page. ### Returns - `items: array of object { content, end_char, start_char }` The list of items. - `content: string` Matched text content. - `end_char: number` End character offset of the match. - `start_char: number` Start character offset of the match. - `next_page_token: optional string` A token, which can be sent as page_token to retrieve the next page. If this field is omitted, there are no subsequent pages. - `total_size: optional number` The total number of items available. This is only populated when specifically requested. The value may be an estimate and can be used for display purposes only. ### Example ```http curl https://api.cloud.llamaindex.ai/api/v1/retrieval/files/grep \ -H 'Content-Type: application/json' \ -H "Authorization: Bearer $LLAMA_CLOUD_API_KEY" \ -d '{ "file_id": "file_id", "index_id": "idx-abc123", "pattern": "revenue|profit" }' ``` #### Response ```json { "items": [ { "content": "content", "end_char": 0, "start_char": 0 } ], "next_page_token": "next_page_token", "total_size": 0 } ``` ## Read File **post** `/api/v1/retrieval/files/read` Read the parsed text content of a specific file. ### Query Parameters - `organization_id: optional string` - `project_id: optional string` ### Cookie Parameters - `session: optional string` ### Body Parameters - `file_id: string` ID of the file to read. - `index_id: string` ID of the index the file belongs to. - `max_length: optional number` Maximum number of characters to read from the offset. - `offset: optional number` Starting character offset. ### Returns - `content: string` Parsed text content of the file. ### Example ```http curl https://api.cloud.llamaindex.ai/api/v1/retrieval/files/read \ -H 'Content-Type: application/json' \ -H "Authorization: Bearer $LLAMA_CLOUD_API_KEY" \ -d '{ "file_id": "file_id", "index_id": "idx-abc123" }' ``` #### Response ```json { "content": "content" } ``` ## Domain Types ### Retrieval Retrieve Response - `RetrievalRetrieveResponse object { results }` Response containing retrieval results. - `results: array of object { content, metadata, rerank_score, 2 more }` Ordered list of retrieved chunks. - `content: string` Text content of the retrieved chunk. - `metadata: optional map[string or number or number or 3 more]` User-defined metadata associated with the chunk. - `string` - `number` - `number` - `boolean` - `unknown` - `MetadataListValue = array of string` - `rerank_score: optional number` Relevance score from the reranker, if reranking was applied. - `score: optional number` Hybrid search relevance score. - `static_fields: optional object { attachments, chunk_end_char, chunk_index, 5 more }` Built-in fields stored for every exported chunk. - `attachments: optional array of object { attachment_name, source_id, type }` Attachments associated with the chunk - `attachment_name: string` Attachment-relative path, e.g. 'screenshots/page_7.jpg'. - `source_id: string` File ID to pass as source_id when fetching the attachment. - `type: string` Attachment kind, e.g. 'screenshot', 'items'. - `chunk_end_char: optional number` End character offset of the chunk. - `chunk_index: optional number` Index of the chunk within the file. - `chunk_start_char: optional number` Start character offset of the chunk. - `chunk_token_count: optional number` Token count of the chunk. - `page_range_end: optional number` Last page number covered by this chunk. - `page_range_start: optional number` First page number covered by this chunk. - `parsed_directory_file_id: optional string` ID of the parsed file. ### Retrieval Find Response - `RetrievalFindResponse object { file_id, file_name }` A file returned by find. - `file_id: string` ID of the file. - `file_name: string` Display name of the file. ### Retrieval Grep Response - `RetrievalGrepResponse object { content, end_char, start_char }` A single grep match within a file. - `content: string` Matched text content. - `end_char: number` End character offset of the match. - `start_char: number` Start character offset of the match. ### Retrieval Read Response - `RetrievalReadResponse object { content }` File read result. - `content: string` Parsed text content of the file.