# Retrieval

## Retrieve

**post** `/api/v1/retrieval/retrieve`

Retrieve relevant chunks via hybrid search (vector + full-text), with filtering on built-in or user-defined metadata.

### Query Parameters

- `organization_id: optional string`

- `project_id: optional string`

### Cookie Parameters

- `session: optional string`

### Body Parameters

- `index_id: string`

  ID of the index to retrieve against.

- `query: string`

  Natural-language query to retrieve relevant chunks.

- `custom_filters: optional map[object { operator, value }  or array of object { operator, value } ]`

  Filters on user-defined metadata fields.

  - `FilterTypeUnionStrIntBoolFloat object { operator, value }`

    - `operator: "eq" or "ne" or "gt" or 5 more`

      - `"eq"`

      - `"ne"`

      - `"gt"`

      - `"lt"`

      - `"gte"`

      - `"lte"`

      - `"in"`

      - `"nin"`

    - `value: string or boolean or number or array of string or boolean or number`

      - `string`

      - `boolean`

      - `number`

      - `array of string or boolean or number`

        - `string`

        - `boolean`

        - `number`

  - `array of object { operator, value }`

    - `operator: "eq" or "ne" or "gt" or 5 more`

      - `"eq"`

      - `"ne"`

      - `"gt"`

      - `"lt"`

      - `"gte"`

      - `"lte"`

      - `"in"`

      - `"nin"`

    - `value: number or array of number`

      - `number`

      - `array of number`

- `full_text_pipeline_weight: optional number`

  Weight of the full-text search pipeline (0-1).

- `num_candidates: optional number`

  Number of candidates for approximate nearest neighbor search.

- `rerank: optional object { enabled, top_n }`

  Reranking configuration applied after hybrid search. Enabled by default.

  - `enabled: optional boolean`

    Set to false to disable reranking.

  - `top_n: optional number`

    Number of results to return after reranking.

- `score_threshold: optional number`

  Minimum score threshold for returned results.

- `static_filters: optional object { parsed_directory_file_id }`

  Filters on built-in document fields (page range, chunk index, etc.).

  - `parsed_directory_file_id: optional object { operator, value }`

    - `operator: "eq" or "ne" or "gt" or 5 more`

      - `"eq"`

      - `"ne"`

      - `"gt"`

      - `"lt"`

      - `"gte"`

      - `"lte"`

      - `"in"`

      - `"nin"`

    - `value: string or array of string`

      - `string`

      - `array of string`

- `top_k: optional number`

  Maximum number of results to return.

- `vector_pipeline_weight: optional number`

  Weight of the vector search pipeline (0-1).

### Returns

- `results: array of object { content, metadata, rerank_score, 2 more }`

  Ordered list of retrieved chunks.

  - `content: string`

    Text content of the retrieved chunk.

  - `metadata: optional map[string or number or number or 3 more]`

    User-defined metadata associated with the chunk.

    - `string`

    - `number`

    - `number`

    - `boolean`

    - `unknown`

    - `MetadataListValue = array of string`

  - `rerank_score: optional number`

    Relevance score from the reranker, if reranking was applied.

  - `score: optional number`

    Hybrid search relevance score.

  - `static_fields: optional object { attachments, chunk_end_char, chunk_index, 5 more }`

    Built-in fields stored for every exported chunk.

    - `attachments: optional array of object { attachment_name, source_id, type }`

      Attachments associated with the chunk

      - `attachment_name: string`

        Attachment-relative path, e.g. 'screenshots/page_7.jpg'.

      - `source_id: string`

        File ID to pass as source_id when fetching the attachment.

      - `type: string`

        Attachment kind, e.g. 'screenshot', 'items'.

    - `chunk_end_char: optional number`

      End character offset of the chunk.

    - `chunk_index: optional number`

      Index of the chunk within the file.

    - `chunk_start_char: optional number`

      Start character offset of the chunk.

    - `chunk_token_count: optional number`

      Token count of the chunk.

    - `page_range_end: optional number`

      Last page number covered by this chunk.

    - `page_range_start: optional number`

      First page number covered by this chunk.

    - `parsed_directory_file_id: optional string`

      ID of the parsed file.

### Example

```http
curl https://api.cloud.llamaindex.ai/api/v1/retrieval/retrieve \
    -H 'Content-Type: application/json' \
    -H "Authorization: Bearer $LLAMA_CLOUD_API_KEY" \
    -d '{
          "index_id": "idx-abc123",
          "query": "What are the key findings?"
        }'
```

#### Response

```json
{
  "results": [
    {
      "content": "content",
      "metadata": {
        "foo": "string"
      },
      "rerank_score": 0,
      "score": 0,
      "static_fields": {
        "attachments": [
          {
            "attachment_name": "attachment_name",
            "source_id": "source_id",
            "type": "type"
          }
        ],
        "chunk_end_char": 0,
        "chunk_index": 0,
        "chunk_start_char": 0,
        "chunk_token_count": 0,
        "page_range_end": 0,
        "page_range_start": 0,
        "parsed_directory_file_id": "parsed_directory_file_id"
      }
    }
  ]
}
```

## Find Files

**post** `/api/v1/retrieval/files/find`

Search for files by name.

### Query Parameters

- `organization_id: optional string`

- `project_id: optional string`

### Cookie Parameters

- `session: optional string`

### Body Parameters

- `index_id: string`

  ID of the index to search within.

- `file_name: optional string`

  Exact file name to match.

- `file_name_contains: optional string`

  Substring match on file name (case-insensitive).

- `page_size: optional number`

  The maximum number of items to return. The service may return fewer than this value. If unspecified, a default page size will be used. The maximum value is typically 1000; values above this will be coerced to the maximum.

- `page_token: optional string`

  A page token, received from a previous list call. Provide this to retrieve the subsequent page.

### Returns

- `items: array of object { file_id, file_name }`

  The list of items.

  - `file_id: string`

    ID of the file.

  - `file_name: string`

    Display name of the file.

- `next_page_token: optional string`

  A token, which can be sent as page_token to retrieve the next page. If this field is omitted, there are no subsequent pages.

- `total_size: optional number`

  The total number of items available. This is only populated when specifically requested. The value may be an estimate and can be used for display purposes only.

### Example

```http
curl https://api.cloud.llamaindex.ai/api/v1/retrieval/files/find \
    -H 'Content-Type: application/json' \
    -H "Authorization: Bearer $LLAMA_CLOUD_API_KEY" \
    -d '{
          "index_id": "idx-abc123"
        }'
```

#### Response

```json
{
  "items": [
    {
      "file_id": "file_id",
      "file_name": "file_name"
    }
  ],
  "next_page_token": "next_page_token",
  "total_size": 0
}
```

## Grep File

**post** `/api/v1/retrieval/files/grep`

Grep within a file's parsed content using a regex pattern.

### Query Parameters

- `organization_id: optional string`

- `project_id: optional string`

### Cookie Parameters

- `session: optional string`

### Body Parameters

- `file_id: string`

  ID of the file to grep.

- `index_id: string`

  ID of the index the file belongs to.

- `pattern: string`

  Regex pattern to search for.

- `context_chars: optional number`

  Number of characters of context to include before and after the matched pattern in the content field of the response

- `page_size: optional number`

  The maximum number of items to return. The service may return fewer than this value. If unspecified, a default page size will be used. The maximum value is typically 1000; values above this will be coerced to the maximum.

- `page_token: optional string`

  A page token, received from a previous list call. Provide this to retrieve the subsequent page.

### Returns

- `items: array of object { content, end_char, start_char }`

  The list of items.

  - `content: string`

    Matched text content.

  - `end_char: number`

    End character offset of the match.

  - `start_char: number`

    Start character offset of the match.

- `next_page_token: optional string`

  A token, which can be sent as page_token to retrieve the next page. If this field is omitted, there are no subsequent pages.

- `total_size: optional number`

  The total number of items available. This is only populated when specifically requested. The value may be an estimate and can be used for display purposes only.

### Example

```http
curl https://api.cloud.llamaindex.ai/api/v1/retrieval/files/grep \
    -H 'Content-Type: application/json' \
    -H "Authorization: Bearer $LLAMA_CLOUD_API_KEY" \
    -d '{
          "file_id": "file_id",
          "index_id": "idx-abc123",
          "pattern": "revenue|profit"
        }'
```

#### Response

```json
{
  "items": [
    {
      "content": "content",
      "end_char": 0,
      "start_char": 0
    }
  ],
  "next_page_token": "next_page_token",
  "total_size": 0
}
```

## Read File

**post** `/api/v1/retrieval/files/read`

Read the parsed text content of a specific file.

### Query Parameters

- `organization_id: optional string`

- `project_id: optional string`

### Cookie Parameters

- `session: optional string`

### Body Parameters

- `file_id: string`

  ID of the file to read.

- `index_id: string`

  ID of the index the file belongs to.

- `max_length: optional number`

  Maximum number of characters to read from the offset.

- `offset: optional number`

  Starting character offset.

### Returns

- `content: string`

  Parsed text content of the file.

### Example

```http
curl https://api.cloud.llamaindex.ai/api/v1/retrieval/files/read \
    -H 'Content-Type: application/json' \
    -H "Authorization: Bearer $LLAMA_CLOUD_API_KEY" \
    -d '{
          "file_id": "file_id",
          "index_id": "idx-abc123"
        }'
```

#### Response

```json
{
  "content": "content"
}
```

## Domain Types

### Retrieval Retrieve Response

- `RetrievalRetrieveResponse object { results }`

  Response containing retrieval results.

  - `results: array of object { content, metadata, rerank_score, 2 more }`

    Ordered list of retrieved chunks.

    - `content: string`

      Text content of the retrieved chunk.

    - `metadata: optional map[string or number or number or 3 more]`

      User-defined metadata associated with the chunk.

      - `string`

      - `number`

      - `number`

      - `boolean`

      - `unknown`

      - `MetadataListValue = array of string`

    - `rerank_score: optional number`

      Relevance score from the reranker, if reranking was applied.

    - `score: optional number`

      Hybrid search relevance score.

    - `static_fields: optional object { attachments, chunk_end_char, chunk_index, 5 more }`

      Built-in fields stored for every exported chunk.

      - `attachments: optional array of object { attachment_name, source_id, type }`

        Attachments associated with the chunk

        - `attachment_name: string`

          Attachment-relative path, e.g. 'screenshots/page_7.jpg'.

        - `source_id: string`

          File ID to pass as source_id when fetching the attachment.

        - `type: string`

          Attachment kind, e.g. 'screenshot', 'items'.

      - `chunk_end_char: optional number`

        End character offset of the chunk.

      - `chunk_index: optional number`

        Index of the chunk within the file.

      - `chunk_start_char: optional number`

        Start character offset of the chunk.

      - `chunk_token_count: optional number`

        Token count of the chunk.

      - `page_range_end: optional number`

        Last page number covered by this chunk.

      - `page_range_start: optional number`

        First page number covered by this chunk.

      - `parsed_directory_file_id: optional string`

        ID of the parsed file.

### Retrieval Find Response

- `RetrievalFindResponse object { file_id, file_name }`

  A file returned by find.

  - `file_id: string`

    ID of the file.

  - `file_name: string`

    Display name of the file.

### Retrieval Grep Response

- `RetrievalGrepResponse object { content, end_char, start_char }`

  A single grep match within a file.

  - `content: string`

    Matched text content.

  - `end_char: number`

    End character offset of the match.

  - `start_char: number`

    Start character offset of the match.

### Retrieval Read Response

- `RetrievalReadResponse object { content }`

  File read result.

  - `content: string`

    Parsed text content of the file.