# Retrieval

## Retrieve

`beta.retrieval.retrieve(RetrievalRetrieveParams**kwargs)  -> RetrievalRetrieveResponse`

**post** `/api/v1/retrieval/retrieve`

Retrieve relevant chunks via hybrid search (vector + full-text), with filtering on built-in or user-defined metadata.

### Parameters

- `index_id: str`

  ID of the index to retrieve against.

- `query: str`

  Natural-language query to retrieve relevant chunks.

- `organization_id: Optional[str]`

- `project_id: Optional[str]`

- `custom_filters: Optional[Dict[str, Optional[CustomFilters]]]`

  Filters on user-defined metadata fields.

  - `class CustomFiltersFilterTypeUnionStrIntBoolFloat: …`

    - `operator: Literal["eq", "ne", "gt", 5 more]`

      - `"eq"`

      - `"ne"`

      - `"gt"`

      - `"lt"`

      - `"gte"`

      - `"lte"`

      - `"in"`

      - `"nin"`

    - `value: Union[str, bool, float, Sequence[Union[str, bool, float]]]`

      - `str`

      - `bool`

      - `float`

      - `Sequence[Union[str, bool, float]]`

        - `str`

        - `bool`

        - `float`

  - `Iterable[CustomFiltersUnionMember1]`

    - `operator: Literal["eq", "ne", "gt", 5 more]`

      - `"eq"`

      - `"ne"`

      - `"gt"`

      - `"lt"`

      - `"gte"`

      - `"lte"`

      - `"in"`

      - `"nin"`

    - `value: Union[float, Iterable[float]]`

      - `float`

      - `Iterable[float]`

- `full_text_pipeline_weight: Optional[float]`

  Weight of the full-text search pipeline (0-1).

- `num_candidates: Optional[int]`

  Number of candidates for approximate nearest neighbor search.

- `rerank: Optional[Rerank]`

  Reranking configuration applied after hybrid search. Enabled by default.

  - `enabled: Optional[bool]`

    Set to false to disable reranking.

  - `top_n: Optional[int]`

    Number of results to return after reranking.

- `score_threshold: Optional[float]`

  Minimum score threshold for returned results.

- `static_filters: Optional[StaticFilters]`

  Filters on built-in document fields (page range, chunk index, etc.).

  - `parsed_directory_file_id: Optional[StaticFiltersParsedDirectoryFileID]`

    - `operator: Literal["eq", "ne", "gt", 5 more]`

      - `"eq"`

      - `"ne"`

      - `"gt"`

      - `"lt"`

      - `"gte"`

      - `"lte"`

      - `"in"`

      - `"nin"`

    - `value: Union[str, Sequence[str]]`

      - `str`

      - `Sequence[str]`

- `top_k: Optional[int]`

  Maximum number of results to return.

- `vector_pipeline_weight: Optional[float]`

  Weight of the vector search pipeline (0-1).

### Returns

- `class RetrievalRetrieveResponse: …`

  Response containing retrieval results.

  - `results: List[Result]`

    Ordered list of retrieved chunks.

    - `content: str`

      Text content of the retrieved chunk.

    - `metadata: Optional[Dict[str, Union[str, int, float, 3 more]]]`

      User-defined metadata associated with the chunk.

      - `str`

      - `int`

      - `float`

      - `bool`

      - `None`

      - `List[str]`

    - `rerank_score: Optional[float]`

      Relevance score from the reranker, if reranking was applied.

    - `score: Optional[float]`

      Hybrid search relevance score.

    - `static_fields: Optional[ResultStaticFields]`

      Built-in fields stored for every exported chunk.

      - `attachments: Optional[List[ResultStaticFieldsAttachment]]`

        Attachments associated with the chunk

        - `attachment_name: str`

          Attachment-relative path, e.g. 'screenshots/page_7.jpg'.

        - `source_id: str`

          File ID to pass as source_id when fetching the attachment.

        - `type: str`

          Attachment kind, e.g. 'screenshot', 'items'.

      - `chunk_end_char: Optional[int]`

        End character offset of the chunk.

      - `chunk_index: Optional[int]`

        Index of the chunk within the file.

      - `chunk_start_char: Optional[int]`

        Start character offset of the chunk.

      - `chunk_token_count: Optional[int]`

        Token count of the chunk.

      - `page_range_end: Optional[int]`

        Last page number covered by this chunk.

      - `page_range_start: Optional[int]`

        First page number covered by this chunk.

      - `parsed_directory_file_id: Optional[str]`

        ID of the parsed file.

### Example

```python
import os
from llama_cloud import LlamaCloud

client = LlamaCloud(
    api_key=os.environ.get("LLAMA_CLOUD_API_KEY"),  # This is the default and can be omitted
)
retrieval = client.beta.retrieval.retrieve(
    index_id="idx-abc123",
    query="What are the key findings?",
)
print(retrieval.results)
```

#### Response

```json
{
  "results": [
    {
      "content": "content",
      "metadata": {
        "foo": "string"
      },
      "rerank_score": 0,
      "score": 0,
      "static_fields": {
        "attachments": [
          {
            "attachment_name": "attachment_name",
            "source_id": "source_id",
            "type": "type"
          }
        ],
        "chunk_end_char": 0,
        "chunk_index": 0,
        "chunk_start_char": 0,
        "chunk_token_count": 0,
        "page_range_end": 0,
        "page_range_start": 0,
        "parsed_directory_file_id": "parsed_directory_file_id"
      }
    }
  ]
}
```

## Find Files

`beta.retrieval.find(RetrievalFindParams**kwargs)  -> SyncPaginatedCursorPost[RetrievalFindResponse]`

**post** `/api/v1/retrieval/files/find`

Search for files by name.

### Parameters

- `index_id: str`

  ID of the index to search within.

- `organization_id: Optional[str]`

- `project_id: Optional[str]`

- `file_name: Optional[str]`

  Exact file name to match.

- `file_name_contains: Optional[str]`

  Substring match on file name (case-insensitive).

- `page_size: Optional[int]`

  The maximum number of items to return. The service may return fewer than this value. If unspecified, a default page size will be used. The maximum value is typically 1000; values above this will be coerced to the maximum.

- `page_token: Optional[str]`

  A page token, received from a previous list call. Provide this to retrieve the subsequent page.

### Returns

- `class RetrievalFindResponse: …`

  A file returned by find.

  - `file_id: str`

    ID of the file.

  - `file_name: str`

    Display name of the file.

### Example

```python
import os
from llama_cloud import LlamaCloud

client = LlamaCloud(
    api_key=os.environ.get("LLAMA_CLOUD_API_KEY"),  # This is the default and can be omitted
)
page = client.beta.retrieval.find(
    index_id="idx-abc123",
)
page = page.items[0]
print(page.file_id)
```

#### Response

```json
{
  "items": [
    {
      "file_id": "file_id",
      "file_name": "file_name"
    }
  ],
  "next_page_token": "next_page_token",
  "total_size": 0
}
```

## Grep File

`beta.retrieval.grep(RetrievalGrepParams**kwargs)  -> SyncPaginatedCursorPost[RetrievalGrepResponse]`

**post** `/api/v1/retrieval/files/grep`

Grep within a file's parsed content using a regex pattern.

### Parameters

- `file_id: str`

  ID of the file to grep.

- `index_id: str`

  ID of the index the file belongs to.

- `pattern: str`

  Regex pattern to search for.

- `organization_id: Optional[str]`

- `project_id: Optional[str]`

- `context_chars: Optional[int]`

  Number of characters of context to include before and after the matched pattern in the content field of the response

- `page_size: Optional[int]`

  The maximum number of items to return. The service may return fewer than this value. If unspecified, a default page size will be used. The maximum value is typically 1000; values above this will be coerced to the maximum.

- `page_token: Optional[str]`

  A page token, received from a previous list call. Provide this to retrieve the subsequent page.

### Returns

- `class RetrievalGrepResponse: …`

  A single grep match within a file.

  - `content: str`

    Matched text content.

  - `end_char: int`

    End character offset of the match.

  - `start_char: int`

    Start character offset of the match.

### Example

```python
import os
from llama_cloud import LlamaCloud

client = LlamaCloud(
    api_key=os.environ.get("LLAMA_CLOUD_API_KEY"),  # This is the default and can be omitted
)
page = client.beta.retrieval.grep(
    file_id="file_id",
    index_id="idx-abc123",
    pattern="revenue|profit",
)
page = page.items[0]
print(page.content)
```

#### Response

```json
{
  "items": [
    {
      "content": "content",
      "end_char": 0,
      "start_char": 0
    }
  ],
  "next_page_token": "next_page_token",
  "total_size": 0
}
```

## Read File

`beta.retrieval.read(RetrievalReadParams**kwargs)  -> RetrievalReadResponse`

**post** `/api/v1/retrieval/files/read`

Read the parsed text content of a specific file.

### Parameters

- `file_id: str`

  ID of the file to read.

- `index_id: str`

  ID of the index the file belongs to.

- `organization_id: Optional[str]`

- `project_id: Optional[str]`

- `max_length: Optional[int]`

  Maximum number of characters to read from the offset.

- `offset: Optional[int]`

  Starting character offset.

### Returns

- `class RetrievalReadResponse: …`

  File read result.

  - `content: str`

    Parsed text content of the file.

### Example

```python
import os
from llama_cloud import LlamaCloud

client = LlamaCloud(
    api_key=os.environ.get("LLAMA_CLOUD_API_KEY"),  # This is the default and can be omitted
)
response = client.beta.retrieval.read(
    file_id="file_id",
    index_id="idx-abc123",
)
print(response.content)
```

#### Response

```json
{
  "content": "content"
}
```

## Domain Types

### Retrieval Retrieve Response

- `class RetrievalRetrieveResponse: …`

  Response containing retrieval results.

  - `results: List[Result]`

    Ordered list of retrieved chunks.

    - `content: str`

      Text content of the retrieved chunk.

    - `metadata: Optional[Dict[str, Union[str, int, float, 3 more]]]`

      User-defined metadata associated with the chunk.

      - `str`

      - `int`

      - `float`

      - `bool`

      - `None`

      - `List[str]`

    - `rerank_score: Optional[float]`

      Relevance score from the reranker, if reranking was applied.

    - `score: Optional[float]`

      Hybrid search relevance score.

    - `static_fields: Optional[ResultStaticFields]`

      Built-in fields stored for every exported chunk.

      - `attachments: Optional[List[ResultStaticFieldsAttachment]]`

        Attachments associated with the chunk

        - `attachment_name: str`

          Attachment-relative path, e.g. 'screenshots/page_7.jpg'.

        - `source_id: str`

          File ID to pass as source_id when fetching the attachment.

        - `type: str`

          Attachment kind, e.g. 'screenshot', 'items'.

      - `chunk_end_char: Optional[int]`

        End character offset of the chunk.

      - `chunk_index: Optional[int]`

        Index of the chunk within the file.

      - `chunk_start_char: Optional[int]`

        Start character offset of the chunk.

      - `chunk_token_count: Optional[int]`

        Token count of the chunk.

      - `page_range_end: Optional[int]`

        Last page number covered by this chunk.

      - `page_range_start: Optional[int]`

        First page number covered by this chunk.

      - `parsed_directory_file_id: Optional[str]`

        ID of the parsed file.

### Retrieval Find Response

- `class RetrievalFindResponse: …`

  A file returned by find.

  - `file_id: str`

    ID of the file.

  - `file_name: str`

    Display name of the file.

### Retrieval Grep Response

- `class RetrievalGrepResponse: …`

  A single grep match within a file.

  - `content: str`

    Matched text content.

  - `end_char: int`

    End character offset of the match.

  - `start_char: int`

    Start character offset of the match.

### Retrieval Read Response

- `class RetrievalReadResponse: …`

  File read result.

  - `content: str`

    Parsed text content of the file.