# Retrieval

## Retrieve

`$ llamacloud-prod beta:retrieval retrieve`

**post** `/api/v1/retrieval/retrieve`

Retrieve relevant chunks via hybrid search (vector + full-text), with filtering on built-in or user-defined metadata.

### Parameters

- `--index-id: string`

  Body param: ID of the index to retrieve against.

- `--query: string`

  Body param: Natural-language query to retrieve relevant chunks.

- `--organization-id: optional string`

  Query param

- `--project-id: optional string`

  Query param

- `--custom-filters: optional map[object { operator, value }  or array of object { operator, value } ]`

  Body param: Filters on user-defined metadata fields.

- `--full-text-pipeline-weight: optional number`

  Body param: Weight of the full-text search pipeline (0-1).

- `--num-candidates: optional number`

  Body param: Number of candidates for approximate nearest neighbor search.

- `--rerank: optional object { enabled, top_n }`

  Body param: Reranking configuration applied after hybrid search. Enabled by default.

- `--score-threshold: optional number`

  Body param: Minimum score threshold for returned results.

- `--static-filters: optional object { parsed_directory_file_id }`

  Body param: Filters on built-in document fields (page range, chunk index, etc.).

- `--top-k: optional number`

  Body param: Maximum number of results to return.

- `--vector-pipeline-weight: optional number`

  Body param: Weight of the vector search pipeline (0-1).

### Returns

- `BetaRetrievalGetResponse: object { results }`

  Response containing retrieval results.

  - `results: array of object { content, metadata, rerank_score, 2 more }`

    Ordered list of retrieved chunks.

    - `content: string`

      Text content of the retrieved chunk.

    - `metadata: optional map[string or number or number or 3 more]`

      User-defined metadata associated with the chunk.

      - `union_member_0: string`

      - `union_member_1: number`

      - `union_member_2: number`

      - `union_member_3: boolean`

      - `union_member_4: unknown`

      - `MetadataListValue: array of string`

    - `rerank_score: optional number`

      Relevance score from the reranker, if reranking was applied.

    - `score: optional number`

      Hybrid search relevance score.

    - `static_fields: optional object { attachments, chunk_end_char, chunk_index, 5 more }`

      Built-in fields stored for every exported chunk.

      - `attachments: optional array of object { attachment_name, source_id, type }`

        Attachments associated with the chunk

        - `attachment_name: string`

          Attachment-relative path, e.g. 'screenshots/page_7.jpg'.

        - `source_id: string`

          File ID to pass as source_id when fetching the attachment.

        - `type: string`

          Attachment kind, e.g. 'screenshot', 'items'.

      - `chunk_end_char: optional number`

        End character offset of the chunk.

      - `chunk_index: optional number`

        Index of the chunk within the file.

      - `chunk_start_char: optional number`

        Start character offset of the chunk.

      - `chunk_token_count: optional number`

        Token count of the chunk.

      - `page_range_end: optional number`

        Last page number covered by this chunk.

      - `page_range_start: optional number`

        First page number covered by this chunk.

      - `parsed_directory_file_id: optional string`

        ID of the parsed file.

### Example

```cli
llamacloud-prod beta:retrieval retrieve \
  --api-key 'My API Key' \
  --index-id idx-abc123 \
  --query 'What are the key findings?'
```

#### Response

```json
{
  "results": [
    {
      "content": "content",
      "metadata": {
        "foo": "string"
      },
      "rerank_score": 0,
      "score": 0,
      "static_fields": {
        "attachments": [
          {
            "attachment_name": "attachment_name",
            "source_id": "source_id",
            "type": "type"
          }
        ],
        "chunk_end_char": 0,
        "chunk_index": 0,
        "chunk_start_char": 0,
        "chunk_token_count": 0,
        "page_range_end": 0,
        "page_range_start": 0,
        "parsed_directory_file_id": "parsed_directory_file_id"
      }
    }
  ]
}
```

## Find Files

`$ llamacloud-prod beta:retrieval find`

**post** `/api/v1/retrieval/files/find`

Search for files by name.

### Parameters

- `--index-id: string`

  Body param: ID of the index to search within.

- `--organization-id: optional string`

  Query param

- `--project-id: optional string`

  Query param

- `--file-name: optional string`

  Body param: Exact file name to match.

- `--file-name-contains: optional string`

  Body param: Substring match on file name (case-insensitive).

- `--page-size: optional number`

  Body param: The maximum number of items to return. The service may return fewer than this value. If unspecified, a default page size will be used. The maximum value is typically 1000; values above this will be coerced to the maximum.

- `--page-token: optional string`

  Body param: A page token, received from a previous list call. Provide this to retrieve the subsequent page.

### Returns

- `FileFindResult: object { items, next_page_token, total_size }`

  Paginated file find results.

  - `items: array of object { file_id, file_name }`

    The list of items.

    - `file_id: string`

      ID of the file.

    - `file_name: string`

      Display name of the file.

  - `next_page_token: optional string`

    A token, which can be sent as page_token to retrieve the next page. If this field is omitted, there are no subsequent pages.

  - `total_size: optional number`

    The total number of items available. This is only populated when specifically requested. The value may be an estimate and can be used for display purposes only.

### Example

```cli
llamacloud-prod beta:retrieval find \
  --api-key 'My API Key' \
  --index-id idx-abc123
```

#### Response

```json
{
  "items": [
    {
      "file_id": "file_id",
      "file_name": "file_name"
    }
  ],
  "next_page_token": "next_page_token",
  "total_size": 0
}
```

## Grep File

`$ llamacloud-prod beta:retrieval grep`

**post** `/api/v1/retrieval/files/grep`

Grep within a file's parsed content using a regex pattern.

### Parameters

- `--file-id: string`

  Body param: ID of the file to grep.

- `--index-id: string`

  Body param: ID of the index the file belongs to.

- `--pattern: string`

  Body param: Regex pattern to search for.

- `--organization-id: optional string`

  Query param

- `--project-id: optional string`

  Query param

- `--context-chars: optional number`

  Body param: Number of characters of context to include before and after the matched pattern in the content field of the response

- `--page-size: optional number`

  Body param: The maximum number of items to return. The service may return fewer than this value. If unspecified, a default page size will be used. The maximum value is typically 1000; values above this will be coerced to the maximum.

- `--page-token: optional string`

  Body param: A page token, received from a previous list call. Provide this to retrieve the subsequent page.

### Returns

- `FileGrepResult: object { items, next_page_token, total_size }`

  Paginated grep results for a file.

  - `items: array of object { content, end_char, start_char }`

    The list of items.

    - `content: string`

      Matched text content.

    - `end_char: number`

      End character offset of the match.

    - `start_char: number`

      Start character offset of the match.

  - `next_page_token: optional string`

    A token, which can be sent as page_token to retrieve the next page. If this field is omitted, there are no subsequent pages.

  - `total_size: optional number`

    The total number of items available. This is only populated when specifically requested. The value may be an estimate and can be used for display purposes only.

### Example

```cli
llamacloud-prod beta:retrieval grep \
  --api-key 'My API Key' \
  --file-id file_id \
  --index-id idx-abc123 \
  --pattern 'revenue|profit'
```

#### Response

```json
{
  "items": [
    {
      "content": "content",
      "end_char": 0,
      "start_char": 0
    }
  ],
  "next_page_token": "next_page_token",
  "total_size": 0
}
```

## Read File

`$ llamacloud-prod beta:retrieval read`

**post** `/api/v1/retrieval/files/read`

Read the parsed text content of a specific file.

### Parameters

- `--file-id: string`

  Body param: ID of the file to read.

- `--index-id: string`

  Body param: ID of the index the file belongs to.

- `--organization-id: optional string`

  Query param

- `--project-id: optional string`

  Query param

- `--max-length: optional number`

  Body param: Maximum number of characters to read from the offset.

- `--offset: optional number`

  Body param: Starting character offset.

### Returns

- `BetaRetrievalReadResponse: object { content }`

  File read result.

  - `content: string`

    Parsed text content of the file.

### Example

```cli
llamacloud-prod beta:retrieval read \
  --api-key 'My API Key' \
  --file-id file_id \
  --index-id idx-abc123
```

#### Response

```json
{
  "content": "content"
}
```