# Llama Parse Python SDK

[![PyPI version](https://img.shields.io/pypi/v/llama_cloud.svg?label=pypi%20(stable))](https://pypi.org/project/llama_cloud/)

The official Python SDK for [LlamaParse](https://cloud.llamaindex.ai) - the enterprise platform for agentic OCR and document processing.

With this SDK, create powerful workflows across many features:

- **Parse** - Agentic OCR and parsing for 130+ formats
- **Extract** - Structured data extraction with custom schemas
- **Classify** - Document categorization with natural-language rules
- **Agents** - Deploy document agents as APIs
- **Index** - Document ingestion and embedding for RAG

## Documentation

- [Get an API Key](https://cloud.llamaindex.ai)
- [Getting Started Guide](https://developers.llamaindex.ai/python/cloud/)
- [Full API Reference](https://developers.api.llamaindex.ai/api/python)

## Installation

```sh
pip install llama_cloud
```

## Quick Start

```python
import os
from llama_cloud import LlamaCloud

client = LlamaCloud(
    api_key=os.environ.get("LLAMA_CLOUD_API_KEY"),  # This is the default and can be omitted
)

# Parse a document
job = client.parsing.create(
    tier="agentic",
    version="latest",
    file_id="your-file-id",
)

print(job.id)
```

## File Uploads

```python
from pathlib import Path
from llama_cloud import LlamaCloud

client = LlamaCloud()

# Upload using a Path
client.files.create(
    file=Path("/path/to/document.pdf"),
    purpose="parse",
)

# Or using bytes with a tuple of (filename, contents, media_type)
client.files.create(
    file=("document.txt", b"content", "text/plain"),
    purpose="parse",
)
```

## Async Usage

```python
import asyncio
from llama_cloud import AsyncLlamaCloud

client = AsyncLlamaCloud()


async def main():
    job = await client.parsing.create(
        tier="agentic",
        version="latest",
        file_id="your-file-id",
    )
    print(job.id)


asyncio.run(main())
```

## MCP Server

Use the Llama Cloud MCP Server to enable AI assistants to interact with the API:

[![Add to Cursor](https://cursor.com/deeplink/mcp-install-dark.svg)](https://cursor.com/en-US/install-mcp?name=%40llamaindex%2Fllama-cloud-mcp&config=eyJuYW1lIjoiQGxsYW1haW5kZXgvbGxhbWEtY2xvdWQtbWNwIiwidHJhbnNwb3J0IjoiaHR0cCIsInVybCI6Imh0dHBzOi8vbGxhbWFjbG91ZC1wcm9kLnN0bG1jcC5jb20iLCJoZWFkZXJzIjp7IngtbGxhbWEtY2xvdWQtYXBpLWtleSI6Ik15IEFQSSBLZXkifX0)
[![Install in VS Code](https://img.shields.io/badge/_-Add_to_VS_Code-blue?style=for-the-badge&logo=data:image/svg%2bxml;base64,PHN2ZyB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciIGZpbGw9Im5vbmUiIHZpZXdCb3g9IjAgMCA0MCA0MCI+PHBhdGggZmlsbD0iI0VFRSIgZmlsbC1ydWxlPSJldmVub2RkIiBkPSJNMzAuMjM1IDM5Ljg4NGEyLjQ5MSAyLjQ5MSAwIDAgMS0xLjc4MS0uNzNMMTIuNyAyNC43OGwtMy40NiAyLjYyNC0zLjQwNiAyLjU4MmExLjY2NSAxLjY2NSAwIDAgMS0xLjA4Mi4zMzggMS42NjQgMS42NjQgMCAwIDEtMS4wNDYtLjQzMWwtMi4yLTJhMS42NjYgMS42NjYgMCAwIDEgMC0yLjQ2M0w3LjQ1OCAyMCA0LjY3IDE3LjQ1MyAxLjUwNyAxNC41N2ExLjY2NSAxLjY2NSAwIDAgMSAwLTIuNDYzbDIuMi0yYTEuNjY1IDEuNjY1IDAgMCAxIDIuMTMtLjA5N2w2Ljg2MyA1LjIwOUwyOC40NTIuODQ0YTIuNDg4IDIuNDg4IDAgMCAxIDEuODQxLS43MjljLjM1MS4wMDkuNjk5LjA5MSAxLjAxOS4yNDVsOC4yMzYgMy45NjFhMi41IDIuNSAwIDAgMSAxLjQxNSAyLjI1M3YuMDk5LS4wNDVWMzMuMzd2LS4wNDUuMDk1YTIuNTAxIDIuNTAxIDAgMCAxLTEuNDE2IDIuMjU3bC04LjIzNSAzLjk2MWEyLjQ5MiAyLjQ5MiAwIDAgMS0xLjA3Ny4yNDZabS43MTYtMjguOTQ3LTExLjk0OCA5LjA2MiAxMS45NTIgOS4wNjUtLjAwNC0xOC4xMjdaIi8+PC9zdmc+)](https://vscode.stainless.com/mcp/%7B%22name%22%3A%22%40llamaindex%2Fllama-cloud-mcp%22%2C%22type%22%3A%22http%22%2C%22url%22%3A%22https%3A%2F%2Fllamacloud-prod.stlmcp.com%22%2C%22headers%22%3A%7B%22x-llama-cloud-api-key%22%3A%22My%20API%20Key%22%7D%7D)

## Error Handling

When the API returns a non-success status code, an `APIError` subclass is raised:

```python
import llama_cloud
from llama_cloud import LlamaCloud

client = LlamaCloud()

try:
    client.pipelines.list(project_id="my-project-id")
except llama_cloud.APIError as e:
    print(e.status_code)  # 400
    print(e.__class__.__name__)  # BadRequestError
```

| Status Code | Error Type                 |
| ----------- | -------------------------- |
| 400         | `BadRequestError`          |
| 401         | `AuthenticationError`      |
| 403         | `PermissionDeniedError`    |
| 404         | `NotFoundError`            |
| 422         | `UnprocessableEntityError` |
| 429         | `RateLimitError`           |
| >=500       | `InternalServerError`      |
| N/A         | `APIConnectionError`       |

## Retries and Timeouts

The SDK automatically retries requests 2 times on connection errors, timeouts, rate limits, and 5xx errors. Requests timeout after 1 minute by default. Functions that combine multiple API calls (e.g. `client.parsing.parse()`) will have larger timeouts by default to account for the multiple requests and polling.

```python
client = LlamaCloud(
    max_retries=0,  # Disable retries (default: 2)
    timeout=30.0,  # 30 second timeout (default: 1 minute)
)
```

## Pagination

List methods support auto-pagination with `for` loops:

```python
for run in client.extraction.runs.list(
    extraction_agent_id="agent-id",
    limit=20,
):
    print(run)
```

Or fetch one page at a time:

```python
page = client.extraction.runs.list(extraction_agent_id="agent-id", limit=20)
for run in page.items:
    print(run)

while page.has_next_page():
    page = page.get_next_page()
```

## Logging

Configure logging via the `LLAMA_CLOUD_LOG` environment variable or the `log` option:

```python
client = LlamaCloud(
    log="debug",  # "debug" | "info" | "warn" | "error" | "off"
)
```

## Requirements

- Python 3.9+

## Contributing

See [CONTRIBUTING.md](./CONTRIBUTING.md).