Skip to content
Get started

List Jobs

GET/api/v1/extraction/jobs

List Jobs

Query ParametersExpand Collapse
extraction_agent_id: string
Cookie ParametersExpand Collapse
session: optional string
ReturnsExpand Collapse
id: string

The id of the extraction job

formatuuid
extraction_agent: ExtractAgent { id, config, data_schema, 5 more }

The agent that the job was run on.

id: string

The id of the extraction agent.

formatuuid
config: ExtractConfig { chunk_mode, citation_bbox, cite_sources, 13 more }

The configuration parameters for the extraction agent.

chunk_mode: optional "PAGE" or "SECTION"

The mode to use for chunking the document.

Accepts one of the following:
"PAGE"
"SECTION"
Deprecatedcitation_bbox: optional boolean

Whether to fetch citation bounding boxes for the extraction. Only available in PREMIUM mode. Deprecated: this is now synonymous with cite_sources.

cite_sources: optional boolean

Whether to cite sources for the extraction.

confidence_scores: optional boolean

Whether to fetch confidence scores for the extraction.

extract_model: optional "openai-gpt-4-1" or "openai-gpt-4-1-mini" or "openai-gpt-4-1-nano" or 8 more or string

The extract model to use for data extraction. If not provided, uses the default for the extraction mode.

Accepts one of the following:
ExtractModels = "openai-gpt-4-1" or "openai-gpt-4-1-mini" or "openai-gpt-4-1-nano" or 8 more

Extract model options.

Accepts one of the following:
"openai-gpt-4-1"
"openai-gpt-4-1-mini"
"openai-gpt-4-1-nano"
"openai-gpt-5"
"openai-gpt-5-mini"
"gemini-2.0-flash"
"gemini-2.5-flash"
"gemini-2.5-flash-lite"
"gemini-2.5-pro"
"openai-gpt-4o"
"openai-gpt-4o-mini"
UnionMember1 = string
extraction_mode: optional "FAST" or "BALANCED" or "PREMIUM" or "MULTIMODAL"

The extraction mode specified (FAST, BALANCED, MULTIMODAL, PREMIUM).

Accepts one of the following:
"FAST"
"BALANCED"
"PREMIUM"
"MULTIMODAL"
extraction_target: optional "PER_DOC" or "PER_PAGE" or "PER_TABLE_ROW"

The extraction target specified.

Accepts one of the following:
"PER_DOC"
"PER_PAGE"
"PER_TABLE_ROW"
high_resolution_mode: optional boolean

Whether to use high resolution mode for the extraction.

invalidate_cache: optional boolean

Whether to invalidate the cache for the extraction.

multimodal_fast_mode: optional boolean

DEPRECATED: Whether to use fast mode for multimodal extraction.

num_pages_context: optional number

Number of pages to pass as context on long document extraction.

minimum1
page_range: optional string

Comma-separated list of page numbers or ranges to extract from (1-based, e.g., '1,3,5-7,9' or '1-3,8-10').

parse_model: optional "openai-gpt-4o" or "openai-gpt-4o-mini" or "openai-gpt-4-1" or 23 more

Public model names.

Accepts one of the following:
"openai-gpt-4o"
"openai-gpt-4o-mini"
"openai-gpt-4-1"
"openai-gpt-4-1-mini"
"openai-gpt-4-1-nano"
"openai-gpt-5"
"openai-gpt-5-mini"
"openai-gpt-5-nano"
"openai-text-embedding-3-large"
"openai-text-embedding-3-small"
"openai-whisper-1"
"anthropic-sonnet-3.5"
"anthropic-sonnet-3.5-v2"
"anthropic-sonnet-3.7"
"anthropic-sonnet-4.0"
"anthropic-sonnet-4.5"
"anthropic-haiku-3.5"
"anthropic-haiku-4.5"
"gemini-2.5-flash"
"gemini-3.0-pro"
"gemini-2.5-pro"
"gemini-2.0-flash"
"gemini-2.0-flash-lite"
"gemini-2.5-flash-lite"
"gemini-1.5-flash"
"gemini-1.5-pro"
priority: optional "low" or "medium" or "high" or "critical"

The priority for the request. This field may be ignored or overwritten depending on the organization tier.

Accepts one of the following:
"low"
"medium"
"high"
"critical"
system_prompt: optional string

The system prompt to use for the extraction.

use_reasoning: optional boolean

Whether to use reasoning for the extraction.

data_schema: map[map[unknown] or array of unknown or string or 2 more]

The schema of the data.

Accepts one of the following:
UnionMember0 = map[unknown]
UnionMember1 = array of unknown
UnionMember2 = string
UnionMember3 = number
UnionMember4 = boolean
name: string

The name of the extraction agent.

project_id: string

The ID of the project that the extraction agent belongs to.

formatuuid
created_at: optional string

The creation time of the extraction agent.

formatdate-time
custom_configuration: optional "default"

Custom configuration type for the extraction agent. Currently supports 'default'.

updated_at: optional string

The last update time of the extraction agent.

formatdate-time
status: "PENDING" or "SUCCESS" or "ERROR" or 2 more

The status of the extraction job

Accepts one of the following:
"PENDING"
"SUCCESS"
"ERROR"
"PARTIAL_SUCCESS"
"CANCELLED"
error: optional string

The error that occurred during extraction

Deprecatedfile: optional File { id, name, project_id, 11 more }

Schema for a file.

id: string

Unique identifier

formatuuid
name: string
project_id: string

The ID of the project that the file belongs to

formatuuid
created_at: optional string

Creation datetime

formatdate-time
data_source_id: optional string

The ID of the data source that the file belongs to

formatuuid
expires_at: optional string

The expiration date for the file. Files past this date can be deleted.

formatdate-time
external_file_id: optional string

The ID of the file in the external system

file_size: optional number

Size of the file in bytes

minimum0
file_type: optional string

File type (e.g. pdf, docx, etc.)

maxLength3000
minLength1
last_modified_at: optional string

The last modified time of the file

formatdate-time
permission_info: optional map[map[unknown] or array of unknown or string or 2 more]

Permission information for the file

Accepts one of the following:
UnionMember0 = map[unknown]
UnionMember1 = array of unknown
UnionMember2 = string
UnionMember3 = number
UnionMember4 = boolean
purpose: optional string

The intended purpose of the file (e.g., 'user_data', 'parse', 'extract', 'split', 'classify')

resource_info: optional map[map[unknown] or array of unknown or string or 2 more]

Resource information for the file

Accepts one of the following:
UnionMember0 = map[unknown]
UnionMember1 = array of unknown
UnionMember2 = string
UnionMember3 = number
UnionMember4 = boolean
updated_at: optional string

Update datetime

formatdate-time
file_id: optional string

The id of the file that the extract was extracted from

formatuuid

List Jobs

curl https://api.cloud.llamaindex.ai/api/v1/extraction/jobs \
    -H "Authorization: Bearer $LLAMA_CLOUD_API_KEY"
[
  {
    "id": "182bd5e5-6e1a-4fe4-a799-aa6d9a6ab26e",
    "extraction_agent": {
      "id": "182bd5e5-6e1a-4fe4-a799-aa6d9a6ab26e",
      "config": {
        "chunk_mode": "PAGE",
        "citation_bbox": true,
        "cite_sources": true,
        "confidence_scores": true,
        "extract_model": "openai-gpt-4-1",
        "extraction_mode": "FAST",
        "extraction_target": "PER_DOC",
        "high_resolution_mode": true,
        "invalidate_cache": true,
        "multimodal_fast_mode": true,
        "num_pages_context": 1,
        "page_range": "page_range",
        "parse_model": "openai-gpt-4o",
        "priority": "low",
        "system_prompt": "system_prompt",
        "use_reasoning": true
      },
      "data_schema": {
        "foo": {
          "foo": "bar"
        }
      },
      "name": "name",
      "project_id": "182bd5e5-6e1a-4fe4-a799-aa6d9a6ab26e",
      "created_at": "2019-12-27T18:11:19.117Z",
      "custom_configuration": "default",
      "updated_at": "2019-12-27T18:11:19.117Z"
    },
    "status": "PENDING",
    "error": "error",
    "file": {
      "id": "182bd5e5-6e1a-4fe4-a799-aa6d9a6ab26e",
      "name": "x",
      "project_id": "182bd5e5-6e1a-4fe4-a799-aa6d9a6ab26e",
      "created_at": "2019-12-27T18:11:19.117Z",
      "data_source_id": "182bd5e5-6e1a-4fe4-a799-aa6d9a6ab26e",
      "expires_at": "2019-12-27T18:11:19.117Z",
      "external_file_id": "external_file_id",
      "file_size": 0,
      "file_type": "x",
      "last_modified_at": "2019-12-27T18:11:19.117Z",
      "permission_info": {
        "foo": {
          "foo": "bar"
        }
      },
      "purpose": "purpose",
      "resource_info": {
        "foo": {
          "foo": "bar"
        }
      },
      "updated_at": "2019-12-27T18:11:19.117Z"
    },
    "file_id": "182bd5e5-6e1a-4fe4-a799-aa6d9a6ab26e"
  }
]
Returns Examples
[
  {
    "id": "182bd5e5-6e1a-4fe4-a799-aa6d9a6ab26e",
    "extraction_agent": {
      "id": "182bd5e5-6e1a-4fe4-a799-aa6d9a6ab26e",
      "config": {
        "chunk_mode": "PAGE",
        "citation_bbox": true,
        "cite_sources": true,
        "confidence_scores": true,
        "extract_model": "openai-gpt-4-1",
        "extraction_mode": "FAST",
        "extraction_target": "PER_DOC",
        "high_resolution_mode": true,
        "invalidate_cache": true,
        "multimodal_fast_mode": true,
        "num_pages_context": 1,
        "page_range": "page_range",
        "parse_model": "openai-gpt-4o",
        "priority": "low",
        "system_prompt": "system_prompt",
        "use_reasoning": true
      },
      "data_schema": {
        "foo": {
          "foo": "bar"
        }
      },
      "name": "name",
      "project_id": "182bd5e5-6e1a-4fe4-a799-aa6d9a6ab26e",
      "created_at": "2019-12-27T18:11:19.117Z",
      "custom_configuration": "default",
      "updated_at": "2019-12-27T18:11:19.117Z"
    },
    "status": "PENDING",
    "error": "error",
    "file": {
      "id": "182bd5e5-6e1a-4fe4-a799-aa6d9a6ab26e",
      "name": "x",
      "project_id": "182bd5e5-6e1a-4fe4-a799-aa6d9a6ab26e",
      "created_at": "2019-12-27T18:11:19.117Z",
      "data_source_id": "182bd5e5-6e1a-4fe4-a799-aa6d9a6ab26e",
      "expires_at": "2019-12-27T18:11:19.117Z",
      "external_file_id": "external_file_id",
      "file_size": 0,
      "file_type": "x",
      "last_modified_at": "2019-12-27T18:11:19.117Z",
      "permission_info": {
        "foo": {
          "foo": "bar"
        }
      },
      "purpose": "purpose",
      "resource_info": {
        "foo": {
          "foo": "bar"
        }
      },
      "updated_at": "2019-12-27T18:11:19.117Z"
    },
    "file_id": "182bd5e5-6e1a-4fe4-a799-aa6d9a6ab26e"
  }
]