Runs
List Extract Runs
Get Run
Delete Extraction Run
Get Run By Job Id
ModelsExpand Collapse
ExtractConfig = object { chunk_mode, citation_bbox, cite_sources, 13 more }
Configuration parameters for the extraction agent.
chunk_mode: optional "PAGE" or "SECTION"
The mode to use for chunking the document.
Deprecatedcitation_bbox: optional boolean
Whether to fetch citation bounding boxes for the extraction. Only available in PREMIUM mode. Deprecated: this is now synonymous with cite_sources.
cite_sources: optional boolean
Whether to cite sources for the extraction.
confidence_scores: optional boolean
Whether to fetch confidence scores for the extraction.
extract_model: optional "openai-gpt-4-1" or "openai-gpt-4-1-mini" or "openai-gpt-4-1-nano" or 8 more or string
The extract model to use for data extraction. If not provided, uses the default for the extraction mode.
ExtractModels = "openai-gpt-4-1" or "openai-gpt-4-1-mini" or "openai-gpt-4-1-nano" or 8 more
Extract model options.
extraction_mode: optional "FAST" or "BALANCED" or "PREMIUM" or "MULTIMODAL"
The extraction mode specified (FAST, BALANCED, MULTIMODAL, PREMIUM).
extraction_target: optional "PER_DOC" or "PER_PAGE" or "PER_TABLE_ROW"
The extraction target specified.
high_resolution_mode: optional boolean
Whether to use high resolution mode for the extraction.
invalidate_cache: optional boolean
Whether to invalidate the cache for the extraction.
multimodal_fast_mode: optional boolean
DEPRECATED: Whether to use fast mode for multimodal extraction.
num_pages_context: optional number
Number of pages to pass as context on long document extraction.
page_range: optional string
Comma-separated list of page numbers or ranges to extract from (1-based, e.g., '1,3,5-7,9' or '1-3,8-10').
parse_model: optional "openai-gpt-4o" or "openai-gpt-4o-mini" or "openai-gpt-4-1" or 23 more
Public model names.
priority: optional "low" or "medium" or "high" or "critical"
The priority for the request. This field may be ignored or overwritten depending on the organization tier.
system_prompt: optional string
The system prompt to use for the extraction.
use_reasoning: optional boolean
Whether to use reasoning for the extraction.
ExtractRun = object { id, config, data_schema, 12 more }
Schema for an extraction run.
id: string
The id of the extraction run
The config used for extraction
chunk_mode: optional "PAGE" or "SECTION"
The mode to use for chunking the document.
Deprecatedcitation_bbox: optional boolean
Whether to fetch citation bounding boxes for the extraction. Only available in PREMIUM mode. Deprecated: this is now synonymous with cite_sources.
cite_sources: optional boolean
Whether to cite sources for the extraction.
confidence_scores: optional boolean
Whether to fetch confidence scores for the extraction.
extract_model: optional "openai-gpt-4-1" or "openai-gpt-4-1-mini" or "openai-gpt-4-1-nano" or 8 more or string
The extract model to use for data extraction. If not provided, uses the default for the extraction mode.
ExtractModels = "openai-gpt-4-1" or "openai-gpt-4-1-mini" or "openai-gpt-4-1-nano" or 8 more
Extract model options.
extraction_mode: optional "FAST" or "BALANCED" or "PREMIUM" or "MULTIMODAL"
The extraction mode specified (FAST, BALANCED, MULTIMODAL, PREMIUM).
extraction_target: optional "PER_DOC" or "PER_PAGE" or "PER_TABLE_ROW"
The extraction target specified.
high_resolution_mode: optional boolean
Whether to use high resolution mode for the extraction.
invalidate_cache: optional boolean
Whether to invalidate the cache for the extraction.
multimodal_fast_mode: optional boolean
DEPRECATED: Whether to use fast mode for multimodal extraction.
num_pages_context: optional number
Number of pages to pass as context on long document extraction.
page_range: optional string
Comma-separated list of page numbers or ranges to extract from (1-based, e.g., '1,3,5-7,9' or '1-3,8-10').
parse_model: optional "openai-gpt-4o" or "openai-gpt-4o-mini" or "openai-gpt-4-1" or 23 more
Public model names.
priority: optional "low" or "medium" or "high" or "critical"
The priority for the request. This field may be ignored or overwritten depending on the organization tier.
system_prompt: optional string
The system prompt to use for the extraction.
use_reasoning: optional boolean
Whether to use reasoning for the extraction.
data_schema: map[map[unknown] or array of unknown or string or 2 more]
The schema used for extraction
extraction_agent_id: string
The id of the extraction agent
from_ui: boolean
Whether this extraction run was triggered from the UI
project_id: string
The id of the project that the extraction run belongs to
status: "CREATED" or "PENDING" or "SUCCESS" or "ERROR"
The status of the extraction run
created_at: optional string
Creation datetime
data: optional map[map[unknown] or array of unknown or string or 2 more] or array of map[map[unknown] or array of unknown or string or 2 more]
The data extracted from the file
UnionMember0 = map[map[unknown] or array of unknown or string or 2 more]
UnionMember1 = array of map[map[unknown] or array of unknown or string or 2 more]
error: optional string
The error that occurred during extraction
extraction_metadata: optional map[map[unknown] or array of unknown or string or 2 more]
The metadata extracted from the file
Schema for a file.
id: string
Unique identifier
project_id: string
The ID of the project that the file belongs to
created_at: optional string
Creation datetime
data_source_id: optional string
The ID of the data source that the file belongs to
expires_at: optional string
The expiration date for the file. Files past this date can be deleted.
external_file_id: optional string
The ID of the file in the external system
file_size: optional number
Size of the file in bytes
file_type: optional string
File type (e.g. pdf, docx, etc.)
last_modified_at: optional string
The last modified time of the file
permission_info: optional map[map[unknown] or array of unknown or string or 2 more]
Permission information for the file
purpose: optional string
The intended purpose of the file (e.g., 'user_data', 'parse', 'extract', 'split', 'classify')
resource_info: optional map[map[unknown] or array of unknown or string or 2 more]
Resource information for the file
updated_at: optional string
Update datetime
file_id: optional string
The id of the file that the extract was extracted from
job_id: optional string
The id of the job that the extraction run belongs to
updated_at: optional string
Update datetime