Beta

BetaAgent Data

Get Agent Data

client.beta.agentData.get(, ?, ?): AgentData { data, deployment_name, id, 4 more }

GET/api/v1/beta/agent-data/{item_id}

Update Agent Data

client.beta.agentData.update(, , ?): AgentData { data, deployment_name, id, 4 more }

PUT/api/v1/beta/agent-data/{item_id}

Delete Agent Data

client.beta.agentData.delete(, ?, ?): AgentDataDeleteResponse

DELETE/api/v1/beta/agent-data/{item_id}

Create Agent Data

client.beta.agentData.agentData(, ?): AgentData { data, deployment_name, id, 4 more }

POST/api/v1/beta/agent-data

Search Agent Data

client.beta.agentData.search(, ?): PaginatedCursorPost<AgentData { data, deployment_name, id, 4 more } >

POST/api/v1/beta/agent-data/:search

Aggregate Agent Data

client.beta.agentData.aggregate(, ?): PaginatedCursorPost<AgentDataAggregateResponse { group_key, count, first_item } >

POST/api/v1/beta/agent-data/:aggregate

Delete Agent Data By Query

client.beta.agentData.deleteByQuery(, ?): AgentDataDeleteByQueryResponse { deleted_count }

POST/api/v1/beta/agent-data/:delete

ModelsExpand Collapse

AgentData { data, deployment_name, id, 4 more }

API Result for a single agent data item

data: Record<string, unknown>

deployment_name: string

id?: string | null

collection?: string

created_at?: string | null

project_id?: string | null

updated_at?: string | null

BetaParse Configurations

Create Parse Configuration

client.beta.parseConfigurations.create(, ?): ParseConfiguration { id, created_at, name, 6 more }

POST/api/v1/beta/parse-configurations

List Parse Configurations

client.beta.parseConfigurations.list(?, ?): PaginatedCursor<ParseConfiguration { id, created_at, name, 6 more } >

GET/api/v1/beta/parse-configurations

Get Parse Configuration

client.beta.parseConfigurations.get(, ?, ?): ParseConfiguration { id, created_at, name, 6 more }

GET/api/v1/beta/parse-configurations/{config_id}

Update Parse Configuration

client.beta.parseConfigurations.update(, , ?): ParseConfiguration { id, created_at, name, 6 more }

PUT/api/v1/beta/parse-configurations/{config_id}

Delete Parse Configuration

client.beta.parseConfigurations.delete(, ?, ?): void

DELETE/api/v1/beta/parse-configurations/{config_id}

ModelsExpand Collapse

ParseConfiguration { id, created_at, name, 6 more }

Parse configuration schema.

id: string

Unique identifier for the parse configuration

created_at: string

Creation timestamp

formatdate-time

Name of the parse configuration

parameters: LlamaParseParameters { adaptive_long_table, aggressive_table_extraction, annotate_links, 116 more }

LlamaParseParameters configuration

adaptive_long_table?: boolean | null

aggressive_table_extraction?: boolean | null

annotate_links?: boolean | null

auto_mode?: boolean | null

auto_mode_configuration_json?: string | null

auto_mode_trigger_on_image_in_page?: boolean | null

auto_mode_trigger_on_regexp_in_page?: string | null

auto_mode_trigger_on_table_in_page?: boolean | null

auto_mode_trigger_on_text_in_page?: string | null

azure_openai_api_version?: string | null

azure_openai_deployment_name?: string | null

azure_openai_endpoint?: string | null

azure_openai_key?: string | null

bbox_bottom?: number | null

bbox_left?: number | null

bbox_right?: number | null

bbox_top?: number | null

bounding_box?: string | null

compact_markdown_table?: boolean | null

complemental_formatting_instruction?: string | null

content_guideline_instruction?: string | null

continuous_mode?: boolean | null

disable_image_extraction?: boolean | null

disable_ocr?: boolean | null

disable_reconstruction?: boolean | null

do_not_cache?: boolean | null

do_not_unroll_columns?: boolean | null

enable_cost_optimizer?: boolean | null

extract_charts?: boolean | null

extract_layout?: boolean | null

extract_printed_page_number?: boolean | null

fast_mode?: boolean | null

formatting_instruction?: string | null

gpt4o_api_key?: string | null

gpt4o_mode?: boolean | null

guess_xlsx_sheet_name?: boolean | null

hide_footers?: boolean | null

hide_headers?: boolean | null

high_res_ocr?: boolean | null

html_make_all_elements_visible?: boolean | null

html_remove_fixed_elements?: boolean | null

html_remove_navigation_elements?: boolean | null

http_proxy?: string | null

ignore_document_elements_for_layout_detection?: boolean | null

images_to_save?: Array<"screenshot" | "embedded" | "layout"> | null

Accepts one of the following:

"screenshot"

"embedded"

"layout"

inline_images_in_markdown?: boolean | null

input_s3_path?: string | null

input_s3_region?: string | null

input_url?: string | null

internal_is_screenshot_job?: boolean | null

invalidate_cache?: boolean | null

is_formatting_instruction?: boolean | null

job_timeout_extra_time_per_page_in_seconds?: number | null

job_timeout_in_seconds?: number | null

keep_page_separator_when_merging_tables?: boolean | null

languages?: Array<ParsingLanguages>

Accepts one of the following:

"af"

"az"

"bs"

"cs"

"cy"

"da"

"de"

"en"

"es"

"et"

"fr"

"ga"

"hr"

"hu"

"id"

"is"

"it"

"ku"

"la"

"lt"

"lv"

"mi"

"ms"

"mt"

"nl"

"no"

"oc"

"pi"

"pl"

"pt"

"ro"

"rs_latin"

"sk"

"sl"

"sq"

"sv"

"sw"

"tl"

"tr"

"uz"

"vi"

"ar"

"fa"

"ug"

"ur"

"bn"

"as"

"mni"

"ru"

"rs_cyrillic"

"be"

"bg"

"uk"

"mn"

"abq"

"ady"

"kbd"

"ava"

"dar"

"inh"

"che"

"lbe"

"lez"

"tab"

"tjk"

"hi"

"mr"

"ne"

"bh"

"mai"

"ang"

"bho"

"mah"

"sck"

"new"

"gom"

"sa"

"bgc"

"th"

"ch_sim"

"ch_tra"

"ja"

"ko"

"ta"

"te"

"kn"

layout_aware?: boolean | null

line_level_bounding_box?: boolean | null

markdown_table_multiline_header_separator?: string | null

max_pages?: number | null

max_pages_enforced?: number | null

merge_tables_across_pages_in_markdown?: boolean | null

model?: string | null

outlined_table_extraction?: boolean | null

output_pdf_of_document?: boolean | null

output_s3_path_prefix?: string | null

output_s3_region?: string | null

output_tables_as_HTML?: boolean | null

page_error_tolerance?: number | null

page_footer_prefix?: string | null

page_footer_suffix?: string | null

page_header_prefix?: string | null

page_header_suffix?: string | null

page_prefix?: string | null

page_separator?: string | null

page_suffix?: string | null

parse_mode?: ParsingMode | null

Enum for representing the mode of parsing to be used.

Accepts one of the following:

"parse_page_without_llm"

"parse_page_with_llm"

"parse_page_with_lvm"

"parse_page_with_agent"

"parse_page_with_layout_agent"

"parse_document_with_llm"

"parse_document_with_lvm"

"parse_document_with_agent"

parsing_instruction?: string | null

precise_bounding_box?: boolean | null

premium_mode?: boolean | null

presentation_out_of_bounds_content?: boolean | null

presentation_skip_embedded_data?: boolean | null

preserve_layout_alignment_across_pages?: boolean | null

preserve_very_small_text?: boolean | null

preset?: string | null

priority?: "low" | "medium" | "high" | "critical" | null

The priority for the request. This field may be ignored or overwritten depending on the organization tier.

Accepts one of the following:

"low"

"medium"

"high"

"critical"

project_id?: string | null

remove_hidden_text?: boolean | null

replace_failed_page_mode?: FailPageMode | null

Enum for representing the different available page error handling modes.

Accepts one of the following:

"raw_text"

"blank_page"

"error_message"

replace_failed_page_with_error_message_prefix?: string | null

replace_failed_page_with_error_message_suffix?: string | null

save_images?: boolean | null

skip_diagonal_text?: boolean | null

specialized_chart_parsing_agentic?: boolean | null

specialized_chart_parsing_efficient?: boolean | null

specialized_chart_parsing_plus?: boolean | null

specialized_image_parsing?: boolean | null

spreadsheet_extract_sub_tables?: boolean | null

spreadsheet_force_formula_computation?: boolean | null

spreadsheet_include_hidden_sheets?: boolean | null

strict_mode_buggy_font?: boolean | null

strict_mode_image_extraction?: boolean | null

strict_mode_image_ocr?: boolean | null

strict_mode_reconstruction?: boolean | null

structured_output?: boolean | null

structured_output_json_schema?: string | null

structured_output_json_schema_name?: string | null

system_prompt?: string | null

system_prompt_append?: string | null

take_screenshot?: boolean | null

target_pages?: string | null

tier?: string | null

use_vendor_multimodal_model?: boolean | null

user_prompt?: string | null

vendor_multimodal_api_key?: string | null

vendor_multimodal_model_name?: string | null

version?: string | null

webhook_configurations?: Array<WebhookConfiguration { webhook_events, webhook_headers, webhook_output_format, webhook_url } > | null

The outbound webhook configurations

webhook_events?: Array<"extract.pending" | "extract.success" | "extract.error" | 14 more> | null

List of event names to subscribe to

Accepts one of the following:

"extract.pending"

"extract.success"

"extract.error"

"extract.partial_success"

"extract.cancelled"

"parse.pending"

"parse.running"

"parse.success"

"parse.error"

"parse.partial_success"

"parse.cancelled"

"classify.pending"

"classify.success"

"classify.error"

"classify.partial_success"

"classify.cancelled"

"unmapped_event"

webhook_headers?: Record<string, string> | null

Custom HTTP headers to include with webhook requests.

webhook_output_format?: string | null

The output format to use for the webhook. Defaults to string if none supplied. Currently supported values: string, json

webhook_url?: string | null

The URL to send webhook notifications to.

webhook_url?: string | null

source_id: string

ID of the source

source_type: string

Type of the source (e.g., 'project')

updated_at: string

Last update timestamp

formatdate-time

version: string

Version of the configuration

creator?: string | null

Creator of the configuration

ParseConfigurationCreate { name, parameters, version, 3 more }

Schema for creating a new parse configuration (API boundary).

Name of the parse configuration

parameters: LlamaParseParameters { adaptive_long_table, aggressive_table_extraction, annotate_links, 116 more }

LlamaParseParameters configuration

adaptive_long_table?: boolean | null

aggressive_table_extraction?: boolean | null

annotate_links?: boolean | null

auto_mode?: boolean | null

auto_mode_configuration_json?: string | null

auto_mode_trigger_on_image_in_page?: boolean | null

auto_mode_trigger_on_regexp_in_page?: string | null

auto_mode_trigger_on_table_in_page?: boolean | null

auto_mode_trigger_on_text_in_page?: string | null

azure_openai_api_version?: string | null

azure_openai_deployment_name?: string | null

azure_openai_endpoint?: string | null

azure_openai_key?: string | null

bbox_bottom?: number | null

bbox_left?: number | null

bbox_right?: number | null

bbox_top?: number | null

bounding_box?: string | null

compact_markdown_table?: boolean | null

complemental_formatting_instruction?: string | null

content_guideline_instruction?: string | null

continuous_mode?: boolean | null

disable_image_extraction?: boolean | null

disable_ocr?: boolean | null

disable_reconstruction?: boolean | null

do_not_cache?: boolean | null

do_not_unroll_columns?: boolean | null

enable_cost_optimizer?: boolean | null

extract_charts?: boolean | null

extract_layout?: boolean | null

extract_printed_page_number?: boolean | null

fast_mode?: boolean | null

formatting_instruction?: string | null

gpt4o_api_key?: string | null

gpt4o_mode?: boolean | null

guess_xlsx_sheet_name?: boolean | null

hide_footers?: boolean | null

hide_headers?: boolean | null

high_res_ocr?: boolean | null

html_make_all_elements_visible?: boolean | null

html_remove_fixed_elements?: boolean | null

html_remove_navigation_elements?: boolean | null

http_proxy?: string | null

ignore_document_elements_for_layout_detection?: boolean | null

images_to_save?: Array<"screenshot" | "embedded" | "layout"> | null

Accepts one of the following:

"screenshot"

"embedded"

"layout"

inline_images_in_markdown?: boolean | null

input_s3_path?: string | null

input_s3_region?: string | null

input_url?: string | null

internal_is_screenshot_job?: boolean | null

invalidate_cache?: boolean | null

is_formatting_instruction?: boolean | null

job_timeout_extra_time_per_page_in_seconds?: number | null

job_timeout_in_seconds?: number | null

keep_page_separator_when_merging_tables?: boolean | null

languages?: Array<ParsingLanguages>

Accepts one of the following:

"af"

"az"

"bs"

"cs"

"cy"

"da"

"de"

"en"

"es"

"et"

"fr"

"ga"

"hr"

"hu"

"id"

"is"

"it"

"ku"

"la"

"lt"

"lv"

"mi"

"ms"

"mt"

"nl"

"no"

"oc"

"pi"

"pl"

"pt"

"ro"

"rs_latin"

"sk"

"sl"

"sq"

"sv"

"sw"

"tl"

"tr"

"uz"

"vi"

"ar"

"fa"

"ug"

"ur"

"bn"

"as"

"mni"

"ru"

"rs_cyrillic"

"be"

"bg"

"uk"

"mn"

"abq"

"ady"

"kbd"

"ava"

"dar"

"inh"

"che"

"lbe"

"lez"

"tab"

"tjk"

"hi"

"mr"

"ne"

"bh"

"mai"

"ang"

"bho"

"mah"

"sck"

"new"

"gom"

"sa"

"bgc"

"th"

"ch_sim"

"ch_tra"

"ja"

"ko"

"ta"

"te"

"kn"

layout_aware?: boolean | null

line_level_bounding_box?: boolean | null

markdown_table_multiline_header_separator?: string | null

max_pages?: number | null

max_pages_enforced?: number | null

merge_tables_across_pages_in_markdown?: boolean | null

model?: string | null

outlined_table_extraction?: boolean | null

output_pdf_of_document?: boolean | null

output_s3_path_prefix?: string | null

output_s3_region?: string | null

output_tables_as_HTML?: boolean | null

page_error_tolerance?: number | null

page_footer_prefix?: string | null

page_footer_suffix?: string | null

page_header_prefix?: string | null

page_header_suffix?: string | null

page_prefix?: string | null

page_separator?: string | null

page_suffix?: string | null

parse_mode?: ParsingMode | null

Enum for representing the mode of parsing to be used.

Accepts one of the following:

"parse_page_without_llm"

"parse_page_with_llm"

"parse_page_with_lvm"

"parse_page_with_agent"

"parse_page_with_layout_agent"

"parse_document_with_llm"

"parse_document_with_lvm"

"parse_document_with_agent"

parsing_instruction?: string | null

precise_bounding_box?: boolean | null

premium_mode?: boolean | null

presentation_out_of_bounds_content?: boolean | null

presentation_skip_embedded_data?: boolean | null

preserve_layout_alignment_across_pages?: boolean | null

preserve_very_small_text?: boolean | null

preset?: string | null

priority?: "low" | "medium" | "high" | "critical" | null

The priority for the request. This field may be ignored or overwritten depending on the organization tier.

Accepts one of the following:

"low"

"medium"

"high"

"critical"

project_id?: string | null

remove_hidden_text?: boolean | null

replace_failed_page_mode?: FailPageMode | null

Enum for representing the different available page error handling modes.

Accepts one of the following:

"raw_text"

"blank_page"

"error_message"

replace_failed_page_with_error_message_prefix?: string | null

replace_failed_page_with_error_message_suffix?: string | null

save_images?: boolean | null

skip_diagonal_text?: boolean | null

specialized_chart_parsing_agentic?: boolean | null

specialized_chart_parsing_efficient?: boolean | null

specialized_chart_parsing_plus?: boolean | null

specialized_image_parsing?: boolean | null

spreadsheet_extract_sub_tables?: boolean | null

spreadsheet_force_formula_computation?: boolean | null

spreadsheet_include_hidden_sheets?: boolean | null

strict_mode_buggy_font?: boolean | null

strict_mode_image_extraction?: boolean | null

strict_mode_image_ocr?: boolean | null

strict_mode_reconstruction?: boolean | null

structured_output?: boolean | null

structured_output_json_schema?: string | null

structured_output_json_schema_name?: string | null

system_prompt?: string | null

system_prompt_append?: string | null

take_screenshot?: boolean | null

target_pages?: string | null

tier?: string | null

use_vendor_multimodal_model?: boolean | null

user_prompt?: string | null

vendor_multimodal_api_key?: string | null

vendor_multimodal_model_name?: string | null

version?: string | null

webhook_configurations?: Array<WebhookConfiguration { webhook_events, webhook_headers, webhook_output_format, webhook_url } > | null

The outbound webhook configurations

webhook_events?: Array<"extract.pending" | "extract.success" | "extract.error" | 14 more> | null

List of event names to subscribe to

Accepts one of the following:

"extract.pending"

"extract.success"

"extract.error"

"extract.partial_success"

"extract.cancelled"

"parse.pending"

"parse.running"

"parse.success"

"parse.error"

"parse.partial_success"

"parse.cancelled"

"classify.pending"

"classify.success"

"classify.error"

"classify.partial_success"

"classify.cancelled"

"unmapped_event"

webhook_headers?: Record<string, string> | null

Custom HTTP headers to include with webhook requests.

webhook_output_format?: string | null

The output format to use for the webhook. Defaults to string if none supplied. Currently supported values: string, json

webhook_url?: string | null

The URL to send webhook notifications to.

webhook_url?: string | null

version: string

Version of the configuration

creator?: string | null

Creator of the configuration

source_id?: string | null

ID of the source

source_type?: string | null

Type of the source (e.g., 'project')

ParseConfigurationQueryResponse { items, next_page_token, total_size }

Response schema for paginated parse configuration queries.

items: Array<ParseConfiguration { id, created_at, name, 6 more } >

The list of items.

id: string

Unique identifier for the parse configuration

created_at: string

Creation timestamp

formatdate-time

Name of the parse configuration

parameters: LlamaParseParameters { adaptive_long_table, aggressive_table_extraction, annotate_links, 116 more }

LlamaParseParameters configuration

adaptive_long_table?: boolean | null

aggressive_table_extraction?: boolean | null

annotate_links?: boolean | null

auto_mode?: boolean | null

auto_mode_configuration_json?: string | null

auto_mode_trigger_on_image_in_page?: boolean | null

auto_mode_trigger_on_regexp_in_page?: string | null

auto_mode_trigger_on_table_in_page?: boolean | null

auto_mode_trigger_on_text_in_page?: string | null

azure_openai_api_version?: string | null

azure_openai_deployment_name?: string | null

azure_openai_endpoint?: string | null

azure_openai_key?: string | null

bbox_bottom?: number | null

bbox_left?: number | null

bbox_right?: number | null

bbox_top?: number | null

bounding_box?: string | null

compact_markdown_table?: boolean | null

complemental_formatting_instruction?: string | null

content_guideline_instruction?: string | null

continuous_mode?: boolean | null

disable_image_extraction?: boolean | null

disable_ocr?: boolean | null

disable_reconstruction?: boolean | null

do_not_cache?: boolean | null

do_not_unroll_columns?: boolean | null

enable_cost_optimizer?: boolean | null

extract_charts?: boolean | null

extract_layout?: boolean | null

extract_printed_page_number?: boolean | null

fast_mode?: boolean | null

formatting_instruction?: string | null

gpt4o_api_key?: string | null

gpt4o_mode?: boolean | null

guess_xlsx_sheet_name?: boolean | null

hide_footers?: boolean | null

hide_headers?: boolean | null

high_res_ocr?: boolean | null

html_make_all_elements_visible?: boolean | null

html_remove_fixed_elements?: boolean | null

html_remove_navigation_elements?: boolean | null

http_proxy?: string | null

ignore_document_elements_for_layout_detection?: boolean | null

images_to_save?: Array<"screenshot" | "embedded" | "layout"> | null

Accepts one of the following:

"screenshot"

"embedded"

"layout"

inline_images_in_markdown?: boolean | null

input_s3_path?: string | null

input_s3_region?: string | null

input_url?: string | null

internal_is_screenshot_job?: boolean | null

invalidate_cache?: boolean | null

is_formatting_instruction?: boolean | null

job_timeout_extra_time_per_page_in_seconds?: number | null

job_timeout_in_seconds?: number | null

keep_page_separator_when_merging_tables?: boolean | null

languages?: Array<ParsingLanguages>

Accepts one of the following:

"af"

"az"

"bs"

"cs"

"cy"

"da"

"de"

"en"

"es"

"et"

"fr"

"ga"

"hr"

"hu"

"id"

"is"

"it"

"ku"

"la"

"lt"

"lv"

"mi"

"ms"

"mt"

"nl"

"no"

"oc"

"pi"

"pl"

"pt"

"ro"

"rs_latin"

"sk"

"sl"

"sq"

"sv"

"sw"

"tl"

"tr"

"uz"

"vi"

"ar"

"fa"

"ug"

"ur"

"bn"

"as"

"mni"

"ru"

"rs_cyrillic"

"be"

"bg"

"uk"

"mn"

"abq"

"ady"

"kbd"

"ava"

"dar"

"inh"

"che"

"lbe"

"lez"

"tab"

"tjk"

"hi"

"mr"

"ne"

"bh"

"mai"

"ang"

"bho"

"mah"

"sck"

"new"

"gom"

"sa"

"bgc"

"th"

"ch_sim"

"ch_tra"

"ja"

"ko"

"ta"

"te"

"kn"

layout_aware?: boolean | null

line_level_bounding_box?: boolean | null

markdown_table_multiline_header_separator?: string | null

max_pages?: number | null

max_pages_enforced?: number | null

merge_tables_across_pages_in_markdown?: boolean | null

model?: string | null

outlined_table_extraction?: boolean | null

output_pdf_of_document?: boolean | null

output_s3_path_prefix?: string | null

output_s3_region?: string | null

output_tables_as_HTML?: boolean | null

page_error_tolerance?: number | null

page_footer_prefix?: string | null

page_footer_suffix?: string | null

page_header_prefix?: string | null

page_header_suffix?: string | null

page_prefix?: string | null

page_separator?: string | null

page_suffix?: string | null

parse_mode?: ParsingMode | null

Enum for representing the mode of parsing to be used.

Accepts one of the following:

"parse_page_without_llm"

"parse_page_with_llm"

"parse_page_with_lvm"

"parse_page_with_agent"

"parse_page_with_layout_agent"

"parse_document_with_llm"

"parse_document_with_lvm"

"parse_document_with_agent"

parsing_instruction?: string | null

precise_bounding_box?: boolean | null

premium_mode?: boolean | null

presentation_out_of_bounds_content?: boolean | null

presentation_skip_embedded_data?: boolean | null

preserve_layout_alignment_across_pages?: boolean | null

preserve_very_small_text?: boolean | null

preset?: string | null

priority?: "low" | "medium" | "high" | "critical" | null

The priority for the request. This field may be ignored or overwritten depending on the organization tier.

Accepts one of the following:

"low"

"medium"

"high"

"critical"

project_id?: string | null

remove_hidden_text?: boolean | null

replace_failed_page_mode?: FailPageMode | null

Enum for representing the different available page error handling modes.

Accepts one of the following:

"raw_text"

"blank_page"

"error_message"

replace_failed_page_with_error_message_prefix?: string | null

replace_failed_page_with_error_message_suffix?: string | null

save_images?: boolean | null

skip_diagonal_text?: boolean | null

specialized_chart_parsing_agentic?: boolean | null

specialized_chart_parsing_efficient?: boolean | null

specialized_chart_parsing_plus?: boolean | null

specialized_image_parsing?: boolean | null

spreadsheet_extract_sub_tables?: boolean | null

spreadsheet_force_formula_computation?: boolean | null

spreadsheet_include_hidden_sheets?: boolean | null

strict_mode_buggy_font?: boolean | null

strict_mode_image_extraction?: boolean | null

strict_mode_image_ocr?: boolean | null

strict_mode_reconstruction?: boolean | null

structured_output?: boolean | null

structured_output_json_schema?: string | null

structured_output_json_schema_name?: string | null

system_prompt?: string | null

system_prompt_append?: string | null

take_screenshot?: boolean | null

target_pages?: string | null

tier?: string | null

use_vendor_multimodal_model?: boolean | null

user_prompt?: string | null

vendor_multimodal_api_key?: string | null

vendor_multimodal_model_name?: string | null

version?: string | null

webhook_configurations?: Array<WebhookConfiguration { webhook_events, webhook_headers, webhook_output_format, webhook_url } > | null

The outbound webhook configurations

webhook_events?: Array<"extract.pending" | "extract.success" | "extract.error" | 14 more> | null

List of event names to subscribe to

Accepts one of the following:

"extract.pending"

"extract.success"

"extract.error"

"extract.partial_success"

"extract.cancelled"

"parse.pending"

"parse.running"

"parse.success"

"parse.error"

"parse.partial_success"

"parse.cancelled"

"classify.pending"

"classify.success"

"classify.error"

"classify.partial_success"

"classify.cancelled"

"unmapped_event"

webhook_headers?: Record<string, string> | null

Custom HTTP headers to include with webhook requests.

webhook_output_format?: string | null

The output format to use for the webhook. Defaults to string if none supplied. Currently supported values: string, json

webhook_url?: string | null

The URL to send webhook notifications to.

webhook_url?: string | null

source_id: string

ID of the source

source_type: string

Type of the source (e.g., 'project')

updated_at: string

Last update timestamp

formatdate-time

version: string

Version of the configuration

creator?: string | null

Creator of the configuration

next_page_token?: string | null

A token, which can be sent as page_token to retrieve the next page. If this field is omitted, there are no subsequent pages.

total_size?: number | null

The total number of items available. This is only populated when specifically requested. The value may be an estimate and can be used for display purposes only.

BetaSheets

Create Spreadsheet Job

client.beta.sheets.create(, ?): SheetsJob { id, config, created_at, 10 more }

POST/api/v1/beta/sheets/jobs

List Spreadsheet Jobs

client.beta.sheets.list(?, ?): PaginatedCursor<SheetsJob { id, config, created_at, 10 more } >

GET/api/v1/beta/sheets/jobs

Get Spreadsheet Job

client.beta.sheets.get(, ?, ?): SheetsJob { id, config, created_at, 10 more }

GET/api/v1/beta/sheets/jobs/{spreadsheet_job_id}

Get Result Region

client.beta.sheets.getResultTable(, , ?): PresignedURL { expires_at, url, form_fields }

GET/api/v1/beta/sheets/jobs/{spreadsheet_job_id}/regions/{region_id}/result/{region_type}

Delete Spreadsheet Job

client.beta.sheets.deleteJob(, ?, ?): SheetDeleteJobResponse

DELETE/api/v1/beta/sheets/jobs/{spreadsheet_job_id}

ModelsExpand Collapse

SheetsJob { id, config, created_at, 10 more }

A spreadsheet parsing job

id: string

The ID of the job

config: SheetsParsingConfig { extraction_range, flatten_hierarchical_tables, generate_additional_metadata, 5 more }

Configuration for the parsing job

extraction_range?: string | null

A1 notation of the range to extract a single region from. If None, the entire sheet is used.

flatten_hierarchical_tables?: boolean

Return a flattened dataframe when a detected table is recognized as hierarchical.

generate_additional_metadata?: boolean

Whether to generate additional metadata (title, description) for each extracted region.

include_hidden_cells?: boolean

Whether to include hidden cells when extracting regions from the spreadsheet.

sheet_names?: Array<string> | null

The names of the sheets to extract regions from. If empty, all sheets will be processed.

specialization?: string | null

Optional specialization mode for domain-specific extraction. Supported values: 'financial-standard', 'financial-enhanced', 'financial-precise'. Default None uses the general-purpose pipeline.

table_merge_sensitivity?: "strong" | "weak"

Influences how likely similar-looking regions are merged into a single table. Useful for spreadsheets that either have sparse tables (strong merging) or many distinct tables close together (weak merging).

Accepts one of the following:

"strong"

"weak"

use_experimental_processing?: boolean

Enables experimental processing. Accuracy may be impacted.

created_at: string

When the job was created

file_id: string | null

The ID of the input file

formatuuid

project_id: string

The ID of the project

formatuuid

status: StatusEnum

The status of the parsing job

Accepts one of the following:

"PENDING"

"SUCCESS"

"ERROR"

"PARTIAL_SUCCESS"

"CANCELLED"

updated_at: string

When the job was last updated

user_id: string

The ID of the user

errors?: Array<string>

Any errors encountered

Deprecatedfile?: File { id, name, project_id, 11 more } | null

Schema for a file.

id: string

Unique identifier

formatuuid

project_id: string

The ID of the project that the file belongs to

formatuuid

created_at?: string | null

Creation datetime

formatdate-time

data_source_id?: string | null

The ID of the data source that the file belongs to

formatuuid

expires_at?: string | null

The expiration date for the file. Files past this date can be deleted.

formatdate-time

external_file_id?: string | null

The ID of the file in the external system

file_size?: number | null

Size of the file in bytes

minimum0

file_type?: string | null

File type (e.g. pdf, docx, etc.)

maxLength3000

minLength1

last_modified_at?: string | null

The last modified time of the file

formatdate-time

Permission information for the file

Accepts one of the following:

Record<string, unknown>

Array<unknown>

string

number

boolean

purpose?: string | null

The intended purpose of the file (e.g., 'user_data', 'parse', 'extract', 'split', 'classify')

Resource information for the file

Accepts one of the following:

Record<string, unknown>

Array<unknown>

string

number

boolean

updated_at?: string | null

Update datetime

formatdate-time

regions?: Array<Region>

All extracted regions (populated when job is complete)

location: string

Location of the region in the spreadsheet

region_type: string

Type of the extracted region

sheet_name: string

Worksheet name where region was found

description?: string | null

Generated description for the region

region_id?: string

Unique identifier for this region within the file

title?: string | null

Generated title for the region

success?: boolean | null

Whether the job completed successfully

worksheet_metadata?: Array<WorksheetMetadata>

Metadata for each processed worksheet (populated when job is complete)

sheet_name: string

Name of the worksheet

description?: string | null

Generated description of the worksheet

title?: string | null

Generated title for the worksheet

SheetsParsingConfig { extraction_range, flatten_hierarchical_tables, generate_additional_metadata, 5 more }

Configuration for spreadsheet parsing and region extraction

extraction_range?: string | null

A1 notation of the range to extract a single region from. If None, the entire sheet is used.

flatten_hierarchical_tables?: boolean

Return a flattened dataframe when a detected table is recognized as hierarchical.

generate_additional_metadata?: boolean

Whether to generate additional metadata (title, description) for each extracted region.

include_hidden_cells?: boolean

Whether to include hidden cells when extracting regions from the spreadsheet.

sheet_names?: Array<string> | null

The names of the sheets to extract regions from. If empty, all sheets will be processed.

specialization?: string | null

Optional specialization mode for domain-specific extraction. Supported values: 'financial-standard', 'financial-enhanced', 'financial-precise'. Default None uses the general-purpose pipeline.

table_merge_sensitivity?: "strong" | "weak"

Accepts one of the following:

"strong"

"weak"

use_experimental_processing?: boolean

Enables experimental processing. Accuracy may be impacted.

BetaDirectories

Create Directory

client.beta.directories.create(, ?): DirectoryCreateResponse { id, name, project_id, 5 more }

POST/api/v1/beta/directories

List Directories

client.beta.directories.list(?, ?): PaginatedCursor<DirectoryListResponse { id, name, project_id, 5 more } >

GET/api/v1/beta/directories

Get Directory

client.beta.directories.get(, ?, ?): DirectoryGetResponse { id, name, project_id, 5 more }

GET/api/v1/beta/directories/{directory_id}

Update Directory

client.beta.directories.update(, , ?): DirectoryUpdateResponse { id, name, project_id, 5 more }

PATCH/api/v1/beta/directories/{directory_id}

Delete Directory

client.beta.directories.delete(, ?, ?): void

DELETE/api/v1/beta/directories/{directory_id}

BetaDirectoriesFiles

Add Directory File

client.beta.directories.files.add(, , ?): FileAddResponse { id, directory_id, display_name, 7 more }

POST/api/v1/beta/directories/{directory_id}/files

List Directory Files

client.beta.directories.files.list(, ?, ?): PaginatedCursor<FileListResponse { id, directory_id, display_name, 7 more } >

GET/api/v1/beta/directories/{directory_id}/files

Get Directory File

client.beta.directories.files.get(, , ?): FileGetResponse { id, directory_id, display_name, 7 more }

GET/api/v1/beta/directories/{directory_id}/files/{directory_file_id}

Update Directory File

client.beta.directories.files.update(, , ?): FileUpdateResponse { id, directory_id, display_name, 7 more }

PATCH/api/v1/beta/directories/{directory_id}/files/{directory_file_id}

Delete Directory File

client.beta.directories.files.delete(, , ?): void

DELETE/api/v1/beta/directories/{directory_id}/files/{directory_file_id}

Upload File To Directory

client.beta.directories.files.upload(, , ?): FileUploadResponse { id, directory_id, display_name, 7 more }

POST/api/v1/beta/directories/{directory_id}/files/upload

BetaBatch

Create Batch Job

client.beta.batch.create(, ?): BatchCreateResponse { id, job_type, project_id, 14 more }

POST/api/v1/beta/batch-processing

List Batch Jobs

client.beta.batch.list(?, ?): PaginatedBatchItems<BatchListResponse { id, job_type, project_id, 14 more } >

GET/api/v1/beta/batch-processing

Get Batch Job Status

client.beta.batch.getStatus(, ?, ?): BatchGetStatusResponse { job, progress_percentage }

GET/api/v1/beta/batch-processing/{job_id}

Cancel Batch Job

client.beta.batch.cancel(, , ?): BatchCancelResponse { job_id, message, processed_items, status }

POST/api/v1/beta/batch-processing/{job_id}/cancel

BetaBatchJob Items

List Batch Job Items

client.beta.batch.jobItems.list(, ?, ?): PaginatedBatchItems<JobItemListResponse { item_id, item_name, status, 7 more } >

GET/api/v1/beta/batch-processing/{job_id}/items

Get Item Processing Results

client.beta.batch.jobItems.getProcessingResults(, ?, ?): JobItemGetProcessingResultsResponse { item_id, item_name, processing_results }

GET/api/v1/beta/batch-processing/items/{item_id}/processing-results

ModelsExpand Collapse

SplitCategory { name, description }

Category definition for document splitting.

Name of the category.

maxLength200

minLength1

description?: string | null

Optional description of what content belongs in this category.

maxLength2000

minLength1

SplitDocumentInput { type, value }

Document input specification.

type: string

Type of document input. Valid values are: file_id

value: string

Document identifier.

SplitResultResponse { segments }

Result of a completed split job.

segments: Array<SplitSegmentResponse { category, confidence_category, pages } >

List of document segments.

category: string

Category name this split belongs to.

confidence_category: string

Categorical confidence level. Valid values are: high, medium, low.

pages: Array<number>

1-indexed page numbers in this split.

SplitSegmentResponse { category, confidence_category, pages }

A segment of the split document.

category: string

Category name this split belongs to.

confidence_category: string

Categorical confidence level. Valid values are: high, medium, low.

pages: Array<number>

1-indexed page numbers in this split.

Beta

BetaAgent Data

ModelsExpand Collapse

BetaParse Configurations

ModelsExpand Collapse

BetaSheets

ModelsExpand Collapse

BetaDirectories

BetaDirectoriesFiles

BetaBatch

BetaBatchJob Items

BetaSplit

ModelsExpand Collapse