Skip to content
Framework Docs

Parsing

Parse File
$ llamacloud-prod parsing create
POST/api/v2/parse
Get Parse Job
$ llamacloud-prod parsing get
GET/api/v2/parse/{job_id}
List Parse Jobs
$ llamacloud-prod parsing list
GET/api/v2/parse
ModelsExpand Collapse
b_box: object { h, w, x, 5 more }

Bounding box with coordinates and optional metadata.

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

code_item: object { md, value, bbox, 2 more }
md: string

Markdown representation preserving formatting

value: string

Code content

bbox: optional array of BBox { h, w, x, 5 more }

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

language: optional string

Programming language identifier

type: optional "code"

Code block item type

"code"
fail_page_mode: "raw_text" or "blank_page" or "error_message"

Enum for representing the different available page error handling modes.

"raw_text"
"blank_page"
"error_message"

List of items within the footer

text_item: object { md, value, bbox, type }
md: string

Markdown representation preserving formatting

value: string

Text content

bbox: optional array of BBox { h, w, x, 5 more }

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

type: optional "text"

Text item type

"text"
heading_item: object { level, md, value, 2 more }
level: number

Heading level (1-6)

md: string

Markdown representation preserving formatting

value: string

Heading text content

bbox: optional array of BBox { h, w, x, 5 more }

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

type: optional "heading"

Heading item type

"heading"
list_item: object { items, md, ordered, 2 more }
items: array of TextItem { md, value, bbox, type } or ListItem { items, md, ordered, 2 more }

List of nested text or list items

text_item: object { md, value, bbox, type }
md: string

Markdown representation preserving formatting

value: string

Text content

bbox: optional array of BBox { h, w, x, 5 more }

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

type: optional "text"

Text item type

"text"
list_item
md: string

Markdown representation preserving formatting

ordered: boolean

Whether the list is ordered or unordered

bbox: optional array of BBox { h, w, x, 5 more }

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

type: optional "list"

List item type

"list"
code_item: object { md, value, bbox, 2 more }
md: string

Markdown representation preserving formatting

value: string

Code content

bbox: optional array of BBox { h, w, x, 5 more }

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

language: optional string

Programming language identifier

type: optional "code"

Code block item type

"code"
table_item: object { csv, html, md, 6 more }
csv: string

CSV representation of the table

html: string

HTML representation of the table

md: string

Markdown representation preserving formatting

rows: array of array of string or number

Table data as array of arrays (string, number, or null)

union_member_0: string
union_member_1: number
bbox: optional array of BBox { h, w, x, 5 more }

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

merged_from_pages: optional array of number

List of page numbers with tables that were merged into this table (e.g., [1, 2, 3, 4])

merged_into_page: optional number

Populated when merged into another table. Page number where the full merged table begins (used on empty tables).

parse_concerns: optional array of object { details, type }

Quality concerns detected during table extraction, indicating the table may have issues

details: string

Human-readable details about the concern

type: string

Type of parse concern (e.g. header_value_type_mismatch, inconsistent_row_cell_count)

type: optional "table"

Table item type

"table"
image_item: object { caption, md, url, 2 more }
caption: string

Image caption

md: string

Markdown representation preserving formatting

url: string

URL to the image

bbox: optional array of BBox { h, w, x, 5 more }

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

type: optional "image"

Image item type

"image"

Markdown representation preserving formatting

Display text of the link

URL of the link

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

Link item type

Markdown representation preserving formatting

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

Page footer container

header_item: object { items, md, bbox, type }
items: array of TextItem { md, value, bbox, type } or HeadingItem { level, md, value, 2 more } or ListItem { items, md, ordered, 2 more } or 4 more

List of items within the header

text_item: object { md, value, bbox, type }
md: string

Markdown representation preserving formatting

value: string

Text content

bbox: optional array of BBox { h, w, x, 5 more }

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

type: optional "text"

Text item type

"text"
heading_item: object { level, md, value, 2 more }
level: number

Heading level (1-6)

md: string

Markdown representation preserving formatting

value: string

Heading text content

bbox: optional array of BBox { h, w, x, 5 more }

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

type: optional "heading"

Heading item type

"heading"
list_item: object { items, md, ordered, 2 more }
items: array of TextItem { md, value, bbox, type } or ListItem { items, md, ordered, 2 more }

List of nested text or list items

text_item: object { md, value, bbox, type }
md: string

Markdown representation preserving formatting

value: string

Text content

bbox: optional array of BBox { h, w, x, 5 more }

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

type: optional "text"

Text item type

"text"
list_item
md: string

Markdown representation preserving formatting

ordered: boolean

Whether the list is ordered or unordered

bbox: optional array of BBox { h, w, x, 5 more }

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

type: optional "list"

List item type

"list"
code_item: object { md, value, bbox, 2 more }
md: string

Markdown representation preserving formatting

value: string

Code content

bbox: optional array of BBox { h, w, x, 5 more }

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

language: optional string

Programming language identifier

type: optional "code"

Code block item type

"code"
table_item: object { csv, html, md, 6 more }
csv: string

CSV representation of the table

html: string

HTML representation of the table

md: string

Markdown representation preserving formatting

rows: array of array of string or number

Table data as array of arrays (string, number, or null)

union_member_0: string
union_member_1: number
bbox: optional array of BBox { h, w, x, 5 more }

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

merged_from_pages: optional array of number

List of page numbers with tables that were merged into this table (e.g., [1, 2, 3, 4])

merged_into_page: optional number

Populated when merged into another table. Page number where the full merged table begins (used on empty tables).

parse_concerns: optional array of object { details, type }

Quality concerns detected during table extraction, indicating the table may have issues

details: string

Human-readable details about the concern

type: string

Type of parse concern (e.g. header_value_type_mismatch, inconsistent_row_cell_count)

type: optional "table"

Table item type

"table"
image_item: object { caption, md, url, 2 more }
caption: string

Image caption

md: string

Markdown representation preserving formatting

url: string

URL to the image

bbox: optional array of BBox { h, w, x, 5 more }

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

type: optional "image"

Image item type

"image"

Markdown representation preserving formatting

Display text of the link

URL of the link

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

Link item type

md: string

Markdown representation preserving formatting

bbox: optional array of BBox { h, w, x, 5 more }

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

type: optional "header"

Page header container

"header"
heading_item: object { level, md, value, 2 more }
level: number

Heading level (1-6)

md: string

Markdown representation preserving formatting

value: string

Heading text content

bbox: optional array of BBox { h, w, x, 5 more }

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

type: optional "heading"

Heading item type

"heading"
image_item: object { caption, md, url, 2 more }
caption: string

Image caption

md: string

Markdown representation preserving formatting

url: string

URL to the image

bbox: optional array of BBox { h, w, x, 5 more }

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

type: optional "image"

Image item type

"image"

Markdown representation preserving formatting

Display text of the link

URL of the link

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

Link item type

list_item: object { items, md, ordered, 2 more }
items: array of TextItem { md, value, bbox, type } or ListItem { items, md, ordered, 2 more }

List of nested text or list items

text_item: object { md, value, bbox, type }
md: string

Markdown representation preserving formatting

value: string

Text content

bbox: optional array of BBox { h, w, x, 5 more }

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

type: optional "text"

Text item type

"text"
list_item
md: string

Markdown representation preserving formatting

ordered: boolean

Whether the list is ordered or unordered

bbox: optional array of BBox { h, w, x, 5 more }

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

type: optional "list"

List item type

"list"
llama_parse_supported_file_extensions: ".pdf" or ".abw" or ".awt" or 143 more

Enum for supported file extensions.

".pdf"
".abw"
".awt"
".cgm"
".cwk"
".doc"
".docm"
".docx"
".dot"
".dotm"
".dotx"
".fodg"
".fodp"
".fopd"
".fodt"
".fb2"
".hwp"
".lwp"
".mcw"
".mw"
".mwd"
".odf"
".odt"
".otg"
".ott"
".pages"
".pbd"
".psw"
".rtf"
".sda"
".sdd"
".sdp"
".sdw"
".sgl"
".std"
".stw"
".sxd"
".sxg"
".sxm"
".sxw"
".uof"
".uop"
".uot"
".vor"
".wpd"
".wps"
".wpt"
".wri"
".wn"
".xml"
".zabw"
".key"
".odp"
".odg"
".otp"
".pot"
".potm"
".potx"
".ppt"
".pptm"
".pptx"
".sti"
".sxi"
".vsd"
".vsdm"
".vsdx"
".vdx"
".bmp"
".gif"
".heic"
".heif"
".jpg"
".jpeg"
".png"
".svg"
".tif"
".tiff"
".webp"
".htm"
".html"
".xhtm"
".csv"
".dbf"
".dif"
".et"
".eth"
".fods"
".numbers"
".ods"
".ots"
".prn"
".qpw"
".slk"
".stc"
".sxc"
".sylk"
".tsv"
".uos1"
".uos2"
".uos"
".wb1"
".wb2"
".wb3"
".wk1"
".wk2"
".wk3"
".wk4"
".wks"
".wq1"
".wq2"
".xlr"
".xls"
".xlsb"
".xlsm"
".xlsx"
".xlw"
".azw"
".azw3"
".azw4"
".cb7"
".cbc"
".cbr"
".cbz"
".chm"
".djvu"
".epub"
".fbz"
".htmlz"
".lit"
".lrf"
".md"
".mobi"
".pdb"
".pml"
".prc"
".rb"
".snb"
".tcr"
".txtz"
".m4a"
".mp3"
".mp4"
".mpeg"
".mpga"
".wav"
".webm"
parsing_job: object { id, status, error_code, error_message }

A parse job (v1).

id: string

Unique parse job identifier

status: "PENDING" or "SUCCESS" or "ERROR" or 2 more

Current job status

"PENDING"
"SUCCESS"
"ERROR"
"PARTIAL_SUCCESS"
"CANCELLED"
error_code: optional string

Machine-readable error code when failed

error_message: optional string

Human-readable error details when failed

parsing_languages: "af" or "az" or "bs" or 83 more

Enum for representing the languages supported by the parser.

"af"
"az"
"bs"
"cs"
"cy"
"da"
"de"
"en"
"es"
"et"
"fr"
"ga"
"hr"
"hu"
"id"
"is"
"it"
"ku"
"la"
"lt"
"lv"
"mi"
"ms"
"mt"
"nl"
"no"
"oc"
"pi"
"pl"
"pt"
"ro"
"rs_latin"
"sk"
"sl"
"sq"
"sv"
"sw"
"tl"
"tr"
"uz"
"vi"
"ar"
"fa"
"ug"
"ur"
"bn"
"as"
"mni"
"ru"
"rs_cyrillic"
"be"
"bg"
"uk"
"mn"
"abq"
"ady"
"kbd"
"ava"
"dar"
"inh"
"che"
"lbe"
"lez"
"tab"
"tjk"
"hi"
"mr"
"ne"
"bh"
"mai"
"ang"
"bho"
"mah"
"sck"
"new"
"gom"
"sa"
"bgc"
"th"
"ch_sim"
"ch_tra"
"ja"
"ko"
"ta"
"te"
"kn"
parsing_mode: "parse_page_without_llm" or "parse_page_with_llm" or "parse_page_with_lvm" or 5 more

Enum for representing the mode of parsing to be used.

"parse_page_without_llm"
"parse_page_with_llm"
"parse_page_with_lvm"
"parse_page_with_agent"
"parse_page_with_layout_agent"
"parse_document_with_llm"
"parse_document_with_lvm"
"parse_document_with_agent"
status_enum: "PENDING" or "SUCCESS" or "ERROR" or 2 more

Enum for representing the status of a job

"PENDING"
"SUCCESS"
"ERROR"
"PARTIAL_SUCCESS"
"CANCELLED"
table_item: object { csv, html, md, 6 more }
csv: string

CSV representation of the table

html: string

HTML representation of the table

md: string

Markdown representation preserving formatting

rows: array of array of string or number

Table data as array of arrays (string, number, or null)

union_member_0: string
union_member_1: number
bbox: optional array of BBox { h, w, x, 5 more }

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

merged_from_pages: optional array of number

List of page numbers with tables that were merged into this table (e.g., [1, 2, 3, 4])

merged_into_page: optional number

Populated when merged into another table. Page number where the full merged table begins (used on empty tables).

parse_concerns: optional array of object { details, type }

Quality concerns detected during table extraction, indicating the table may have issues

details: string

Human-readable details about the concern

type: string

Type of parse concern (e.g. header_value_type_mismatch, inconsistent_row_cell_count)

type: optional "table"

Table item type

"table"
text_item: object { md, value, bbox, type }
md: string

Markdown representation preserving formatting

value: string

Text content

bbox: optional array of BBox { h, w, x, 5 more }

List of bounding boxes

h: number

Height of the bounding box

w: number

Width of the bounding box

x: number

X coordinate of the bounding box

y: number

Y coordinate of the bounding box

confidence: optional number

Confidence score

end_index: optional number

End index in the text

label: optional string

Label for the bounding box

start_index: optional number

Start index in the text

type: optional "text"

Text item type

"text"