Skip to content
Get started

Parsing

Parse File
parsing.create(ParsingCreateParams**kwargs) -> ParsingCreateResponse
POST/api/v2/parse
Get Parse Job
parsing.get(strjob_id, ParsingGetParams**kwargs) -> ParsingGetResponse
GET/api/v2/parse/{job_id}
List Parse Jobs
parsing.list(ParsingListParams**kwargs) -> SyncPaginatedCursor[ParsingListResponse]
GET/api/v2/parse
ModelsExpand Collapse
class BBox:

Bounding box with coordinates and optional metadata.

h: float

Height of the bounding box

w: float

Width of the bounding box

x: float

X coordinate of the bounding box

y: float

Y coordinate of the bounding box

confidence: Optional[float]

Confidence score

end_index: Optional[int]

End index in the text

label: Optional[str]

Label for the bounding box

start_index: Optional[int]

Start index in the text

Literal["raw_text", "blank_page", "error_message"]

Enum for representing the different available page error handling modes.

Accepts one of the following:
"raw_text"
"blank_page"
"error_message"
class ListItem:
items: List[Item]

List of nested text or list items

Accepts one of the following:
class ItemTextItem:
md: str

Markdown representation preserving formatting

value: str

Text content

bbox: Optional[List[BBox]]

List of bounding boxes

h: float

Height of the bounding box

w: float

Width of the bounding box

x: float

X coordinate of the bounding box

y: float

Y coordinate of the bounding box

confidence: Optional[float]

Confidence score

end_index: Optional[int]

End index in the text

label: Optional[str]

Label for the bounding box

start_index: Optional[int]

Start index in the text

type: Optional[Literal["text"]]

Text item type

class ListItem:
items: List[Item]

List of nested text or list items

Accepts one of the following:
class ItemTextItem:
md: str

Markdown representation preserving formatting

value: str

Text content

bbox: Optional[List[BBox]]

List of bounding boxes

h: float

Height of the bounding box

w: float

Width of the bounding box

x: float

X coordinate of the bounding box

y: float

Y coordinate of the bounding box

confidence: Optional[float]

Confidence score

end_index: Optional[int]

End index in the text

label: Optional[str]

Label for the bounding box

start_index: Optional[int]

Start index in the text

type: Optional[Literal["text"]]

Text item type

md: str

Markdown representation preserving formatting

ordered: bool

Whether the list is ordered or unordered

bbox: Optional[List[BBox]]

List of bounding boxes

h: float

Height of the bounding box

w: float

Width of the bounding box

x: float

X coordinate of the bounding box

y: float

Y coordinate of the bounding box

confidence: Optional[float]

Confidence score

end_index: Optional[int]

End index in the text

label: Optional[str]

Label for the bounding box

start_index: Optional[int]

Start index in the text

type: Optional[Literal["list"]]

List item type

md: str

Markdown representation preserving formatting

ordered: bool

Whether the list is ordered or unordered

bbox: Optional[List[BBox]]

List of bounding boxes

h: float

Height of the bounding box

w: float

Width of the bounding box

x: float

X coordinate of the bounding box

y: float

Y coordinate of the bounding box

confidence: Optional[float]

Confidence score

end_index: Optional[int]

End index in the text

label: Optional[str]

Label for the bounding box

start_index: Optional[int]

Start index in the text

type: Optional[Literal["list"]]

List item type

Literal[".pdf", ".abw", ".awt", 141 more]

Enum for supported file extensions.

Accepts one of the following:
".pdf"
".abw"
".awt"
".cgm"
".cwk"
".doc"
".docm"
".docx"
".dot"
".dotm"
".dotx"
".fodg"
".fodp"
".fopd"
".fodt"
".fb2"
".hwp"
".lwp"
".mcw"
".mw"
".mwd"
".odf"
".odt"
".otg"
".ott"
".pages"
".pbd"
".psw"
".rtf"
".sda"
".sdd"
".sdp"
".sdw"
".sgl"
".std"
".stw"
".sxd"
".sxg"
".sxm"
".sxw"
".uof"
".uop"
".uot"
".vor"
".wpd"
".wps"
".wpt"
".wri"
".wn"
".xml"
".zabw"
".key"
".odp"
".odg"
".otp"
".pot"
".potm"
".potx"
".ppt"
".pptm"
".pptx"
".sti"
".sxi"
".vsd"
".vsdm"
".vsdx"
".vdx"
".bmp"
".gif"
".jpg"
".jpeg"
".png"
".svg"
".tif"
".tiff"
".webp"
".htm"
".html"
".xhtm"
".csv"
".dbf"
".dif"
".et"
".eth"
".fods"
".numbers"
".ods"
".ots"
".prn"
".qpw"
".slk"
".stc"
".sxc"
".sylk"
".tsv"
".uos1"
".uos2"
".uos"
".wb1"
".wb2"
".wb3"
".wk1"
".wk2"
".wk3"
".wk4"
".wks"
".wq1"
".wq2"
".xlr"
".xls"
".xlsb"
".xlsm"
".xlsx"
".xlw"
".azw"
".azw3"
".azw4"
".cb7"
".cbc"
".cbr"
".cbz"
".chm"
".djvu"
".epub"
".fbz"
".htmlz"
".lit"
".lrf"
".md"
".mobi"
".pdb"
".pml"
".prc"
".rb"
".snb"
".tcr"
".txtz"
".m4a"
".mp3"
".mp4"
".mpeg"
".mpga"
".wav"
".webm"
class ParsingJob:

Response schema for a parsing job.

id: str
status: StatusEnum

Enum for representing the status of a job

Accepts one of the following:
"PENDING"
"SUCCESS"
"ERROR"
"PARTIAL_SUCCESS"
"CANCELLED"
error_code: Optional[str]
error_message: Optional[str]
Literal["af", "az", "bs", 83 more]

Enum for representing the languages supported by the parser.

Accepts one of the following:
"af"
"az"
"bs"
"cs"
"cy"
"da"
"de"
"en"
"es"
"et"
"fr"
"ga"
"hr"
"hu"
"id"
"is"
"it"
"ku"
"la"
"lt"
"lv"
"mi"
"ms"
"mt"
"nl"
"no"
"oc"
"pi"
"pl"
"pt"
"ro"
"rs_latin"
"sk"
"sl"
"sq"
"sv"
"sw"
"tl"
"tr"
"uz"
"vi"
"ar"
"fa"
"ug"
"ur"
"bn"
"as"
"mni"
"ru"
"rs_cyrillic"
"be"
"bg"
"uk"
"mn"
"abq"
"ady"
"kbd"
"ava"
"dar"
"inh"
"che"
"lbe"
"lez"
"tab"
"tjk"
"hi"
"mr"
"ne"
"bh"
"mai"
"ang"
"bho"
"mah"
"sck"
"new"
"gom"
"sa"
"bgc"
"th"
"ch_sim"
"ch_tra"
"ja"
"ko"
"ta"
"te"
"kn"
Literal["parse_page_without_llm", "parse_page_with_llm", "parse_page_with_lvm", 5 more]

Enum for representing the mode of parsing to be used.

Accepts one of the following:
"parse_page_without_llm"
"parse_page_with_llm"
"parse_page_with_lvm"
"parse_page_with_agent"
"parse_page_with_layout_agent"
"parse_document_with_llm"
"parse_document_with_lvm"
"parse_document_with_agent"
Literal["PENDING", "SUCCESS", "ERROR", 2 more]

Enum for representing the status of a job

Accepts one of the following:
"PENDING"
"SUCCESS"
"ERROR"
"PARTIAL_SUCCESS"
"CANCELLED"