Skip to content
Get started

Documents

Create Batch Pipeline Documents
POST/api/v1/pipelines/{pipeline_id}/documents
Paginated List Pipeline Documents
GET/api/v1/pipelines/{pipeline_id}/documents/paginated
Get Pipeline Document
GET/api/v1/pipelines/{pipeline_id}/documents/{document_id}
Delete Pipeline Document
DELETE/api/v1/pipelines/{pipeline_id}/documents/{document_id}
Get Pipeline Document Status
GET/api/v1/pipelines/{pipeline_id}/documents/{document_id}/status
Sync Pipeline Document
POST/api/v1/pipelines/{pipeline_id}/documents/{document_id}/sync
List Pipeline Document Chunks
GET/api/v1/pipelines/{pipeline_id}/documents/{document_id}/chunks
Upsert Batch Pipeline Documents
PUT/api/v1/pipelines/{pipeline_id}/documents
ModelsExpand Collapse
CloudDocument = object { id, metadata, text, 4 more }

Cloud document stored in S3.

id: string
metadata: map[unknown]
text: string
excluded_embed_metadata_keys: optional array of string
excluded_llm_metadata_keys: optional array of string
page_positions: optional array of number

indices in the CloudDocument.text where a new page begins. e.g. Second page starts at index specified by page_positions[1].

status_metadata: optional map[unknown]
CloudDocumentCreate = object { metadata, text, id, 3 more }

Create a new cloud document.

metadata: map[unknown]
text: string
id: optional string
excluded_embed_metadata_keys: optional array of string
excluded_llm_metadata_keys: optional array of string
page_positions: optional array of number

indices in the CloudDocument.text where a new page begins. e.g. Second page starts at index specified by page_positions[1].

TextNode = object { class_name, embedding, end_char_idx, 11 more }

Provided for backward compatibility.

Note: we keep the field with the typo "seperator" to maintain backward compatibility for serialized objects.

class_name: optional string
embedding: optional array of number

Embedding of the node.

end_char_idx: optional number

End char index of the node.

excluded_embed_metadata_keys: optional array of string

Metadata keys that are excluded from text for the embed model.

excluded_llm_metadata_keys: optional array of string

Metadata keys that are excluded from text for the LLM.

extra_info: optional map[unknown]

A flat dictionary of metadata fields

id_: optional string

Unique ID of the node.

metadata_seperator: optional string

Separator between metadata fields when converting to string.

metadata_template: optional string

Template for how metadata is formatted, with {key} and {value} placeholders.

mimetype: optional string

MIME type of the node content.

relationships: optional map[object { node_id, class_name, hash, 2 more } or array of object { node_id, class_name, hash, 2 more } ]

A mapping of relationships to other node information.

Accepts one of the following:
RelatedNodeInfo = object { node_id, class_name, hash, 2 more }
node_id: string
class_name: optional string
hash: optional string
metadata: optional map[unknown]
node_type: optional "1" or "2" or "3" or 2 more or string
Accepts one of the following:
ObjectType = "1" or "2" or "3" or 2 more
Accepts one of the following:
"1"
"2"
"3"
"4"
"5"
UnionMember1 = string
UnionMember1 = array of object { node_id, class_name, hash, 2 more }
node_id: string
class_name: optional string
hash: optional string
metadata: optional map[unknown]
node_type: optional "1" or "2" or "3" or 2 more or string
Accepts one of the following:
ObjectType = "1" or "2" or "3" or 2 more
Accepts one of the following:
"1"
"2"
"3"
"4"
"5"
UnionMember1 = string
start_char_idx: optional number

Start char index of the node.

text: optional string

Text content of the node.

text_template: optional string

Template for how text is formatted, with {content} and {metadata_str} placeholders.