Documents

Create Batch Pipeline Documents

POST/api/v1/pipelines/{pipeline_id}/documents

Paginated List Pipeline Documents

GET/api/v1/pipelines/{pipeline_id}/documents/paginated

Get Pipeline Document

GET/api/v1/pipelines/{pipeline_id}/documents/{document_id}

Delete Pipeline Document

DELETE/api/v1/pipelines/{pipeline_id}/documents/{document_id}

Get Pipeline Document Status

GET/api/v1/pipelines/{pipeline_id}/documents/{document_id}/status

Sync Pipeline Document

POST/api/v1/pipelines/{pipeline_id}/documents/{document_id}/sync

List Pipeline Document Chunks

GET/api/v1/pipelines/{pipeline_id}/documents/{document_id}/chunks

Upsert Batch Pipeline Documents

PUT/api/v1/pipelines/{pipeline_id}/documents

ModelsExpand Collapse

CloudDocument = object { id, metadata, text, 4 more }

Cloud document stored in S3.

id: string

metadata: map[unknown]

text: string

excluded_embed_metadata_keys: optional array of string

excluded_llm_metadata_keys: optional array of string

page_positions: optional array of number

indices in the CloudDocument.text where a new page begins. e.g. Second page starts at index specified by page_positions[1].

status_metadata: optional map[unknown]

CloudDocumentCreate = object { metadata, text, id, 3 more }

Create a new cloud document.

metadata: map[unknown]

text: string

id: optional string

excluded_embed_metadata_keys: optional array of string

excluded_llm_metadata_keys: optional array of string

page_positions: optional array of number

indices in the CloudDocument.text where a new page begins. e.g. Second page starts at index specified by page_positions[1].

TextNode = object { class_name, embedding, end_char_idx, 11 more }

Provided for backward compatibility.

Note: we keep the field with the typo "seperator" to maintain backward compatibility for serialized objects.

class_name: optional string

embedding: optional array of number

Embedding of the node.

end_char_idx: optional number

End char index of the node.

excluded_embed_metadata_keys: optional array of string

Metadata keys that are excluded from text for the embed model.

excluded_llm_metadata_keys: optional array of string

Metadata keys that are excluded from text for the LLM.

extra_info: optional map[unknown]

A flat dictionary of metadata fields

id_: optional string

Unique ID of the node.

metadata_seperator: optional string

Separator between metadata fields when converting to string.

metadata_template: optional string

Template for how metadata is formatted, with {key} and {value} placeholders.

mimetype: optional string

MIME type of the node content.

relationships: optional map[object { node_id, class_name, hash, 2 more } or array of object { node_id, class_name, hash, 2 more } ]

A mapping of relationships to other node information.

Accepts one of the following:

RelatedNodeInfo = object { node_id, class_name, hash, 2 more }

node_id: string

class_name: optional string

hash: optional string

metadata: optional map[unknown]

node_type: optional "1" or "2" or "3" or 2 more or string

Accepts one of the following:

ObjectType = "1" or "2" or "3" or 2 more

Accepts one of the following:

"1"

"2"

"3"

"4"

"5"

UnionMember1 = string

UnionMember1 = array of object { node_id, class_name, hash, 2 more }

node_id: string

class_name: optional string

hash: optional string

metadata: optional map[unknown]

node_type: optional "1" or "2" or "3" or 2 more or string

Accepts one of the following:

ObjectType = "1" or "2" or "3" or 2 more

Accepts one of the following:

"1"

"2"

"3"

"4"

"5"

UnionMember1 = string

start_char_idx: optional number

Start char index of the node.

text: optional string

Text content of the node.

text_template: optional string

Template for how text is formatted, with {content} and {metadata_str} placeholders.