Skip to content
Get started

Create Split Job

client.beta.split.create(SplitCreateParams { categories, document_input, organization_id, 2 more } params, RequestOptionsoptions?): SplitCreateResponse { id, categories, document_input, 7 more }
POST/api/v1/beta/split/jobs

Create a document split job. Experimental: This endpoint is not yet ready for production use and is subject to change at any time.

ParametersExpand Collapse
params: SplitCreateParams { categories, document_input, organization_id, 2 more }
categories: Array<SplitCategory { name, description } >

Body param: Categories to split the document into.

name: string

Name of the category.

maxLength200
minLength1
description?: string | null

Optional description of what content belongs in this category.

maxLength2000
minLength1
document_input: SplitDocumentInput { type, value }

Body param: Document to be split.

type: string

Type of document input. Valid values are: file_id

value: string

Document identifier.

organization_id?: string | null

Query param

formatuuid
project_id?: string | null

Query param

formatuuid
splitting_strategy?: SplittingStrategy

Body param: Strategy for splitting the document.

allow_uncategorized?: boolean

Whether to allow pages that don't match any category to be grouped as 'uncategorized'. If False, all pages must be assigned to a defined category.

ReturnsExpand Collapse
SplitCreateResponse { id, categories, document_input, 7 more }

A document split job.

id: string

Unique identifier for the split job.

categories: Array<SplitCategory { name, description } >

Categories used for splitting.

name: string

Name of the category.

maxLength200
minLength1
description?: string | null

Optional description of what content belongs in this category.

maxLength2000
minLength1
document_input: SplitDocumentInput { type, value }

Document that was split.

type: string

Type of document input. Valid values are: file_id

value: string

Document identifier.

project_id: string

Project ID this job belongs to.

status: string

Current status of the job. Valid values are: pending, processing, completed, failed.

user_id: string

User ID who created this job.

created_at?: string | null

Creation datetime

formatdate-time
error_message?: string | null

Error message if the job failed.

result?: SplitResultResponse { segments } | null

Result of a completed split job.

segments: Array<SplitSegmentResponse { category, confidence_category, pages } >

List of document segments.

category: string

Category name this split belongs to.

confidence_category: string

Categorical confidence level. Valid values are: high, medium, low.

pages: Array<number>

1-indexed page numbers in this split.

updated_at?: string | null

Update datetime

formatdate-time

Create Split Job

import LlamaCloud from '@llamaindex/llama-cloud';

const client = new LlamaCloud({
  apiKey: process.env['LLAMA_CLOUD_API_KEY'], // This is the default and can be omitted
});

const split = await client.beta.split.create({
  categories: [{ name: 'x' }],
  document_input: { type: 'type', value: 'value' },
});

console.log(split.id);
{
  "id": "id",
  "categories": [
    {
      "name": "x",
      "description": "x"
    }
  ],
  "document_input": {
    "type": "type",
    "value": "value"
  },
  "project_id": "project_id",
  "status": "status",
  "user_id": "user_id",
  "created_at": "2019-12-27T18:11:19.117Z",
  "error_message": "error_message",
  "result": {
    "segments": [
      {
        "category": "category",
        "confidence_category": "confidence_category",
        "pages": [
          0
        ]
      }
    ]
  },
  "updated_at": "2019-12-27T18:11:19.117Z"
}
Returns Examples
{
  "id": "id",
  "categories": [
    {
      "name": "x",
      "description": "x"
    }
  ],
  "document_input": {
    "type": "type",
    "value": "value"
  },
  "project_id": "project_id",
  "status": "status",
  "user_id": "user_id",
  "created_at": "2019-12-27T18:11:19.117Z",
  "error_message": "error_message",
  "result": {
    "segments": [
      {
        "category": "category",
        "confidence_category": "confidence_category",
        "pages": [
          0
        ]
      }
    ]
  },
  "updated_at": "2019-12-27T18:11:19.117Z"
}