# Split ## Create Split Job `SplitCreateResponse beta().split().create(SplitCreateParamsparams, RequestOptionsrequestOptions = RequestOptions.none())` **post** `/api/v1/beta/split/jobs` Create a document split job. ### Parameters - `SplitCreateParams params` - `Optional organizationId` - `Optional projectId` - `SplitDocumentInput documentInput` Document to be split. - `Optional configuration` Split configuration with categories and splitting strategy. - `List categories` Categories to split documents into. - `String name` Name of the category. - `Optional description` Optional description of what content belongs in this category. - `Optional splittingStrategy` Strategy for splitting documents. - `Optional allowUncategorized` Controls handling of pages that don't match any category. 'include': pages can be grouped as 'uncategorized' and included in results. 'forbid': all pages must be assigned to a defined category. 'omit': pages can be classified as 'uncategorized' but are excluded from results. - `INCLUDE("include")` - `FORBID("forbid")` - `OMIT("omit")` - `Optional configurationId` Saved split configuration ID. ### Returns - `class SplitCreateResponse:` Beta response — uses nested document_input object. - `String id` Unique identifier for the split job. - `List categories` Categories used for splitting. - `String name` Name of the category. - `Optional description` Optional description of what content belongs in this category. - `SplitDocumentInput documentInput` Document that was split. - `String type` Type of document input. Valid values are: file_id - `String value` Document identifier. - `String projectId` Project ID this job belongs to. - `String status` Current status of the job. Valid values are: pending, processing, completed, failed, cancelled. - `String userId` User ID who created this job. - `Optional configurationId` Split configuration ID used for this job. - `Optional createdAt` Creation datetime - `Optional errorMessage` Error message if the job failed. - `Optional result` Result of a completed split job. - `List segments` List of document segments. - `String category` Category name this split belongs to. - `String confidenceCategory` Categorical confidence level. Valid values are: high, medium, low. - `List pages` 1-indexed page numbers in this split. - `Optional updatedAt` Update datetime ### Example ```java package com.llamacloud_prod.api.example; import com.llamacloud_prod.api.client.LlamaCloudClient; import com.llamacloud_prod.api.client.okhttp.LlamaCloudOkHttpClient; import com.llamacloud_prod.api.models.beta.split.SplitCreateParams; import com.llamacloud_prod.api.models.beta.split.SplitCreateResponse; import com.llamacloud_prod.api.models.beta.split.SplitDocumentInput; public final class Main { private Main() {} public static void main(String[] args) { LlamaCloudClient client = LlamaCloudOkHttpClient.fromEnv(); SplitCreateParams params = SplitCreateParams.builder() .documentInput(SplitDocumentInput.builder() .type("type") .value("value") .build()) .build(); SplitCreateResponse split = client.beta().split().create(params); } } ``` #### Response ```json { "id": "id", "categories": [ { "name": "x", "description": "x" } ], "document_input": { "type": "type", "value": "value" }, "project_id": "project_id", "status": "status", "user_id": "user_id", "configuration_id": "configuration_id", "created_at": "2019-12-27T18:11:19.117Z", "error_message": "error_message", "result": { "segments": [ { "category": "category", "confidence_category": "confidence_category", "pages": [ 0 ] } ] }, "updated_at": "2019-12-27T18:11:19.117Z" } ``` ## List Split Jobs `SplitListPage beta().split().list(SplitListParamsparams = SplitListParams.none(), RequestOptionsrequestOptions = RequestOptions.none())` **get** `/api/v1/beta/split/jobs` List document split jobs. ### Parameters - `SplitListParams params` - `Optional createdAtOnOrAfter` Include items created at or after this timestamp (inclusive) - `Optional createdAtOnOrBefore` Include items created at or before this timestamp (inclusive) - `Optional> jobIds` Filter by specific job IDs - `Optional organizationId` - `Optional pageSize` - `Optional pageToken` - `Optional projectId` - `Optional status` Filter by job status (pending, processing, completed, failed, cancelled) - `PENDING("pending")` - `PROCESSING("processing")` - `COMPLETED("completed")` - `FAILED("failed")` - `CANCELLED("cancelled")` ### Returns - `class SplitListResponse:` Beta response — uses nested document_input object. - `String id` Unique identifier for the split job. - `List categories` Categories used for splitting. - `String name` Name of the category. - `Optional description` Optional description of what content belongs in this category. - `SplitDocumentInput documentInput` Document that was split. - `String type` Type of document input. Valid values are: file_id - `String value` Document identifier. - `String projectId` Project ID this job belongs to. - `String status` Current status of the job. Valid values are: pending, processing, completed, failed, cancelled. - `String userId` User ID who created this job. - `Optional configurationId` Split configuration ID used for this job. - `Optional createdAt` Creation datetime - `Optional errorMessage` Error message if the job failed. - `Optional result` Result of a completed split job. - `List segments` List of document segments. - `String category` Category name this split belongs to. - `String confidenceCategory` Categorical confidence level. Valid values are: high, medium, low. - `List pages` 1-indexed page numbers in this split. - `Optional updatedAt` Update datetime ### Example ```java package com.llamacloud_prod.api.example; import com.llamacloud_prod.api.client.LlamaCloudClient; import com.llamacloud_prod.api.client.okhttp.LlamaCloudOkHttpClient; import com.llamacloud_prod.api.models.beta.split.SplitListPage; import com.llamacloud_prod.api.models.beta.split.SplitListParams; public final class Main { private Main() {} public static void main(String[] args) { LlamaCloudClient client = LlamaCloudOkHttpClient.fromEnv(); SplitListPage page = client.beta().split().list(); } } ``` #### Response ```json { "items": [ { "id": "id", "categories": [ { "name": "x", "description": "x" } ], "document_input": { "type": "type", "value": "value" }, "project_id": "project_id", "status": "status", "user_id": "user_id", "configuration_id": "configuration_id", "created_at": "2019-12-27T18:11:19.117Z", "error_message": "error_message", "result": { "segments": [ { "category": "category", "confidence_category": "confidence_category", "pages": [ 0 ] } ] }, "updated_at": "2019-12-27T18:11:19.117Z" } ], "next_page_token": "next_page_token", "total_size": 0 } ``` ## Get Split Job `SplitGetResponse beta().split().get(SplitGetParamsparams = SplitGetParams.none(), RequestOptionsrequestOptions = RequestOptions.none())` **get** `/api/v1/beta/split/jobs/{split_job_id}` Get a document split job. ### Parameters - `SplitGetParams params` - `Optional splitJobId` - `Optional organizationId` - `Optional projectId` ### Returns - `class SplitGetResponse:` Beta response — uses nested document_input object. - `String id` Unique identifier for the split job. - `List categories` Categories used for splitting. - `String name` Name of the category. - `Optional description` Optional description of what content belongs in this category. - `SplitDocumentInput documentInput` Document that was split. - `String type` Type of document input. Valid values are: file_id - `String value` Document identifier. - `String projectId` Project ID this job belongs to. - `String status` Current status of the job. Valid values are: pending, processing, completed, failed, cancelled. - `String userId` User ID who created this job. - `Optional configurationId` Split configuration ID used for this job. - `Optional createdAt` Creation datetime - `Optional errorMessage` Error message if the job failed. - `Optional result` Result of a completed split job. - `List segments` List of document segments. - `String category` Category name this split belongs to. - `String confidenceCategory` Categorical confidence level. Valid values are: high, medium, low. - `List pages` 1-indexed page numbers in this split. - `Optional updatedAt` Update datetime ### Example ```java package com.llamacloud_prod.api.example; import com.llamacloud_prod.api.client.LlamaCloudClient; import com.llamacloud_prod.api.client.okhttp.LlamaCloudOkHttpClient; import com.llamacloud_prod.api.models.beta.split.SplitGetParams; import com.llamacloud_prod.api.models.beta.split.SplitGetResponse; public final class Main { private Main() {} public static void main(String[] args) { LlamaCloudClient client = LlamaCloudOkHttpClient.fromEnv(); SplitGetResponse split = client.beta().split().get("split_job_id"); } } ``` #### Response ```json { "id": "id", "categories": [ { "name": "x", "description": "x" } ], "document_input": { "type": "type", "value": "value" }, "project_id": "project_id", "status": "status", "user_id": "user_id", "configuration_id": "configuration_id", "created_at": "2019-12-27T18:11:19.117Z", "error_message": "error_message", "result": { "segments": [ { "category": "category", "confidence_category": "confidence_category", "pages": [ 0 ] } ] }, "updated_at": "2019-12-27T18:11:19.117Z" } ``` ## Domain Types ### Split Category - `class SplitCategory:` Category definition for document splitting. - `String name` Name of the category. - `Optional description` Optional description of what content belongs in this category. ### Split Document Input - `class SplitDocumentInput:` Document input specification for beta API. - `String type` Type of document input. Valid values are: file_id - `String value` Document identifier. ### Split Result Response - `class SplitResultResponse:` Result of a completed split job. - `List segments` List of document segments. - `String category` Category name this split belongs to. - `String confidenceCategory` Categorical confidence level. Valid values are: high, medium, low. - `List pages` 1-indexed page numbers in this split. ### Split Segment Response - `class SplitSegmentResponse:` A segment of the split document. - `String category` Category name this split belongs to. - `String confidenceCategory` Categorical confidence level. Valid values are: high, medium, low. - `List pages` 1-indexed page numbers in this split.