# Beta # Indexes ## Get Index `IndexGetResponse beta().indexes().get(IndexGetParamsparams = IndexGetParams.none(), RequestOptionsrequestOptions = RequestOptions.none())` **get** `/api/v1/indexes/{index_id}` Get an index by ID. ### Parameters - `IndexGetParams params` - `Optional indexId` - `Optional organizationId` - `Optional projectId` ### Returns - `class IndexGetResponse:` A searchable index over a directory of documents. - `String id` Unique identifier - `String exportConfigId` ID of the export configuration. - `String name` Index name. - `String projectId` Project this index belongs to. - `String sourceDirectoryId` ID of the source directory. - `String syncConfigId` ID of the sync configuration. - `Optional createdAt` Creation datetime - `Optional description` Index description. - `Optional lastExportedAt` Last export time. - `Optional lastSyncedAt` Last sync time. - `Optional metadata` Build state and diagnostic info. - `Optional updatedAt` Update datetime ### Example ```java package com.llamacloud_prod.api.example; import com.llamacloud_prod.api.client.LlamaCloudClient; import com.llamacloud_prod.api.client.okhttp.LlamaCloudOkHttpClient; import com.llamacloud_prod.api.models.beta.indexes.IndexGetParams; import com.llamacloud_prod.api.models.beta.indexes.IndexGetResponse; public final class Main { private Main() {} public static void main(String[] args) { LlamaCloudClient client = LlamaCloudOkHttpClient.fromEnv(); IndexGetResponse index = client.beta().indexes().get("index_id"); } } ``` #### Response ```json { "id": "id", "export_config_id": "export_config_id", "name": "name", "project_id": "project_id", "source_directory_id": "source_directory_id", "sync_config_id": "sync_config_id", "created_at": "2019-12-27T18:11:19.117Z", "description": "description", "last_exported_at": "2019-12-27T18:11:19.117Z", "last_synced_at": "2019-12-27T18:11:19.117Z", "metadata": { "foo": "bar" }, "updated_at": "2019-12-27T18:11:19.117Z" } ``` ## Delete Index `beta().indexes().delete(IndexDeleteParamsparams = IndexDeleteParams.none(), RequestOptionsrequestOptions = RequestOptions.none())` **delete** `/api/v1/indexes/{index_id}` Delete an index. ### Parameters - `IndexDeleteParams params` - `Optional indexId` - `Optional organizationId` - `Optional projectId` ### Example ```java package com.llamacloud_prod.api.example; import com.llamacloud_prod.api.client.LlamaCloudClient; import com.llamacloud_prod.api.client.okhttp.LlamaCloudOkHttpClient; import com.llamacloud_prod.api.models.beta.indexes.IndexDeleteParams; public final class Main { private Main() {} public static void main(String[] args) { LlamaCloudClient client = LlamaCloudOkHttpClient.fromEnv(); client.beta().indexes().delete("index_id"); } } ``` ## Create Index `IndexCreateResponse beta().indexes().create(IndexCreateParamsparams, RequestOptionsrequestOptions = RequestOptions.none())` **post** `/api/v1/indexes` Create a searchable index over a source directory. ### Parameters - `IndexCreateParams params` - `Optional organizationId` - `Optional projectId` - `String sourceDirectoryId` ID of the source directory containing your documents. - `Optional description` Optional description of the index. - `Optional name` Optional display name for the index. If omitted, the index is named after the source directory. - `Optional> products` Product configurations for syncing. Omit to use a default parse configuration. Include an explicit entry per product type (e.g. parse, extract) to override the default. - `String productConfigId` ID of the product configuration. - `String productType` Product type. One of: parse, extract. - `Optional> storeAttachments` Attachment kinds to store alongside parsed output. Each entry must be one of: screenshots, items. For example, ['screenshots'] renders and stores per-page screenshots; ['items'] stores structured items with bounding boxes. Omit or pass an empty list to skip attachments. - `Optional syncFrequency` How often to re-run the sync. One of: manual, daily, on_source_change. Defaults to manual. - `Optional vectorTarget` Vector export destination for the index. 'DEFAULT' exports to the managed vector DB destination resolved from configuration. 'DISABLED' skips vector export — the export destination falls back to 'Download'. - `DEFAULT("DEFAULT")` - `DISABLED("DISABLED")` ### Returns - `class IndexCreateResponse:` A searchable index over a directory of documents. - `String id` Unique identifier - `String exportConfigId` ID of the export configuration. - `String name` Index name. - `String projectId` Project this index belongs to. - `String sourceDirectoryId` ID of the source directory. - `String syncConfigId` ID of the sync configuration. - `Optional createdAt` Creation datetime - `Optional description` Index description. - `Optional lastExportedAt` Last export time. - `Optional lastSyncedAt` Last sync time. - `Optional metadata` Build state and diagnostic info. - `Optional updatedAt` Update datetime ### Example ```java package com.llamacloud_prod.api.example; import com.llamacloud_prod.api.client.LlamaCloudClient; import com.llamacloud_prod.api.client.okhttp.LlamaCloudOkHttpClient; import com.llamacloud_prod.api.models.beta.indexes.IndexCreateParams; import com.llamacloud_prod.api.models.beta.indexes.IndexCreateResponse; public final class Main { private Main() {} public static void main(String[] args) { LlamaCloudClient client = LlamaCloudOkHttpClient.fromEnv(); IndexCreateParams params = IndexCreateParams.builder() .sourceDirectoryId("dir-abc123") .build(); IndexCreateResponse index = client.beta().indexes().create(params); } } ``` #### Response ```json { "id": "id", "export_config_id": "export_config_id", "name": "name", "project_id": "project_id", "source_directory_id": "source_directory_id", "sync_config_id": "sync_config_id", "created_at": "2019-12-27T18:11:19.117Z", "description": "description", "last_exported_at": "2019-12-27T18:11:19.117Z", "last_synced_at": "2019-12-27T18:11:19.117Z", "metadata": { "foo": "bar" }, "updated_at": "2019-12-27T18:11:19.117Z" } ``` ## Sync Index `JsonValue beta().indexes().sync(IndexSyncParamsparams = IndexSyncParams.none(), RequestOptionsrequestOptions = RequestOptions.none())` **post** `/api/v1/indexes/{index_id}/sync` Trigger a sync and export for an existing index, re-parsing changed files and exporting updated chunks. ### Parameters - `IndexSyncParams params` - `Optional indexId` - `Optional organizationId` - `Optional projectId` ### Returns - `class IndexSyncResponse:` ### Example ```java package com.llamacloud_prod.api.example; import com.llamacloud_prod.api.client.LlamaCloudClient; import com.llamacloud_prod.api.client.okhttp.LlamaCloudOkHttpClient; import com.llamacloud_prod.api.models.beta.indexes.IndexSyncParams; import com.llamacloud_prod.api.models.beta.indexes.IndexSyncResponse; public final class Main { private Main() {} public static void main(String[] args) { LlamaCloudClient client = LlamaCloudOkHttpClient.fromEnv(); IndexSyncResponse response = client.beta().indexes().sync("index_id"); } } ``` #### Response ```json {} ``` ## List Indexes `IndexListPage beta().indexes().list(IndexListParamsparams = IndexListParams.none(), RequestOptionsrequestOptions = RequestOptions.none())` **get** `/api/v1/indexes` List indexes for the current project. ### Parameters - `IndexListParams params` - `Optional organizationId` - `Optional pageSize` - `Optional pageToken` - `Optional projectId` - `Optional sourceDirectoryId` ### Returns - `class IndexListResponse:` A searchable index over a directory of documents. - `String id` Unique identifier - `String exportConfigId` ID of the export configuration. - `String name` Index name. - `String projectId` Project this index belongs to. - `String sourceDirectoryId` ID of the source directory. - `String syncConfigId` ID of the sync configuration. - `Optional createdAt` Creation datetime - `Optional description` Index description. - `Optional lastExportedAt` Last export time. - `Optional lastSyncedAt` Last sync time. - `Optional metadata` Build state and diagnostic info. - `Optional updatedAt` Update datetime ### Example ```java package com.llamacloud_prod.api.example; import com.llamacloud_prod.api.client.LlamaCloudClient; import com.llamacloud_prod.api.client.okhttp.LlamaCloudOkHttpClient; import com.llamacloud_prod.api.models.beta.indexes.IndexListPage; import com.llamacloud_prod.api.models.beta.indexes.IndexListParams; public final class Main { private Main() {} public static void main(String[] args) { LlamaCloudClient client = LlamaCloudOkHttpClient.fromEnv(); IndexListPage page = client.beta().indexes().list(); } } ``` #### Response ```json { "items": [ { "id": "id", "export_config_id": "export_config_id", "name": "name", "project_id": "project_id", "source_directory_id": "source_directory_id", "sync_config_id": "sync_config_id", "created_at": "2019-12-27T18:11:19.117Z", "description": "description", "last_exported_at": "2019-12-27T18:11:19.117Z", "last_synced_at": "2019-12-27T18:11:19.117Z", "metadata": { "foo": "bar" }, "updated_at": "2019-12-27T18:11:19.117Z" } ], "next_page_token": "next_page_token", "total_size": 0 } ``` # Retrieval ## Retrieve `RetrievalRetrieveResponse beta().retrieval().retrieve(RetrievalRetrieveParamsparams, RequestOptionsrequestOptions = RequestOptions.none())` **post** `/api/v1/retrieval/retrieve` Retrieve relevant chunks via hybrid search (vector + full-text), with filtering on built-in or user-defined metadata. ### Parameters - `RetrievalRetrieveParams params` - `Optional organizationId` - `Optional projectId` - `String indexId` ID of the index to retrieve against. - `String query` Natural-language query to retrieve relevant chunks. - `Optional customFilters` Filters on user-defined metadata fields. - `class FilterTypeUnionStrIntBoolFloat:` - `Operator operator` - `EQ("eq")` - `NE("ne")` - `GT("gt")` - `LT("lt")` - `GTE("gte")` - `LTE("lte")` - `IN("in")` - `NIN("nin")` - `Value value` - `String` - `boolean` - `double` - `List` - `String` - `boolean` - `double` - `List` - `Operator operator` - `EQ("eq")` - `NE("ne")` - `GT("gt")` - `LT("lt")` - `GTE("gte")` - `LTE("lte")` - `IN("in")` - `NIN("nin")` - `Value value` - `double` - `List` - `Optional fullTextPipelineWeight` Weight of the full-text search pipeline (0-1). - `Optional numCandidates` Number of candidates for approximate nearest neighbor search. - `Optional rerank` Reranking configuration applied after hybrid search. Enabled by default. - `Optional enabled` Set to false to disable reranking. - `Optional topN` Number of results to return after reranking. - `Optional scoreThreshold` Minimum score threshold for returned results. - `Optional staticFilters` Filters on built-in document fields (page range, chunk index, etc.). - `Optional parsedDirectoryFileId` - `Operator operator` - `EQ("eq")` - `NE("ne")` - `GT("gt")` - `LT("lt")` - `GTE("gte")` - `LTE("lte")` - `IN("in")` - `NIN("nin")` - `Value value` - `String` - `List` - `Optional topK` Maximum number of results to return. - `Optional vectorPipelineWeight` Weight of the vector search pipeline (0-1). ### Returns - `class RetrievalRetrieveResponse:` Response containing retrieval results. - `List results` Ordered list of retrieved chunks. - `String content` Text content of the retrieved chunk. - `Optional metadata` User-defined metadata associated with the chunk. - `String` - `long` - `double` - `boolean` - `JsonValue;` - `List` - `Optional rerankScore` Relevance score from the reranker, if reranking was applied. - `Optional score` Hybrid search relevance score. - `Optional staticFields` Built-in fields stored for every exported chunk. - `Optional> attachments` Attachments associated with the chunk - `String attachmentName` Attachment-relative path, e.g. 'screenshots/page_7.jpg'. - `String sourceId` File ID to pass as source_id when fetching the attachment. - `String type` Attachment kind, e.g. 'screenshot', 'items'. - `Optional chunkEndChar` End character offset of the chunk. - `Optional chunkIndex` Index of the chunk within the file. - `Optional chunkStartChar` Start character offset of the chunk. - `Optional chunkTokenCount` Token count of the chunk. - `Optional pageRangeEnd` Last page number covered by this chunk. - `Optional pageRangeStart` First page number covered by this chunk. - `Optional parsedDirectoryFileId` ID of the parsed file. ### Example ```java package com.llamacloud_prod.api.example; import com.llamacloud_prod.api.client.LlamaCloudClient; import com.llamacloud_prod.api.client.okhttp.LlamaCloudOkHttpClient; import com.llamacloud_prod.api.models.beta.retrieval.RetrievalRetrieveParams; import com.llamacloud_prod.api.models.beta.retrieval.RetrievalRetrieveResponse; public final class Main { private Main() {} public static void main(String[] args) { LlamaCloudClient client = LlamaCloudOkHttpClient.fromEnv(); RetrievalRetrieveParams params = RetrievalRetrieveParams.builder() .indexId("idx-abc123") .query("What are the key findings?") .build(); RetrievalRetrieveResponse retrieval = client.beta().retrieval().retrieve(params); } } ``` #### Response ```json { "results": [ { "content": "content", "metadata": { "foo": "string" }, "rerank_score": 0, "score": 0, "static_fields": { "attachments": [ { "attachment_name": "attachment_name", "source_id": "source_id", "type": "type" } ], "chunk_end_char": 0, "chunk_index": 0, "chunk_start_char": 0, "chunk_token_count": 0, "page_range_end": 0, "page_range_start": 0, "parsed_directory_file_id": "parsed_directory_file_id" } } ] } ``` ## Find Files `RetrievalFindPage beta().retrieval().find(RetrievalFindParamsparams, RequestOptionsrequestOptions = RequestOptions.none())` **post** `/api/v1/retrieval/files/find` Search for files by name. ### Parameters - `RetrievalFindParams params` - `Optional organizationId` - `Optional projectId` - `String indexId` ID of the index to search within. - `Optional fileName` Exact file name to match. - `Optional fileNameContains` Substring match on file name (case-insensitive). - `Optional pageSize` The maximum number of items to return. The service may return fewer than this value. If unspecified, a default page size will be used. The maximum value is typically 1000; values above this will be coerced to the maximum. - `Optional pageToken` A page token, received from a previous list call. Provide this to retrieve the subsequent page. ### Returns - `class RetrievalFindResponse:` A file returned by find. - `String fileId` ID of the file. - `String fileName` Display name of the file. ### Example ```java package com.llamacloud_prod.api.example; import com.llamacloud_prod.api.client.LlamaCloudClient; import com.llamacloud_prod.api.client.okhttp.LlamaCloudOkHttpClient; import com.llamacloud_prod.api.models.beta.retrieval.RetrievalFindPage; import com.llamacloud_prod.api.models.beta.retrieval.RetrievalFindParams; public final class Main { private Main() {} public static void main(String[] args) { LlamaCloudClient client = LlamaCloudOkHttpClient.fromEnv(); RetrievalFindParams params = RetrievalFindParams.builder() .indexId("idx-abc123") .build(); RetrievalFindPage page = client.beta().retrieval().find(params); } } ``` #### Response ```json { "items": [ { "file_id": "file_id", "file_name": "file_name" } ], "next_page_token": "next_page_token", "total_size": 0 } ``` ## Grep File `RetrievalGrepPage beta().retrieval().grep(RetrievalGrepParamsparams, RequestOptionsrequestOptions = RequestOptions.none())` **post** `/api/v1/retrieval/files/grep` Grep within a file's parsed content using a regex pattern. ### Parameters - `RetrievalGrepParams params` - `Optional organizationId` - `Optional projectId` - `String fileId` ID of the file to grep. - `String indexId` ID of the index the file belongs to. - `String pattern` Regex pattern to search for. - `Optional contextChars` Number of characters of context to include before and after the matched pattern in the content field of the response - `Optional pageSize` The maximum number of items to return. The service may return fewer than this value. If unspecified, a default page size will be used. The maximum value is typically 1000; values above this will be coerced to the maximum. - `Optional pageToken` A page token, received from a previous list call. Provide this to retrieve the subsequent page. ### Returns - `class RetrievalGrepResponse:` A single grep match within a file. - `String content` Matched text content. - `long endChar` End character offset of the match. - `long startChar` Start character offset of the match. ### Example ```java package com.llamacloud_prod.api.example; import com.llamacloud_prod.api.client.LlamaCloudClient; import com.llamacloud_prod.api.client.okhttp.LlamaCloudOkHttpClient; import com.llamacloud_prod.api.models.beta.retrieval.RetrievalGrepPage; import com.llamacloud_prod.api.models.beta.retrieval.RetrievalGrepParams; public final class Main { private Main() {} public static void main(String[] args) { LlamaCloudClient client = LlamaCloudOkHttpClient.fromEnv(); RetrievalGrepParams params = RetrievalGrepParams.builder() .fileId("file_id") .indexId("idx-abc123") .pattern("revenue|profit") .build(); RetrievalGrepPage page = client.beta().retrieval().grep(params); } } ``` #### Response ```json { "items": [ { "content": "content", "end_char": 0, "start_char": 0 } ], "next_page_token": "next_page_token", "total_size": 0 } ``` ## Read File `RetrievalReadResponse beta().retrieval().read(RetrievalReadParamsparams, RequestOptionsrequestOptions = RequestOptions.none())` **post** `/api/v1/retrieval/files/read` Read the parsed text content of a specific file. ### Parameters - `RetrievalReadParams params` - `Optional organizationId` - `Optional projectId` - `String fileId` ID of the file to read. - `String indexId` ID of the index the file belongs to. - `Optional maxLength` Maximum number of characters to read from the offset. - `Optional offset` Starting character offset. ### Returns - `class RetrievalReadResponse:` File read result. - `String content` Parsed text content of the file. ### Example ```java package com.llamacloud_prod.api.example; import com.llamacloud_prod.api.client.LlamaCloudClient; import com.llamacloud_prod.api.client.okhttp.LlamaCloudOkHttpClient; import com.llamacloud_prod.api.models.beta.retrieval.RetrievalReadParams; import com.llamacloud_prod.api.models.beta.retrieval.RetrievalReadResponse; public final class Main { private Main() {} public static void main(String[] args) { LlamaCloudClient client = LlamaCloudOkHttpClient.fromEnv(); RetrievalReadParams params = RetrievalReadParams.builder() .fileId("file_id") .indexId("idx-abc123") .build(); RetrievalReadResponse response = client.beta().retrieval().read(params); } } ``` #### Response ```json { "content": "content" } ``` # Chat ## List Sessions `ChatListPage beta().chat().list(ChatListParamsparams = ChatListParams.none(), RequestOptionsrequestOptions = RequestOptions.none())` **get** `/api/v1/chat` List all chat sessions for the current project. ### Parameters - `ChatListParams params` - `Optional organizationId` - `Optional pageSize` - `Optional pageToken` - `Optional projectId` ### Returns - `class ChatListResponse:` Summary of a chat session, including its title and last run metadata. - `String lastUpdatedAt` ISO-format timestamp showing when the session was last updated. - `String sessionId` Unique session identifier. - `Optional generatedTitle` Auto-generated title derived from the first user message. - `Optional> indexIds` Indexes this session is bound to. Null on unbound sessions. - `Optional jobMetadata` Token usage and status from the most recent run. Null if the session has not been run yet. - `Optional durationMs` - `Optional error` - `Optional> exportConfigIds` - `Optional isError` - `Optional totalInputTokens` - `Optional totalOutputTokens` - `Optional turns` ### Example ```java package com.llamacloud_prod.api.example; import com.llamacloud_prod.api.client.LlamaCloudClient; import com.llamacloud_prod.api.client.okhttp.LlamaCloudOkHttpClient; import com.llamacloud_prod.api.models.beta.chat.ChatListPage; import com.llamacloud_prod.api.models.beta.chat.ChatListParams; public final class Main { private Main() {} public static void main(String[] args) { LlamaCloudClient client = LlamaCloudOkHttpClient.fromEnv(); ChatListPage page = client.beta().chat().list(); } } ``` #### Response ```json { "items": [ { "last_updated_at": "2026-04-22T12:34:41.342245", "session_id": "ses-abc123", "generated_title": "What were the main findings in Q3?...", "index_ids": [ "idx-abc123", "idx-def456" ], "job_metadata": { "duration_ms": 0, "error": "error", "export_config_ids": [ "string" ], "is_error": true, "total_input_tokens": 0, "total_output_tokens": 0, "turns": 0 } } ], "next_page_token": "next_page_token" } ``` ## Create Session `ChatCreateResponse beta().chat().create(ChatCreateParamsparams = ChatCreateParams.none(), RequestOptionsrequestOptions = RequestOptions.none())` **post** `/api/v1/chat` Create a chat session, optionally bound to indexes (locked after the first message). ### Parameters - `ChatCreateParams params` - `Optional organizationId` - `Optional projectId` - `Optional> indexIds` Indexes this session will retrieve from. Once set and the first message has been sent, the source set is locked for the session's lifetime. Leave null to create an unbound session. ### Returns - `class ChatCreateResponse:` Summary of a chat session, including its title and last run metadata. - `String lastUpdatedAt` ISO-format timestamp showing when the session was last updated. - `String sessionId` Unique session identifier. - `Optional generatedTitle` Auto-generated title derived from the first user message. - `Optional> indexIds` Indexes this session is bound to. Null on unbound sessions. - `Optional jobMetadata` Token usage and status from the most recent run. Null if the session has not been run yet. - `Optional durationMs` - `Optional error` - `Optional> exportConfigIds` - `Optional isError` - `Optional totalInputTokens` - `Optional totalOutputTokens` - `Optional turns` ### Example ```java package com.llamacloud_prod.api.example; import com.llamacloud_prod.api.client.LlamaCloudClient; import com.llamacloud_prod.api.client.okhttp.LlamaCloudOkHttpClient; import com.llamacloud_prod.api.models.beta.chat.ChatCreateParams; import com.llamacloud_prod.api.models.beta.chat.ChatCreateResponse; public final class Main { private Main() {} public static void main(String[] args) { LlamaCloudClient client = LlamaCloudOkHttpClient.fromEnv(); ChatCreateResponse chat = client.beta().chat().create(); } } ``` #### Response ```json { "last_updated_at": "2026-04-22T12:34:41.342245", "session_id": "ses-abc123", "generated_title": "What were the main findings in Q3?...", "index_ids": [ "idx-abc123", "idx-def456" ], "job_metadata": { "duration_ms": 0, "error": "error", "export_config_ids": [ "string" ], "is_error": true, "total_input_tokens": 0, "total_output_tokens": 0, "turns": 0 } } ``` ## Get Full Session `ChatRetrieveResponse beta().chat().retrieve(ChatRetrieveParamsparams = ChatRetrieveParams.none(), RequestOptionsrequestOptions = RequestOptions.none())` **get** `/api/v1/chat/{session_id}` Retrieve a full session by ID, including its event history. ### Parameters - `ChatRetrieveParams params` - `Optional sessionId` - `Optional organizationId` - `Optional projectId` ### Returns - `class ChatRetrieveResponse:` Full chat session including its complete event history. - `List events` Ordered list of events that make up the conversation history. - `class ThinkingDelta:` - `String content` - `Optional type` - `THINKING_DELTA("thinking_delta")` - `class TextDelta:` - `String content` - `Optional type` - `TEXT_DELTA("text_delta")` - `class Thinking:` - `String content` - `Optional type` - `THINKING("thinking")` - `class Text:` - `String content` - `Optional type` - `TEXT("text")` - `class ToolCall:` - `Arguments arguments` - `String callId` - `String name` - `Optional type` - `TOOL_CALL("tool_call")` - `class ToolResult:` - `String callId` - `String name` - `JsonValue result` - `Optional imageAttachment` Coordinates for lazily resolving a page screenshot presigned URL. - `String attachmentName` - `String sourceId` - `Optional type` - `TOOL_RESULT("tool_result")` - `class Stop:` - `Optional error` - `boolean isError` - `Usage usage` - `Optional durationMs` - `Optional totalInputTokens` - `Optional totalOutputTokens` - `Optional turns` - `Optional type` - `STOP("stop")` - `class UserInput:` - `String content` - `Optional type` - `USER_INPUT("user_input")` - `String lastUpdatedAt` ISO-format timestamp showing when the session was last updated. - `String sessionId` Unique session identifier. - `Optional generatedTitle` Auto-generated title derived from the first user message. - `Optional> indexIds` Indexes this session is bound to. Null on unbound sessions. - `Optional jobMetadata` Token usage and status from the most recent run. Null if the session has not been run yet. - `Optional durationMs` - `Optional error` - `Optional> exportConfigIds` - `Optional isError` - `Optional totalInputTokens` - `Optional totalOutputTokens` - `Optional turns` ### Example ```java package com.llamacloud_prod.api.example; import com.llamacloud_prod.api.client.LlamaCloudClient; import com.llamacloud_prod.api.client.okhttp.LlamaCloudOkHttpClient; import com.llamacloud_prod.api.models.beta.chat.ChatRetrieveParams; import com.llamacloud_prod.api.models.beta.chat.ChatRetrieveResponse; public final class Main { private Main() {} public static void main(String[] args) { LlamaCloudClient client = LlamaCloudOkHttpClient.fromEnv(); ChatRetrieveResponse chat = client.beta().chat().retrieve("session_id"); } } ``` #### Response ```json { "events": [ { "content": "content", "type": "thinking_delta" } ], "last_updated_at": "2026-04-22T12:34:41.342245", "session_id": "ses-abc123", "generated_title": "What were the main findings in Q3?...", "index_ids": [ "idx-abc123", "idx-def456" ], "job_metadata": { "duration_ms": 0, "error": "error", "export_config_ids": [ "string" ], "is_error": true, "total_input_tokens": 0, "total_output_tokens": 0, "turns": 0 } } ``` ## Delete Session `beta().chat().delete(ChatDeleteParamsparams = ChatDeleteParams.none(), RequestOptionsrequestOptions = RequestOptions.none())` **delete** `/api/v1/chat/{session_id}` Delete a session. ### Parameters - `ChatDeleteParams params` - `Optional sessionId` - `Optional organizationId` - `Optional projectId` ### Example ```java package com.llamacloud_prod.api.example; import com.llamacloud_prod.api.client.LlamaCloudClient; import com.llamacloud_prod.api.client.okhttp.LlamaCloudOkHttpClient; import com.llamacloud_prod.api.models.beta.chat.ChatDeleteParams; public final class Main { private Main() {} public static void main(String[] args) { LlamaCloudClient client = LlamaCloudOkHttpClient.fromEnv(); client.beta().chat().delete("session_id"); } } ``` ## Get Session Summary `ChatGetSummaryResponse beta().chat().getSummary(ChatGetSummaryParamsparams = ChatGetSummaryParams.none(), RequestOptionsrequestOptions = RequestOptions.none())` **get** `/api/v1/chat/{session_id}/summary` Retrieve a session summary by ID. ### Parameters - `ChatGetSummaryParams params` - `Optional sessionId` - `Optional organizationId` - `Optional projectId` ### Returns - `class ChatGetSummaryResponse:` Summary of a chat session, including its title and last run metadata. - `String lastUpdatedAt` ISO-format timestamp showing when the session was last updated. - `String sessionId` Unique session identifier. - `Optional generatedTitle` Auto-generated title derived from the first user message. - `Optional> indexIds` Indexes this session is bound to. Null on unbound sessions. - `Optional jobMetadata` Token usage and status from the most recent run. Null if the session has not been run yet. - `Optional durationMs` - `Optional error` - `Optional> exportConfigIds` - `Optional isError` - `Optional totalInputTokens` - `Optional totalOutputTokens` - `Optional turns` ### Example ```java package com.llamacloud_prod.api.example; import com.llamacloud_prod.api.client.LlamaCloudClient; import com.llamacloud_prod.api.client.okhttp.LlamaCloudOkHttpClient; import com.llamacloud_prod.api.models.beta.chat.ChatGetSummaryParams; import com.llamacloud_prod.api.models.beta.chat.ChatGetSummaryResponse; public final class Main { private Main() {} public static void main(String[] args) { LlamaCloudClient client = LlamaCloudOkHttpClient.fromEnv(); ChatGetSummaryResponse response = client.beta().chat().getSummary("session_id"); } } ``` #### Response ```json { "last_updated_at": "2026-04-22T12:34:41.342245", "session_id": "ses-abc123", "generated_title": "What were the main findings in Q3?...", "index_ids": [ "idx-abc123", "idx-def456" ], "job_metadata": { "duration_ms": 0, "error": "error", "export_config_ids": [ "string" ], "is_error": true, "total_input_tokens": 0, "total_output_tokens": 0, "turns": 0 } } ``` ## Stream Messages `JsonValue beta().chat().stream(ChatStreamParamsparams, RequestOptionsrequestOptions = RequestOptions.none())` **post** `/api/v1/chat/{session_id}/messages/stream` Stream agent events for a chat turn as Server-Sent Events. ### Parameters - `ChatStreamParams params` - `Optional sessionId` - `Optional organizationId` - `Optional projectId` - `List indexIds` Indexes to retrieve data from. - `String prompt` User message for this chat turn. ### Returns - `class ChatStreamResponse:` ### Example ```java package com.llamacloud_prod.api.example; import com.llamacloud_prod.api.client.LlamaCloudClient; import com.llamacloud_prod.api.client.okhttp.LlamaCloudOkHttpClient; import com.llamacloud_prod.api.models.beta.chat.ChatStreamParams; import com.llamacloud_prod.api.models.beta.chat.ChatStreamResponse; public final class Main { private Main() {} public static void main(String[] args) { LlamaCloudClient client = LlamaCloudOkHttpClient.fromEnv(); ChatStreamParams params = ChatStreamParams.builder() .sessionId("session_id") .addIndexId("idx-abc123") .addIndexId("idx-def456") .prompt("What were the main findings in Q3?") .build(); ChatStreamResponse response = client.beta().chat().stream(params); } } ``` #### Response ```json {} ``` # Agent Data ## Get Agent Data `AgentData beta().agentData().get(AgentDataGetParamsparams = AgentDataGetParams.none(), RequestOptionsrequestOptions = RequestOptions.none())` **get** `/api/v1/beta/agent-data/{item_id}` Get agent data by ID. ### Parameters - `AgentDataGetParams params` - `Optional itemId` - `Optional organizationId` - `Optional projectId` ### Returns - `class AgentData:` API Result for a single agent data item - `Data data` - `String deploymentName` - `Optional id` - `Optional collection` - `Optional createdAt` - `Optional projectId` - `Optional updatedAt` ### Example ```java package com.llamacloud_prod.api.example; import com.llamacloud_prod.api.client.LlamaCloudClient; import com.llamacloud_prod.api.client.okhttp.LlamaCloudOkHttpClient; import com.llamacloud_prod.api.models.beta.agentdata.AgentData; import com.llamacloud_prod.api.models.beta.agentdata.AgentDataGetParams; public final class Main { private Main() {} public static void main(String[] args) { LlamaCloudClient client = LlamaCloudOkHttpClient.fromEnv(); AgentData agentData = client.beta().agentData().get("item_id"); } } ``` #### Response ```json { "data": { "foo": "bar" }, "deployment_name": "deployment_name", "id": "id", "collection": "collection", "created_at": "2019-12-27T18:11:19.117Z", "project_id": "project_id", "updated_at": "2019-12-27T18:11:19.117Z" } ``` ## Update Agent Data `AgentData beta().agentData().update(AgentDataUpdateParamsparams, RequestOptionsrequestOptions = RequestOptions.none())` **put** `/api/v1/beta/agent-data/{item_id}` Update agent data by ID (overwrites). ### Parameters - `AgentDataUpdateParams params` - `Optional itemId` - `Optional organizationId` - `Optional projectId` - `Data data` ### Returns - `class AgentData:` API Result for a single agent data item - `Data data` - `String deploymentName` - `Optional id` - `Optional collection` - `Optional createdAt` - `Optional projectId` - `Optional updatedAt` ### Example ```java package com.llamacloud_prod.api.example; import com.llamacloud_prod.api.client.LlamaCloudClient; import com.llamacloud_prod.api.client.okhttp.LlamaCloudOkHttpClient; import com.llamacloud_prod.api.core.JsonValue; import com.llamacloud_prod.api.models.beta.agentdata.AgentData; import com.llamacloud_prod.api.models.beta.agentdata.AgentDataUpdateParams; public final class Main { private Main() {} public static void main(String[] args) { LlamaCloudClient client = LlamaCloudOkHttpClient.fromEnv(); AgentDataUpdateParams params = AgentDataUpdateParams.builder() .itemId("item_id") .data(AgentDataUpdateParams.Data.builder() .putAdditionalProperty("foo", JsonValue.from("bar")) .build()) .build(); AgentData agentData = client.beta().agentData().update(params); } } ``` #### Response ```json { "data": { "foo": "bar" }, "deployment_name": "deployment_name", "id": "id", "collection": "collection", "created_at": "2019-12-27T18:11:19.117Z", "project_id": "project_id", "updated_at": "2019-12-27T18:11:19.117Z" } ``` ## Delete Agent Data `AgentDataDeleteResponse beta().agentData().delete(AgentDataDeleteParamsparams = AgentDataDeleteParams.none(), RequestOptionsrequestOptions = RequestOptions.none())` **delete** `/api/v1/beta/agent-data/{item_id}` Delete agent data by ID. ### Parameters - `AgentDataDeleteParams params` - `Optional itemId` - `Optional organizationId` - `Optional projectId` ### Returns - `class AgentDataDeleteResponse:` ### Example ```java package com.llamacloud_prod.api.example; import com.llamacloud_prod.api.client.LlamaCloudClient; import com.llamacloud_prod.api.client.okhttp.LlamaCloudOkHttpClient; import com.llamacloud_prod.api.models.beta.agentdata.AgentDataDeleteParams; import com.llamacloud_prod.api.models.beta.agentdata.AgentDataDeleteResponse; public final class Main { private Main() {} public static void main(String[] args) { LlamaCloudClient client = LlamaCloudOkHttpClient.fromEnv(); AgentDataDeleteResponse agentData = client.beta().agentData().delete("item_id"); } } ``` #### Response ```json { "foo": "string" } ``` ## Create Agent Data `AgentData beta().agentData().create(AgentDataCreateParamsparams, RequestOptionsrequestOptions = RequestOptions.none())` **post** `/api/v1/beta/agent-data` Create new agent data. ### Parameters - `AgentDataCreateParams params` - `Optional organizationId` - `Optional projectId` - `Data data` - `String deploymentName` - `Optional collection` ### Returns - `class AgentData:` API Result for a single agent data item - `Data data` - `String deploymentName` - `Optional id` - `Optional collection` - `Optional createdAt` - `Optional projectId` - `Optional updatedAt` ### Example ```java package com.llamacloud_prod.api.example; import com.llamacloud_prod.api.client.LlamaCloudClient; import com.llamacloud_prod.api.client.okhttp.LlamaCloudOkHttpClient; import com.llamacloud_prod.api.core.JsonValue; import com.llamacloud_prod.api.models.beta.agentdata.AgentData; import com.llamacloud_prod.api.models.beta.agentdata.AgentDataCreateParams; public final class Main { private Main() {} public static void main(String[] args) { LlamaCloudClient client = LlamaCloudOkHttpClient.fromEnv(); AgentDataCreateParams params = AgentDataCreateParams.builder() .data(AgentDataCreateParams.Data.builder() .putAdditionalProperty("foo", JsonValue.from("bar")) .build()) .deploymentName("deployment_name") .build(); AgentData agentData = client.beta().agentData().create(params); } } ``` #### Response ```json { "data": { "foo": "bar" }, "deployment_name": "deployment_name", "id": "id", "collection": "collection", "created_at": "2019-12-27T18:11:19.117Z", "project_id": "project_id", "updated_at": "2019-12-27T18:11:19.117Z" } ``` ## Search Agent Data `AgentDataSearchPage beta().agentData().search(AgentDataSearchParamsparams, RequestOptionsrequestOptions = RequestOptions.none())` **post** `/api/v1/beta/agent-data/:search` Search agent data with filtering, sorting, and pagination. ### Parameters - `AgentDataSearchParams params` - `Optional organizationId` - `Optional projectId` - `String deploymentName` The agent deployment's name to search within - `Optional collection` The logical agent data collection to search within - `Optional filter` A filter object or expression that filters resources listed in the response. - `Optional eq` - `double` - `String` - `LocalDateTime` - `Optional> excludes` - `double` - `String` - `LocalDateTime` - `Optional gt` - `double` - `String` - `LocalDateTime` - `Optional gte` - `double` - `String` - `LocalDateTime` - `Optional> includes` - `double` - `String` - `LocalDateTime` - `Optional lt` - `double` - `String` - `LocalDateTime` - `Optional lte` - `double` - `String` - `LocalDateTime` - `Optional ne` - `double` - `String` - `LocalDateTime` - `Optional includeTotal` Whether to include the total number of items in the response - `Optional offset` The offset to start from. If not provided, the first page is returned - `Optional orderBy` A comma-separated list of fields to order by, sorted in ascending order. Use 'field_name desc' to specify descending order. - `Optional pageSize` The maximum number of items to return. The service may return fewer than this value. If unspecified, a default page size will be used. The maximum value is typically 1000; values above this will be coerced to the maximum. - `Optional pageToken` A page token, received from a previous list call. Provide this to retrieve the subsequent page. ### Returns - `class AgentData:` API Result for a single agent data item - `Data data` - `String deploymentName` - `Optional id` - `Optional collection` - `Optional createdAt` - `Optional projectId` - `Optional updatedAt` ### Example ```java package com.llamacloud_prod.api.example; import com.llamacloud_prod.api.client.LlamaCloudClient; import com.llamacloud_prod.api.client.okhttp.LlamaCloudOkHttpClient; import com.llamacloud_prod.api.models.beta.agentdata.AgentDataSearchPage; import com.llamacloud_prod.api.models.beta.agentdata.AgentDataSearchParams; public final class Main { private Main() {} public static void main(String[] args) { LlamaCloudClient client = LlamaCloudOkHttpClient.fromEnv(); AgentDataSearchParams params = AgentDataSearchParams.builder() .deploymentName("deployment_name") .build(); AgentDataSearchPage page = client.beta().agentData().search(params); } } ``` #### Response ```json { "items": [ { "data": { "foo": "bar" }, "deployment_name": "deployment_name", "id": "id", "collection": "collection", "created_at": "2019-12-27T18:11:19.117Z", "project_id": "project_id", "updated_at": "2019-12-27T18:11:19.117Z" } ], "next_page_token": "next_page_token", "total_size": 0 } ``` ## Aggregate Agent Data `AgentDataAggregatePage beta().agentData().aggregate(AgentDataAggregateParamsparams, RequestOptionsrequestOptions = RequestOptions.none())` **post** `/api/v1/beta/agent-data/:aggregate` Aggregate agent data with grouping and optional counting/first item retrieval. ### Parameters - `AgentDataAggregateParams params` - `Optional organizationId` - `Optional projectId` - `String deploymentName` The agent deployment's name to aggregate data for - `Optional collection` The logical agent data collection to aggregate data for - `Optional count` Whether to count the number of items in each group - `Optional filter` A filter object or expression that filters resources listed in the response. - `Optional eq` - `double` - `String` - `LocalDateTime` - `Optional> excludes` - `double` - `String` - `LocalDateTime` - `Optional gt` - `double` - `String` - `LocalDateTime` - `Optional gte` - `double` - `String` - `LocalDateTime` - `Optional> includes` - `double` - `String` - `LocalDateTime` - `Optional lt` - `double` - `String` - `LocalDateTime` - `Optional lte` - `double` - `String` - `LocalDateTime` - `Optional ne` - `double` - `String` - `LocalDateTime` - `Optional first` Whether to return the first item in each group (Sorted by created_at) - `Optional> groupBy` The fields to group by. If empty, the entire dataset is grouped on. e.g. if left out, can be used for simple count operations - `Optional offset` The offset to start from. If not provided, the first page is returned - `Optional orderBy` A comma-separated list of fields to order by, sorted in ascending order. Use 'field_name desc' to specify descending order. - `Optional pageSize` The maximum number of items to return. The service may return fewer than this value. If unspecified, a default page size will be used. The maximum value is typically 1000; values above this will be coerced to the maximum. - `Optional pageToken` A page token, received from a previous list call. Provide this to retrieve the subsequent page. ### Returns - `class AgentDataAggregateResponse:` API Result for a single group in the aggregate response - `GroupKey groupKey` - `Optional count` - `Optional firstItem` ### Example ```java package com.llamacloud_prod.api.example; import com.llamacloud_prod.api.client.LlamaCloudClient; import com.llamacloud_prod.api.client.okhttp.LlamaCloudOkHttpClient; import com.llamacloud_prod.api.models.beta.agentdata.AgentDataAggregatePage; import com.llamacloud_prod.api.models.beta.agentdata.AgentDataAggregateParams; public final class Main { private Main() {} public static void main(String[] args) { LlamaCloudClient client = LlamaCloudOkHttpClient.fromEnv(); AgentDataAggregateParams params = AgentDataAggregateParams.builder() .deploymentName("deployment_name") .build(); AgentDataAggregatePage page = client.beta().agentData().aggregate(params); } } ``` #### Response ```json { "items": [ { "group_key": { "foo": "bar" }, "count": 0, "first_item": { "foo": "bar" } } ], "next_page_token": "next_page_token", "total_size": 0 } ``` ## Delete Agent Data By Query `AgentDataDeleteByQueryResponse beta().agentData().deleteByQuery(AgentDataDeleteByQueryParamsparams, RequestOptionsrequestOptions = RequestOptions.none())` **post** `/api/v1/beta/agent-data/:delete` Bulk delete agent data by query (deployment_name, collection, optional filters). ### Parameters - `AgentDataDeleteByQueryParams params` - `Optional organizationId` - `Optional projectId` - `String deploymentName` The agent deployment's name to delete data for - `Optional collection` The logical agent data collection to delete from - `Optional filter` Optional filters to select which items to delete - `Optional eq` - `double` - `String` - `LocalDateTime` - `Optional> excludes` - `double` - `String` - `LocalDateTime` - `Optional gt` - `double` - `String` - `LocalDateTime` - `Optional gte` - `double` - `String` - `LocalDateTime` - `Optional> includes` - `double` - `String` - `LocalDateTime` - `Optional lt` - `double` - `String` - `LocalDateTime` - `Optional lte` - `double` - `String` - `LocalDateTime` - `Optional ne` - `double` - `String` - `LocalDateTime` ### Returns - `class AgentDataDeleteByQueryResponse:` API response for bulk delete operation - `long deletedCount` ### Example ```java package com.llamacloud_prod.api.example; import com.llamacloud_prod.api.client.LlamaCloudClient; import com.llamacloud_prod.api.client.okhttp.LlamaCloudOkHttpClient; import com.llamacloud_prod.api.models.beta.agentdata.AgentDataDeleteByQueryParams; import com.llamacloud_prod.api.models.beta.agentdata.AgentDataDeleteByQueryResponse; public final class Main { private Main() {} public static void main(String[] args) { LlamaCloudClient client = LlamaCloudOkHttpClient.fromEnv(); AgentDataDeleteByQueryParams params = AgentDataDeleteByQueryParams.builder() .deploymentName("deployment_name") .build(); AgentDataDeleteByQueryResponse response = client.beta().agentData().deleteByQuery(params); } } ``` #### Response ```json { "deleted_count": 0 } ``` ## Domain Types ### Agent Data - `class AgentData:` API Result for a single agent data item - `Data data` - `String deploymentName` - `Optional id` - `Optional collection` - `Optional createdAt` - `Optional projectId` - `Optional updatedAt` # Sheets ## Create Spreadsheet Job `SheetsJob beta().sheets().create(SheetCreateParamsparams, RequestOptionsrequestOptions = RequestOptions.none())` **post** `/api/v1/beta/sheets/jobs` Create a spreadsheet parsing job. Provide at most one of `configuration` (an inline parsing configuration) or `configuration_id` (a saved configuration preset). If neither is provided, a default configuration is used. Optionally include `webhook_configurations` to receive `sheets.*` status notifications. Experimental: not production-ready and subject to change. ### Parameters - `SheetCreateParams params` - `Optional organizationId` - `Optional projectId` - `String fileId` The ID of the file to parse - `Optional config` Configuration for spreadsheet parsing and region extraction - `Optional configuration` Configuration for spreadsheet parsing and region extraction - `Optional configurationId` Saved configuration ID - `Optional> webhookConfigurations` Outbound webhook endpoints to notify on job status changes - `Optional> webhookEvents` Events to subscribe to (e.g. 'parse.success', 'extract.error'). If null, all events are delivered. - `EXTRACT_PENDING("extract.pending")` - `EXTRACT_SUCCESS("extract.success")` - `EXTRACT_ERROR("extract.error")` - `EXTRACT_PARTIAL_SUCCESS("extract.partial_success")` - `EXTRACT_CANCELLED("extract.cancelled")` - `PARSE_PENDING("parse.pending")` - `PARSE_RUNNING("parse.running")` - `PARSE_SUCCESS("parse.success")` - `PARSE_ERROR("parse.error")` - `PARSE_PARTIAL_SUCCESS("parse.partial_success")` - `PARSE_CANCELLED("parse.cancelled")` - `CLASSIFY_PENDING("classify.pending")` - `CLASSIFY_RUNNING("classify.running")` - `CLASSIFY_SUCCESS("classify.success")` - `CLASSIFY_ERROR("classify.error")` - `CLASSIFY_PARTIAL_SUCCESS("classify.partial_success")` - `CLASSIFY_CANCELLED("classify.cancelled")` - `SHEETS_PENDING("sheets.pending")` - `SHEETS_SUCCESS("sheets.success")` - `SHEETS_ERROR("sheets.error")` - `SHEETS_PARTIAL_SUCCESS("sheets.partial_success")` - `SHEETS_CANCELLED("sheets.cancelled")` - `UNMAPPED_EVENT("unmapped_event")` - `Optional webhookHeaders` Custom HTTP headers sent with each webhook request (e.g. auth tokens) - `Optional webhookOutputFormat` Response format sent to the webhook: 'string' (default) or 'json' - `Optional webhookUrl` URL to receive webhook POST notifications ### Returns - `class SheetsJob:` A spreadsheet parsing job. - `String id` The ID of the job - `SheetsParsingConfig configuration` Configuration applied to the parsing job (inline or resolved from a saved preset). - `Optional extractionRange` A1 notation of the range to extract a single region from. If None, the entire sheet is used. - `Optional flattenHierarchicalTables` Return a flattened dataframe when a detected table is recognized as hierarchical. - `Optional generateAdditionalMetadata` Whether to generate additional metadata (title, description) for each extracted region. - `Optional includeHiddenCells` Whether to include hidden cells when extracting regions from the spreadsheet. - `Optional> sheetNames` The names of the sheets to extract regions from. If empty, all sheets will be processed. - `Optional specialization` Optional specialization mode for domain-specific extraction. Supported values: 'financial-standard', 'financial-enhanced', 'financial-precise'. Default None uses the general-purpose pipeline. - `Optional tableMergeSensitivity` Influences how likely similar-looking regions are merged into a single table. Useful for spreadsheets that either have sparse tables (strong merging) or many distinct tables close together (weak merging). - `STRONG("strong")` - `WEAK("weak")` - `Optional useExperimentalProcessing` Enables experimental processing. Accuracy may be impacted. - `String createdAt` When the job was created - `Optional fileId` The ID of the input file - `String projectId` The ID of the project - `Status status` The status of the parsing job - `PENDING("PENDING")` - `SUCCESS("SUCCESS")` - `ERROR("ERROR")` - `PARTIAL_SUCCESS("PARTIAL_SUCCESS")` - `CANCELLED("CANCELLED")` - `String updatedAt` When the job was last updated - `String userId` The ID of the user - `Optional config` Configuration for spreadsheet parsing and region extraction - `Optional configurationId` The saved product configuration ID used at create time, if any. - `Optional> errors` Any errors encountered - `Optional file` Schema for a file. - `String id` Unique identifier - `String name` - `String projectId` The ID of the project that the file belongs to - `Optional createdAt` Creation datetime - `Optional dataSourceId` The ID of the data source that the file belongs to - `Optional expiresAt` The expiration date for the file. Files past this date can be deleted. - `Optional externalFileId` The ID of the file in the external system - `Optional fileSize` Size of the file in bytes - `Optional fileType` File type (e.g. pdf, docx, etc.) - `Optional lastModifiedAt` The last modified time of the file - `Optional permissionInfo` Permission information for the file - `class UnionMember0:` - `List` - `String` - `double` - `boolean` - `Optional purpose` The intended purpose of the file (e.g., 'user_data', 'parse', 'extract', 'split', 'classify') - `Optional resourceInfo` Resource information for the file - `class UnionMember0:` - `List` - `String` - `double` - `boolean` - `Optional updatedAt` Update datetime - `Optional metadataStateTransitions` Per-status entry timestamps. Returned only when requested via `?expand=metadata_state_transitions`. - `Optional parameters` Job-time parameters such as webhook configurations. - `Optional> webhookConfigurations` Webhook configurations for job status notifications. - `Optional> webhookEvents` Events to subscribe to (e.g. 'parse.success', 'extract.error'). If null, all events are delivered. - `EXTRACT_PENDING("extract.pending")` - `EXTRACT_SUCCESS("extract.success")` - `EXTRACT_ERROR("extract.error")` - `EXTRACT_PARTIAL_SUCCESS("extract.partial_success")` - `EXTRACT_CANCELLED("extract.cancelled")` - `PARSE_PENDING("parse.pending")` - `PARSE_RUNNING("parse.running")` - `PARSE_SUCCESS("parse.success")` - `PARSE_ERROR("parse.error")` - `PARSE_PARTIAL_SUCCESS("parse.partial_success")` - `PARSE_CANCELLED("parse.cancelled")` - `CLASSIFY_PENDING("classify.pending")` - `CLASSIFY_RUNNING("classify.running")` - `CLASSIFY_SUCCESS("classify.success")` - `CLASSIFY_ERROR("classify.error")` - `CLASSIFY_PARTIAL_SUCCESS("classify.partial_success")` - `CLASSIFY_CANCELLED("classify.cancelled")` - `SHEETS_PENDING("sheets.pending")` - `SHEETS_SUCCESS("sheets.success")` - `SHEETS_ERROR("sheets.error")` - `SHEETS_PARTIAL_SUCCESS("sheets.partial_success")` - `SHEETS_CANCELLED("sheets.cancelled")` - `UNMAPPED_EVENT("unmapped_event")` - `Optional webhookHeaders` Custom HTTP headers sent with each webhook request (e.g. auth tokens) - `Optional webhookOutputFormat` Response format sent to the webhook: 'string' (default) or 'json' - `Optional webhookUrl` URL to receive webhook POST notifications - `Optional> regions` All extracted regions (populated when job is complete) - `String location` Location of the region in the spreadsheet - `String regionType` Type of the extracted region - `String sheetName` Worksheet name where region was found - `Optional description` Generated description for the region - `Optional regionId` Unique identifier for this region within the file - `Optional title` Generated title for the region - `Optional success` Whether the job completed successfully - `Optional> worksheetMetadata` Metadata for each processed worksheet (populated when job is complete) - `String sheetName` Name of the worksheet - `Optional description` Generated description of the worksheet - `Optional title` Generated title for the worksheet ### Example ```java package com.llamacloud_prod.api.example; import com.llamacloud_prod.api.client.LlamaCloudClient; import com.llamacloud_prod.api.client.okhttp.LlamaCloudOkHttpClient; import com.llamacloud_prod.api.models.beta.sheets.SheetCreateParams; import com.llamacloud_prod.api.models.beta.sheets.SheetsJob; public final class Main { private Main() {} public static void main(String[] args) { LlamaCloudClient client = LlamaCloudOkHttpClient.fromEnv(); SheetCreateParams params = SheetCreateParams.builder() .fileId("182bd5e5-6e1a-4fe4-a799-aa6d9a6ab26e") .build(); SheetsJob sheetsJob = client.beta().sheets().create(params); } } ``` #### Response ```json { "id": "id", "configuration": { "extraction_range": "extraction_range", "flatten_hierarchical_tables": true, "generate_additional_metadata": true, "include_hidden_cells": true, "sheet_names": [ "string" ], "specialization": "specialization", "table_merge_sensitivity": "strong", "use_experimental_processing": true }, "created_at": "created_at", "file_id": "182bd5e5-6e1a-4fe4-a799-aa6d9a6ab26e", "project_id": "182bd5e5-6e1a-4fe4-a799-aa6d9a6ab26e", "status": "PENDING", "updated_at": "updated_at", "user_id": "user_id", "config": { "extraction_range": "extraction_range", "flatten_hierarchical_tables": true, "generate_additional_metadata": true, "include_hidden_cells": true, "sheet_names": [ "string" ], "specialization": "specialization", "table_merge_sensitivity": "strong", "use_experimental_processing": true }, "configuration_id": "configuration_id", "errors": [ "string" ], "file": { "id": "182bd5e5-6e1a-4fe4-a799-aa6d9a6ab26e", "name": "x", "project_id": "182bd5e5-6e1a-4fe4-a799-aa6d9a6ab26e", "created_at": "2019-12-27T18:11:19.117Z", "data_source_id": "182bd5e5-6e1a-4fe4-a799-aa6d9a6ab26e", "expires_at": "2019-12-27T18:11:19.117Z", "external_file_id": "external_file_id", "file_size": 0, "file_type": "x", "last_modified_at": "2019-12-27T18:11:19.117Z", "permission_info": { "foo": { "foo": "bar" } }, "purpose": "purpose", "resource_info": { "foo": { "foo": "bar" } }, "updated_at": "2019-12-27T18:11:19.117Z" }, "metadata_state_transitions": { "foo": "bar" }, "parameters": { "webhook_configurations": [ { "webhook_events": [ "parse.success", "parse.error" ], "webhook_headers": { "Authorization": "Bearer sk-..." }, "webhook_output_format": "json", "webhook_url": "https://example.com/webhooks/llamacloud" } ] }, "regions": [ { "location": "location", "region_type": "region_type", "sheet_name": "sheet_name", "description": "description", "region_id": "region_id", "title": "title" } ], "success": true, "worksheet_metadata": [ { "sheet_name": "sheet_name", "description": "description", "title": "title" } ] } ``` ## List Spreadsheet Jobs `SheetListPage beta().sheets().list(SheetListParamsparams = SheetListParams.none(), RequestOptionsrequestOptions = RequestOptions.none())` **get** `/api/v1/beta/sheets/jobs` List spreadsheet parsing jobs. Experimental: not production-ready and subject to change. ### Parameters - `SheetListParams params` - `Optional configurationId` Filter by saved configuration ID - `Optional createdAtOnOrAfter` Include items created at or after this timestamp (inclusive) - `Optional createdAtOnOrBefore` Include items created at or before this timestamp (inclusive) - `Optional includeResults` - `Optional> jobIds` Filter by specific job IDs - `Optional organizationId` - `Optional pageSize` - `Optional pageToken` - `Optional projectId` - `Optional status` Filter by job status - `PENDING("PENDING")` - `SUCCESS("SUCCESS")` - `ERROR("ERROR")` - `PARTIAL_SUCCESS("PARTIAL_SUCCESS")` - `CANCELLED("CANCELLED")` ### Returns - `class SheetsJob:` A spreadsheet parsing job. - `String id` The ID of the job - `SheetsParsingConfig configuration` Configuration applied to the parsing job (inline or resolved from a saved preset). - `Optional extractionRange` A1 notation of the range to extract a single region from. If None, the entire sheet is used. - `Optional flattenHierarchicalTables` Return a flattened dataframe when a detected table is recognized as hierarchical. - `Optional generateAdditionalMetadata` Whether to generate additional metadata (title, description) for each extracted region. - `Optional includeHiddenCells` Whether to include hidden cells when extracting regions from the spreadsheet. - `Optional> sheetNames` The names of the sheets to extract regions from. If empty, all sheets will be processed. - `Optional specialization` Optional specialization mode for domain-specific extraction. Supported values: 'financial-standard', 'financial-enhanced', 'financial-precise'. Default None uses the general-purpose pipeline. - `Optional tableMergeSensitivity` Influences how likely similar-looking regions are merged into a single table. Useful for spreadsheets that either have sparse tables (strong merging) or many distinct tables close together (weak merging). - `STRONG("strong")` - `WEAK("weak")` - `Optional useExperimentalProcessing` Enables experimental processing. Accuracy may be impacted. - `String createdAt` When the job was created - `Optional fileId` The ID of the input file - `String projectId` The ID of the project - `Status status` The status of the parsing job - `PENDING("PENDING")` - `SUCCESS("SUCCESS")` - `ERROR("ERROR")` - `PARTIAL_SUCCESS("PARTIAL_SUCCESS")` - `CANCELLED("CANCELLED")` - `String updatedAt` When the job was last updated - `String userId` The ID of the user - `Optional config` Configuration for spreadsheet parsing and region extraction - `Optional configurationId` The saved product configuration ID used at create time, if any. - `Optional> errors` Any errors encountered - `Optional file` Schema for a file. - `String id` Unique identifier - `String name` - `String projectId` The ID of the project that the file belongs to - `Optional createdAt` Creation datetime - `Optional dataSourceId` The ID of the data source that the file belongs to - `Optional expiresAt` The expiration date for the file. Files past this date can be deleted. - `Optional externalFileId` The ID of the file in the external system - `Optional fileSize` Size of the file in bytes - `Optional fileType` File type (e.g. pdf, docx, etc.) - `Optional lastModifiedAt` The last modified time of the file - `Optional permissionInfo` Permission information for the file - `class UnionMember0:` - `List` - `String` - `double` - `boolean` - `Optional purpose` The intended purpose of the file (e.g., 'user_data', 'parse', 'extract', 'split', 'classify') - `Optional resourceInfo` Resource information for the file - `class UnionMember0:` - `List` - `String` - `double` - `boolean` - `Optional updatedAt` Update datetime - `Optional metadataStateTransitions` Per-status entry timestamps. Returned only when requested via `?expand=metadata_state_transitions`. - `Optional parameters` Job-time parameters such as webhook configurations. - `Optional> webhookConfigurations` Webhook configurations for job status notifications. - `Optional> webhookEvents` Events to subscribe to (e.g. 'parse.success', 'extract.error'). If null, all events are delivered. - `EXTRACT_PENDING("extract.pending")` - `EXTRACT_SUCCESS("extract.success")` - `EXTRACT_ERROR("extract.error")` - `EXTRACT_PARTIAL_SUCCESS("extract.partial_success")` - `EXTRACT_CANCELLED("extract.cancelled")` - `PARSE_PENDING("parse.pending")` - `PARSE_RUNNING("parse.running")` - `PARSE_SUCCESS("parse.success")` - `PARSE_ERROR("parse.error")` - `PARSE_PARTIAL_SUCCESS("parse.partial_success")` - `PARSE_CANCELLED("parse.cancelled")` - `CLASSIFY_PENDING("classify.pending")` - `CLASSIFY_RUNNING("classify.running")` - `CLASSIFY_SUCCESS("classify.success")` - `CLASSIFY_ERROR("classify.error")` - `CLASSIFY_PARTIAL_SUCCESS("classify.partial_success")` - `CLASSIFY_CANCELLED("classify.cancelled")` - `SHEETS_PENDING("sheets.pending")` - `SHEETS_SUCCESS("sheets.success")` - `SHEETS_ERROR("sheets.error")` - `SHEETS_PARTIAL_SUCCESS("sheets.partial_success")` - `SHEETS_CANCELLED("sheets.cancelled")` - `UNMAPPED_EVENT("unmapped_event")` - `Optional webhookHeaders` Custom HTTP headers sent with each webhook request (e.g. auth tokens) - `Optional webhookOutputFormat` Response format sent to the webhook: 'string' (default) or 'json' - `Optional webhookUrl` URL to receive webhook POST notifications - `Optional> regions` All extracted regions (populated when job is complete) - `String location` Location of the region in the spreadsheet - `String regionType` Type of the extracted region - `String sheetName` Worksheet name where region was found - `Optional description` Generated description for the region - `Optional regionId` Unique identifier for this region within the file - `Optional title` Generated title for the region - `Optional success` Whether the job completed successfully - `Optional> worksheetMetadata` Metadata for each processed worksheet (populated when job is complete) - `String sheetName` Name of the worksheet - `Optional description` Generated description of the worksheet - `Optional title` Generated title for the worksheet ### Example ```java package com.llamacloud_prod.api.example; import com.llamacloud_prod.api.client.LlamaCloudClient; import com.llamacloud_prod.api.client.okhttp.LlamaCloudOkHttpClient; import com.llamacloud_prod.api.models.beta.sheets.SheetListPage; import com.llamacloud_prod.api.models.beta.sheets.SheetListParams; public final class Main { private Main() {} public static void main(String[] args) { LlamaCloudClient client = LlamaCloudOkHttpClient.fromEnv(); SheetListPage page = client.beta().sheets().list(); } } ``` #### Response ```json { "items": [ { "id": "id", "configuration": { "extraction_range": "extraction_range", "flatten_hierarchical_tables": true, "generate_additional_metadata": true, "include_hidden_cells": true, "sheet_names": [ "string" ], "specialization": "specialization", "table_merge_sensitivity": "strong", "use_experimental_processing": true }, "created_at": "created_at", "file_id": "182bd5e5-6e1a-4fe4-a799-aa6d9a6ab26e", "project_id": "182bd5e5-6e1a-4fe4-a799-aa6d9a6ab26e", "status": "PENDING", "updated_at": "updated_at", "user_id": "user_id", "config": { "extraction_range": "extraction_range", "flatten_hierarchical_tables": true, "generate_additional_metadata": true, "include_hidden_cells": true, "sheet_names": [ "string" ], "specialization": "specialization", "table_merge_sensitivity": "strong", "use_experimental_processing": true }, "configuration_id": "configuration_id", "errors": [ "string" ], "file": { "id": "182bd5e5-6e1a-4fe4-a799-aa6d9a6ab26e", "name": "x", "project_id": "182bd5e5-6e1a-4fe4-a799-aa6d9a6ab26e", "created_at": "2019-12-27T18:11:19.117Z", "data_source_id": "182bd5e5-6e1a-4fe4-a799-aa6d9a6ab26e", "expires_at": "2019-12-27T18:11:19.117Z", "external_file_id": "external_file_id", "file_size": 0, "file_type": "x", "last_modified_at": "2019-12-27T18:11:19.117Z", "permission_info": { "foo": { "foo": "bar" } }, "purpose": "purpose", "resource_info": { "foo": { "foo": "bar" } }, "updated_at": "2019-12-27T18:11:19.117Z" }, "metadata_state_transitions": { "foo": "bar" }, "parameters": { "webhook_configurations": [ { "webhook_events": [ "parse.success", "parse.error" ], "webhook_headers": { "Authorization": "Bearer sk-..." }, "webhook_output_format": "json", "webhook_url": "https://example.com/webhooks/llamacloud" } ] }, "regions": [ { "location": "location", "region_type": "region_type", "sheet_name": "sheet_name", "description": "description", "region_id": "region_id", "title": "title" } ], "success": true, "worksheet_metadata": [ { "sheet_name": "sheet_name", "description": "description", "title": "title" } ] } ], "next_page_token": "next_page_token", "total_size": 0 } ``` ## Get Spreadsheet Job `SheetsJob beta().sheets().get(SheetGetParamsparams = SheetGetParams.none(), RequestOptionsrequestOptions = RequestOptions.none())` **get** `/api/v1/beta/sheets/jobs/{spreadsheet_job_id}` Get a spreadsheet parsing job. When `include_results=True` (default), embeds extracted regions and results if complete, skipping the separate `/results` call. Experimental: not production-ready and subject to change. ### Parameters - `SheetGetParams params` - `Optional spreadsheetJobId` - `Optional> expand` Optional fields to populate on the response. Valid values: metadata_state_transitions. - `Optional includeResults` - `Optional organizationId` - `Optional projectId` ### Returns - `class SheetsJob:` A spreadsheet parsing job. - `String id` The ID of the job - `SheetsParsingConfig configuration` Configuration applied to the parsing job (inline or resolved from a saved preset). - `Optional extractionRange` A1 notation of the range to extract a single region from. If None, the entire sheet is used. - `Optional flattenHierarchicalTables` Return a flattened dataframe when a detected table is recognized as hierarchical. - `Optional generateAdditionalMetadata` Whether to generate additional metadata (title, description) for each extracted region. - `Optional includeHiddenCells` Whether to include hidden cells when extracting regions from the spreadsheet. - `Optional> sheetNames` The names of the sheets to extract regions from. If empty, all sheets will be processed. - `Optional specialization` Optional specialization mode for domain-specific extraction. Supported values: 'financial-standard', 'financial-enhanced', 'financial-precise'. Default None uses the general-purpose pipeline. - `Optional tableMergeSensitivity` Influences how likely similar-looking regions are merged into a single table. Useful for spreadsheets that either have sparse tables (strong merging) or many distinct tables close together (weak merging). - `STRONG("strong")` - `WEAK("weak")` - `Optional useExperimentalProcessing` Enables experimental processing. Accuracy may be impacted. - `String createdAt` When the job was created - `Optional fileId` The ID of the input file - `String projectId` The ID of the project - `Status status` The status of the parsing job - `PENDING("PENDING")` - `SUCCESS("SUCCESS")` - `ERROR("ERROR")` - `PARTIAL_SUCCESS("PARTIAL_SUCCESS")` - `CANCELLED("CANCELLED")` - `String updatedAt` When the job was last updated - `String userId` The ID of the user - `Optional config` Configuration for spreadsheet parsing and region extraction - `Optional configurationId` The saved product configuration ID used at create time, if any. - `Optional> errors` Any errors encountered - `Optional file` Schema for a file. - `String id` Unique identifier - `String name` - `String projectId` The ID of the project that the file belongs to - `Optional createdAt` Creation datetime - `Optional dataSourceId` The ID of the data source that the file belongs to - `Optional expiresAt` The expiration date for the file. Files past this date can be deleted. - `Optional externalFileId` The ID of the file in the external system - `Optional fileSize` Size of the file in bytes - `Optional fileType` File type (e.g. pdf, docx, etc.) - `Optional lastModifiedAt` The last modified time of the file - `Optional permissionInfo` Permission information for the file - `class UnionMember0:` - `List` - `String` - `double` - `boolean` - `Optional purpose` The intended purpose of the file (e.g., 'user_data', 'parse', 'extract', 'split', 'classify') - `Optional resourceInfo` Resource information for the file - `class UnionMember0:` - `List` - `String` - `double` - `boolean` - `Optional updatedAt` Update datetime - `Optional metadataStateTransitions` Per-status entry timestamps. Returned only when requested via `?expand=metadata_state_transitions`. - `Optional parameters` Job-time parameters such as webhook configurations. - `Optional> webhookConfigurations` Webhook configurations for job status notifications. - `Optional> webhookEvents` Events to subscribe to (e.g. 'parse.success', 'extract.error'). If null, all events are delivered. - `EXTRACT_PENDING("extract.pending")` - `EXTRACT_SUCCESS("extract.success")` - `EXTRACT_ERROR("extract.error")` - `EXTRACT_PARTIAL_SUCCESS("extract.partial_success")` - `EXTRACT_CANCELLED("extract.cancelled")` - `PARSE_PENDING("parse.pending")` - `PARSE_RUNNING("parse.running")` - `PARSE_SUCCESS("parse.success")` - `PARSE_ERROR("parse.error")` - `PARSE_PARTIAL_SUCCESS("parse.partial_success")` - `PARSE_CANCELLED("parse.cancelled")` - `CLASSIFY_PENDING("classify.pending")` - `CLASSIFY_RUNNING("classify.running")` - `CLASSIFY_SUCCESS("classify.success")` - `CLASSIFY_ERROR("classify.error")` - `CLASSIFY_PARTIAL_SUCCESS("classify.partial_success")` - `CLASSIFY_CANCELLED("classify.cancelled")` - `SHEETS_PENDING("sheets.pending")` - `SHEETS_SUCCESS("sheets.success")` - `SHEETS_ERROR("sheets.error")` - `SHEETS_PARTIAL_SUCCESS("sheets.partial_success")` - `SHEETS_CANCELLED("sheets.cancelled")` - `UNMAPPED_EVENT("unmapped_event")` - `Optional webhookHeaders` Custom HTTP headers sent with each webhook request (e.g. auth tokens) - `Optional webhookOutputFormat` Response format sent to the webhook: 'string' (default) or 'json' - `Optional webhookUrl` URL to receive webhook POST notifications - `Optional> regions` All extracted regions (populated when job is complete) - `String location` Location of the region in the spreadsheet - `String regionType` Type of the extracted region - `String sheetName` Worksheet name where region was found - `Optional description` Generated description for the region - `Optional regionId` Unique identifier for this region within the file - `Optional title` Generated title for the region - `Optional success` Whether the job completed successfully - `Optional> worksheetMetadata` Metadata for each processed worksheet (populated when job is complete) - `String sheetName` Name of the worksheet - `Optional description` Generated description of the worksheet - `Optional title` Generated title for the worksheet ### Example ```java package com.llamacloud_prod.api.example; import com.llamacloud_prod.api.client.LlamaCloudClient; import com.llamacloud_prod.api.client.okhttp.LlamaCloudOkHttpClient; import com.llamacloud_prod.api.models.beta.sheets.SheetGetParams; import com.llamacloud_prod.api.models.beta.sheets.SheetsJob; public final class Main { private Main() {} public static void main(String[] args) { LlamaCloudClient client = LlamaCloudOkHttpClient.fromEnv(); SheetsJob sheetsJob = client.beta().sheets().get("spreadsheet_job_id"); } } ``` #### Response ```json { "id": "id", "configuration": { "extraction_range": "extraction_range", "flatten_hierarchical_tables": true, "generate_additional_metadata": true, "include_hidden_cells": true, "sheet_names": [ "string" ], "specialization": "specialization", "table_merge_sensitivity": "strong", "use_experimental_processing": true }, "created_at": "created_at", "file_id": "182bd5e5-6e1a-4fe4-a799-aa6d9a6ab26e", "project_id": "182bd5e5-6e1a-4fe4-a799-aa6d9a6ab26e", "status": "PENDING", "updated_at": "updated_at", "user_id": "user_id", "config": { "extraction_range": "extraction_range", "flatten_hierarchical_tables": true, "generate_additional_metadata": true, "include_hidden_cells": true, "sheet_names": [ "string" ], "specialization": "specialization", "table_merge_sensitivity": "strong", "use_experimental_processing": true }, "configuration_id": "configuration_id", "errors": [ "string" ], "file": { "id": "182bd5e5-6e1a-4fe4-a799-aa6d9a6ab26e", "name": "x", "project_id": "182bd5e5-6e1a-4fe4-a799-aa6d9a6ab26e", "created_at": "2019-12-27T18:11:19.117Z", "data_source_id": "182bd5e5-6e1a-4fe4-a799-aa6d9a6ab26e", "expires_at": "2019-12-27T18:11:19.117Z", "external_file_id": "external_file_id", "file_size": 0, "file_type": "x", "last_modified_at": "2019-12-27T18:11:19.117Z", "permission_info": { "foo": { "foo": "bar" } }, "purpose": "purpose", "resource_info": { "foo": { "foo": "bar" } }, "updated_at": "2019-12-27T18:11:19.117Z" }, "metadata_state_transitions": { "foo": "bar" }, "parameters": { "webhook_configurations": [ { "webhook_events": [ "parse.success", "parse.error" ], "webhook_headers": { "Authorization": "Bearer sk-..." }, "webhook_output_format": "json", "webhook_url": "https://example.com/webhooks/llamacloud" } ] }, "regions": [ { "location": "location", "region_type": "region_type", "sheet_name": "sheet_name", "description": "description", "region_id": "region_id", "title": "title" } ], "success": true, "worksheet_metadata": [ { "sheet_name": "sheet_name", "description": "description", "title": "title" } ] } ``` ## Get Result Region `PresignedUrl beta().sheets().getResultTable(SheetGetResultTableParamsparams, RequestOptionsrequestOptions = RequestOptions.none())` **get** `/api/v1/beta/sheets/jobs/{spreadsheet_job_id}/regions/{region_id}/result/{region_type}` Generate a presigned URL to download a specific extracted region. Experimental: not production-ready and subject to change. ### Parameters - `SheetGetResultTableParams params` - `String spreadsheetJobId` - `String regionId` - `Optional regionType` - `TABLE("table")` - `EXTRA("extra")` - `CELL_METADATA("cell_metadata")` - `Optional expiresAtSeconds` - `Optional organizationId` - `Optional projectId` ### Returns - `class PresignedUrl:` Schema for a presigned URL. - `LocalDateTime expiresAt` The time at which the presigned URL expires - `String url` A presigned URL for IO operations against a private file - `Optional formFields` Form fields for a presigned POST request ### Example ```java package com.llamacloud_prod.api.example; import com.llamacloud_prod.api.client.LlamaCloudClient; import com.llamacloud_prod.api.client.okhttp.LlamaCloudOkHttpClient; import com.llamacloud_prod.api.models.beta.sheets.SheetGetResultTableParams; import com.llamacloud_prod.api.models.files.PresignedUrl; public final class Main { private Main() {} public static void main(String[] args) { LlamaCloudClient client = LlamaCloudOkHttpClient.fromEnv(); SheetGetResultTableParams params = SheetGetResultTableParams.builder() .spreadsheetJobId("spreadsheet_job_id") .regionId("region_id") .regionType(SheetGetResultTableParams.RegionType.TABLE) .build(); PresignedUrl presignedUrl = client.beta().sheets().getResultTable(params); } } ``` #### Response ```json { "expires_at": "2019-12-27T18:11:19.117Z", "url": "https://example.com", "form_fields": { "foo": "string" } } ``` ## Delete Spreadsheet Job `JsonValue beta().sheets().deleteJob(SheetDeleteJobParamsparams = SheetDeleteJobParams.none(), RequestOptionsrequestOptions = RequestOptions.none())` **delete** `/api/v1/beta/sheets/jobs/{spreadsheet_job_id}` Delete a spreadsheet parsing job and its associated data. Experimental: not production-ready and subject to change. ### Parameters - `SheetDeleteJobParams params` - `Optional spreadsheetJobId` - `Optional organizationId` - `Optional projectId` ### Returns - `class SheetDeleteJobResponse:` ### Example ```java package com.llamacloud_prod.api.example; import com.llamacloud_prod.api.client.LlamaCloudClient; import com.llamacloud_prod.api.client.okhttp.LlamaCloudOkHttpClient; import com.llamacloud_prod.api.models.beta.sheets.SheetDeleteJobParams; import com.llamacloud_prod.api.models.beta.sheets.SheetDeleteJobResponse; public final class Main { private Main() {} public static void main(String[] args) { LlamaCloudClient client = LlamaCloudOkHttpClient.fromEnv(); SheetDeleteJobResponse response = client.beta().sheets().deleteJob("spreadsheet_job_id"); } } ``` #### Response ```json {} ``` ## Domain Types ### Sheets Job - `class SheetsJob:` A spreadsheet parsing job. - `String id` The ID of the job - `SheetsParsingConfig configuration` Configuration applied to the parsing job (inline or resolved from a saved preset). - `Optional extractionRange` A1 notation of the range to extract a single region from. If None, the entire sheet is used. - `Optional flattenHierarchicalTables` Return a flattened dataframe when a detected table is recognized as hierarchical. - `Optional generateAdditionalMetadata` Whether to generate additional metadata (title, description) for each extracted region. - `Optional includeHiddenCells` Whether to include hidden cells when extracting regions from the spreadsheet. - `Optional> sheetNames` The names of the sheets to extract regions from. If empty, all sheets will be processed. - `Optional specialization` Optional specialization mode for domain-specific extraction. Supported values: 'financial-standard', 'financial-enhanced', 'financial-precise'. Default None uses the general-purpose pipeline. - `Optional tableMergeSensitivity` Influences how likely similar-looking regions are merged into a single table. Useful for spreadsheets that either have sparse tables (strong merging) or many distinct tables close together (weak merging). - `STRONG("strong")` - `WEAK("weak")` - `Optional useExperimentalProcessing` Enables experimental processing. Accuracy may be impacted. - `String createdAt` When the job was created - `Optional fileId` The ID of the input file - `String projectId` The ID of the project - `Status status` The status of the parsing job - `PENDING("PENDING")` - `SUCCESS("SUCCESS")` - `ERROR("ERROR")` - `PARTIAL_SUCCESS("PARTIAL_SUCCESS")` - `CANCELLED("CANCELLED")` - `String updatedAt` When the job was last updated - `String userId` The ID of the user - `Optional config` Configuration for spreadsheet parsing and region extraction - `Optional configurationId` The saved product configuration ID used at create time, if any. - `Optional> errors` Any errors encountered - `Optional file` Schema for a file. - `String id` Unique identifier - `String name` - `String projectId` The ID of the project that the file belongs to - `Optional createdAt` Creation datetime - `Optional dataSourceId` The ID of the data source that the file belongs to - `Optional expiresAt` The expiration date for the file. Files past this date can be deleted. - `Optional externalFileId` The ID of the file in the external system - `Optional fileSize` Size of the file in bytes - `Optional fileType` File type (e.g. pdf, docx, etc.) - `Optional lastModifiedAt` The last modified time of the file - `Optional permissionInfo` Permission information for the file - `class UnionMember0:` - `List` - `String` - `double` - `boolean` - `Optional purpose` The intended purpose of the file (e.g., 'user_data', 'parse', 'extract', 'split', 'classify') - `Optional resourceInfo` Resource information for the file - `class UnionMember0:` - `List` - `String` - `double` - `boolean` - `Optional updatedAt` Update datetime - `Optional metadataStateTransitions` Per-status entry timestamps. Returned only when requested via `?expand=metadata_state_transitions`. - `Optional parameters` Job-time parameters such as webhook configurations. - `Optional> webhookConfigurations` Webhook configurations for job status notifications. - `Optional> webhookEvents` Events to subscribe to (e.g. 'parse.success', 'extract.error'). If null, all events are delivered. - `EXTRACT_PENDING("extract.pending")` - `EXTRACT_SUCCESS("extract.success")` - `EXTRACT_ERROR("extract.error")` - `EXTRACT_PARTIAL_SUCCESS("extract.partial_success")` - `EXTRACT_CANCELLED("extract.cancelled")` - `PARSE_PENDING("parse.pending")` - `PARSE_RUNNING("parse.running")` - `PARSE_SUCCESS("parse.success")` - `PARSE_ERROR("parse.error")` - `PARSE_PARTIAL_SUCCESS("parse.partial_success")` - `PARSE_CANCELLED("parse.cancelled")` - `CLASSIFY_PENDING("classify.pending")` - `CLASSIFY_RUNNING("classify.running")` - `CLASSIFY_SUCCESS("classify.success")` - `CLASSIFY_ERROR("classify.error")` - `CLASSIFY_PARTIAL_SUCCESS("classify.partial_success")` - `CLASSIFY_CANCELLED("classify.cancelled")` - `SHEETS_PENDING("sheets.pending")` - `SHEETS_SUCCESS("sheets.success")` - `SHEETS_ERROR("sheets.error")` - `SHEETS_PARTIAL_SUCCESS("sheets.partial_success")` - `SHEETS_CANCELLED("sheets.cancelled")` - `UNMAPPED_EVENT("unmapped_event")` - `Optional webhookHeaders` Custom HTTP headers sent with each webhook request (e.g. auth tokens) - `Optional webhookOutputFormat` Response format sent to the webhook: 'string' (default) or 'json' - `Optional webhookUrl` URL to receive webhook POST notifications - `Optional> regions` All extracted regions (populated when job is complete) - `String location` Location of the region in the spreadsheet - `String regionType` Type of the extracted region - `String sheetName` Worksheet name where region was found - `Optional description` Generated description for the region - `Optional regionId` Unique identifier for this region within the file - `Optional title` Generated title for the region - `Optional success` Whether the job completed successfully - `Optional> worksheetMetadata` Metadata for each processed worksheet (populated when job is complete) - `String sheetName` Name of the worksheet - `Optional description` Generated description of the worksheet - `Optional title` Generated title for the worksheet ### Sheets Parsing Config - `class SheetsParsingConfig:` Configuration for spreadsheet parsing and region extraction - `Optional extractionRange` A1 notation of the range to extract a single region from. If None, the entire sheet is used. - `Optional flattenHierarchicalTables` Return a flattened dataframe when a detected table is recognized as hierarchical. - `Optional generateAdditionalMetadata` Whether to generate additional metadata (title, description) for each extracted region. - `Optional includeHiddenCells` Whether to include hidden cells when extracting regions from the spreadsheet. - `Optional> sheetNames` The names of the sheets to extract regions from. If empty, all sheets will be processed. - `Optional specialization` Optional specialization mode for domain-specific extraction. Supported values: 'financial-standard', 'financial-enhanced', 'financial-precise'. Default None uses the general-purpose pipeline. - `Optional tableMergeSensitivity` Influences how likely similar-looking regions are merged into a single table. Useful for spreadsheets that either have sparse tables (strong merging) or many distinct tables close together (weak merging). - `STRONG("strong")` - `WEAK("weak")` - `Optional useExperimentalProcessing` Enables experimental processing. Accuracy may be impacted. # Directories ## Create Directory `DirectoryCreateResponse beta().directories().create(DirectoryCreateParamsparams, RequestOptionsrequestOptions = RequestOptions.none())` **post** `/api/v1/beta/directories` Create a new directory within the specified project. ### Parameters - `DirectoryCreateParams params` - `Optional organizationId` - `Optional projectId` - `String name` Human-readable name for the directory. - `Optional description` Optional description shown to users. - `Optional expiresAt` When this directory expires. Required for ephemeral directories. - `Optional systemMetadata` Reserved system-managed metadata. - `Optional type` Directory type. Use 'ephemeral' for batch processing with automatic cleanup. - `USER("user")` - `EPHEMERAL("ephemeral")` ### Returns - `class DirectoryCreateResponse:` API response schema for a directory. - `String id` Unique identifier for the directory. - `String name` Human-readable name for the directory. - `String projectId` Project the directory belongs to. - `Optional createdAt` Creation datetime - `Optional deletedAt` Optional timestamp of when the directory was deleted. Null if not deleted. - `Optional description` Optional description shown to users. - `Optional expiresAt` When this directory expires and is eligible for cleanup. - `Optional systemMetadata` Reserved system-managed metadata. - `Optional type` Directory type: 'user', 'index', 'ephemeral', or 'system_ephemeral'. - `USER("user")` - `INDEX("index")` - `EPHEMERAL("ephemeral")` - `SYSTEM_EPHEMERAL("system_ephemeral")` - `Optional updatedAt` Update datetime ### Example ```java package com.llamacloud_prod.api.example; import com.llamacloud_prod.api.client.LlamaCloudClient; import com.llamacloud_prod.api.client.okhttp.LlamaCloudOkHttpClient; import com.llamacloud_prod.api.models.beta.directories.DirectoryCreateParams; import com.llamacloud_prod.api.models.beta.directories.DirectoryCreateResponse; public final class Main { private Main() {} public static void main(String[] args) { LlamaCloudClient client = LlamaCloudOkHttpClient.fromEnv(); DirectoryCreateParams params = DirectoryCreateParams.builder() .name("x") .build(); DirectoryCreateResponse directory = client.beta().directories().create(params); } } ``` #### Response ```json { "id": "id", "name": "x", "project_id": "project_id", "created_at": "2019-12-27T18:11:19.117Z", "deleted_at": "2019-12-27T18:11:19.117Z", "description": "description", "expires_at": "2019-12-27T18:11:19.117Z", "system_metadata": { "foo": "bar" }, "type": "user", "updated_at": "2019-12-27T18:11:19.117Z" } ``` ## List Directories `DirectoryListPage beta().directories().list(DirectoryListParamsparams = DirectoryListParams.none(), RequestOptionsrequestOptions = RequestOptions.none())` **get** `/api/v1/beta/directories` List Directories ### Parameters - `DirectoryListParams params` - `Optional includeDeleted` - `Optional name` - `Optional organizationId` - `Optional pageSize` - `Optional pageToken` - `Optional projectId` - `Optional type` - `USER("user")` - `INDEX("index")` - `EPHEMERAL("ephemeral")` ### Returns - `class DirectoryListResponse:` API response schema for a directory. - `String id` Unique identifier for the directory. - `String name` Human-readable name for the directory. - `String projectId` Project the directory belongs to. - `Optional createdAt` Creation datetime - `Optional deletedAt` Optional timestamp of when the directory was deleted. Null if not deleted. - `Optional description` Optional description shown to users. - `Optional expiresAt` When this directory expires and is eligible for cleanup. - `Optional systemMetadata` Reserved system-managed metadata. - `Optional type` Directory type: 'user', 'index', 'ephemeral', or 'system_ephemeral'. - `USER("user")` - `INDEX("index")` - `EPHEMERAL("ephemeral")` - `SYSTEM_EPHEMERAL("system_ephemeral")` - `Optional updatedAt` Update datetime ### Example ```java package com.llamacloud_prod.api.example; import com.llamacloud_prod.api.client.LlamaCloudClient; import com.llamacloud_prod.api.client.okhttp.LlamaCloudOkHttpClient; import com.llamacloud_prod.api.models.beta.directories.DirectoryListPage; import com.llamacloud_prod.api.models.beta.directories.DirectoryListParams; public final class Main { private Main() {} public static void main(String[] args) { LlamaCloudClient client = LlamaCloudOkHttpClient.fromEnv(); DirectoryListPage page = client.beta().directories().list(); } } ``` #### Response ```json { "items": [ { "id": "id", "name": "x", "project_id": "project_id", "created_at": "2019-12-27T18:11:19.117Z", "deleted_at": "2019-12-27T18:11:19.117Z", "description": "description", "expires_at": "2019-12-27T18:11:19.117Z", "system_metadata": { "foo": "bar" }, "type": "user", "updated_at": "2019-12-27T18:11:19.117Z" } ], "next_page_token": "next_page_token", "total_size": 0 } ``` ## Get Directory `DirectoryGetResponse beta().directories().get(DirectoryGetParamsparams = DirectoryGetParams.none(), RequestOptionsrequestOptions = RequestOptions.none())` **get** `/api/v1/beta/directories/{directory_id}` Retrieve a directory by its identifier. ### Parameters - `DirectoryGetParams params` - `Optional directoryId` - `Optional organizationId` - `Optional projectId` ### Returns - `class DirectoryGetResponse:` API response schema for a directory. - `String id` Unique identifier for the directory. - `String name` Human-readable name for the directory. - `String projectId` Project the directory belongs to. - `Optional createdAt` Creation datetime - `Optional deletedAt` Optional timestamp of when the directory was deleted. Null if not deleted. - `Optional description` Optional description shown to users. - `Optional expiresAt` When this directory expires and is eligible for cleanup. - `Optional systemMetadata` Reserved system-managed metadata. - `Optional type` Directory type: 'user', 'index', 'ephemeral', or 'system_ephemeral'. - `USER("user")` - `INDEX("index")` - `EPHEMERAL("ephemeral")` - `SYSTEM_EPHEMERAL("system_ephemeral")` - `Optional updatedAt` Update datetime ### Example ```java package com.llamacloud_prod.api.example; import com.llamacloud_prod.api.client.LlamaCloudClient; import com.llamacloud_prod.api.client.okhttp.LlamaCloudOkHttpClient; import com.llamacloud_prod.api.models.beta.directories.DirectoryGetParams; import com.llamacloud_prod.api.models.beta.directories.DirectoryGetResponse; public final class Main { private Main() {} public static void main(String[] args) { LlamaCloudClient client = LlamaCloudOkHttpClient.fromEnv(); DirectoryGetResponse directory = client.beta().directories().get("directory_id"); } } ``` #### Response ```json { "id": "id", "name": "x", "project_id": "project_id", "created_at": "2019-12-27T18:11:19.117Z", "deleted_at": "2019-12-27T18:11:19.117Z", "description": "description", "expires_at": "2019-12-27T18:11:19.117Z", "system_metadata": { "foo": "bar" }, "type": "user", "updated_at": "2019-12-27T18:11:19.117Z" } ``` ## Update Directory `DirectoryUpdateResponse beta().directories().update(DirectoryUpdateParamsparams = DirectoryUpdateParams.none(), RequestOptionsrequestOptions = RequestOptions.none())` **patch** `/api/v1/beta/directories/{directory_id}` Update directory metadata. ### Parameters - `DirectoryUpdateParams params` - `Optional directoryId` - `Optional organizationId` - `Optional projectId` - `Optional description` Updated description for the directory. - `Optional name` Updated name for the directory. ### Returns - `class DirectoryUpdateResponse:` API response schema for a directory. - `String id` Unique identifier for the directory. - `String name` Human-readable name for the directory. - `String projectId` Project the directory belongs to. - `Optional createdAt` Creation datetime - `Optional deletedAt` Optional timestamp of when the directory was deleted. Null if not deleted. - `Optional description` Optional description shown to users. - `Optional expiresAt` When this directory expires and is eligible for cleanup. - `Optional systemMetadata` Reserved system-managed metadata. - `Optional type` Directory type: 'user', 'index', 'ephemeral', or 'system_ephemeral'. - `USER("user")` - `INDEX("index")` - `EPHEMERAL("ephemeral")` - `SYSTEM_EPHEMERAL("system_ephemeral")` - `Optional updatedAt` Update datetime ### Example ```java package com.llamacloud_prod.api.example; import com.llamacloud_prod.api.client.LlamaCloudClient; import com.llamacloud_prod.api.client.okhttp.LlamaCloudOkHttpClient; import com.llamacloud_prod.api.models.beta.directories.DirectoryUpdateParams; import com.llamacloud_prod.api.models.beta.directories.DirectoryUpdateResponse; public final class Main { private Main() {} public static void main(String[] args) { LlamaCloudClient client = LlamaCloudOkHttpClient.fromEnv(); DirectoryUpdateResponse directory = client.beta().directories().update("directory_id"); } } ``` #### Response ```json { "id": "id", "name": "x", "project_id": "project_id", "created_at": "2019-12-27T18:11:19.117Z", "deleted_at": "2019-12-27T18:11:19.117Z", "description": "description", "expires_at": "2019-12-27T18:11:19.117Z", "system_metadata": { "foo": "bar" }, "type": "user", "updated_at": "2019-12-27T18:11:19.117Z" } ``` ## Delete Directory `beta().directories().delete(DirectoryDeleteParamsparams = DirectoryDeleteParams.none(), RequestOptionsrequestOptions = RequestOptions.none())` **delete** `/api/v1/beta/directories/{directory_id}` Permanently delete a directory. ### Parameters - `DirectoryDeleteParams params` - `Optional directoryId` - `Optional organizationId` - `Optional projectId` ### Example ```java package com.llamacloud_prod.api.example; import com.llamacloud_prod.api.client.LlamaCloudClient; import com.llamacloud_prod.api.client.okhttp.LlamaCloudOkHttpClient; import com.llamacloud_prod.api.models.beta.directories.DirectoryDeleteParams; public final class Main { private Main() {} public static void main(String[] args) { LlamaCloudClient client = LlamaCloudOkHttpClient.fromEnv(); client.beta().directories().delete("directory_id"); } } ``` # Files ## Add Directory File `FileAddResponse beta().directories().files().add(FileAddParamsparams, RequestOptionsrequestOptions = RequestOptions.none())` **post** `/api/v1/beta/directories/{directory_id}/files` Create a new file within the specified directory; the directory must exist in the project and `file_id` must reference an existing file. ### Parameters - `FileAddParams params` - `Optional directoryId` - `Optional organizationId` - `Optional projectId` - `String fileId` File ID for the storage location (required). - `Optional displayName` Display name for the file. If not provided, will use the file's name. - `Optional metadata` User-defined metadata key-value pairs to associate with the file. - `String` - `long` - `double` - `boolean` - `JsonValue;` - `List` - `Optional uniqueId` Unique identifier for the file in the directory. If not provided, will use the file's external_file_id or name. ### Returns - `class FileAddResponse:` API response schema for a directory file. - `String id` Unique identifier for the directory file. - `String directoryId` Directory the file belongs to. - `String displayName` Display name for the file. - `String projectId` Project the directory file belongs to. - `String uniqueId` Unique identifier for the file in the directory - `Optional createdAt` Creation datetime - `Optional deletedAt` Soft delete marker when the file is removed upstream or by user action. - `Optional downloadUrl` Schema for a presigned URL. - `LocalDateTime expiresAt` The time at which the presigned URL expires - `String url` A presigned URL for IO operations against a private file - `Optional formFields` Form fields for a presigned POST request - `Optional fileId` File ID for the storage location. - `Optional metadata` Merged metadata from all sources. Higher-priority sources override lower. - `String` - `long` - `double` - `boolean` - `JsonValue;` - `List` - `Optional updatedAt` Update datetime ### Example ```java package com.llamacloud_prod.api.example; import com.llamacloud_prod.api.client.LlamaCloudClient; import com.llamacloud_prod.api.client.okhttp.LlamaCloudOkHttpClient; import com.llamacloud_prod.api.models.beta.directories.files.FileAddParams; import com.llamacloud_prod.api.models.beta.directories.files.FileAddResponse; public final class Main { private Main() {} public static void main(String[] args) { LlamaCloudClient client = LlamaCloudOkHttpClient.fromEnv(); FileAddParams params = FileAddParams.builder() .directoryId("directory_id") .fileId("file_id") .build(); FileAddResponse response = client.beta().directories().files().add(params); } } ``` #### Response ```json { "id": "id", "directory_id": "directory_id", "display_name": "x", "project_id": "project_id", "unique_id": "x", "created_at": "2019-12-27T18:11:19.117Z", "deleted_at": "2019-12-27T18:11:19.117Z", "download_url": { "expires_at": "2019-12-27T18:11:19.117Z", "url": "https://example.com", "form_fields": { "foo": "string" } }, "file_id": "file_id", "metadata": { "foo": "string" }, "updated_at": "2019-12-27T18:11:19.117Z" } ``` ## List Directory Files `FileListPage beta().directories().files().list(FileListParamsparams = FileListParams.none(), RequestOptionsrequestOptions = RequestOptions.none())` **get** `/api/v1/beta/directories/{directory_id}/files` List all files within the specified directory with optional filtering and pagination. ### Parameters - `FileListParams params` - `Optional directoryId` - `Optional displayName` - `Optional displayNameContains` - `Optional> expand` Fields to expand on each directory file. - `Optional fileId` - `Optional includeDeleted` - `Optional organizationId` - `Optional pageSize` - `Optional pageToken` - `Optional projectId` - `Optional uniqueId` - `Optional updatedAtOnOrAfter` Include items updated at or after this timestamp (inclusive) - `Optional updatedAtOnOrBefore` Include items updated at or before this timestamp (inclusive) ### Returns - `class FileListResponse:` API response schema for a directory file. - `String id` Unique identifier for the directory file. - `String directoryId` Directory the file belongs to. - `String displayName` Display name for the file. - `String projectId` Project the directory file belongs to. - `String uniqueId` Unique identifier for the file in the directory - `Optional createdAt` Creation datetime - `Optional deletedAt` Soft delete marker when the file is removed upstream or by user action. - `Optional downloadUrl` Schema for a presigned URL. - `LocalDateTime expiresAt` The time at which the presigned URL expires - `String url` A presigned URL for IO operations against a private file - `Optional formFields` Form fields for a presigned POST request - `Optional fileId` File ID for the storage location. - `Optional metadata` Merged metadata from all sources. Higher-priority sources override lower. - `String` - `long` - `double` - `boolean` - `JsonValue;` - `List` - `Optional updatedAt` Update datetime ### Example ```java package com.llamacloud_prod.api.example; import com.llamacloud_prod.api.client.LlamaCloudClient; import com.llamacloud_prod.api.client.okhttp.LlamaCloudOkHttpClient; import com.llamacloud_prod.api.models.beta.directories.files.FileListPage; import com.llamacloud_prod.api.models.beta.directories.files.FileListParams; public final class Main { private Main() {} public static void main(String[] args) { LlamaCloudClient client = LlamaCloudOkHttpClient.fromEnv(); FileListPage page = client.beta().directories().files().list("directory_id"); } } ``` #### Response ```json { "items": [ { "id": "id", "directory_id": "directory_id", "display_name": "x", "project_id": "project_id", "unique_id": "x", "created_at": "2019-12-27T18:11:19.117Z", "deleted_at": "2019-12-27T18:11:19.117Z", "download_url": { "expires_at": "2019-12-27T18:11:19.117Z", "url": "https://example.com", "form_fields": { "foo": "string" } }, "file_id": "file_id", "metadata": { "foo": "string" }, "updated_at": "2019-12-27T18:11:19.117Z" } ], "next_page_token": "next_page_token", "total_size": 0 } ``` ## Get Directory File `FileGetResponse beta().directories().files().get(FileGetParamsparams, RequestOptionsrequestOptions = RequestOptions.none())` **get** `/api/v1/beta/directories/{directory_id}/files/{directory_file_id}` Get a directory file by `directory_file_id`; to look up by `unique_id`, use the list endpoint with a filter. ### Parameters - `FileGetParams params` - `String directoryId` - `Optional directoryFileId` - `Optional> expand` Fields to expand. - `Optional organizationId` - `Optional projectId` ### Returns - `class FileGetResponse:` API response schema for a directory file. - `String id` Unique identifier for the directory file. - `String directoryId` Directory the file belongs to. - `String displayName` Display name for the file. - `String projectId` Project the directory file belongs to. - `String uniqueId` Unique identifier for the file in the directory - `Optional createdAt` Creation datetime - `Optional deletedAt` Soft delete marker when the file is removed upstream or by user action. - `Optional downloadUrl` Schema for a presigned URL. - `LocalDateTime expiresAt` The time at which the presigned URL expires - `String url` A presigned URL for IO operations against a private file - `Optional formFields` Form fields for a presigned POST request - `Optional fileId` File ID for the storage location. - `Optional metadata` Merged metadata from all sources. Higher-priority sources override lower. - `String` - `long` - `double` - `boolean` - `JsonValue;` - `List` - `Optional updatedAt` Update datetime ### Example ```java package com.llamacloud_prod.api.example; import com.llamacloud_prod.api.client.LlamaCloudClient; import com.llamacloud_prod.api.client.okhttp.LlamaCloudOkHttpClient; import com.llamacloud_prod.api.models.beta.directories.files.FileGetParams; import com.llamacloud_prod.api.models.beta.directories.files.FileGetResponse; public final class Main { private Main() {} public static void main(String[] args) { LlamaCloudClient client = LlamaCloudOkHttpClient.fromEnv(); FileGetParams params = FileGetParams.builder() .directoryId("directory_id") .directoryFileId("directory_file_id") .build(); FileGetResponse file = client.beta().directories().files().get(params); } } ``` #### Response ```json { "id": "id", "directory_id": "directory_id", "display_name": "x", "project_id": "project_id", "unique_id": "x", "created_at": "2019-12-27T18:11:19.117Z", "deleted_at": "2019-12-27T18:11:19.117Z", "download_url": { "expires_at": "2019-12-27T18:11:19.117Z", "url": "https://example.com", "form_fields": { "foo": "string" } }, "file_id": "file_id", "metadata": { "foo": "string" }, "updated_at": "2019-12-27T18:11:19.117Z" } ``` ## Update Directory File `FileUpdateResponse beta().directories().files().update(FileUpdateParamsparams, RequestOptionsrequestOptions = RequestOptions.none())` **patch** `/api/v1/beta/directories/{directory_id}/files/{directory_file_id}` Update directory-file metadata by `directory_file_id`; set `directory_id` to move the file to a different directory. To resolve from `unique_id`, list with a filter first. ### Parameters - `FileUpdateParams params` - `String directoryId` - `Optional directoryFileId` - `Optional organizationId` - `Optional projectId` - `Optional displayName` Updated display name. - `Optional metadata` User-defined metadata key-value pairs. Replaces the user metadata layer. - `String` - `long` - `double` - `boolean` - `JsonValue;` - `List` - `Optional targetDirectoryId` Move file to a different directory. - `Optional uniqueId` Updated unique identifier. ### Returns - `class FileUpdateResponse:` API response schema for a directory file. - `String id` Unique identifier for the directory file. - `String directoryId` Directory the file belongs to. - `String displayName` Display name for the file. - `String projectId` Project the directory file belongs to. - `String uniqueId` Unique identifier for the file in the directory - `Optional createdAt` Creation datetime - `Optional deletedAt` Soft delete marker when the file is removed upstream or by user action. - `Optional downloadUrl` Schema for a presigned URL. - `LocalDateTime expiresAt` The time at which the presigned URL expires - `String url` A presigned URL for IO operations against a private file - `Optional formFields` Form fields for a presigned POST request - `Optional fileId` File ID for the storage location. - `Optional metadata` Merged metadata from all sources. Higher-priority sources override lower. - `String` - `long` - `double` - `boolean` - `JsonValue;` - `List` - `Optional updatedAt` Update datetime ### Example ```java package com.llamacloud_prod.api.example; import com.llamacloud_prod.api.client.LlamaCloudClient; import com.llamacloud_prod.api.client.okhttp.LlamaCloudOkHttpClient; import com.llamacloud_prod.api.models.beta.directories.files.FileUpdateParams; import com.llamacloud_prod.api.models.beta.directories.files.FileUpdateResponse; public final class Main { private Main() {} public static void main(String[] args) { LlamaCloudClient client = LlamaCloudOkHttpClient.fromEnv(); FileUpdateParams params = FileUpdateParams.builder() .directoryId("directory_id") .directoryFileId("directory_file_id") .build(); FileUpdateResponse file = client.beta().directories().files().update(params); } } ``` #### Response ```json { "id": "id", "directory_id": "directory_id", "display_name": "x", "project_id": "project_id", "unique_id": "x", "created_at": "2019-12-27T18:11:19.117Z", "deleted_at": "2019-12-27T18:11:19.117Z", "download_url": { "expires_at": "2019-12-27T18:11:19.117Z", "url": "https://example.com", "form_fields": { "foo": "string" } }, "file_id": "file_id", "metadata": { "foo": "string" }, "updated_at": "2019-12-27T18:11:19.117Z" } ``` ## Delete Directory File `beta().directories().files().delete(FileDeleteParamsparams, RequestOptionsrequestOptions = RequestOptions.none())` **delete** `/api/v1/beta/directories/{directory_id}/files/{directory_file_id}` Delete a directory file by `directory_file_id`; to resolve from `unique_id`, list with a filter first. ### Parameters - `FileDeleteParams params` - `String directoryId` - `Optional directoryFileId` - `Optional organizationId` - `Optional projectId` ### Example ```java package com.llamacloud_prod.api.example; import com.llamacloud_prod.api.client.LlamaCloudClient; import com.llamacloud_prod.api.client.okhttp.LlamaCloudOkHttpClient; import com.llamacloud_prod.api.models.beta.directories.files.FileDeleteParams; public final class Main { private Main() {} public static void main(String[] args) { LlamaCloudClient client = LlamaCloudOkHttpClient.fromEnv(); FileDeleteParams params = FileDeleteParams.builder() .directoryId("directory_id") .directoryFileId("directory_file_id") .build(); client.beta().directories().files().delete(params); } } ``` ## Upload File To Directory `FileUploadResponse beta().directories().files().upload(FileUploadParamsparams, RequestOptionsrequestOptions = RequestOptions.none())` **post** `/api/v1/beta/directories/{directory_id}/files/upload` Upload a file and create its directory entry in one call; `unique_id` / `display_name` default to values derived from file metadata. ### Parameters - `FileUploadParams params` - `Optional directoryId` - `Optional organizationId` - `Optional projectId` - `String uploadFile` - `Optional displayName` - `Optional externalFileId` - `Optional metadata` User metadata as a JSON object string. - `Optional uniqueId` ### Returns - `class FileUploadResponse:` API response schema for a directory file. - `String id` Unique identifier for the directory file. - `String directoryId` Directory the file belongs to. - `String displayName` Display name for the file. - `String projectId` Project the directory file belongs to. - `String uniqueId` Unique identifier for the file in the directory - `Optional createdAt` Creation datetime - `Optional deletedAt` Soft delete marker when the file is removed upstream or by user action. - `Optional downloadUrl` Schema for a presigned URL. - `LocalDateTime expiresAt` The time at which the presigned URL expires - `String url` A presigned URL for IO operations against a private file - `Optional formFields` Form fields for a presigned POST request - `Optional fileId` File ID for the storage location. - `Optional metadata` Merged metadata from all sources. Higher-priority sources override lower. - `String` - `long` - `double` - `boolean` - `JsonValue;` - `List` - `Optional updatedAt` Update datetime ### Example ```java package com.llamacloud_prod.api.example; import com.llamacloud_prod.api.client.LlamaCloudClient; import com.llamacloud_prod.api.client.okhttp.LlamaCloudOkHttpClient; import com.llamacloud_prod.api.models.beta.directories.files.FileUploadParams; import com.llamacloud_prod.api.models.beta.directories.files.FileUploadResponse; import java.io.ByteArrayInputStream; public final class Main { private Main() {} public static void main(String[] args) { LlamaCloudClient client = LlamaCloudOkHttpClient.fromEnv(); FileUploadParams params = FileUploadParams.builder() .directoryId("directory_id") .uploadFile(new ByteArrayInputStream("Example data".getBytes())) .build(); FileUploadResponse response = client.beta().directories().files().upload(params); } } ``` #### Response ```json { "id": "id", "directory_id": "directory_id", "display_name": "x", "project_id": "project_id", "unique_id": "x", "created_at": "2019-12-27T18:11:19.117Z", "deleted_at": "2019-12-27T18:11:19.117Z", "download_url": { "expires_at": "2019-12-27T18:11:19.117Z", "url": "https://example.com", "form_fields": { "foo": "string" } }, "file_id": "file_id", "metadata": { "foo": "string" }, "updated_at": "2019-12-27T18:11:19.117Z" } ``` # Batch ## Create Batch Job `BatchCreateResponse beta().batch().create(BatchCreateParamsparams, RequestOptionsrequestOptions = RequestOptions.none())` **post** `/api/v1/beta/batch-processing` Create a batch processing job. Processes files from a directory or a specific list of item IDs. Supports batch parsing and classification operations. Provide either `directory_id` to process all files in a directory, or `item_ids` for specific items. The job runs asynchronously — poll `GET /batch/{job_id}` for progress. ### Parameters - `BatchCreateParams params` - `Optional organizationId` - `Optional projectId` - `Optional temporalNamespace` - `JobConfig jobConfig` Job configuration — either a parse or classify config - `class BatchParseJobRecordCreate:` Batch-specific parse job record for batch processing. This model contains the metadata and configuration for a batch parse job, but excludes file-specific information. It's used as input to the batch parent workflow and combined with DirectoryFile data to create full ParseJobRecordCreate instances for each file. Attributes: job_name: Must be PARSE_RAW_FILE partitions: Partitions for job output location parameters: Generic parse configuration (BatchParseJobConfig) session_id: Upstream request ID for tracking correlation_id: Correlation ID for cross-service tracking parent_job_execution_id: Parent job execution ID if nested user_id: User who created the job project_id: Project this job belongs to webhook_url: Optional webhook URL for job completion notifications - `Optional correlationId` The correlation ID for this job. Used for tracking the job across services. - `Optional jobName` - `PARSE_RAW_FILE_JOB("parse_raw_file_job")` - `Optional parameters` Generic parse job configuration for batch processing. This model contains the parsing configuration that applies to all files in a batch, but excludes file-specific fields like file_name, file_id, etc. Those file-specific fields are populated from DirectoryFile data when creating individual ParseJobRecordCreate instances for each file. The fields in this model should be generic settings that apply uniformly to all files being processed in the batch. - `Optional adaptiveLongTable` - `Optional aggressiveTableExtraction` - `Optional annotateLinks` - `Optional autoMode` - `Optional autoModeConfigurationJson` - `Optional autoModeTriggerOnImageInPage` - `Optional autoModeTriggerOnRegexpInPage` - `Optional autoModeTriggerOnTableInPage` - `Optional autoModeTriggerOnTextInPage` - `Optional azureOpenAIApiVersion` - `Optional azureOpenAIDeploymentName` - `Optional azureOpenAIEndpoint` - `Optional azureOpenAIKey` - `Optional bboxBottom` - `Optional bboxLeft` - `Optional bboxRight` - `Optional bboxTop` - `Optional boundingBox` - `Optional compactMarkdownTable` - `Optional complementalFormattingInstruction` - `Optional contentGuidelineInstruction` - `Optional continuousMode` - `Optional customMetadata` The custom metadata to attach to the documents. - `Optional disableImageExtraction` - `Optional disableOcr` - `Optional disableReconstruction` - `Optional doNotCache` - `Optional doNotUnrollColumns` - `Optional enableCostOptimizer` - `Optional extractCharts` - `Optional extractLayout` - `Optional extractPrintedPageNumber` - `Optional fastMode` - `Optional formattingInstruction` - `Optional gpt4oApiKey` - `Optional gpt4oMode` - `Optional guessXlsxSheetName` - `Optional hideFooters` - `Optional hideHeaders` - `Optional highResOcr` - `Optional htmlMakeAllElementsVisible` - `Optional htmlRemoveFixedElements` - `Optional htmlRemoveNavigationElements` - `Optional httpProxy` - `Optional ignoreDocumentElementsForLayoutDetection` - `Optional> imagesToSave` - `SCREENSHOT("screenshot")` - `EMBEDDED("embedded")` - `LAYOUT("layout")` - `Optional inlineImagesInMarkdown` - `Optional inputS3Path` - `Optional inputS3Region` The region for the input S3 bucket. - `Optional inputUrl` - `Optional internalIsScreenshotJob` - `Optional invalidateCache` - `Optional isFormattingInstruction` - `Optional jobTimeoutExtraTimePerPageInSeconds` - `Optional jobTimeoutInSeconds` - `Optional keepPageSeparatorWhenMergingTables` - `Optional lang` The language. - `Optional> languages` - `AF("af")` - `AZ("az")` - `BS("bs")` - `CS("cs")` - `CY("cy")` - `DA("da")` - `DE("de")` - `EN("en")` - `ES("es")` - `ET("et")` - `FR("fr")` - `GA("ga")` - `HR("hr")` - `HU("hu")` - `ID("id")` - `IS("is")` - `IT("it")` - `KU("ku")` - `LA("la")` - `LT("lt")` - `LV("lv")` - `MI("mi")` - `MS("ms")` - `MT("mt")` - `NL("nl")` - `NO("no")` - `OC("oc")` - `PI("pi")` - `PL("pl")` - `PT("pt")` - `RO("ro")` - `RS_LATIN("rs_latin")` - `SK("sk")` - `SL("sl")` - `SQ("sq")` - `SV("sv")` - `SW("sw")` - `TL("tl")` - `TR("tr")` - `UZ("uz")` - `VI("vi")` - `AR("ar")` - `FA("fa")` - `UG("ug")` - `UR("ur")` - `BN("bn")` - `AS("as")` - `MNI("mni")` - `RU("ru")` - `RS_CYRILLIC("rs_cyrillic")` - `BE("be")` - `BG("bg")` - `UK("uk")` - `MN("mn")` - `ABQ("abq")` - `ADY("ady")` - `KBD("kbd")` - `AVA("ava")` - `DAR("dar")` - `INH("inh")` - `CHE("che")` - `LBE("lbe")` - `LEZ("lez")` - `TAB("tab")` - `TJK("tjk")` - `HI("hi")` - `MR("mr")` - `NE("ne")` - `BH("bh")` - `MAI("mai")` - `ANG("ang")` - `BHO("bho")` - `MAH("mah")` - `SCK("sck")` - `NEW("new")` - `GOM("gom")` - `SA("sa")` - `BGC("bgc")` - `TH("th")` - `CH_SIM("ch_sim")` - `CH_TRA("ch_tra")` - `JA("ja")` - `KO("ko")` - `TA("ta")` - `TE("te")` - `KN("kn")` - `Optional layoutAware` - `Optional lineLevelBoundingBox` - `Optional markdownTableMultilineHeaderSeparator` - `Optional maxPages` - `Optional maxPagesEnforced` - `Optional mergeTablesAcrossPagesInMarkdown` - `Optional model` - `Optional outlinedTableExtraction` - `Optional outputPdfOfDocument` - `Optional outputS3PathPrefix` If specified, llamaParse will save the output to the specified path. All output file will use this 'prefix' should be a valid s3:// url - `Optional outputS3Region` The region for the output S3 bucket. - `Optional outputTablesAsHtml` - `Optional outputBucket` The output bucket. - `Optional pageErrorTolerance` - `Optional pageFooterPrefix` - `Optional pageFooterSuffix` - `Optional pageHeaderPrefix` - `Optional pageHeaderSuffix` - `Optional pagePrefix` - `Optional pageSeparator` - `Optional pageSuffix` - `Optional parseMode` Enum for representing the mode of parsing to be used. - `PARSE_PAGE_WITHOUT_LLM("parse_page_without_llm")` - `PARSE_PAGE_WITH_LLM("parse_page_with_llm")` - `PARSE_PAGE_WITH_LVM("parse_page_with_lvm")` - `PARSE_PAGE_WITH_AGENT("parse_page_with_agent")` - `PARSE_PAGE_WITH_LAYOUT_AGENT("parse_page_with_layout_agent")` - `PARSE_DOCUMENT_WITH_LLM("parse_document_with_llm")` - `PARSE_DOCUMENT_WITH_LVM("parse_document_with_lvm")` - `PARSE_DOCUMENT_WITH_AGENT("parse_document_with_agent")` - `Optional parsingInstruction` - `Optional pipelineId` The pipeline ID. - `Optional preciseBoundingBox` - `Optional premiumMode` - `Optional presentationOutOfBoundsContent` - `Optional presentationSkipEmbeddedData` - `Optional preserveLayoutAlignmentAcrossPages` - `Optional preserveVerySmallText` - `Optional preset` - `Optional priority` The priority for the request. This field may be ignored or overwritten depending on the organization tier. - `LOW("low")` - `MEDIUM("medium")` - `HIGH("high")` - `CRITICAL("critical")` - `Optional projectId` - `Optional removeHiddenText` - `Optional replaceFailedPageMode` Enum for representing the different available page error handling modes. - `RAW_TEXT("raw_text")` - `BLANK_PAGE("blank_page")` - `ERROR_MESSAGE("error_message")` - `Optional replaceFailedPageWithErrorMessagePrefix` - `Optional replaceFailedPageWithErrorMessageSuffix` - `Optional resourceInfo` The resource info about the file - `Optional saveImages` - `Optional skipDiagonalText` - `Optional specializedChartParsingAgentic` - `Optional specializedChartParsingEfficient` - `Optional specializedChartParsingPlus` - `Optional specializedImageParsing` - `Optional spreadsheetExtractSubTables` - `Optional spreadsheetForceFormulaComputation` - `Optional spreadsheetIncludeHiddenSheets` - `Optional strictModeBuggyFont` - `Optional strictModeImageExtraction` - `Optional strictModeImageOcr` - `Optional strictModeReconstruction` - `Optional structuredOutput` - `Optional structuredOutputJsonSchema` - `Optional structuredOutputJsonSchemaName` - `Optional systemPrompt` - `Optional systemPromptAppend` - `Optional takeScreenshot` - `Optional targetPages` - `Optional tier` - `Optional type` - `PARSE("parse")` - `Optional useVendorMultimodalModel` - `Optional userPrompt` - `Optional vendorMultimodalApiKey` - `Optional vendorMultimodalModelName` - `Optional version` - `Optional> webhookConfigurations` Outbound webhook endpoints to notify on job status changes - `Optional> webhookEvents` Events to subscribe to (e.g. 'parse.success', 'extract.error'). If null, all events are delivered. - `EXTRACT_PENDING("extract.pending")` - `EXTRACT_SUCCESS("extract.success")` - `EXTRACT_ERROR("extract.error")` - `EXTRACT_PARTIAL_SUCCESS("extract.partial_success")` - `EXTRACT_CANCELLED("extract.cancelled")` - `PARSE_PENDING("parse.pending")` - `PARSE_RUNNING("parse.running")` - `PARSE_SUCCESS("parse.success")` - `PARSE_ERROR("parse.error")` - `PARSE_PARTIAL_SUCCESS("parse.partial_success")` - `PARSE_CANCELLED("parse.cancelled")` - `CLASSIFY_PENDING("classify.pending")` - `CLASSIFY_RUNNING("classify.running")` - `CLASSIFY_SUCCESS("classify.success")` - `CLASSIFY_ERROR("classify.error")` - `CLASSIFY_PARTIAL_SUCCESS("classify.partial_success")` - `CLASSIFY_CANCELLED("classify.cancelled")` - `SHEETS_PENDING("sheets.pending")` - `SHEETS_SUCCESS("sheets.success")` - `SHEETS_ERROR("sheets.error")` - `SHEETS_PARTIAL_SUCCESS("sheets.partial_success")` - `SHEETS_CANCELLED("sheets.cancelled")` - `UNMAPPED_EVENT("unmapped_event")` - `Optional webhookHeaders` Custom HTTP headers sent with each webhook request (e.g. auth tokens) - `Optional webhookOutputFormat` Response format sent to the webhook: 'string' (default) or 'json' - `Optional webhookUrl` URL to receive webhook POST notifications - `Optional webhookUrl` - `Optional parentJobExecutionId` The ID of the parent job execution. - `Optional partitions` The partitions for this execution. Used for determining where to save job output. - `Optional projectId` The ID of the project this job belongs to. - `Optional sessionId` The upstream request ID that created this job. Used for tracking the job across services. - `Optional userId` The ID of the user that created this job - `Optional webhookUrl` The URL that needs to be called at the end of the parsing job. - `class ClassifyJob:` A classify job. - `String id` Unique identifier - `String projectId` The ID of the project - `List rules` The rules to classify the files - `String description` Natural language description of what to classify. Be specific about the content characteristics that identify this document type. - `String type` The document type to assign when this rule matches (e.g., 'invoice', 'receipt', 'contract') - `StatusEnum status` The status of the classify job - `PENDING("PENDING")` - `SUCCESS("SUCCESS")` - `ERROR("ERROR")` - `PARTIAL_SUCCESS("PARTIAL_SUCCESS")` - `CANCELLED("CANCELLED")` - `String userId` The ID of the user - `Optional createdAt` Creation datetime - `Optional effectiveAt` - `Optional errorMessage` Error message for the latest job attempt, if any. - `Optional jobRecordId` The job record ID associated with this status, if any. - `Optional mode` The classification mode to use - `FAST("FAST")` - `MULTIMODAL("MULTIMODAL")` - `Optional parsingConfiguration` The configuration for the parsing job - `Optional lang` The language to parse the files in - `AF("af")` - `AZ("az")` - `BS("bs")` - `CS("cs")` - `CY("cy")` - `DA("da")` - `DE("de")` - `EN("en")` - `ES("es")` - `ET("et")` - `FR("fr")` - `GA("ga")` - `HR("hr")` - `HU("hu")` - `ID("id")` - `IS("is")` - `IT("it")` - `KU("ku")` - `LA("la")` - `LT("lt")` - `LV("lv")` - `MI("mi")` - `MS("ms")` - `MT("mt")` - `NL("nl")` - `NO("no")` - `OC("oc")` - `PI("pi")` - `PL("pl")` - `PT("pt")` - `RO("ro")` - `RS_LATIN("rs_latin")` - `SK("sk")` - `SL("sl")` - `SQ("sq")` - `SV("sv")` - `SW("sw")` - `TL("tl")` - `TR("tr")` - `UZ("uz")` - `VI("vi")` - `AR("ar")` - `FA("fa")` - `UG("ug")` - `UR("ur")` - `BN("bn")` - `AS("as")` - `MNI("mni")` - `RU("ru")` - `RS_CYRILLIC("rs_cyrillic")` - `BE("be")` - `BG("bg")` - `UK("uk")` - `MN("mn")` - `ABQ("abq")` - `ADY("ady")` - `KBD("kbd")` - `AVA("ava")` - `DAR("dar")` - `INH("inh")` - `CHE("che")` - `LBE("lbe")` - `LEZ("lez")` - `TAB("tab")` - `TJK("tjk")` - `HI("hi")` - `MR("mr")` - `NE("ne")` - `BH("bh")` - `MAI("mai")` - `ANG("ang")` - `BHO("bho")` - `MAH("mah")` - `SCK("sck")` - `NEW("new")` - `GOM("gom")` - `SA("sa")` - `BGC("bgc")` - `TH("th")` - `CH_SIM("ch_sim")` - `CH_TRA("ch_tra")` - `JA("ja")` - `KO("ko")` - `TA("ta")` - `TE("te")` - `KN("kn")` - `Optional maxPages` The maximum number of pages to parse - `Optional> targetPages` The pages to target for parsing (0-indexed, so first page is at 0) - `Optional updatedAt` Update datetime - `Optional continueAsNewThreshold` Maximum files to process per execution cycle in directory mode. Defaults to page_size. - `Optional directoryId` ID of the directory containing files to process - `Optional> itemIds` List of specific item IDs to process. Either this or directory_id must be provided. - `Optional pageSize` Number of files to process per batch when using directory mode ### Returns - `class BatchCreateResponse:` Response schema for a batch processing job. - `String id` Unique identifier for the batch job - `JobType jobType` Type of processing operation (parse or classify) - `PARSE("parse")` - `EXTRACT("extract")` - `CLASSIFY("classify")` - `String projectId` Project this job belongs to - `Status status` Current job status - `PENDING("pending")` - `RUNNING("running")` - `DISPATCHED("dispatched")` - `COMPLETED("completed")` - `FAILED("failed")` - `CANCELLED("cancelled")` - `long totalItems` Total number of items in the job - `Optional completedAt` Timestamp when job completed - `Optional createdAt` Creation datetime - `Optional directoryId` Directory being processed - `Optional effectiveAt` - `Optional errorMessage` Error message for the latest job attempt, if any. - `Optional failedItems` Number of items that failed processing - `Optional jobRecordId` The job record ID associated with this status, if any. - `Optional processedItems` Number of items processed so far - `Optional skippedItems` Number of items skipped (already processed or size limit) - `Optional startedAt` Timestamp when job processing started - `Optional updatedAt` Update datetime - `Optional workflowId` Async job tracking ID ### Example ```java package com.llamacloud_prod.api.example; import com.llamacloud_prod.api.client.LlamaCloudClient; import com.llamacloud_prod.api.client.okhttp.LlamaCloudOkHttpClient; import com.llamacloud_prod.api.models.beta.batch.BatchCreateParams; import com.llamacloud_prod.api.models.beta.batch.BatchCreateResponse; public final class Main { private Main() {} public static void main(String[] args) { LlamaCloudClient client = LlamaCloudOkHttpClient.fromEnv(); BatchCreateParams params = BatchCreateParams.builder() .jobConfig(BatchCreateParams.JobConfig.BatchParseJobRecordCreate.builder().build()) .build(); BatchCreateResponse batch = client.beta().batch().create(params); } } ``` #### Response ```json { "id": "bjb-aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee", "job_type": "parse", "project_id": "proj-aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee", "status": "pending", "total_items": 0, "completed_at": "2019-12-27T18:11:19.117Z", "created_at": "2019-12-27T18:11:19.117Z", "directory_id": "dir-aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee", "effective_at": "2019-12-27T18:11:19.117Z", "error_message": "error_message", "failed_items": 0, "job_record_id": "job_record_id", "processed_items": 0, "skipped_items": 0, "started_at": "2019-12-27T18:11:19.117Z", "updated_at": "2019-12-27T18:11:19.117Z", "workflow_id": "workflow_id" } ``` ## List Batch Jobs `BatchListPage beta().batch().list(BatchListParamsparams = BatchListParams.none(), RequestOptionsrequestOptions = RequestOptions.none())` **get** `/api/v1/beta/batch-processing` List batch processing jobs with optional filtering. Filter by `directory_id`, `job_type`, or `status`. Results are paginated with configurable `limit` and `offset`. ### Parameters - `BatchListParams params` - `Optional directoryId` Filter by directory ID - `Optional jobType` Filter by job type (PARSE, EXTRACT, CLASSIFY) - `PARSE("parse")` - `EXTRACT("extract")` - `CLASSIFY("classify")` - `Optional limit` Maximum number of jobs to return - `Optional offset` Number of jobs to skip for pagination - `Optional organizationId` - `Optional projectId` - `Optional status` Filter by job status (PENDING, RUNNING, COMPLETED, FAILED, CANCELLED) - `PENDING("pending")` - `RUNNING("running")` - `DISPATCHED("dispatched")` - `COMPLETED("completed")` - `FAILED("failed")` - `CANCELLED("cancelled")` ### Returns - `class BatchListResponse:` Response schema for a batch processing job. - `String id` Unique identifier for the batch job - `JobType jobType` Type of processing operation (parse or classify) - `PARSE("parse")` - `EXTRACT("extract")` - `CLASSIFY("classify")` - `String projectId` Project this job belongs to - `Status status` Current job status - `PENDING("pending")` - `RUNNING("running")` - `DISPATCHED("dispatched")` - `COMPLETED("completed")` - `FAILED("failed")` - `CANCELLED("cancelled")` - `long totalItems` Total number of items in the job - `Optional completedAt` Timestamp when job completed - `Optional createdAt` Creation datetime - `Optional directoryId` Directory being processed - `Optional effectiveAt` - `Optional errorMessage` Error message for the latest job attempt, if any. - `Optional failedItems` Number of items that failed processing - `Optional jobRecordId` The job record ID associated with this status, if any. - `Optional processedItems` Number of items processed so far - `Optional skippedItems` Number of items skipped (already processed or size limit) - `Optional startedAt` Timestamp when job processing started - `Optional updatedAt` Update datetime - `Optional workflowId` Async job tracking ID ### Example ```java package com.llamacloud_prod.api.example; import com.llamacloud_prod.api.client.LlamaCloudClient; import com.llamacloud_prod.api.client.okhttp.LlamaCloudOkHttpClient; import com.llamacloud_prod.api.models.beta.batch.BatchListPage; import com.llamacloud_prod.api.models.beta.batch.BatchListParams; public final class Main { private Main() {} public static void main(String[] args) { LlamaCloudClient client = LlamaCloudOkHttpClient.fromEnv(); BatchListPage page = client.beta().batch().list(); } } ``` #### Response ```json { "items": [ { "id": "bjb-aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee", "job_type": "parse", "project_id": "proj-aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee", "status": "pending", "total_items": 0, "completed_at": "2019-12-27T18:11:19.117Z", "created_at": "2019-12-27T18:11:19.117Z", "directory_id": "dir-aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee", "effective_at": "2019-12-27T18:11:19.117Z", "error_message": "error_message", "failed_items": 0, "job_record_id": "job_record_id", "processed_items": 0, "skipped_items": 0, "started_at": "2019-12-27T18:11:19.117Z", "updated_at": "2019-12-27T18:11:19.117Z", "workflow_id": "workflow_id" } ], "next_page_token": "next_page_token", "total_size": 0 } ``` ## Get Batch Job Status `BatchGetStatusResponse beta().batch().getStatus(BatchGetStatusParamsparams = BatchGetStatusParams.none(), RequestOptionsrequestOptions = RequestOptions.none())` **get** `/api/v1/beta/batch-processing/{job_id}` Get detailed status of a batch processing job. Returns current progress percentage, file counts (total, processed, failed, skipped), and timestamps. ### Parameters - `BatchGetStatusParams params` - `Optional jobId` - `Optional organizationId` - `Optional projectId` ### Returns - `class BatchGetStatusResponse:` Detailed status response for a batch processing job. - `Job job` Response schema for a batch processing job. - `String id` Unique identifier for the batch job - `JobType jobType` Type of processing operation (parse or classify) - `PARSE("parse")` - `EXTRACT("extract")` - `CLASSIFY("classify")` - `String projectId` Project this job belongs to - `Status status` Current job status - `PENDING("pending")` - `RUNNING("running")` - `DISPATCHED("dispatched")` - `COMPLETED("completed")` - `FAILED("failed")` - `CANCELLED("cancelled")` - `long totalItems` Total number of items in the job - `Optional completedAt` Timestamp when job completed - `Optional createdAt` Creation datetime - `Optional directoryId` Directory being processed - `Optional effectiveAt` - `Optional errorMessage` Error message for the latest job attempt, if any. - `Optional failedItems` Number of items that failed processing - `Optional jobRecordId` The job record ID associated with this status, if any. - `Optional processedItems` Number of items processed so far - `Optional skippedItems` Number of items skipped (already processed or size limit) - `Optional startedAt` Timestamp when job processing started - `Optional updatedAt` Update datetime - `Optional workflowId` Async job tracking ID - `double progressPercentage` Percentage of items processed (0-100) ### Example ```java package com.llamacloud_prod.api.example; import com.llamacloud_prod.api.client.LlamaCloudClient; import com.llamacloud_prod.api.client.okhttp.LlamaCloudOkHttpClient; import com.llamacloud_prod.api.models.beta.batch.BatchGetStatusParams; import com.llamacloud_prod.api.models.beta.batch.BatchGetStatusResponse; public final class Main { private Main() {} public static void main(String[] args) { LlamaCloudClient client = LlamaCloudOkHttpClient.fromEnv(); BatchGetStatusResponse response = client.beta().batch().getStatus("job_id"); } } ``` #### Response ```json { "job": { "id": "bjb-aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee", "job_type": "parse", "project_id": "proj-aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee", "status": "pending", "total_items": 0, "completed_at": "2019-12-27T18:11:19.117Z", "created_at": "2019-12-27T18:11:19.117Z", "directory_id": "dir-aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee", "effective_at": "2019-12-27T18:11:19.117Z", "error_message": "error_message", "failed_items": 0, "job_record_id": "job_record_id", "processed_items": 0, "skipped_items": 0, "started_at": "2019-12-27T18:11:19.117Z", "updated_at": "2019-12-27T18:11:19.117Z", "workflow_id": "workflow_id" }, "progress_percentage": 0 } ``` ## Cancel Batch Job `BatchCancelResponse beta().batch().cancel(BatchCancelParamsparams = BatchCancelParams.none(), RequestOptionsrequestOptions = RequestOptions.none())` **post** `/api/v1/beta/batch-processing/{job_id}/cancel` Cancel a running batch processing job. Stops processing and marks pending items as cancelled. Items currently being processed may still complete. ### Parameters - `BatchCancelParams params` - `Optional jobId` - `Optional organizationId` - `Optional projectId` - `Optional temporalNamespace` - `Optional reason` Optional reason for cancelling the job ### Returns - `class BatchCancelResponse:` Response after cancelling a batch job. - `String jobId` ID of the cancelled job - `String message` Confirmation message - `long processedItems` Number of items processed before cancellation - `Status status` New status (should be 'cancelled') - `PENDING("pending")` - `RUNNING("running")` - `DISPATCHED("dispatched")` - `COMPLETED("completed")` - `FAILED("failed")` - `CANCELLED("cancelled")` ### Example ```java package com.llamacloud_prod.api.example; import com.llamacloud_prod.api.client.LlamaCloudClient; import com.llamacloud_prod.api.client.okhttp.LlamaCloudOkHttpClient; import com.llamacloud_prod.api.models.beta.batch.BatchCancelParams; import com.llamacloud_prod.api.models.beta.batch.BatchCancelResponse; public final class Main { private Main() {} public static void main(String[] args) { LlamaCloudClient client = LlamaCloudOkHttpClient.fromEnv(); BatchCancelResponse response = client.beta().batch().cancel("job_id"); } } ``` #### Response ```json { "job_id": "job_id", "message": "message", "processed_items": 0, "status": "pending" } ``` # Job Items ## List Batch Job Items `JobItemListPage beta().batch().jobItems().list(JobItemListParamsparams = JobItemListParams.none(), RequestOptionsrequestOptions = RequestOptions.none())` **get** `/api/v1/beta/batch-processing/{job_id}/items` List items in a batch job with optional status filtering. Useful for finding failed items, viewing completed items, or debugging processing issues. ### Parameters - `JobItemListParams params` - `Optional jobId` - `Optional limit` Maximum number of items to return - `Optional offset` Number of items to skip - `Optional organizationId` - `Optional projectId` - `Optional status` Filter items by status - `PENDING("pending")` - `PROCESSING("processing")` - `COMPLETED("completed")` - `FAILED("failed")` - `SKIPPED("skipped")` - `CANCELLED("cancelled")` ### Returns - `class JobItemListResponse:` Detailed information about an item in a batch job. - `String itemId` ID of the item - `String itemName` Name of the item - `Status status` Processing status of this item - `PENDING("pending")` - `PROCESSING("processing")` - `COMPLETED("completed")` - `FAILED("failed")` - `SKIPPED("skipped")` - `CANCELLED("cancelled")` - `Optional completedAt` When processing completed for this item - `Optional effectiveAt` - `Optional errorMessage` Error message for the latest job attempt, if any. - `Optional jobId` Job ID for the underlying processing job (links to parse/extract job results) - `Optional jobRecordId` The job record ID associated with this status, if any. - `Optional skipReason` Reason item was skipped (e.g., 'already_processed', 'size_limit_exceeded') - `Optional startedAt` When processing started for this item ### Example ```java package com.llamacloud_prod.api.example; import com.llamacloud_prod.api.client.LlamaCloudClient; import com.llamacloud_prod.api.client.okhttp.LlamaCloudOkHttpClient; import com.llamacloud_prod.api.models.beta.batch.jobitems.JobItemListPage; import com.llamacloud_prod.api.models.beta.batch.jobitems.JobItemListParams; public final class Main { private Main() {} public static void main(String[] args) { LlamaCloudClient client = LlamaCloudOkHttpClient.fromEnv(); JobItemListPage page = client.beta().batch().jobItems().list("job_id"); } } ``` #### Response ```json { "items": [ { "item_id": "item_id", "item_name": "item_name", "status": "pending", "completed_at": "2019-12-27T18:11:19.117Z", "effective_at": "2019-12-27T18:11:19.117Z", "error_message": "error_message", "job_id": "job_id", "job_record_id": "job_record_id", "skip_reason": "skip_reason", "started_at": "2019-12-27T18:11:19.117Z" } ], "next_page_token": "next_page_token", "total_size": 0 } ``` ## Get Item Processing Results `JobItemGetProcessingResultsResponse beta().batch().jobItems().getProcessingResults(JobItemGetProcessingResultsParamsparams = JobItemGetProcessingResultsParams.none(), RequestOptionsrequestOptions = RequestOptions.none())` **get** `/api/v1/beta/batch-processing/items/{item_id}/processing-results` Get all processing results for a specific item. Returns the complete processing history for an item including what operations were performed, parameters used, and where outputs are stored. Optionally filter by `job_type`. ### Parameters - `JobItemGetProcessingResultsParams params` - `Optional itemId` - `Optional jobType` Filter results by job type - `PARSE("parse")` - `EXTRACT("extract")` - `CLASSIFY("classify")` - `Optional organizationId` - `Optional projectId` ### Returns - `class JobItemGetProcessingResultsResponse:` Response containing all processing results for an item. - `String itemId` ID of the source item - `String itemName` Name of the source item - `Optional> processingResults` List of all processing operations performed on this item - `String itemId` Source item that was processed - `JobConfig jobConfig` Job configuration used for processing - `class BatchParseJobRecordCreate:` Batch-specific parse job record for batch processing. This model contains the metadata and configuration for a batch parse job, but excludes file-specific information. It's used as input to the batch parent workflow and combined with DirectoryFile data to create full ParseJobRecordCreate instances for each file. Attributes: job_name: Must be PARSE_RAW_FILE partitions: Partitions for job output location parameters: Generic parse configuration (BatchParseJobConfig) session_id: Upstream request ID for tracking correlation_id: Correlation ID for cross-service tracking parent_job_execution_id: Parent job execution ID if nested user_id: User who created the job project_id: Project this job belongs to webhook_url: Optional webhook URL for job completion notifications - `Optional correlationId` The correlation ID for this job. Used for tracking the job across services. - `Optional jobName` - `PARSE_RAW_FILE_JOB("parse_raw_file_job")` - `Optional parameters` Generic parse job configuration for batch processing. This model contains the parsing configuration that applies to all files in a batch, but excludes file-specific fields like file_name, file_id, etc. Those file-specific fields are populated from DirectoryFile data when creating individual ParseJobRecordCreate instances for each file. The fields in this model should be generic settings that apply uniformly to all files being processed in the batch. - `Optional adaptiveLongTable` - `Optional aggressiveTableExtraction` - `Optional annotateLinks` - `Optional autoMode` - `Optional autoModeConfigurationJson` - `Optional autoModeTriggerOnImageInPage` - `Optional autoModeTriggerOnRegexpInPage` - `Optional autoModeTriggerOnTableInPage` - `Optional autoModeTriggerOnTextInPage` - `Optional azureOpenAIApiVersion` - `Optional azureOpenAIDeploymentName` - `Optional azureOpenAIEndpoint` - `Optional azureOpenAIKey` - `Optional bboxBottom` - `Optional bboxLeft` - `Optional bboxRight` - `Optional bboxTop` - `Optional boundingBox` - `Optional compactMarkdownTable` - `Optional complementalFormattingInstruction` - `Optional contentGuidelineInstruction` - `Optional continuousMode` - `Optional customMetadata` The custom metadata to attach to the documents. - `Optional disableImageExtraction` - `Optional disableOcr` - `Optional disableReconstruction` - `Optional doNotCache` - `Optional doNotUnrollColumns` - `Optional enableCostOptimizer` - `Optional extractCharts` - `Optional extractLayout` - `Optional extractPrintedPageNumber` - `Optional fastMode` - `Optional formattingInstruction` - `Optional gpt4oApiKey` - `Optional gpt4oMode` - `Optional guessXlsxSheetName` - `Optional hideFooters` - `Optional hideHeaders` - `Optional highResOcr` - `Optional htmlMakeAllElementsVisible` - `Optional htmlRemoveFixedElements` - `Optional htmlRemoveNavigationElements` - `Optional httpProxy` - `Optional ignoreDocumentElementsForLayoutDetection` - `Optional> imagesToSave` - `SCREENSHOT("screenshot")` - `EMBEDDED("embedded")` - `LAYOUT("layout")` - `Optional inlineImagesInMarkdown` - `Optional inputS3Path` - `Optional inputS3Region` The region for the input S3 bucket. - `Optional inputUrl` - `Optional internalIsScreenshotJob` - `Optional invalidateCache` - `Optional isFormattingInstruction` - `Optional jobTimeoutExtraTimePerPageInSeconds` - `Optional jobTimeoutInSeconds` - `Optional keepPageSeparatorWhenMergingTables` - `Optional lang` The language. - `Optional> languages` - `AF("af")` - `AZ("az")` - `BS("bs")` - `CS("cs")` - `CY("cy")` - `DA("da")` - `DE("de")` - `EN("en")` - `ES("es")` - `ET("et")` - `FR("fr")` - `GA("ga")` - `HR("hr")` - `HU("hu")` - `ID("id")` - `IS("is")` - `IT("it")` - `KU("ku")` - `LA("la")` - `LT("lt")` - `LV("lv")` - `MI("mi")` - `MS("ms")` - `MT("mt")` - `NL("nl")` - `NO("no")` - `OC("oc")` - `PI("pi")` - `PL("pl")` - `PT("pt")` - `RO("ro")` - `RS_LATIN("rs_latin")` - `SK("sk")` - `SL("sl")` - `SQ("sq")` - `SV("sv")` - `SW("sw")` - `TL("tl")` - `TR("tr")` - `UZ("uz")` - `VI("vi")` - `AR("ar")` - `FA("fa")` - `UG("ug")` - `UR("ur")` - `BN("bn")` - `AS("as")` - `MNI("mni")` - `RU("ru")` - `RS_CYRILLIC("rs_cyrillic")` - `BE("be")` - `BG("bg")` - `UK("uk")` - `MN("mn")` - `ABQ("abq")` - `ADY("ady")` - `KBD("kbd")` - `AVA("ava")` - `DAR("dar")` - `INH("inh")` - `CHE("che")` - `LBE("lbe")` - `LEZ("lez")` - `TAB("tab")` - `TJK("tjk")` - `HI("hi")` - `MR("mr")` - `NE("ne")` - `BH("bh")` - `MAI("mai")` - `ANG("ang")` - `BHO("bho")` - `MAH("mah")` - `SCK("sck")` - `NEW("new")` - `GOM("gom")` - `SA("sa")` - `BGC("bgc")` - `TH("th")` - `CH_SIM("ch_sim")` - `CH_TRA("ch_tra")` - `JA("ja")` - `KO("ko")` - `TA("ta")` - `TE("te")` - `KN("kn")` - `Optional layoutAware` - `Optional lineLevelBoundingBox` - `Optional markdownTableMultilineHeaderSeparator` - `Optional maxPages` - `Optional maxPagesEnforced` - `Optional mergeTablesAcrossPagesInMarkdown` - `Optional model` - `Optional outlinedTableExtraction` - `Optional outputPdfOfDocument` - `Optional outputS3PathPrefix` If specified, llamaParse will save the output to the specified path. All output file will use this 'prefix' should be a valid s3:// url - `Optional outputS3Region` The region for the output S3 bucket. - `Optional outputTablesAsHtml` - `Optional outputBucket` The output bucket. - `Optional pageErrorTolerance` - `Optional pageFooterPrefix` - `Optional pageFooterSuffix` - `Optional pageHeaderPrefix` - `Optional pageHeaderSuffix` - `Optional pagePrefix` - `Optional pageSeparator` - `Optional pageSuffix` - `Optional parseMode` Enum for representing the mode of parsing to be used. - `PARSE_PAGE_WITHOUT_LLM("parse_page_without_llm")` - `PARSE_PAGE_WITH_LLM("parse_page_with_llm")` - `PARSE_PAGE_WITH_LVM("parse_page_with_lvm")` - `PARSE_PAGE_WITH_AGENT("parse_page_with_agent")` - `PARSE_PAGE_WITH_LAYOUT_AGENT("parse_page_with_layout_agent")` - `PARSE_DOCUMENT_WITH_LLM("parse_document_with_llm")` - `PARSE_DOCUMENT_WITH_LVM("parse_document_with_lvm")` - `PARSE_DOCUMENT_WITH_AGENT("parse_document_with_agent")` - `Optional parsingInstruction` - `Optional pipelineId` The pipeline ID. - `Optional preciseBoundingBox` - `Optional premiumMode` - `Optional presentationOutOfBoundsContent` - `Optional presentationSkipEmbeddedData` - `Optional preserveLayoutAlignmentAcrossPages` - `Optional preserveVerySmallText` - `Optional preset` - `Optional priority` The priority for the request. This field may be ignored or overwritten depending on the organization tier. - `LOW("low")` - `MEDIUM("medium")` - `HIGH("high")` - `CRITICAL("critical")` - `Optional projectId` - `Optional removeHiddenText` - `Optional replaceFailedPageMode` Enum for representing the different available page error handling modes. - `RAW_TEXT("raw_text")` - `BLANK_PAGE("blank_page")` - `ERROR_MESSAGE("error_message")` - `Optional replaceFailedPageWithErrorMessagePrefix` - `Optional replaceFailedPageWithErrorMessageSuffix` - `Optional resourceInfo` The resource info about the file - `Optional saveImages` - `Optional skipDiagonalText` - `Optional specializedChartParsingAgentic` - `Optional specializedChartParsingEfficient` - `Optional specializedChartParsingPlus` - `Optional specializedImageParsing` - `Optional spreadsheetExtractSubTables` - `Optional spreadsheetForceFormulaComputation` - `Optional spreadsheetIncludeHiddenSheets` - `Optional strictModeBuggyFont` - `Optional strictModeImageExtraction` - `Optional strictModeImageOcr` - `Optional strictModeReconstruction` - `Optional structuredOutput` - `Optional structuredOutputJsonSchema` - `Optional structuredOutputJsonSchemaName` - `Optional systemPrompt` - `Optional systemPromptAppend` - `Optional takeScreenshot` - `Optional targetPages` - `Optional tier` - `Optional type` - `PARSE("parse")` - `Optional useVendorMultimodalModel` - `Optional userPrompt` - `Optional vendorMultimodalApiKey` - `Optional vendorMultimodalModelName` - `Optional version` - `Optional> webhookConfigurations` Outbound webhook endpoints to notify on job status changes - `Optional> webhookEvents` Events to subscribe to (e.g. 'parse.success', 'extract.error'). If null, all events are delivered. - `EXTRACT_PENDING("extract.pending")` - `EXTRACT_SUCCESS("extract.success")` - `EXTRACT_ERROR("extract.error")` - `EXTRACT_PARTIAL_SUCCESS("extract.partial_success")` - `EXTRACT_CANCELLED("extract.cancelled")` - `PARSE_PENDING("parse.pending")` - `PARSE_RUNNING("parse.running")` - `PARSE_SUCCESS("parse.success")` - `PARSE_ERROR("parse.error")` - `PARSE_PARTIAL_SUCCESS("parse.partial_success")` - `PARSE_CANCELLED("parse.cancelled")` - `CLASSIFY_PENDING("classify.pending")` - `CLASSIFY_RUNNING("classify.running")` - `CLASSIFY_SUCCESS("classify.success")` - `CLASSIFY_ERROR("classify.error")` - `CLASSIFY_PARTIAL_SUCCESS("classify.partial_success")` - `CLASSIFY_CANCELLED("classify.cancelled")` - `SHEETS_PENDING("sheets.pending")` - `SHEETS_SUCCESS("sheets.success")` - `SHEETS_ERROR("sheets.error")` - `SHEETS_PARTIAL_SUCCESS("sheets.partial_success")` - `SHEETS_CANCELLED("sheets.cancelled")` - `UNMAPPED_EVENT("unmapped_event")` - `Optional webhookHeaders` Custom HTTP headers sent with each webhook request (e.g. auth tokens) - `Optional webhookOutputFormat` Response format sent to the webhook: 'string' (default) or 'json' - `Optional webhookUrl` URL to receive webhook POST notifications - `Optional webhookUrl` - `Optional parentJobExecutionId` The ID of the parent job execution. - `Optional partitions` The partitions for this execution. Used for determining where to save job output. - `Optional projectId` The ID of the project this job belongs to. - `Optional sessionId` The upstream request ID that created this job. Used for tracking the job across services. - `Optional userId` The ID of the user that created this job - `Optional webhookUrl` The URL that needs to be called at the end of the parsing job. - `class ClassifyJob:` A classify job. - `String id` Unique identifier - `String projectId` The ID of the project - `List rules` The rules to classify the files - `String description` Natural language description of what to classify. Be specific about the content characteristics that identify this document type. - `String type` The document type to assign when this rule matches (e.g., 'invoice', 'receipt', 'contract') - `StatusEnum status` The status of the classify job - `PENDING("PENDING")` - `SUCCESS("SUCCESS")` - `ERROR("ERROR")` - `PARTIAL_SUCCESS("PARTIAL_SUCCESS")` - `CANCELLED("CANCELLED")` - `String userId` The ID of the user - `Optional createdAt` Creation datetime - `Optional effectiveAt` - `Optional errorMessage` Error message for the latest job attempt, if any. - `Optional jobRecordId` The job record ID associated with this status, if any. - `Optional mode` The classification mode to use - `FAST("FAST")` - `MULTIMODAL("MULTIMODAL")` - `Optional parsingConfiguration` The configuration for the parsing job - `Optional lang` The language to parse the files in - `AF("af")` - `AZ("az")` - `BS("bs")` - `CS("cs")` - `CY("cy")` - `DA("da")` - `DE("de")` - `EN("en")` - `ES("es")` - `ET("et")` - `FR("fr")` - `GA("ga")` - `HR("hr")` - `HU("hu")` - `ID("id")` - `IS("is")` - `IT("it")` - `KU("ku")` - `LA("la")` - `LT("lt")` - `LV("lv")` - `MI("mi")` - `MS("ms")` - `MT("mt")` - `NL("nl")` - `NO("no")` - `OC("oc")` - `PI("pi")` - `PL("pl")` - `PT("pt")` - `RO("ro")` - `RS_LATIN("rs_latin")` - `SK("sk")` - `SL("sl")` - `SQ("sq")` - `SV("sv")` - `SW("sw")` - `TL("tl")` - `TR("tr")` - `UZ("uz")` - `VI("vi")` - `AR("ar")` - `FA("fa")` - `UG("ug")` - `UR("ur")` - `BN("bn")` - `AS("as")` - `MNI("mni")` - `RU("ru")` - `RS_CYRILLIC("rs_cyrillic")` - `BE("be")` - `BG("bg")` - `UK("uk")` - `MN("mn")` - `ABQ("abq")` - `ADY("ady")` - `KBD("kbd")` - `AVA("ava")` - `DAR("dar")` - `INH("inh")` - `CHE("che")` - `LBE("lbe")` - `LEZ("lez")` - `TAB("tab")` - `TJK("tjk")` - `HI("hi")` - `MR("mr")` - `NE("ne")` - `BH("bh")` - `MAI("mai")` - `ANG("ang")` - `BHO("bho")` - `MAH("mah")` - `SCK("sck")` - `NEW("new")` - `GOM("gom")` - `SA("sa")` - `BGC("bgc")` - `TH("th")` - `CH_SIM("ch_sim")` - `CH_TRA("ch_tra")` - `JA("ja")` - `KO("ko")` - `TA("ta")` - `TE("te")` - `KN("kn")` - `Optional maxPages` The maximum number of pages to parse - `Optional> targetPages` The pages to target for parsing (0-indexed, so first page is at 0) - `Optional updatedAt` Update datetime - `JobType jobType` Type of processing performed - `PARSE("parse")` - `EXTRACT("extract")` - `CLASSIFY("classify")` - `String outputS3Path` Location of the processing output - `String parametersHash` Content hash of the job configuration for dedup - `LocalDateTime processedAt` When this processing occurred - `String resultId` Unique identifier for this result - `Optional outputMetadata` Metadata about processing output. Currently empty - will be populated with job-type-specific metadata fields in the future. ### Example ```java package com.llamacloud_prod.api.example; import com.llamacloud_prod.api.client.LlamaCloudClient; import com.llamacloud_prod.api.client.okhttp.LlamaCloudOkHttpClient; import com.llamacloud_prod.api.models.beta.batch.jobitems.JobItemGetProcessingResultsParams; import com.llamacloud_prod.api.models.beta.batch.jobitems.JobItemGetProcessingResultsResponse; public final class Main { private Main() {} public static void main(String[] args) { LlamaCloudClient client = LlamaCloudOkHttpClient.fromEnv(); JobItemGetProcessingResultsResponse response = client.beta().batch().jobItems().getProcessingResults("item_id"); } } ``` #### Response ```json { "item_id": "item_id", "item_name": "item_name", "processing_results": [ { "item_id": "item_id", "job_config": { "correlation_id": "182bd5e5-6e1a-4fe4-a799-aa6d9a6ab26e", "job_name": "parse_raw_file_job", "parameters": { "adaptive_long_table": true, "aggressive_table_extraction": true, "annotate_links": true, "auto_mode": true, "auto_mode_configuration_json": "auto_mode_configuration_json", "auto_mode_trigger_on_image_in_page": true, "auto_mode_trigger_on_regexp_in_page": "auto_mode_trigger_on_regexp_in_page", "auto_mode_trigger_on_table_in_page": true, "auto_mode_trigger_on_text_in_page": "auto_mode_trigger_on_text_in_page", "azure_openai_api_version": "azure_openai_api_version", "azure_openai_deployment_name": "azure_openai_deployment_name", "azure_openai_endpoint": "azure_openai_endpoint", "azure_openai_key": "azure_openai_key", "bbox_bottom": 0, "bbox_left": 0, "bbox_right": 0, "bbox_top": 0, "bounding_box": "bounding_box", "compact_markdown_table": true, "complemental_formatting_instruction": "complemental_formatting_instruction", "content_guideline_instruction": "content_guideline_instruction", "continuous_mode": true, "custom_metadata": { "foo": "bar" }, "disable_image_extraction": true, "disable_ocr": true, "disable_reconstruction": true, "do_not_cache": true, "do_not_unroll_columns": true, "enable_cost_optimizer": true, "extract_charts": true, "extract_layout": true, "extract_printed_page_number": true, "fast_mode": true, "formatting_instruction": "formatting_instruction", "gpt4o_api_key": "gpt4o_api_key", "gpt4o_mode": true, "guess_xlsx_sheet_name": true, "hide_footers": true, "hide_headers": true, "high_res_ocr": true, "html_make_all_elements_visible": true, "html_remove_fixed_elements": true, "html_remove_navigation_elements": true, "http_proxy": "http_proxy", "ignore_document_elements_for_layout_detection": true, "images_to_save": [ "screenshot" ], "inline_images_in_markdown": true, "input_s3_path": "input_s3_path", "input_s3_region": "input_s3_region", "input_url": "input_url", "internal_is_screenshot_job": true, "invalidate_cache": true, "is_formatting_instruction": true, "job_timeout_extra_time_per_page_in_seconds": 0, "job_timeout_in_seconds": 0, "keep_page_separator_when_merging_tables": true, "lang": "lang", "languages": [ "af" ], "layout_aware": true, "line_level_bounding_box": true, "markdown_table_multiline_header_separator": "markdown_table_multiline_header_separator", "max_pages": 0, "max_pages_enforced": 0, "merge_tables_across_pages_in_markdown": true, "model": "model", "outlined_table_extraction": true, "output_pdf_of_document": true, "output_s3_path_prefix": "output_s3_path_prefix", "output_s3_region": "output_s3_region", "output_tables_as_HTML": true, "outputBucket": "outputBucket", "page_error_tolerance": 0, "page_footer_prefix": "page_footer_prefix", "page_footer_suffix": "page_footer_suffix", "page_header_prefix": "page_header_prefix", "page_header_suffix": "page_header_suffix", "page_prefix": "page_prefix", "page_separator": "page_separator", "page_suffix": "page_suffix", "parse_mode": "parse_page_without_llm", "parsing_instruction": "parsing_instruction", "pipeline_id": "pipeline_id", "precise_bounding_box": true, "premium_mode": true, "presentation_out_of_bounds_content": true, "presentation_skip_embedded_data": true, "preserve_layout_alignment_across_pages": true, "preserve_very_small_text": true, "preset": "preset", "priority": "low", "project_id": "project_id", "remove_hidden_text": true, "replace_failed_page_mode": "raw_text", "replace_failed_page_with_error_message_prefix": "replace_failed_page_with_error_message_prefix", "replace_failed_page_with_error_message_suffix": "replace_failed_page_with_error_message_suffix", "resource_info": { "foo": "bar" }, "save_images": true, "skip_diagonal_text": true, "specialized_chart_parsing_agentic": true, "specialized_chart_parsing_efficient": true, "specialized_chart_parsing_plus": true, "specialized_image_parsing": true, "spreadsheet_extract_sub_tables": true, "spreadsheet_force_formula_computation": true, "spreadsheet_include_hidden_sheets": true, "strict_mode_buggy_font": true, "strict_mode_image_extraction": true, "strict_mode_image_ocr": true, "strict_mode_reconstruction": true, "structured_output": true, "structured_output_json_schema": "structured_output_json_schema", "structured_output_json_schema_name": "structured_output_json_schema_name", "system_prompt": "system_prompt", "system_prompt_append": "system_prompt_append", "take_screenshot": true, "target_pages": "target_pages", "tier": "tier", "type": "parse", "use_vendor_multimodal_model": true, "user_prompt": "user_prompt", "vendor_multimodal_api_key": "vendor_multimodal_api_key", "vendor_multimodal_model_name": "vendor_multimodal_model_name", "version": "version", "webhook_configurations": [ { "webhook_events": [ "parse.success", "parse.error" ], "webhook_headers": { "Authorization": "Bearer sk-..." }, "webhook_output_format": "json", "webhook_url": "https://example.com/webhooks/llamacloud" } ], "webhook_url": "webhook_url" }, "parent_job_execution_id": "182bd5e5-6e1a-4fe4-a799-aa6d9a6ab26e", "partitions": { "foo": "182bd5e5-6e1a-4fe4-a799-aa6d9a6ab26e" }, "project_id": "182bd5e5-6e1a-4fe4-a799-aa6d9a6ab26e", "session_id": "182bd5e5-6e1a-4fe4-a799-aa6d9a6ab26e", "user_id": "user_id", "webhook_url": "webhook_url" }, "job_type": "parse", "output_s3_path": "output_s3_path", "parameters_hash": "parameters_hash", "processed_at": "2019-12-27T18:11:19.117Z", "result_id": "result_id", "output_metadata": {} } ] } ``` # Split ## Create Split Job `SplitCreateResponse beta().split().create(SplitCreateParamsparams, RequestOptionsrequestOptions = RequestOptions.none())` **post** `/api/v1/beta/split/jobs` Create a document split job. ### Parameters - `SplitCreateParams params` - `Optional organizationId` - `Optional projectId` - `SplitDocumentInput documentInput` Document to be split. - `Optional configuration` Split configuration with categories and splitting strategy. - `List categories` Categories to split documents into. - `String name` Name of the category. - `Optional description` Optional description of what content belongs in this category. - `Optional splittingStrategy` Strategy for splitting documents. - `Optional allowUncategorized` Controls handling of pages that don't match any category. 'include': pages can be grouped as 'uncategorized' and included in results. 'forbid': all pages must be assigned to a defined category. 'omit': pages can be classified as 'uncategorized' but are excluded from results. - `INCLUDE("include")` - `FORBID("forbid")` - `OMIT("omit")` - `Optional configurationId` Saved split configuration ID. ### Returns - `class SplitCreateResponse:` Beta response — uses nested document_input object. - `String id` Unique identifier for the split job. - `List categories` Categories used for splitting. - `String name` Name of the category. - `Optional description` Optional description of what content belongs in this category. - `SplitDocumentInput documentInput` Document that was split. - `String type` Type of document input. Valid values are: file_id - `String value` Document identifier. - `String projectId` Project ID this job belongs to. - `String status` Current status of the job. Valid values are: pending, processing, completed, failed, cancelled. - `String userId` User ID who created this job. - `Optional configurationId` Split configuration ID used for this job. - `Optional createdAt` Creation datetime - `Optional errorMessage` Error message if the job failed. - `Optional result` Result of a completed split job. - `List segments` List of document segments. - `String category` Category name this split belongs to. - `String confidenceCategory` Categorical confidence level. Valid values are: high, medium, low. - `List pages` 1-indexed page numbers in this split. - `Optional updatedAt` Update datetime ### Example ```java package com.llamacloud_prod.api.example; import com.llamacloud_prod.api.client.LlamaCloudClient; import com.llamacloud_prod.api.client.okhttp.LlamaCloudOkHttpClient; import com.llamacloud_prod.api.models.beta.split.SplitCreateParams; import com.llamacloud_prod.api.models.beta.split.SplitCreateResponse; import com.llamacloud_prod.api.models.beta.split.SplitDocumentInput; public final class Main { private Main() {} public static void main(String[] args) { LlamaCloudClient client = LlamaCloudOkHttpClient.fromEnv(); SplitCreateParams params = SplitCreateParams.builder() .documentInput(SplitDocumentInput.builder() .type("type") .value("value") .build()) .build(); SplitCreateResponse split = client.beta().split().create(params); } } ``` #### Response ```json { "id": "id", "categories": [ { "name": "x", "description": "x" } ], "document_input": { "type": "type", "value": "value" }, "project_id": "project_id", "status": "status", "user_id": "user_id", "configuration_id": "configuration_id", "created_at": "2019-12-27T18:11:19.117Z", "error_message": "error_message", "result": { "segments": [ { "category": "category", "confidence_category": "confidence_category", "pages": [ 0 ] } ] }, "updated_at": "2019-12-27T18:11:19.117Z" } ``` ## List Split Jobs `SplitListPage beta().split().list(SplitListParamsparams = SplitListParams.none(), RequestOptionsrequestOptions = RequestOptions.none())` **get** `/api/v1/beta/split/jobs` List document split jobs. ### Parameters - `SplitListParams params` - `Optional createdAtOnOrAfter` Include items created at or after this timestamp (inclusive) - `Optional createdAtOnOrBefore` Include items created at or before this timestamp (inclusive) - `Optional> jobIds` Filter by specific job IDs - `Optional organizationId` - `Optional pageSize` - `Optional pageToken` - `Optional projectId` - `Optional status` Filter by job status (pending, processing, completed, failed, cancelled) - `PENDING("pending")` - `PROCESSING("processing")` - `COMPLETED("completed")` - `FAILED("failed")` - `CANCELLED("cancelled")` ### Returns - `class SplitListResponse:` Beta response — uses nested document_input object. - `String id` Unique identifier for the split job. - `List categories` Categories used for splitting. - `String name` Name of the category. - `Optional description` Optional description of what content belongs in this category. - `SplitDocumentInput documentInput` Document that was split. - `String type` Type of document input. Valid values are: file_id - `String value` Document identifier. - `String projectId` Project ID this job belongs to. - `String status` Current status of the job. Valid values are: pending, processing, completed, failed, cancelled. - `String userId` User ID who created this job. - `Optional configurationId` Split configuration ID used for this job. - `Optional createdAt` Creation datetime - `Optional errorMessage` Error message if the job failed. - `Optional result` Result of a completed split job. - `List segments` List of document segments. - `String category` Category name this split belongs to. - `String confidenceCategory` Categorical confidence level. Valid values are: high, medium, low. - `List pages` 1-indexed page numbers in this split. - `Optional updatedAt` Update datetime ### Example ```java package com.llamacloud_prod.api.example; import com.llamacloud_prod.api.client.LlamaCloudClient; import com.llamacloud_prod.api.client.okhttp.LlamaCloudOkHttpClient; import com.llamacloud_prod.api.models.beta.split.SplitListPage; import com.llamacloud_prod.api.models.beta.split.SplitListParams; public final class Main { private Main() {} public static void main(String[] args) { LlamaCloudClient client = LlamaCloudOkHttpClient.fromEnv(); SplitListPage page = client.beta().split().list(); } } ``` #### Response ```json { "items": [ { "id": "id", "categories": [ { "name": "x", "description": "x" } ], "document_input": { "type": "type", "value": "value" }, "project_id": "project_id", "status": "status", "user_id": "user_id", "configuration_id": "configuration_id", "created_at": "2019-12-27T18:11:19.117Z", "error_message": "error_message", "result": { "segments": [ { "category": "category", "confidence_category": "confidence_category", "pages": [ 0 ] } ] }, "updated_at": "2019-12-27T18:11:19.117Z" } ], "next_page_token": "next_page_token", "total_size": 0 } ``` ## Get Split Job `SplitGetResponse beta().split().get(SplitGetParamsparams = SplitGetParams.none(), RequestOptionsrequestOptions = RequestOptions.none())` **get** `/api/v1/beta/split/jobs/{split_job_id}` Get a document split job. ### Parameters - `SplitGetParams params` - `Optional splitJobId` - `Optional organizationId` - `Optional projectId` ### Returns - `class SplitGetResponse:` Beta response — uses nested document_input object. - `String id` Unique identifier for the split job. - `List categories` Categories used for splitting. - `String name` Name of the category. - `Optional description` Optional description of what content belongs in this category. - `SplitDocumentInput documentInput` Document that was split. - `String type` Type of document input. Valid values are: file_id - `String value` Document identifier. - `String projectId` Project ID this job belongs to. - `String status` Current status of the job. Valid values are: pending, processing, completed, failed, cancelled. - `String userId` User ID who created this job. - `Optional configurationId` Split configuration ID used for this job. - `Optional createdAt` Creation datetime - `Optional errorMessage` Error message if the job failed. - `Optional result` Result of a completed split job. - `List segments` List of document segments. - `String category` Category name this split belongs to. - `String confidenceCategory` Categorical confidence level. Valid values are: high, medium, low. - `List pages` 1-indexed page numbers in this split. - `Optional updatedAt` Update datetime ### Example ```java package com.llamacloud_prod.api.example; import com.llamacloud_prod.api.client.LlamaCloudClient; import com.llamacloud_prod.api.client.okhttp.LlamaCloudOkHttpClient; import com.llamacloud_prod.api.models.beta.split.SplitGetParams; import com.llamacloud_prod.api.models.beta.split.SplitGetResponse; public final class Main { private Main() {} public static void main(String[] args) { LlamaCloudClient client = LlamaCloudOkHttpClient.fromEnv(); SplitGetResponse split = client.beta().split().get("split_job_id"); } } ``` #### Response ```json { "id": "id", "categories": [ { "name": "x", "description": "x" } ], "document_input": { "type": "type", "value": "value" }, "project_id": "project_id", "status": "status", "user_id": "user_id", "configuration_id": "configuration_id", "created_at": "2019-12-27T18:11:19.117Z", "error_message": "error_message", "result": { "segments": [ { "category": "category", "confidence_category": "confidence_category", "pages": [ 0 ] } ] }, "updated_at": "2019-12-27T18:11:19.117Z" } ``` ## Domain Types ### Split Category - `class SplitCategory:` Category definition for document splitting. - `String name` Name of the category. - `Optional description` Optional description of what content belongs in this category. ### Split Document Input - `class SplitDocumentInput:` Document input specification for beta API. - `String type` Type of document input. Valid values are: file_id - `String value` Document identifier. ### Split Result Response - `class SplitResultResponse:` Result of a completed split job. - `List segments` List of document segments. - `String category` Category name this split belongs to. - `String confidenceCategory` Categorical confidence level. Valid values are: high, medium, low. - `List pages` 1-indexed page numbers in this split. ### Split Segment Response - `class SplitSegmentResponse:` A segment of the split document. - `String category` Category name this split belongs to. - `String confidenceCategory` Categorical confidence level. Valid values are: high, medium, low. - `List pages` 1-indexed page numbers in this split.