Create Classify Job

classifier.jobs.create() -> ClassifyJob

POST/api/v1/classifier/jobs

Create a classify job. Experimental: not production-ready and subject to change.

ParametersExpand Collapse

file_ids: Sequence[str]

The IDs of the files to classify

rules: Iterable[ClassifierRuleParam]

The rules to classify the files

description: str

Natural language description of what to classify. Be specific about the content characteristics that identify this document type.

maxLength2000

minLength10

type: str

The document type to assign when this rule matches (e.g., ‘invoice’, ‘receipt’, ‘contract’)

maxLength50

minLength1

organization_id: Optional[str]

project_id: Optional[str]

mode: Optional[Literal["FAST", "MULTIMODAL"]]

The classification mode to use

One of the following:

"FAST"

"MULTIMODAL"

parsing_configuration: Optional[ClassifyParsingConfigurationParam]

The configuration for the parsing job

lang: Optional[ParsingLanguages]

The language to parse the files in

One of the following:

"abq"

"ady"

"af"

"ang"

"ar"

"as"

"ava"

"az"

"be"

"bg"

"bgc"

"bh"

"bho"

"bn"

"bs"

"ch_sim"

"ch_tra"

"che"

"cs"

"cy"

"da"

"dar"

"de"

"en"

"es"

"et"

"fa"

"fr"

"ga"

"gom"

"hi"

"hr"

"hu"

"id"

"inh"

"is"

"it"

"ja"

"kbd"

"kn"

"ko"

"ku"

"la"

"lbe"

"lez"

"lt"

"lv"

"mah"

"mai"

"mi"

"mn"

"mni"

"mr"

"ms"

"mt"

"ne"

"new"

"nl"

"no"

"oc"

"pi"

"pl"

"pt"

"ro"

"rs_cyrillic"

"rs_latin"

"ru"

"sa"

"sck"

"sk"

"sl"

"sq"

"sv"

"sw"

"ta"

"tab"

"te"

"th"

"tjk"

"tl"

"tr"

"ug"

"uk"

"ur"

"uz"

"vi"

max_pages: Optional[int]

The maximum number of pages to parse

target_pages: Optional[List[int]]

The pages to target for parsing (0-indexed, so first page is at 0)

webhook_configurations: Optional[Iterable[WebhookConfiguration]]

List of webhook configurations for notifications

webhook_events: Optional[Sequence[str]]

Events that trigger this webhook. Options: ‘parse.success’ (job completed), ‘parse.error’ (job failed), ‘parse.partial_success’ (some pages failed), ‘parse.pending’, ‘parse.running’, ‘parse.cancelled’. If not specified, webhook fires for all events

webhook_headers: Optional[Dict[str, object]]

Custom HTTP headers to include in webhook requests. Use for authentication tokens or custom routing. Example: {‘Authorization’: ‘Bearer xyz’}

webhook_output_format: Optional[Literal["json", "string"]]

Format of the webhook payload body. ‘string’ (default) sends the payload as a JSON-encoded string; ‘json’ sends it as a JSON object.

One of the following:

"json"

"string"

webhook_signing_secret: Optional[str]

Shared signing secret used to sign webhook deliveries. When set, each request includes an HMAC-SHA256 signature of the request body in the ‘LC-Signature’ header (value ‘sha256=’). Recompute the HMAC over the raw request body with this secret to verify the delivery is authentic.

webhook_url: Optional[str]

HTTPS URL to receive webhook POST requests. Must be publicly accessible

ReturnsExpand Collapse

class ClassifyJob: …

A classify job.

id: str

Unique identifier

formatuuid

project_id: str

The ID of the project

formatuuid

rules: List[ClassifierRule]

The rules to classify the files

description: str

Natural language description of what to classify. Be specific about the content characteristics that identify this document type.

maxLength2000

minLength10

type: str

The document type to assign when this rule matches (e.g., ‘invoice’, ‘receipt’, ‘contract’)

maxLength50

minLength1

status: StatusEnum

The status of the classify job

One of the following:

"CANCELLED"

"ERROR"

"PARTIAL_SUCCESS"

"PENDING"

"SUCCESS"

user_id: str

The ID of the user

created_at: Optional[datetime]

Creation datetime

formatdate-time

effective_at: Optional[datetime]

error_message: Optional[str]

Error message for the latest job attempt, if any.

job_record_id: Optional[str]

The job record ID associated with this status, if any.

mode: Optional[Literal["FAST", "MULTIMODAL"]]

The classification mode to use

One of the following:

"FAST"

"MULTIMODAL"

parsing_configuration: Optional[ClassifyParsingConfiguration]

The configuration for the parsing job

lang: Optional[ParsingLanguages]

The language to parse the files in

One of the following:

"abq"

"ady"

"af"

"ang"

"ar"

"as"

"ava"

"az"

"be"

"bg"

"bgc"

"bh"

"bho"

"bn"

"bs"

"ch_sim"

"ch_tra"

"che"

"cs"

"cy"

"da"

"dar"

"de"

"en"

"es"

"et"

"fa"

"fr"

"ga"

"gom"

"hi"

"hr"

"hu"

"id"

"inh"

"is"

"it"

"ja"

"kbd"

"kn"

"ko"

"ku"

"la"

"lbe"

"lez"

"lt"

"lv"

"mah"

"mai"

"mi"

"mn"

"mni"

"mr"

"ms"

"mt"

"ne"

"new"

"nl"

"no"

"oc"

"pi"

"pl"

"pt"

"ro"

"rs_cyrillic"

"rs_latin"

"ru"

"sa"

"sck"

"sk"

"sl"

"sq"

"sv"

"sw"

"ta"

"tab"

"te"

"th"

"tjk"

"tl"

"tr"

"ug"

"uk"

"ur"

"uz"

"vi"

max_pages: Optional[int]

The maximum number of pages to parse

target_pages: Optional[List[int]]

The pages to target for parsing (0-indexed, so first page is at 0)

updated_at: Optional[datetime]

Update datetime

formatdate-time

Create Classify Job

import os
from llama_cloud import LlamaCloud

client = LlamaCloud(
    api_key=os.environ.get("LLAMA_CLOUD_API_KEY"),  # This is the default and can be omitted
)
classify_job = client.classifier.jobs.create(
    file_ids=["182bd5e5-6e1a-4fe4-a799-aa6d9a6ab26e"],
    rules=[{
        "description": "contains invoice number, line items, and total amount",
        "type": "invoice",
    }],
)
print(classify_job.id)

{
  "id": "182bd5e5-6e1a-4fe4-a799-aa6d9a6ab26e",
  "project_id": "182bd5e5-6e1a-4fe4-a799-aa6d9a6ab26e",
  "rules": [
    {
      "description": "contains invoice number, line items, and total amount",
      "type": "invoice"
    }
  ],
  "status": "CANCELLED",
  "user_id": "user_id",
  "created_at": "2019-12-27T18:11:19.117Z",
  "effective_at": "2019-12-27T18:11:19.117Z",
  "error_message": "error_message",
  "job_record_id": "job_record_id",
  "mode": "FAST",
  "parsing_configuration": {
    "lang": "abq",
    "max_pages": 0,
    "target_pages": [
      0
    ]
  },
  "updated_at": "2019-12-27T18:11:19.117Z"
}

Returns Examples

{
  "id": "182bd5e5-6e1a-4fe4-a799-aa6d9a6ab26e",
  "project_id": "182bd5e5-6e1a-4fe4-a799-aa6d9a6ab26e",
  "rules": [
    {
      "description": "contains invoice number, line items, and total amount",
      "type": "invoice"
    }
  ],
  "status": "CANCELLED",
  "user_id": "user_id",
  "created_at": "2019-12-27T18:11:19.117Z",
  "effective_at": "2019-12-27T18:11:19.117Z",
  "error_message": "error_message",
  "job_record_id": "job_record_id",
  "mode": "FAST",
  "parsing_configuration": {
    "lang": "abq",
    "max_pages": 0,
    "target_pages": [
      0
    ]
  },
  "updated_at": "2019-12-27T18:11:19.117Z"
}