Get Data Source
Get a data source by ID.
ParametersExpand Collapse
ReturnsExpand Collapse
DataSource { id, component, name, 6 more }
Schema for a data source.
id: string
Unique identifier
component: Record<string, unknown> | CloudS3DataSource { bucket, aws_access_id, aws_access_secret, 5 more } | CloudAzStorageBlobDataSource { account_url, container_name, account_key, 8 more } | 8 more
Component that implements the data source
CloudS3DataSource { bucket, aws_access_id, aws_access_secret, 5 more }
bucket: string
The name of the S3 bucket to read from.
aws_access_id?: string | null
The AWS access ID to use for authentication.
aws_access_secret?: string | null
The AWS access secret to use for authentication.
prefix?: string | null
The prefix of the S3 objects to read from.
regex_pattern?: string | null
The regex pattern to filter S3 objects. Must be a valid regex pattern.
s3_endpoint_url?: string | null
The S3 endpoint URL to use for authentication.
CloudAzStorageBlobDataSource { account_url, container_name, account_key, 8 more }
account_url: string
The Azure Storage Blob account URL to use for authentication.
container_name: string
The name of the Azure Storage Blob container to read from.
account_key?: string | null
The Azure Storage Blob account key to use for authentication.
account_name?: string | null
The Azure Storage Blob account name to use for authentication.
blob?: string | null
The blob name to read from.
client_id?: string | null
The Azure AD client ID to use for authentication.
client_secret?: string | null
The Azure AD client secret to use for authentication.
prefix?: string | null
The prefix of the Azure Storage Blob objects to read from.
tenant_id?: string | null
The Azure AD tenant ID to use for authentication.
CloudOneDriveDataSource { client_id, client_secret, tenant_id, 6 more }
client_id: string
The client ID to use for authentication.
client_secret: string
The client secret to use for authentication.
tenant_id: string
The tenant ID to use for authentication.
user_principal_name: string
The user principal name to use for authentication.
folder_id?: string | null
The ID of the OneDrive folder to read from.
folder_path?: string | null
The path of the OneDrive folder to read from.
required_exts?: Array<string> | null
The list of required file extensions.
CloudSharepointDataSource { client_id, client_secret, tenant_id, 11 more }
client_id: string
The client ID to use for authentication.
client_secret: string
The client secret to use for authentication.
tenant_id: string
The tenant ID to use for authentication.
drive_name?: string | null
The name of the Sharepoint drive to read from.
exclude_path_patterns?: Array<string> | null
List of regex patterns for file paths to exclude. Files whose paths (including filename) match any pattern will be excluded. Example: ['/temp/', '/backup/', '.git/', '.tmp$', '^~']
folder_id?: string | null
The ID of the Sharepoint folder to read from.
folder_path?: string | null
The path of the Sharepoint folder to read from.
get_permissions?: boolean | null
Whether to get permissions for the sharepoint site.
include_path_patterns?: Array<string> | null
List of regex patterns for file paths to include. Full paths (including filename) must match at least one pattern to be included. Example: ['/reports/', '/docs/..pdf$', '^Report..pdf$']
required_exts?: Array<string> | null
The list of required file extensions.
site_id?: string | null
The ID of the SharePoint site to download from.
site_name?: string | null
The name of the SharePoint site to download from.
CloudSlackDataSource { slack_token, channel_ids, channel_patterns, 6 more }
slack_token: string
Slack Bot Token.
channel_ids?: string | null
Slack Channel.
channel_patterns?: string | null
Slack Channel name pattern.
earliest_date?: string | null
Earliest date.
earliest_date_timestamp?: number | null
Earliest date timestamp.
latest_date?: string | null
Latest date.
latest_date_timestamp?: number | null
Latest date timestamp.
CloudNotionPageDataSource { integration_token, class_name, database_ids, 2 more }
integration_token: string
The integration token to use for authentication.
database_ids?: string | null
The Notion Database Id to read content from.
page_ids?: string | null
The Page ID's of the Notion to read from.
CloudConfluenceDataSource { authentication_mechanism, server_url, api_token, 10 more }
authentication_mechanism: string
Type of Authentication for connecting to Confluence APIs.
server_url: string
The server URL of the Confluence instance.
api_token?: string | null
The API token to use for authentication.
cql?: string | null
The CQL query to use for fetching pages.
Configuration for handling failures during processing. Key-value object controlling failure handling behaviors.
Example: { "skip_list_failures": true }
Currently supports:
- skip_list_failures: Skip failed batches/lists and continue processing
skip_list_failures?: boolean
Whether to skip failed batches/lists and continue processing
index_restricted_pages?: boolean
Whether to index restricted pages.
keep_markdown_format?: boolean
Whether to keep the markdown format.
label?: string | null
The label to use for fetching pages.
page_ids?: string | null
The page IDs of the Confluence to read from.
space_key?: string | null
The space key to read from.
user_name?: string | null
The username to use for authentication.
CloudJiraDataSource { authentication_mechanism, query, api_token, 5 more }
Cloud Jira Data Source integrating JiraReader.
authentication_mechanism: string
Type of Authentication for connecting to Jira APIs.
query: string
JQL (Jira Query Language) query to search.
api_token?: string | null
The API/ Access Token used for Basic, PAT and OAuth2 authentication.
cloud_id?: string | null
The cloud ID, used in case of OAuth2.
email?: string | null
The email address to use for authentication.
server_url?: string | null
The server url for Jira Cloud.
CloudJiraDataSourceV2 { authentication_mechanism, query, server_url, 10 more }
Cloud Jira Data Source integrating JiraReaderV2.
authentication_mechanism: string
Type of Authentication for connecting to Jira APIs.
query: string
JQL (Jira Query Language) query to search.
server_url: string
The server url for Jira Cloud.
api_token?: string | null
The API Access Token used for Basic, PAT and OAuth2 authentication.
api_version?: "2" | "3"
Jira REST API version to use (2 or 3). 3 supports Atlassian Document Format (ADF).
cloud_id?: string | null
The cloud ID, used in case of OAuth2.
email?: string | null
The email address to use for authentication.
expand?: string | null
Fields to expand in the response.
fields?: Array<string> | null
List of fields to retrieve from Jira. If None, retrieves all fields.
get_permissions?: boolean
Whether to fetch project role permissions and issue-level security
requests_per_minute?: number | null
Rate limit for Jira API requests per minute.
CloudBoxDataSource { authentication_mechanism, class_name, client_id, 6 more }
authentication_mechanism: "developer_token" | "ccg"
The type of authentication to use (Developer Token or CCG)
client_id?: string | null
Box API key used for identifying the application the user is authenticating with
client_secret?: string | null
Box API secret used for making auth requests.
developer_token?: string | null
Developer token for authentication if authentication_mechanism is 'developer_token'.
enterprise_id?: string | null
Box Enterprise ID, if provided authenticates as service.
folder_id?: string | null
The ID of the Box folder to read from.
user_id?: string | null
Box User ID, if provided authenticates as user.
name: string
The name of the data source.
source_type: "S3" | "AZURE_STORAGE_BLOB" | "GOOGLE_DRIVE" | 8 more
created_at?: string | null
Creation datetime
custom_metadata?: Record<string, Record<string, unknown> | Array<unknown> | string | 2 more | null> | null
Custom metadata that will be present on all data loaded from the data source
updated_at?: string | null
Update datetime
Version metadata for the data source
reader_version?: "1.0" | "2.0" | "2.1" | null
The version of the reader to use for this data source.
Get Data Source
import LlamaCloud from '@llamaindex/llama-cloud';
const client = new LlamaCloud({
apiKey: process.env['LLAMA_CLOUD_API_KEY'], // This is the default and can be omitted
});
const dataSource = await client.dataSources.get('182bd5e5-6e1a-4fe4-a799-aa6d9a6ab26e');
console.log(dataSource.id);{
"id": "182bd5e5-6e1a-4fe4-a799-aa6d9a6ab26e",
"component": {
"foo": "bar"
},
"name": "name",
"project_id": "182bd5e5-6e1a-4fe4-a799-aa6d9a6ab26e",
"source_type": "S3",
"created_at": "2019-12-27T18:11:19.117Z",
"custom_metadata": {
"foo": {
"foo": "bar"
}
},
"updated_at": "2019-12-27T18:11:19.117Z",
"version_metadata": {
"reader_version": "1.0"
}
}Returns Examples
{
"id": "182bd5e5-6e1a-4fe4-a799-aa6d9a6ab26e",
"component": {
"foo": "bar"
},
"name": "name",
"project_id": "182bd5e5-6e1a-4fe4-a799-aa6d9a6ab26e",
"source_type": "S3",
"created_at": "2019-12-27T18:11:19.117Z",
"custom_metadata": {
"foo": {
"foo": "bar"
}
},
"updated_at": "2019-12-27T18:11:19.117Z",
"version_metadata": {
"reader_version": "1.0"
}
}