AgentsforBedrock / Client / start_ingestion_job

start_ingestion_job¶

AgentsforBedrock.Client.start_ingestion_job(**kwargs)¶

Begins a data ingestion job. Data sources are ingested into your knowledge base so that Large Language Models (LLMs) can use your data.

Request Syntax

response = client.start_ingestion_job(
    knowledgeBaseId='string',
    dataSourceId='string',
    clientToken='string',
    description='string'
)

Parameters:

knowledgeBaseId (string) –
[REQUIRED]

The unique identifier of the knowledge base for the data ingestion job.
dataSourceId (string) –
[REQUIRED]

The unique identifier of the data source you want to ingest into your knowledge base.
clientToken (string) –
A unique, case-sensitive identifier to ensure that the API request completes no more than one time. If this token matches a previous request, Amazon Bedrock ignores the request, but does not return an error. For more information, see Ensuring idempotency.

This field is autopopulated if not provided.
description (string) – A description of the data ingestion job.

Return type:

dict

Returns:

Response Syntax

{
    'ingestionJob': {
        'knowledgeBaseId': 'string',
        'dataSourceId': 'string',
        'ingestionJobId': 'string',
        'description': 'string',
        'status': 'STARTING'|'IN_PROGRESS'|'COMPLETE'|'FAILED'|'STOPPING'|'STOPPED',
        'statistics': {
            'numberOfDocumentsScanned': 123,
            'numberOfMetadataDocumentsScanned': 123,
            'numberOfNewDocumentsIndexed': 123,
            'numberOfModifiedDocumentsIndexed': 123,
            'numberOfMetadataDocumentsModified': 123,
            'numberOfDocumentsDeleted': 123,
            'numberOfDocumentsFailed': 123
        },
        'failureReasons': [
            'string',
        ],
        'startedAt': datetime(2015, 1, 1),
        'updatedAt': datetime(2015, 1, 1)
    }
}

Response Structure

(dict) –
- ingestionJob (dict) –
  
  Contains information about the data ingestion job.
  - knowledgeBaseId (string) –
    
    The unique identifier of the knowledge for the data ingestion job.
  - dataSourceId (string) –
    
    The unique identifier of the data source for the data ingestion job.
  - ingestionJobId (string) –
    
    The unique identifier of the data ingestion job.
  - description (string) –
    
    The description of the data ingestion job.
  - status (string) –
    
    The status of the data ingestion job.
  - statistics (dict) –
    
    Contains statistics about the data ingestion job.
    - numberOfDocumentsScanned (integer) –
      
      The total number of source documents that were scanned. Includes new, updated, and unchanged documents.
    - numberOfMetadataDocumentsScanned (integer) –
      
      The total number of metadata files that were scanned. Includes new, updated, and unchanged files.
    - numberOfNewDocumentsIndexed (integer) –
      
      The number of new source documents in the data source that were successfully indexed.
    - numberOfModifiedDocumentsIndexed (integer) –
      
      The number of modified source documents in the data source that were successfully indexed.
    - numberOfMetadataDocumentsModified (integer) –
      
      The number of metadata files that were updated or deleted.
    - numberOfDocumentsDeleted (integer) –
      
      The number of source documents that were deleted.
    - numberOfDocumentsFailed (integer) –
      
      The number of source documents that failed to be ingested.
  - failureReasons (list) –
    
    A list of reasons that the data ingestion job failed.
    - (string) –
  - startedAt (datetime) –
    
    The time the data ingestion job started.
    
    If you stop a data ingestion job, the startedAt time is the time the job was started before the job was stopped.
  - updatedAt (datetime) –
    
    The time the data ingestion job was last updated.
    
    If you stop a data ingestion job, the updatedAt time is the time the job was stopped.

start_ingestion_job¶

Request Syntax

Response Syntax

Response Structure

Exceptions