QBusiness / Client / batch_put_document

batch_put_document#

QBusiness.Client.batch_put_document(**kwargs)#

Adds one or more documents to an Amazon Q Business index.

You use this API to:

  • ingest your structured and unstructured documents and documents stored in an Amazon S3 bucket into an Amazon Q Business index.

  • add custom attributes to documents in an Amazon Q Business index.

  • attach an access control list to the documents added to an Amazon Q Business index.

You can see the progress of the deletion, and any error messages related to the process, by using CloudWatch.

See also: AWS API Documentation

Request Syntax

response = client.batch_put_document(
    applicationId='string',
    indexId='string',
    documents=[
        {
            'id': 'string',
            'attributes': [
                {
                    'name': 'string',
                    'value': {
                        'stringValue': 'string',
                        'stringListValue': [
                            'string',
                        ],
                        'longValue': 123,
                        'dateValue': datetime(2015, 1, 1)
                    }
                },
            ],
            'content': {
                'blob': b'bytes',
                's3': {
                    'bucket': 'string',
                    'key': 'string'
                }
            },
            'contentType': 'PDF'|'HTML'|'MS_WORD'|'PLAIN_TEXT'|'PPT'|'RTF'|'XML'|'XSLT'|'MS_EXCEL'|'CSV'|'JSON'|'MD',
            'title': 'string',
            'accessConfiguration': {
                'accessControls': [
                    {
                        'principals': [
                            {
                                'user': {
                                    'id': 'string',
                                    'access': 'ALLOW'|'DENY',
                                    'membershipType': 'INDEX'|'DATASOURCE'
                                },
                                'group': {
                                    'name': 'string',
                                    'access': 'ALLOW'|'DENY',
                                    'membershipType': 'INDEX'|'DATASOURCE'
                                }
                            },
                        ],
                        'memberRelation': 'AND'|'OR'
                    },
                ],
                'memberRelation': 'AND'|'OR'
            },
            'documentEnrichmentConfiguration': {
                'inlineConfigurations': [
                    {
                        'condition': {
                            'key': 'string',
                            'operator': 'GREATER_THAN'|'GREATER_THAN_OR_EQUALS'|'LESS_THAN'|'LESS_THAN_OR_EQUALS'|'EQUALS'|'NOT_EQUALS'|'CONTAINS'|'NOT_CONTAINS'|'EXISTS'|'NOT_EXISTS'|'BEGINS_WITH',
                            'value': {
                                'stringValue': 'string',
                                'stringListValue': [
                                    'string',
                                ],
                                'longValue': 123,
                                'dateValue': datetime(2015, 1, 1)
                            }
                        },
                        'target': {
                            'key': 'string',
                            'value': {
                                'stringValue': 'string',
                                'stringListValue': [
                                    'string',
                                ],
                                'longValue': 123,
                                'dateValue': datetime(2015, 1, 1)
                            },
                            'attributeValueOperator': 'DELETE'
                        },
                        'documentContentOperator': 'DELETE'
                    },
                ],
                'preExtractionHookConfiguration': {
                    'invocationCondition': {
                        'key': 'string',
                        'operator': 'GREATER_THAN'|'GREATER_THAN_OR_EQUALS'|'LESS_THAN'|'LESS_THAN_OR_EQUALS'|'EQUALS'|'NOT_EQUALS'|'CONTAINS'|'NOT_CONTAINS'|'EXISTS'|'NOT_EXISTS'|'BEGINS_WITH',
                        'value': {
                            'stringValue': 'string',
                            'stringListValue': [
                                'string',
                            ],
                            'longValue': 123,
                            'dateValue': datetime(2015, 1, 1)
                        }
                    },
                    'lambdaArn': 'string',
                    's3BucketName': 'string',
                    'roleArn': 'string'
                },
                'postExtractionHookConfiguration': {
                    'invocationCondition': {
                        'key': 'string',
                        'operator': 'GREATER_THAN'|'GREATER_THAN_OR_EQUALS'|'LESS_THAN'|'LESS_THAN_OR_EQUALS'|'EQUALS'|'NOT_EQUALS'|'CONTAINS'|'NOT_CONTAINS'|'EXISTS'|'NOT_EXISTS'|'BEGINS_WITH',
                        'value': {
                            'stringValue': 'string',
                            'stringListValue': [
                                'string',
                            ],
                            'longValue': 123,
                            'dateValue': datetime(2015, 1, 1)
                        }
                    },
                    'lambdaArn': 'string',
                    's3BucketName': 'string',
                    'roleArn': 'string'
                }
            },
            'mediaExtractionConfiguration': {
                'imageExtractionConfiguration': {
                    'imageExtractionStatus': 'ENABLED'|'DISABLED'
                }
            }
        },
    ],
    roleArn='string',
    dataSourceSyncId='string'
)
Parameters:
  • applicationId (string) –

    [REQUIRED]

    The identifier of the Amazon Q Business application.

  • indexId (string) –

    [REQUIRED]

    The identifier of the Amazon Q Business index to add the documents to.

  • documents (list) –

    [REQUIRED]

    One or more documents to add to the index.

    • (dict) –

      A document in an Amazon Q Business application.

      • id (string) – [REQUIRED]

        The identifier of the document.

      • attributes (list) –

        Custom attributes to apply to the document for refining Amazon Q Business web experience responses.

        • (dict) –

          A document attribute or metadata field.

          • name (string) – [REQUIRED]

            The identifier for the attribute.

          • value (dict) – [REQUIRED]

            The value of the attribute.

            Note

            This is a Tagged Union structure. Only one of the following top level keys can be set: stringValue, stringListValue, longValue, dateValue.

            • stringValue (string) –

              A string.

            • stringListValue (list) –

              A list of strings.

              • (string) –

            • longValue (integer) –

              A long integer value.

            • dateValue (datetime) –

              A date expressed as an ISO 8601 string.

              It’s important for the time zone to be included in the ISO 8601 date-time format. For example, 2012-03-25T12:30:10+01:00 is the ISO 8601 date-time format for March 25th 2012 at 12:30PM (plus 10 seconds) in Central European Time.

      • content (dict) –

        The contents of the document.

        Note

        This is a Tagged Union structure. Only one of the following top level keys can be set: blob, s3.

        • blob (bytes) –

          The contents of the document. Documents passed to the blob parameter must be base64 encoded. Your code might not need to encode the document file bytes if you’re using an Amazon Web Services SDK to call Amazon Q Business APIs. If you are calling the Amazon Q Business endpoint directly using REST, you must base64 encode the contents before sending.

        • s3 (dict) –

          The path to the document in an Amazon S3 bucket.

          • bucket (string) – [REQUIRED]

            The name of the S3 bucket that contains the file.

          • key (string) – [REQUIRED]

            The name of the file.

      • contentType (string) –

        The file type of the document in the Blob field.

        If you want to index snippets or subsets of HTML documents instead of the entirety of the HTML documents, you add the HTML start and closing tags ( <HTML>content</HTML>) around the content.

      • title (string) –

        The title of the document.

      • accessConfiguration (dict) –

        Configuration information for access permission to a document.

        • accessControls (list) – [REQUIRED]

          A list of AccessControlList objects.

          • (dict) –

            A list of principals. Each principal can be either a USER or a GROUP and can be designated document access permissions of either ALLOW or DENY.

            • principals (list) – [REQUIRED]

              Contains a list of principals, where a principal can be either a USER or a GROUP. Each principal can be have the following type of document access: ALLOW or DENY.

              • (dict) –

                Provides user and group information used for filtering documents to use for generating Amazon Q Business conversation responses.

                Note

                This is a Tagged Union structure. Only one of the following top level keys can be set: user, group.

                • user (dict) –

                  The user associated with the principal.

                  • id (string) –

                    The identifier of the user.

                  • access (string) – [REQUIRED]

                    Provides information about whether to allow or deny access to the principal.

                  • membershipType (string) –

                    The type of group.

                • group (dict) –

                  The group associated with the principal.

                  • name (string) –

                    The name of the group.

                  • access (string) – [REQUIRED]

                    Provides information about whether to allow or deny access to the principal.

                  • membershipType (string) –

                    The type of group.

            • memberRelation (string) –

              Describes the member relation within a principal list.

        • memberRelation (string) –

          Describes the member relation within the AccessControlList object.

      • documentEnrichmentConfiguration (dict) –

        The configuration information for altering document metadata and content during the document ingestion process.

        • inlineConfigurations (list) –

          Configuration information to alter document attributes or metadata fields and content when ingesting documents into Amazon Q Business.

          • (dict) –

            Provides the configuration information for applying basic logic to alter document metadata and content when ingesting documents into Amazon Q Business.

            To apply advanced logic, to go beyond what you can do with basic logic, see HookConfiguration.

            For more information, see Custom document enrichment.

            • condition (dict) –

              The condition used for the target document attribute or metadata field when ingesting documents into Amazon Q Business. You use this with DocumentAttributeTarget to apply the condition.

              For example, you can create the ‘Department’ target field and have it prefill department names associated with the documents based on information in the ‘Source_URI’ field. Set the condition that if the ‘Source_URI’ field contains ‘financial’ in its URI value, then prefill the target field ‘Department’ with the target value ‘Finance’ for the document.

              Amazon Q Business can’t create a target field if it has not already been created as an index field. After you create your index field, you can create a document metadata field using DocumentAttributeTarget. Amazon Q Business then will map your newly created metadata field to your index field.

              • key (string) – [REQUIRED]

                The identifier of the document attribute used for the condition.

                For example, ‘Source_URI’ could be an identifier for the attribute or metadata field that contains source URIs associated with the documents.

                Amazon Q Business currently doesn’t support _document_body as an attribute key used for the condition.

              • operator (string) – [REQUIRED]

                The identifier of the document attribute used for the condition.

                For example, ‘Source_URI’ could be an identifier for the attribute or metadata field that contains source URIs associated with the documents.

                Amazon Q Business currently does not support _document_body as an attribute key used for the condition.

              • value (dict) –

                The value of a document attribute. You can only provide one value for a document attribute.

                Note

                This is a Tagged Union structure. Only one of the following top level keys can be set: stringValue, stringListValue, longValue, dateValue.

                • stringValue (string) –

                  A string.

                • stringListValue (list) –

                  A list of strings.

                  • (string) –

                • longValue (integer) –

                  A long integer value.

                • dateValue (datetime) –

                  A date expressed as an ISO 8601 string.

                  It’s important for the time zone to be included in the ISO 8601 date-time format. For example, 2012-03-25T12:30:10+01:00 is the ISO 8601 date-time format for March 25th 2012 at 12:30PM (plus 10 seconds) in Central European Time.

            • target (dict) –

              The target document attribute or metadata field you want to alter when ingesting documents into Amazon Q Business.

              For example, you can delete all customer identification numbers associated with the documents, stored in the document metadata field called ‘Customer_ID’ by setting the target key as ‘Customer_ID’ and the deletion flag to TRUE. This removes all customer ID values in the field ‘Customer_ID’. This would scrub personally identifiable information from each document’s metadata.

              Amazon Q Business can’t create a target field if it has not already been created as an index field. After you create your index field, you can create a document metadata field using DocumentAttributeTarget. Amazon Q Business will then map your newly created document attribute to your index field.

              You can also use this with DocumentAttributeCondition.

              • key (string) – [REQUIRED]

                The identifier of the target document attribute or metadata field. For example, ‘Department’ could be an identifier for the target attribute or metadata field that includes the department names associated with the documents.

              • value (dict) –

                The value of a document attribute. You can only provide one value for a document attribute.

                Note

                This is a Tagged Union structure. Only one of the following top level keys can be set: stringValue, stringListValue, longValue, dateValue.

                • stringValue (string) –

                  A string.

                • stringListValue (list) –

                  A list of strings.

                  • (string) –

                • longValue (integer) –

                  A long integer value.

                • dateValue (datetime) –

                  A date expressed as an ISO 8601 string.

                  It’s important for the time zone to be included in the ISO 8601 date-time format. For example, 2012-03-25T12:30:10+01:00 is the ISO 8601 date-time format for March 25th 2012 at 12:30PM (plus 10 seconds) in Central European Time.

              • attributeValueOperator (string) –

                TRUE to delete the existing target value for your specified target attribute key. You cannot create a target value and set this to TRUE.

            • documentContentOperator (string) –

              TRUE to delete content if the condition used for the target attribute is met.

        • preExtractionHookConfiguration (dict) –

          Provides the configuration information for invoking a Lambda function in Lambda to alter document metadata and content when ingesting documents into Amazon Q Business.

          You can configure your Lambda function using the PreExtractionHookConfiguration parameter if you want to apply advanced alterations on the original or raw documents.

          If you want to apply advanced alterations on the Amazon Q Business structured documents, you must configure your Lambda function using PostExtractionHookConfiguration.

          You can only invoke one Lambda function. However, this function can invoke other functions it requires.

          For more information, see Custom document enrichment.

          • invocationCondition (dict) –

            The condition used for when a Lambda function should be invoked.

            For example, you can specify a condition that if there are empty date-time values, then Amazon Q Business should invoke a function that inserts the current date-time.

            • key (string) – [REQUIRED]

              The identifier of the document attribute used for the condition.

              For example, ‘Source_URI’ could be an identifier for the attribute or metadata field that contains source URIs associated with the documents.

              Amazon Q Business currently doesn’t support _document_body as an attribute key used for the condition.

            • operator (string) – [REQUIRED]

              The identifier of the document attribute used for the condition.

              For example, ‘Source_URI’ could be an identifier for the attribute or metadata field that contains source URIs associated with the documents.

              Amazon Q Business currently does not support _document_body as an attribute key used for the condition.

            • value (dict) –

              The value of a document attribute. You can only provide one value for a document attribute.

              Note

              This is a Tagged Union structure. Only one of the following top level keys can be set: stringValue, stringListValue, longValue, dateValue.

              • stringValue (string) –

                A string.

              • stringListValue (list) –

                A list of strings.

                • (string) –

              • longValue (integer) –

                A long integer value.

              • dateValue (datetime) –

                A date expressed as an ISO 8601 string.

                It’s important for the time zone to be included in the ISO 8601 date-time format. For example, 2012-03-25T12:30:10+01:00 is the ISO 8601 date-time format for March 25th 2012 at 12:30PM (plus 10 seconds) in Central European Time.

          • lambdaArn (string) –

            The Amazon Resource Name (ARN) of a role with permission to run a Lambda function during ingestion. For more information, see IAM roles for Custom Document Enrichment (CDE).

          • s3BucketName (string) –

            Stores the original, raw documents or the structured, parsed documents before and after altering them. For more information, see Data contracts for Lambda functions.

          • roleArn (string) –

            The Amazon Resource Name (ARN) of a role with permission to run PreExtractionHookConfiguration and PostExtractionHookConfiguration for altering document metadata and content during the document ingestion process.

        • postExtractionHookConfiguration (dict) –

          Provides the configuration information for invoking a Lambda function in Lambda to alter document metadata and content when ingesting documents into Amazon Q Business.

          You can configure your Lambda function using the PreExtractionHookConfiguration parameter if you want to apply advanced alterations on the original or raw documents.

          If you want to apply advanced alterations on the Amazon Q Business structured documents, you must configure your Lambda function using PostExtractionHookConfiguration.

          You can only invoke one Lambda function. However, this function can invoke other functions it requires.

          For more information, see Custom document enrichment.

          • invocationCondition (dict) –

            The condition used for when a Lambda function should be invoked.

            For example, you can specify a condition that if there are empty date-time values, then Amazon Q Business should invoke a function that inserts the current date-time.

            • key (string) – [REQUIRED]

              The identifier of the document attribute used for the condition.

              For example, ‘Source_URI’ could be an identifier for the attribute or metadata field that contains source URIs associated with the documents.

              Amazon Q Business currently doesn’t support _document_body as an attribute key used for the condition.

            • operator (string) – [REQUIRED]

              The identifier of the document attribute used for the condition.

              For example, ‘Source_URI’ could be an identifier for the attribute or metadata field that contains source URIs associated with the documents.

              Amazon Q Business currently does not support _document_body as an attribute key used for the condition.

            • value (dict) –

              The value of a document attribute. You can only provide one value for a document attribute.

              Note

              This is a Tagged Union structure. Only one of the following top level keys can be set: stringValue, stringListValue, longValue, dateValue.

              • stringValue (string) –

                A string.

              • stringListValue (list) –

                A list of strings.

                • (string) –

              • longValue (integer) –

                A long integer value.

              • dateValue (datetime) –

                A date expressed as an ISO 8601 string.

                It’s important for the time zone to be included in the ISO 8601 date-time format. For example, 2012-03-25T12:30:10+01:00 is the ISO 8601 date-time format for March 25th 2012 at 12:30PM (plus 10 seconds) in Central European Time.

          • lambdaArn (string) –

            The Amazon Resource Name (ARN) of a role with permission to run a Lambda function during ingestion. For more information, see IAM roles for Custom Document Enrichment (CDE).

          • s3BucketName (string) –

            Stores the original, raw documents or the structured, parsed documents before and after altering them. For more information, see Data contracts for Lambda functions.

          • roleArn (string) –

            The Amazon Resource Name (ARN) of a role with permission to run PreExtractionHookConfiguration and PostExtractionHookConfiguration for altering document metadata and content during the document ingestion process.

      • mediaExtractionConfiguration (dict) –

        The configuration for extracting information from media in the document.

        • imageExtractionConfiguration (dict) –

          The configuration for extracting semantic meaning from images in documents. For more information, see Extracting semantic meaning from images and visuals.

          • imageExtractionStatus (string) – [REQUIRED]

            Specify whether to extract semantic meaning from images and visuals from documents.

  • roleArn (string) – The Amazon Resource Name (ARN) of an IAM role with permission to access your S3 bucket.

  • dataSourceSyncId (string) – The identifier of the data source sync during which the documents were added.

Return type:

dict

Returns:

Response Syntax

{
    'failedDocuments': [
        {
            'id': 'string',
            'error': {
                'errorMessage': 'string',
                'errorCode': 'InternalError'|'InvalidRequest'|'ResourceInactive'|'ResourceNotFound'
            },
            'dataSourceId': 'string'
        },
    ]
}

Response Structure

  • (dict) –

    • failedDocuments (list) –

      A list of documents that were not added to the Amazon Q Business index because the document failed a validation check. Each document contains an error message that indicates why the document couldn’t be added to the index.

      • (dict) –

        A list of documents that could not be removed from an Amazon Q Business index. Each entry contains an error message that indicates why the document couldn’t be removed from the index.

        • id (string) –

          The identifier of the document that couldn’t be removed from the Amazon Q Business index.

        • error (dict) –

          An explanation for why the document couldn’t be removed from the index.

          • errorMessage (string) –

            The message explaining the Amazon Q Business request error.

          • errorCode (string) –

            The code associated with the Amazon Q Business request error.

        • dataSourceId (string) –

          The identifier of the Amazon Q Business data source connector that contains the failed document.

Exceptions