EMRContainers / Client / start_job_run

start_job_run#

EMRContainers.Client.start_job_run(**kwargs)#

Starts a job run. A job run is a unit of work, such as a Spark jar, PySpark script, or SparkSQL query, that you submit to Amazon EMR on EKS.

See also: AWS API Documentation

Request Syntax

response = client.start_job_run(
    name='string',
    virtualClusterId='string',
    clientToken='string',
    executionRoleArn='string',
    releaseLabel='string',
    jobDriver={
        'sparkSubmitJobDriver': {
            'entryPoint': 'string',
            'entryPointArguments': [
                'string',
            ],
            'sparkSubmitParameters': 'string'
        },
        'sparkSqlJobDriver': {
            'entryPoint': 'string',
            'sparkSqlParameters': 'string'
        }
    },
    configurationOverrides={
        'applicationConfiguration': [
            {
                'classification': 'string',
                'properties': {
                    'string': 'string'
                },
                'configurations': {'... recursive ...'}
            },
        ],
        'monitoringConfiguration': {
            'persistentAppUI': 'ENABLED'|'DISABLED',
            'cloudWatchMonitoringConfiguration': {
                'logGroupName': 'string',
                'logStreamNamePrefix': 'string'
            },
            's3MonitoringConfiguration': {
                'logUri': 'string'
            },
            'containerLogRotationConfiguration': {
                'rotationSize': 'string',
                'maxFilesToKeep': 123
            }
        }
    },
    tags={
        'string': 'string'
    },
    jobTemplateId='string',
    jobTemplateParameters={
        'string': 'string'
    },
    retryPolicyConfiguration={
        'maxAttempts': 123
    }
)
Parameters:
  • name (string) – The name of the job run.

  • virtualClusterId (string) –

    [REQUIRED]

    The virtual cluster ID for which the job run request is submitted.

  • clientToken (string) –

    [REQUIRED]

    The client idempotency token of the job run request.

    This field is autopopulated if not provided.

  • executionRoleArn (string) – The execution role ARN for the job run.

  • releaseLabel (string) – The Amazon EMR release version to use for the job run.

  • jobDriver (dict) –

    The job driver for the job run.

    • sparkSubmitJobDriver (dict) –

      The job driver parameters specified for spark submit.

      • entryPoint (string) – [REQUIRED]

        The entry point of job application.

      • entryPointArguments (list) –

        The arguments for job application.

        • (string) –

      • sparkSubmitParameters (string) –

        The Spark submit parameters that are used for job runs.

    • sparkSqlJobDriver (dict) –

      The job driver for job type.

      • entryPoint (string) –

        The SQL file to be executed.

      • sparkSqlParameters (string) –

        The Spark parameters to be included in the Spark SQL command.

  • configurationOverrides (dict) –

    The configuration overrides for the job run.

    • applicationConfiguration (list) –

      The configurations for the application running by the job run.

      • (dict) –

        A configuration specification to be used when provisioning virtual clusters, which can include configurations for applications and software bundled with Amazon EMR on EKS. A configuration consists of a classification, properties, and optional nested configurations. A classification refers to an application-specific configuration file. Properties are the settings you want to change in that file.

        • classification (string) – [REQUIRED]

          The classification within a configuration.

        • properties (dict) –

          A set of properties specified within a configuration classification.

          • (string) –

            • (string) –

        • configurations (list) –

          A list of additional configurations to apply within a configuration object.

    • monitoringConfiguration (dict) –

      The configurations for monitoring.

      • persistentAppUI (string) –

        Monitoring configurations for the persistent application UI.

      • cloudWatchMonitoringConfiguration (dict) –

        Monitoring configurations for CloudWatch.

        • logGroupName (string) – [REQUIRED]

          The name of the log group for log publishing.

        • logStreamNamePrefix (string) –

          The specified name prefix for log streams.

      • s3MonitoringConfiguration (dict) –

        Amazon S3 configuration for monitoring log publishing.

        • logUri (string) – [REQUIRED]

          Amazon S3 destination URI for log publishing.

      • containerLogRotationConfiguration (dict) –

        Enable or disable container log rotation.

        • rotationSize (string) – [REQUIRED]

          The file size at which to rotate logs. Minimum of 2KB, Maximum of 2GB.

        • maxFilesToKeep (integer) – [REQUIRED]

          The number of files to keep in container after rotation.

  • tags (dict) –

    The tags assigned to job runs.

    • (string) –

      • (string) –

  • jobTemplateId (string) – The job template ID to be used to start the job run.

  • jobTemplateParameters (dict) –

    The values of job template parameters to start a job run.

    • (string) –

      • (string) –

  • retryPolicyConfiguration (dict) –

    The retry policy configuration for the job run.

    • maxAttempts (integer) – [REQUIRED]

      The maximum number of attempts on the job’s driver.

Return type:

dict

Returns:

Response Syntax

{
    'id': 'string',
    'name': 'string',
    'arn': 'string',
    'virtualClusterId': 'string'
}

Response Structure

  • (dict) –

    • id (string) –

      This output displays the started job run ID.

    • name (string) –

      This output displays the name of the started job run.

    • arn (string) –

      This output lists the ARN of job run.

    • virtualClusterId (string) –

      This output displays the virtual cluster ID for which the job run was submitted.

Exceptions