EMR / Client / add_instance_groups



Adds one or more instance groups to a running cluster.

See also: AWS API Documentation

Request Syntax

response = client.add_instance_groups(
            'Name': 'string',
            'Market': 'ON_DEMAND'|'SPOT',
            'InstanceRole': 'MASTER'|'CORE'|'TASK',
            'BidPrice': 'string',
            'InstanceType': 'string',
            'InstanceCount': 123,
            'Configurations': [
                    'Classification': 'string',
                    'Configurations': {'... recursive ...'},
                    'Properties': {
                        'string': 'string'
            'EbsConfiguration': {
                'EbsBlockDeviceConfigs': [
                        'VolumeSpecification': {
                            'VolumeType': 'string',
                            'Iops': 123,
                            'SizeInGB': 123,
                            'Throughput': 123
                        'VolumesPerInstance': 123
                'EbsOptimized': True|False
            'AutoScalingPolicy': {
                'Constraints': {
                    'MinCapacity': 123,
                    'MaxCapacity': 123
                'Rules': [
                        'Name': 'string',
                        'Description': 'string',
                        'Action': {
                            'Market': 'ON_DEMAND'|'SPOT',
                            'SimpleScalingPolicyConfiguration': {
                                'AdjustmentType': 'CHANGE_IN_CAPACITY'|'PERCENT_CHANGE_IN_CAPACITY'|'EXACT_CAPACITY',
                                'ScalingAdjustment': 123,
                                'CoolDown': 123
                        'Trigger': {
                            'CloudWatchAlarmDefinition': {
                                'ComparisonOperator': 'GREATER_THAN_OR_EQUAL'|'GREATER_THAN'|'LESS_THAN'|'LESS_THAN_OR_EQUAL',
                                'EvaluationPeriods': 123,
                                'MetricName': 'string',
                                'Namespace': 'string',
                                'Period': 123,
                                'Statistic': 'SAMPLE_COUNT'|'AVERAGE'|'SUM'|'MINIMUM'|'MAXIMUM',
                                'Threshold': 123.0,
                                'Dimensions': [
                                        'Key': 'string',
                                        'Value': 'string'
            'CustomAmiId': 'string'
  • InstanceGroups (list) –


    Instance groups to add.

    • (dict) –

      Configuration defining a new instance group.

      • Name (string) –

        Friendly name given to the instance group.

      • Market (string) –

        Market type of the Amazon EC2 instances used to create a cluster node.

      • InstanceRole (string) – [REQUIRED]

        The role of the instance group in the cluster.

      • BidPrice (string) –

        If specified, indicates that the instance group uses Spot Instances. This is the maximum price you are willing to pay for Spot Instances. Specify OnDemandPrice to set the amount equal to the On-Demand price, or specify an amount in USD.

      • InstanceType (string) – [REQUIRED]

        The Amazon EC2 instance type for all instances in the instance group.

      • InstanceCount (integer) – [REQUIRED]

        Target number of instances for the instance group.

      • Configurations (list) –


        Amazon EMR releases 4.x or later.

        The list of configurations supplied for an Amazon EMR cluster instance group. You can specify a separate configuration for each instance group (master, core, and task).

        • (dict) –


          Amazon EMR releases 4.x or later.

          An optional configuration specification to be used when provisioning cluster instances, which can include configurations for applications and software bundled with Amazon EMR. A configuration consists of a classification, properties, and optional nested configurations. A classification refers to an application-specific configuration file. Properties are the settings you want to change in that file. For more information, see Configuring Applications.

          • Classification (string) –

            The classification within a configuration.

          • Configurations (list) –

            A list of additional configurations to apply within a configuration object.

          • Properties (dict) –

            A set of properties specified within a configuration classification.

            • (string) –

              • (string) –

      • EbsConfiguration (dict) –

        EBS configurations that will be attached to each Amazon EC2 instance in the instance group.

        • EbsBlockDeviceConfigs (list) –

          An array of Amazon EBS volume specifications attached to a cluster instance.

          • (dict) –

            Configuration of requested EBS block device associated with the instance group with count of volumes that are associated to every instance.

            • VolumeSpecification (dict) – [REQUIRED]

              EBS volume specifications such as volume type, IOPS, size (GiB) and throughput (MiB/s) that are requested for the EBS volume attached to an Amazon EC2 instance in the cluster.

              • VolumeType (string) – [REQUIRED]

                The volume type. Volume types supported are gp3, gp2, io1, st1, sc1, and standard.

              • Iops (integer) –

                The number of I/O operations per second (IOPS) that the volume supports.

              • SizeInGB (integer) – [REQUIRED]

                The volume size, in gibibytes (GiB). This can be a number from 1 - 1024. If the volume type is EBS-optimized, the minimum value is 10.

              • Throughput (integer) –

                The throughput, in mebibyte per second (MiB/s). This optional parameter can be a number from 125 - 1000 and is valid only for gp3 volumes.

            • VolumesPerInstance (integer) –

              Number of EBS volumes with a specific volume configuration that are associated with every instance in the instance group

        • EbsOptimized (boolean) –

          Indicates whether an Amazon EBS volume is EBS-optimized.

      • AutoScalingPolicy (dict) –

        An automatic scaling policy for a core instance group or task instance group in an Amazon EMR cluster. The automatic scaling policy defines how an instance group dynamically adds and terminates Amazon EC2 instances in response to the value of a CloudWatch metric. See PutAutoScalingPolicy.

        • Constraints (dict) – [REQUIRED]

          The upper and lower Amazon EC2 instance limits for an automatic scaling policy. Automatic scaling activity will not cause an instance group to grow above or below these limits.

          • MinCapacity (integer) – [REQUIRED]

            The lower boundary of Amazon EC2 instances in an instance group below which scaling activities are not allowed to shrink. Scale-in activities will not terminate instances below this boundary.

          • MaxCapacity (integer) – [REQUIRED]

            The upper boundary of Amazon EC2 instances in an instance group beyond which scaling activities are not allowed to grow. Scale-out activities will not add instances beyond this boundary.

        • Rules (list) – [REQUIRED]

          The scale-in and scale-out rules that comprise the automatic scaling policy.

          • (dict) –

            A scale-in or scale-out rule that defines scaling activity, including the CloudWatch metric alarm that triggers activity, how Amazon EC2 instances are added or removed, and the periodicity of adjustments. The automatic scaling policy for an instance group can comprise one or more automatic scaling rules.

            • Name (string) – [REQUIRED]

              The name used to identify an automatic scaling rule. Rule names must be unique within a scaling policy.

            • Description (string) –

              A friendly, more verbose description of the automatic scaling rule.

            • Action (dict) – [REQUIRED]

              The conditions that trigger an automatic scaling activity.

              • Market (string) –

                Not available for instance groups. Instance groups use the market type specified for the group.

              • SimpleScalingPolicyConfiguration (dict) – [REQUIRED]

                The type of adjustment the automatic scaling activity makes when triggered, and the periodicity of the adjustment.

                • AdjustmentType (string) –

                  The way in which Amazon EC2 instances are added (if ScalingAdjustment is a positive number) or terminated (if ScalingAdjustment is a negative number) each time the scaling activity is triggered. CHANGE_IN_CAPACITY is the default. CHANGE_IN_CAPACITY indicates that the Amazon EC2 instance count increments or decrements by ScalingAdjustment, which should be expressed as an integer. PERCENT_CHANGE_IN_CAPACITY indicates the instance count increments or decrements by the percentage specified by ScalingAdjustment, which should be expressed as an integer. For example, 20 indicates an increase in 20% increments of cluster capacity. EXACT_CAPACITY indicates the scaling activity results in an instance group with the number of Amazon EC2 instances specified by ScalingAdjustment, which should be expressed as a positive integer.

                • ScalingAdjustment (integer) – [REQUIRED]

                  The amount by which to scale in or scale out, based on the specified AdjustmentType. A positive value adds to the instance group’s Amazon EC2 instance count while a negative number removes instances. If AdjustmentType is set to EXACT_CAPACITY, the number should only be a positive integer. If AdjustmentType is set to PERCENT_CHANGE_IN_CAPACITY, the value should express the percentage as an integer. For example, -20 indicates a decrease in 20% increments of cluster capacity.

                • CoolDown (integer) –

                  The amount of time, in seconds, after a scaling activity completes before any further trigger-related scaling activities can start. The default value is 0.

            • Trigger (dict) – [REQUIRED]

              The CloudWatch alarm definition that determines when automatic scaling activity is triggered.

              • CloudWatchAlarmDefinition (dict) – [REQUIRED]

                The definition of a CloudWatch metric alarm. When the defined alarm conditions are met along with other trigger parameters, scaling activity begins.

                • ComparisonOperator (string) – [REQUIRED]

                  Determines how the metric specified by MetricName is compared to the value specified by Threshold.

                • EvaluationPeriods (integer) –

                  The number of periods, in five-minute increments, during which the alarm condition must exist before the alarm triggers automatic scaling activity. The default value is 1.

                • MetricName (string) – [REQUIRED]

                  The name of the CloudWatch metric that is watched to determine an alarm condition.

                • Namespace (string) –

                  The namespace for the CloudWatch metric. The default is AWS/ElasticMapReduce.

                • Period (integer) – [REQUIRED]

                  The period, in seconds, over which the statistic is applied. CloudWatch metrics for Amazon EMR are emitted every five minutes (300 seconds), so if you specify a CloudWatch metric, specify 300.

                • Statistic (string) –

                  The statistic to apply to the metric associated with the alarm. The default is AVERAGE.

                • Threshold (float) – [REQUIRED]

                  The value against which the specified statistic is compared.

                • Unit (string) –

                  The unit of measure associated with the CloudWatch metric being watched. The value specified for Unit must correspond to the units specified in the CloudWatch metric.

                • Dimensions (list) –

                  A CloudWatch metric dimension.

                  • (dict) –

                    A CloudWatch dimension, which is specified using a Key (known as a Name in CloudWatch), Value pair. By default, Amazon EMR uses one dimension whose Key is JobFlowID and Value is a variable representing the cluster ID, which is ${emr.clusterId}. This enables the rule to bootstrap when the cluster ID becomes available.

                    • Key (string) –

                      The dimension name.

                    • Value (string) –

                      The dimension value.

      • CustomAmiId (string) –

        The custom AMI ID to use for the provisioned instance group.

  • JobFlowId (string) –


    Job flow in which to add the instance groups.

Return type:



Response Syntax

    'JobFlowId': 'string',
    'InstanceGroupIds': [
    'ClusterArn': 'string'

Response Structure

  • (dict) –

    Output from an AddInstanceGroups call.

    • JobFlowId (string) –

      The job flow ID in which the instance groups are added.

    • InstanceGroupIds (list) –

      Instance group IDs of the newly created instance groups.

      • (string) –

    • ClusterArn (string) –

      The Amazon Resource Name of the cluster.