EMR / Client / describe_cluster

describe_cluster#

EMR.Client.describe_cluster(**kwargs)#

Provides cluster-level details including status, hardware and software configuration, VPC settings, and so on.

See also: AWS API Documentation

Request Syntax

response = client.describe_cluster(
    ClusterId='string'
)
Parameters:

ClusterId (string) –

[REQUIRED]

The identifier of the cluster to describe.

Return type:

dict

Returns:

Response Syntax

{
    'Cluster': {
        'Id': 'string',
        'Name': 'string',
        'Status': {
            'State': 'STARTING'|'BOOTSTRAPPING'|'RUNNING'|'WAITING'|'TERMINATING'|'TERMINATED'|'TERMINATED_WITH_ERRORS',
            'StateChangeReason': {
                'Code': 'INTERNAL_ERROR'|'VALIDATION_ERROR'|'INSTANCE_FAILURE'|'INSTANCE_FLEET_TIMEOUT'|'BOOTSTRAP_FAILURE'|'USER_REQUEST'|'STEP_FAILURE'|'ALL_STEPS_COMPLETED',
                'Message': 'string'
            },
            'Timeline': {
                'CreationDateTime': datetime(2015, 1, 1),
                'ReadyDateTime': datetime(2015, 1, 1),
                'EndDateTime': datetime(2015, 1, 1)
            },
            'ErrorDetails': [
                {
                    'ErrorCode': 'string',
                    'ErrorData': [
                        {
                            'string': 'string'
                        },
                    ],
                    'ErrorMessage': 'string'
                },
            ]
        },
        'Ec2InstanceAttributes': {
            'Ec2KeyName': 'string',
            'Ec2SubnetId': 'string',
            'RequestedEc2SubnetIds': [
                'string',
            ],
            'Ec2AvailabilityZone': 'string',
            'RequestedEc2AvailabilityZones': [
                'string',
            ],
            'IamInstanceProfile': 'string',
            'EmrManagedMasterSecurityGroup': 'string',
            'EmrManagedSlaveSecurityGroup': 'string',
            'ServiceAccessSecurityGroup': 'string',
            'AdditionalMasterSecurityGroups': [
                'string',
            ],
            'AdditionalSlaveSecurityGroups': [
                'string',
            ]
        },
        'InstanceCollectionType': 'INSTANCE_FLEET'|'INSTANCE_GROUP',
        'LogUri': 'string',
        'LogEncryptionKmsKeyId': 'string',
        'RequestedAmiVersion': 'string',
        'RunningAmiVersion': 'string',
        'ReleaseLabel': 'string',
        'AutoTerminate': True|False,
        'TerminationProtected': True|False,
        'UnhealthyNodeReplacement': True|False,
        'VisibleToAllUsers': True|False,
        'Applications': [
            {
                'Name': 'string',
                'Version': 'string',
                'Args': [
                    'string',
                ],
                'AdditionalInfo': {
                    'string': 'string'
                }
            },
        ],
        'Tags': [
            {
                'Key': 'string',
                'Value': 'string'
            },
        ],
        'ServiceRole': 'string',
        'NormalizedInstanceHours': 123,
        'MasterPublicDnsName': 'string',
        'Configurations': [
            {
                'Classification': 'string',
                'Configurations': {'... recursive ...'},
                'Properties': {
                    'string': 'string'
                }
            },
        ],
        'SecurityConfiguration': 'string',
        'AutoScalingRole': 'string',
        'ScaleDownBehavior': 'TERMINATE_AT_INSTANCE_HOUR'|'TERMINATE_AT_TASK_COMPLETION',
        'CustomAmiId': 'string',
        'EbsRootVolumeSize': 123,
        'RepoUpgradeOnBoot': 'SECURITY'|'NONE',
        'KerberosAttributes': {
            'Realm': 'string',
            'KdcAdminPassword': 'string',
            'CrossRealmTrustPrincipalPassword': 'string',
            'ADDomainJoinUser': 'string',
            'ADDomainJoinPassword': 'string'
        },
        'ClusterArn': 'string',
        'OutpostArn': 'string',
        'StepConcurrencyLevel': 123,
        'PlacementGroups': [
            {
                'InstanceRole': 'MASTER'|'CORE'|'TASK',
                'PlacementStrategy': 'SPREAD'|'PARTITION'|'CLUSTER'|'NONE'
            },
        ],
        'OSReleaseLabel': 'string',
        'EbsRootVolumeIops': 123,
        'EbsRootVolumeThroughput': 123
    }
}

Response Structure

  • (dict) –

    This output contains the description of the cluster.

    • Cluster (dict) –

      This output contains the details for the requested cluster.

      • Id (string) –

        The unique identifier for the cluster.

      • Name (string) –

        The name of the cluster. This parameter can’t contain the characters <, >, $, |, or ` (backtick).

      • Status (dict) –

        The current status details about the cluster.

        • State (string) –

          The current state of the cluster.

        • StateChangeReason (dict) –

          The reason for the cluster status change.

          • Code (string) –

            The programmatic code for the state change reason.

          • Message (string) –

            The descriptive message for the state change reason.

        • Timeline (dict) –

          A timeline that represents the status of a cluster over the lifetime of the cluster.

          • CreationDateTime (datetime) –

            The creation date and time of the cluster.

          • ReadyDateTime (datetime) –

            The date and time when the cluster was ready to run steps.

          • EndDateTime (datetime) –

            The date and time when the cluster was terminated.

        • ErrorDetails (list) –

          A list of tuples that provides information about the errors that caused a cluster to terminate. This structure can contain up to 10 different ErrorDetail tuples.

          • (dict) –

            A tuple that provides information about an error that caused a cluster to terminate.

            • ErrorCode (string) –

              The name or code associated with the error.

            • ErrorData (list) –

              A list of key value pairs that provides contextual information about why an error occured.

              • (dict) –

                • (string) –

                  • (string) –

            • ErrorMessage (string) –

              A message that describes the error.

      • Ec2InstanceAttributes (dict) –

        Provides information about the Amazon EC2 instances in a cluster grouped by category. For example, key name, subnet ID, IAM instance profile, and so on.

        • Ec2KeyName (string) –

          The name of the Amazon EC2 key pair to use when connecting with SSH into the master node as a user named “hadoop”.

        • Ec2SubnetId (string) –

          Set this parameter to the identifier of the Amazon VPC subnet where you want the cluster to launch. If you do not specify this value, and your account supports EC2-Classic, the cluster launches in EC2-Classic.

        • RequestedEc2SubnetIds (list) –

          Applies to clusters configured with the instance fleets option. Specifies the unique identifier of one or more Amazon EC2 subnets in which to launch Amazon EC2 cluster instances. Subnets must exist within the same VPC. Amazon EMR chooses the Amazon EC2 subnet with the best fit from among the list of RequestedEc2SubnetIds, and then launches all cluster instances within that Subnet. If this value is not specified, and the account and Region support EC2-Classic networks, the cluster launches instances in the EC2-Classic network and uses RequestedEc2AvailabilityZones instead of this setting. If EC2-Classic is not supported, and no Subnet is specified, Amazon EMR chooses the subnet for you. RequestedEc2SubnetIDs and RequestedEc2AvailabilityZones cannot be specified together.

          • (string) –

        • Ec2AvailabilityZone (string) –

          The Availability Zone in which the cluster will run.

        • RequestedEc2AvailabilityZones (list) –

          Applies to clusters configured with the instance fleets option. Specifies one or more Availability Zones in which to launch Amazon EC2 cluster instances when the EC2-Classic network configuration is supported. Amazon EMR chooses the Availability Zone with the best fit from among the list of RequestedEc2AvailabilityZones, and then launches all cluster instances within that Availability Zone. If you do not specify this value, Amazon EMR chooses the Availability Zone for you. RequestedEc2SubnetIDs and RequestedEc2AvailabilityZones cannot be specified together.

          • (string) –

        • IamInstanceProfile (string) –

          The IAM role that was specified when the cluster was launched. The Amazon EC2 instances of the cluster assume this role.

        • EmrManagedMasterSecurityGroup (string) –

          The identifier of the Amazon EC2 security group for the master node.

        • EmrManagedSlaveSecurityGroup (string) –

          The identifier of the Amazon EC2 security group for the core and task nodes.

        • ServiceAccessSecurityGroup (string) –

          The identifier of the Amazon EC2 security group for the Amazon EMR service to access clusters in VPC private subnets.

        • AdditionalMasterSecurityGroups (list) –

          A list of additional Amazon EC2 security group IDs for the master node.

          • (string) –

        • AdditionalSlaveSecurityGroups (list) –

          A list of additional Amazon EC2 security group IDs for the core and task nodes.

          • (string) –

      • InstanceCollectionType (string) –

        Note

        The instance fleet configuration is available only in Amazon EMR releases 4.8.0 and later, excluding 5.0.x versions.

        The instance group configuration of the cluster. A value of INSTANCE_GROUP indicates a uniform instance group configuration. A value of INSTANCE_FLEET indicates an instance fleets configuration.

      • LogUri (string) –

        The path to the Amazon S3 location where logs for this cluster are stored.

      • LogEncryptionKmsKeyId (string) –

        The KMS key used for encrypting log files. This attribute is only available with Amazon EMR 5.30.0 and later, excluding Amazon EMR 6.0.0.

      • RequestedAmiVersion (string) –

        The AMI version requested for this cluster.

      • RunningAmiVersion (string) –

        The AMI version running on this cluster.

      • ReleaseLabel (string) –

        The Amazon EMR release label, which determines the version of open-source application packages installed on the cluster. Release labels are in the form emr-x.x.x, where x.x.x is an Amazon EMR release version such as emr-5.14.0. For more information about Amazon EMR release versions and included application versions and features, see https://docs.aws.amazon.com/emr/latest/ReleaseGuide/. The release label applies only to Amazon EMR releases version 4.0 and later. Earlier versions use AmiVersion.

      • AutoTerminate (boolean) –

        Specifies whether the cluster should terminate after completing all steps.

      • TerminationProtected (boolean) –

        Indicates whether Amazon EMR will lock the cluster to prevent the Amazon EC2 instances from being terminated by an API call or user intervention, or in the event of a cluster error.

      • UnhealthyNodeReplacement (boolean) –

        Indicates whether Amazon EMR should gracefully replace Amazon EC2 core instances that have degraded within the cluster.

      • VisibleToAllUsers (boolean) –

        Indicates whether the cluster is visible to IAM principals in the Amazon Web Services account associated with the cluster. When true, IAM principals in the Amazon Web Services account can perform Amazon EMR cluster actions on the cluster that their IAM policies allow. When false, only the IAM principal that created the cluster and the Amazon Web Services account root user can perform Amazon EMR actions, regardless of IAM permissions policies attached to other IAM principals.

        The default value is true if a value is not provided when creating a cluster using the Amazon EMR API RunJobFlow command, the CLI create-cluster command, or the Amazon Web Services Management Console.

      • Applications (list) –

        The applications installed on this cluster.

        • (dict) –

          With Amazon EMR release version 4.0 and later, the only accepted parameter is the application name. To pass arguments to applications, you use configuration classifications specified using configuration JSON objects. For more information, see Configuring Applications.

          With earlier Amazon EMR releases, the application is any Amazon or third-party software that you can add to the cluster. This structure contains a list of strings that indicates the software to use with the cluster and accepts a user argument list. Amazon EMR accepts and forwards the argument list to the corresponding installation script as bootstrap action argument.

          • Name (string) –

            The name of the application.

          • Version (string) –

            The version of the application.

          • Args (list) –

            Arguments for Amazon EMR to pass to the application.

            • (string) –

          • AdditionalInfo (dict) –

            This option is for advanced users only. This is meta information about third-party applications that third-party vendors use for testing purposes.

            • (string) –

              • (string) –

      • Tags (list) –

        A list of tags associated with a cluster.

        • (dict) –

          A key-value pair containing user-defined metadata that you can associate with an Amazon EMR resource. Tags make it easier to associate clusters in various ways, such as grouping clusters to track your Amazon EMR resource allocation costs. For more information, see Tag Clusters.

          • Key (string) –

            A user-defined key, which is the minimum required information for a valid tag. For more information, see Tag.

          • Value (string) –

            A user-defined value, which is optional in a tag. For more information, see Tag Clusters.

      • ServiceRole (string) –

        The IAM role that Amazon EMR assumes in order to access Amazon Web Services resources on your behalf.

      • NormalizedInstanceHours (integer) –

        An approximation of the cost of the cluster, represented in m1.small/hours. This value is incremented one time for every hour an m1.small instance runs. Larger instances are weighted more, so an Amazon EC2 instance that is roughly four times more expensive would result in the normalized instance hours being incremented by four. This result is only an approximation and does not reflect the actual billing rate.

      • MasterPublicDnsName (string) –

        The DNS name of the master node. If the cluster is on a private subnet, this is the private DNS name. On a public subnet, this is the public DNS name.

      • Configurations (list) –

        Applies only to Amazon EMR releases 4.x and later. The list of configurations that are supplied to the Amazon EMR cluster.

        • (dict) –

          Note

          Amazon EMR releases 4.x or later.

          An optional configuration specification to be used when provisioning cluster instances, which can include configurations for applications and software bundled with Amazon EMR. A configuration consists of a classification, properties, and optional nested configurations. A classification refers to an application-specific configuration file. Properties are the settings you want to change in that file. For more information, see Configuring Applications.

          • Classification (string) –

            The classification within a configuration.

          • Configurations (list) –

            A list of additional configurations to apply within a configuration object.

          • Properties (dict) –

            A set of properties specified within a configuration classification.

            • (string) –

              • (string) –

      • SecurityConfiguration (string) –

        The name of the security configuration applied to the cluster.

      • AutoScalingRole (string) –

        An IAM role for automatic scaling policies. The default role is EMR_AutoScaling_DefaultRole. The IAM role provides permissions that the automatic scaling feature requires to launch and terminate Amazon EC2 instances in an instance group.

      • ScaleDownBehavior (string) –

        The way that individual Amazon EC2 instances terminate when an automatic scale-in activity occurs or an instance group is resized. TERMINATE_AT_INSTANCE_HOUR indicates that Amazon EMR terminates nodes at the instance-hour boundary, regardless of when the request to terminate the instance was submitted. This option is only available with Amazon EMR 5.1.0 and later and is the default for clusters created using that version. TERMINATE_AT_TASK_COMPLETION indicates that Amazon EMR adds nodes to a deny list and drains tasks from nodes before terminating the Amazon EC2 instances, regardless of the instance-hour boundary. With either behavior, Amazon EMR removes the least active nodes first and blocks instance termination if it could lead to HDFS corruption. TERMINATE_AT_TASK_COMPLETION is available only in Amazon EMR releases 4.1.0 and later, and is the default for versions of Amazon EMR earlier than 5.1.0.

      • CustomAmiId (string) –

        Available only in Amazon EMR releases 5.7.0 and later. The ID of a custom Amazon EBS-backed Linux AMI if the cluster uses a custom AMI.

      • EbsRootVolumeSize (integer) –

        The size, in GiB, of the Amazon EBS root device volume of the Linux AMI that is used for each Amazon EC2 instance. Available in Amazon EMR releases 4.x and later.

      • RepoUpgradeOnBoot (string) –

        Applies only when CustomAmiID is used. Specifies the type of updates that the Amazon Linux AMI package repositories apply when an instance boots using the AMI.

      • KerberosAttributes (dict) –

        Attributes for Kerberos configuration when Kerberos authentication is enabled using a security configuration. For more information see Use Kerberos Authentication in the Amazon EMR Management Guide.

        • Realm (string) –

          The name of the Kerberos realm to which all nodes in a cluster belong. For example, EC2.INTERNAL.

        • KdcAdminPassword (string) –

          The password used within the cluster for the kadmin service on the cluster-dedicated KDC, which maintains Kerberos principals, password policies, and keytabs for the cluster.

        • CrossRealmTrustPrincipalPassword (string) –

          Required only when establishing a cross-realm trust with a KDC in a different realm. The cross-realm principal password, which must be identical across realms.

        • ADDomainJoinUser (string) –

          Required only when establishing a cross-realm trust with an Active Directory domain. A user with sufficient privileges to join resources to the domain.

        • ADDomainJoinPassword (string) –

          The Active Directory password for ADDomainJoinUser.

      • ClusterArn (string) –

        The Amazon Resource Name of the cluster.

      • OutpostArn (string) –

        The Amazon Resource Name (ARN) of the Outpost where the cluster is launched.

      • StepConcurrencyLevel (integer) –

        Specifies the number of steps that can be executed concurrently.

      • PlacementGroups (list) –

        Placement group configured for an Amazon EMR cluster.

        • (dict) –

          Placement group configuration for an Amazon EMR cluster. The configuration specifies the placement strategy that can be applied to instance roles during cluster creation.

          To use this configuration, consider attaching managed policy AmazonElasticMapReducePlacementGroupPolicy to the Amazon EMR role.

          • InstanceRole (string) –

            Role of the instance in the cluster.

            Starting with Amazon EMR release 5.23.0, the only supported instance role is MASTER.

          • PlacementStrategy (string) –

            Amazon EC2 Placement Group strategy associated with instance role.

            Starting with Amazon EMR release 5.23.0, the only supported placement strategy is SPREAD for the MASTER instance role.

      • OSReleaseLabel (string) –

        The Amazon Linux release specified in a cluster launch RunJobFlow request. If no Amazon Linux release was specified, the default Amazon Linux release is shown in the response.

      • EbsRootVolumeIops (integer) –

        The IOPS, of the Amazon EBS root device volume of the Linux AMI that is used for each Amazon EC2 instance. Available in Amazon EMR releases 6.15.0 and later.

      • EbsRootVolumeThroughput (integer) –

        The throughput, in MiB/s, of the Amazon EBS root device volume of the Linux AMI that is used for each Amazon EC2 instance. Available in Amazon EMR releases 6.15.0 and later.

Exceptions