Tidbits | Aug. 30, 2016

Today I Learned – The Vagary of AWS Availability Zones

by Stephen Spencer

In daily parlance, “zone” is used in a variety of contexts:

  1. the neutral zone (Romulan or Klingon--take your pick)
  2. the demilitarized zone
  3. a safe zone
  4. zoned-out
  5. in the zone
  6. the "friend" zone
  7. land/property-use zoning

The commonality these contexts share is the idea of a specific space;  whether physical or metaphysical, virtual or real, a “zone” defines a discrete space for a specific use or set of uses (or non-use in the case of #4).

I hadn’t thought much about whether or not the term ‘availability zone’ was etymologically accurate. 

On some level (Amazon NOCs), I imagine it is precise enough. From the perspective of an external consumer of AWS resources--I imagine most people will go through life without giving it a second thought.  Apparently, I was not destined to walk amongst them.

My goal was simple: Using the Python boto3 and botocore modules, I wanted to be able to specify a supernet, an AWS region, a CIDR mask and a VPC id and have it do the math and create the desired subnets that would subsequently be attached to the target region’s availability zones.

What I learned:

  • the concept of "zone" in AWS-land is fluid.  Something I noticed: the AZ is the only AWS thing that does not have an associated ID (at least not available to mortals)
  • the EC2 API will give you a list of possible availability zones. They may all be active or... not. (e.g: us-east-1 currently has five zones. Four of them exist. The letter associated with the dead zone is chosen by Amazon when you create your account)

When you ask for the list of an AZ within us-east-1:

[
    "AvailabilityZones",
    [
        {
            "State": "available",
            "RegionName": "us-east-1",
            "Messages": [],
            "ZoneName": "us-east-1a"
        },
        {
            "State": "available",
            "RegionName": "us-east-1",
            "Messages": [],
            "ZoneName": "us-east-1b"
        },
        {
            "State": "available",
            "RegionName": "us-east-1",
            "Messages": [],
            "ZoneName": "us-east-1c"
        },
        {
            "State": "available",
            "RegionName": "us-east-1",
            "Messages": [],
            "ZoneName": "us-east-1d"
        },
        {
            "State": "available",
            "RegionName": "us-east-1",
            "Messages": [],
            "ZoneName": "us-east-1e"
        }
    ]
]

Today there are only 4 active zones.  Which one is the Dead Zone(tm)?  The solution: simple brute force!

  1. Get the list of possible AZs for the region
  2. Create a dummy VPC using a moderate sized IPv4 supernet (/21)
  3. Divide the supernet into several subnets -- there are 8 /24 subnets in a /21 network so unless Amazon rolls out a region with more than 8 AZs, we're covered
  4. Start creating subnets on the dummy VPC
  5. The response from Amazon is a dictionary with a ResponseMetadata key.  If response['ResponseMetadata']['HttpStatusCode'], you have a live AZ
  6. Cache this on a per-account basis.

import boto3
from botocore.exceptions import ClientError
from ipaddr import IPNetwork    # https://github.com/google/ipaddr-py

def verified_az_list(region=us-east-1, key=None, keyid=None, profile=None, supernet=10.255.248.0/21):
    retval = []
	
    rc_boto = boto3.Session(aws_access_key_id=keyid,         # 1
                       aws_secret_access_key=key,
                       region_name=region)

    mc_ec2 = boto3.client(ec2)

    (_, az_dat,), stat = mc.describe_availability_zones().items()

    az_list = [t['ZoneName'] for t in az_dat]

    resp = mc_ec2.create_vpc(                                # 2
        CidrBlock=supernet, InstanceTenancy='default'
    )

    if resp['ResponseMetadata']['HTTPStatusCode'] != 200:
        return False

    vpc_id = resp['Vpc']['VpcId']

    subnets = [                 # for list comp. haters: using the ipaddr.IPNetwork.exploded 
        subnet.exploded         # property method to extract the CIDR string--we only need one
        for t in IPNetwork(     # subnet-per-reported AZ
           supernet 
        ).subnet(
            new_prefix=28
        )[:len(az_list)]
    ]

    # THE BRUTALITY
    for az, subnet in zip(az_list, subnets):
        try:
            (_, subnet_dat), stat = mc_ec2.create_subnet(
                VpcId=vpc_id,
                CidrBlock=subnet,
                AvailabilityZone=az
            ).items()

            retval.append(az)
        except ClientError:
            pass
        else:
            mc_ec2.delete_subnet(SubnetId=subnet_dat['SubnetId'])

    mc_ec2.delete_vpc(VpcId=vpc_id)

    return retval

to be fair...

From Amazon’s very own “What is Amazon EC2?” document:

An Availability Zone is represented by a region code followed by a letter identifier; for example, us-east-1a. To ensure that resources are distributed across the Availability Zones for a region, we independently map Availability Zones to identifiers for each account. For example, your Availability Zone us-east-1a might not be the same location as us-east-1a for another account. There's no way for you to coordinate Availability Zones between accounts.

I used to thoughtlessly cause the Death of Trees by printing off, binding then storing such manuals on the back of toilets for perusal when entrapped by biological necessity.  I don’t do that anymore--the ascension of tablets and smartphones have made reaching for 3-ring binders filled with pounds of API documentation a rare compulsion.  The result: I don’t think I’ve given the EC2 introductory manual more than a passing glance until recently.

P.S.

To the keepers of Amazon’s AWS documentation: your mobile user experience is rotten. Save a forest; give us a nice mobile-friendly doc option!  (PDF -> MOBI doesn’t count! :P )

What are Amazon availability zones anyway?

2016-08-30T11:56:16.160651 2016-08-30T12:11:48.938503 2016 AWS,ops