AWS Non-default VPC Creation With CloudFormation

Recently I brushed up on a lot of AWS services in preparation to take the AWS Solutions Architect certification exam. One of the focus areas for the exam is Virtual Private Cloud (VPC). Looking for ways to get more hands-on experience with VPC I once again turned to my own personal website www.robhughes.net.

Awhile back I transitioned the website from my own hardware to the AWS cloud using CloudFormation and documented here

That approach used the default  VPC that is provided for each AWS account. While the default VPC works it could be more secure. Why is that?

  • The default VPC uses a public subnet in each availability zone (AZ). If no default subnet exists one is created. An Internet gateway is attached to your default VPC and your default subnet has a route to the Internet via the gateway. Each EC2 instance in the public subnet receives a public IP by default. All this is convenient  and a bit “magical” (I’ll talk about this more in a bit) but less secure. You may or may not want your EC2 instances to access the internet and with the public IP your instances can be accessed directly from the internet.
  • The RDS (MySQL) instance used by the website also resides in a public subnet. The database is only used by the web-tier so there is never a need for the database to be accessed directly from the internet.

AWS offers a number of recipes to create more secure applications using VPC. The scenario below was a best fit for my website:

It offers the following advantages over the default VPC:

  • Non-default VPC with public and private subnets.
  • NAT instance in each AZ that allows resources (e.g. EC2 instances) in the private subnet to access the internet, say, to download security updates. A NAT instance is an EC2 instance that implements Port Address Translation (PAT) to block inbound connections from the internet unless the connection was initiated from within the private subnet.
  • A bastion host for SSH access. A  bastion host is a dedicated EC2 instance, one per VPC, where all SSH access to your EC2 instances must be initiated from the bastion host. 

While you could create the non-default VPC, public/private subnets, NAT/bastion instances, etc. by hand it is not repeatable. Once again I turned to CloudFormation to automate the creation of the non-default VPC, public/private subnets, NAT/bastion instances and also the deployment of the web application into the non-default VPC.

And that’s were things get interesting. CloudFormation is good at configuration driven deployments. It is not a general purpose programming language and as such lacks primitives to perform conditional logic and iteration. Things that would have been handy to create a VPC stack or application stack across multiple AZs while avoiding one or more AZ.

CloudFormation does offer primitives to get a list of all AZs in a region which at first seems promising. But you may not want to use all AZs. In the US East regions there are four AZs (us-east-1b, us-east-1c, us-east-1d, us-east-1e). My web-tier uses the t1.micro instance type. I found that us-east-1e does not support the t1.micro instance type. (NOTE: The AZs in a region are mapped to different names for different accounts. us-east-1e may not be the same AZ for you). This is one of the areas where the default VPC is a bit magical. When using auto-scaled instances with the default VPC across all the AZs in the us-east region the us-east-1e AZ is “avoided”. My guess is that the autoscaling logic notices the instance type is not available in a given AZ and removes that AZ from consideration.

So it would be great if CloudFormation allowed you to iterate over a list, of say, AZs in a region, or conditionally not create resources, say, in my case to avoid using AZ us-east-1e. But you can’t do that with CloudFormation

I did find a nice alternative. Matteo Rinaudo has created a very useful Python script that prompts you for some information and then spits out a CloudFormation script that can be used to create a non-default VPC for an input CIDR range, along with public and private subnets, NAT instances, routing tables, and optionally a bastion host.

I used the Python script to generate an intial cut at the CloudFormation template to create the non-default VPC which you can find here.

The existing CloudFormation template to create the web application stack requires a number of updates to take advantage of the non-default VPC. Here’s a summary of the changes needed for the application stack to use the VPC resources:

  • Added a CloudFormation parameter, named VPCID, to input VPC ID. This is used to specify the VPC to use when application stack resources are created elsewhere in the CloudFormation template. In particular with a non-default VPC you must only use security groups associated with the VPC. For the application this means the security groups for the load balancer, web-tier EC2 instances, and security group for the RDS instance must be linked to the VPC, e.g.

    "LoadBalancerSecurityGroup": {
      "Type": "AWS::EC2::SecurityGroup",
      "Properties": {
        "GroupDescription": "ELB Security Group with HTTP access on port 80",
        "VpcId": {
          "Ref":”VPCID”
        },
        ...
      }
    }
    
  • Added CloudFormation parameter, named *PublicSubnetIDs, *to specify list of public subnet IDs across the AZs. This list is used by the CloudFormation template to filter which AZs will be used when creating web-tier EC2 instances e.g.

    "WebServerGroup": {
      "Type": "AWS::AutoScaling::AutoScalingGroup",
      "Properties": {
        "VPCZoneIdentifier”: {
          "Ref": "PublicSubnetIDs"
        },
        ...
      }
    }
    
  • Added CloudFormation parameter, named PrivateSubnetIDs, to specify list of private subnet IDs across the AZs. Any stack resources created that need to reside in a private subnet can take advantage of this list. In my case the multi-AZ RDS instance uses this list to indicate where RDS instances can reside. e.g.

    "DBSubnetGroup": {
      "Type": "AWS::RDS::DBSubnetGroup",
      "Properties": {
        "DBSubnetGroupDescription": "Subnets available for the RDS DB Instance",
        "SubnetIds": {
          "Ref": "PrivateSubnetIDs"
        }
        ...
      }  
    }
    
  • Added CloudFormation parameter, named BastionSecurityGroup,  to specify the security group where the bastion host resides. The ID of the bastion host security group is used in the security group ingress rules for the web-tier EC2 instances to restrict SSH access to the EC2 instances e.g.

    "SecurityGroupIngress": [
      {
        "IpProtocol": "tcp",
        "FromPort": "80",
        "ToPort": "80",
        "SourceSecurityGroupId": {
          "Ref": "LoadBalancerSecurityGroup"
        }
      },
      {
         "IpProtocol": "tcp",
         "FromPort": "22",
         "ToPort": "22",
         "SourceSecurityGroupId": {
           "Ref": "BastionSecurityGroup"
         }
      }
    ]
    

By using CloudFormation parameters information can be provided at stack creation time without hardcoding the information in the rest of the CloudFormation template.

The hardest part of this transition was using CloudFormation to configure the placement of the master/standby RDS instances within the private subnets in the VPC. Both the AWS documentation and community provided recipies are spotty at best. My guess is that CloudFormation has evolved overtime. The various CloudFormation snippets you find on the internet probably worked at one time. So, at least for now, I have a working CloudFormation template that places the multi-AZ RDS instance with the private subnets of the VPC.

You can find the updated CloudFormation template for the web application stack here.

If you want to move to a non-default VPC using this AWS scenario you could use the CloudFormation script provided above. It is hardcoded for the us-east region and AZs us-east-1b, 1c, and 1d but you could quickly modify the region and AZs as needed. Alternatively use Matteo Rinaudo’s Python script to generate a custom CloudFormation script to meet your needs.

You don’t need to use CloudFormation to create the non-default VPC. You could use the AWS management console or various APIs. It did  take many iterations to get all the network plumbing just right. With CloudFormation you can quickly tear down an entire VPC stack including the VPC itself, subnets, routing tables, NAT instances, bastion instances, etc. at the click of a button in the CloudFormation management console. Make a few tweaks and then spin up a new VPC stack. That alone is worth using CloudFormation.

In summary, this blog presents my experience moving to a non-default VPC using multiple AZs and a two-tier architecture uising public/private subnets.