Quick and easy BeyondCorp BackOffice access with ALBs, Cognito and GSuite

18. December 2018 2018 0

For some values of quick and easy.

Overview

LoveCrafts has several services which are currently hosted behind several different VPNs. VPN access is managed via LDAP which is managed by Engineering/DevOps.

Historically, we have not been notified of company leavers in a timely fashion, which is an obvious security hole, as VPN access (can) permit access to privileged resources within our hosting environment.

This includes but is not limited to:

  • Grafana
  • Kibana
  • Jenkins

For a while, we had been discussing a Single Sign-On (SSO) system to manage access to all these disparate systems. We use Google GSuite for corporate mail. Our Human Resources Team manually add and remove people as they join and leave. So it seemed obvious to treat Google as our single source of truth (at least for now).

In June 2018, AWS announced the integration of Cognito and JWT Authorisation within their Application Load Balancers (ALBs). [1]

This would allow any Web based back office services to be put behind a public facing ALB with Cognito Authorisation via GSuite.

This probably equates to 90% of our corporate VPN traffic. Theoretically, we should then be able to get the required VPN services used only for emergency SSH/RDP. We could limit SSH access as much as possible with other tools, such as SSM Manager Console.

Integrating with GSuite gets LoveCrafts significantly closer to a full SSO.

Caveat Developer: Google GSuite is being used here, but Cognito supports multiple OAuth2 sources, including Amazon, Facebook, OpenID or indeed any OAuth2/SAML provider.

The following code has been reverse engineered from our Puppet managed configuration. I have modified these to work without Puppet so there may be some inconsistencies to the following examples.

Initial Proof of Concept

To test feasibility, I used a test AWS account and created the following:

  • Cognito User Pool
  • Cognito App Client
  • Application Load Balancer(ALB)
  • Google OAuth2 Client Credentials

The ALB was configured with a separate CNAME to an existing service.

The Google OAuth2 Client credentials were configured and added to the Cognito User Pool in the testing account.

Enabling the authentication, all HTTPS access to the ALB was redirected to a Google auth page and redirected back to the ALB once sign in was complete.

Transparent access worked fine, and a user was added to the Cognito Pool.

Access was allowed to the protected resource once authenticated or repeatedly presented a Google Authentication page.

The Good

Works transparently without having to write any app-specific code. Zero to up and running in ~5mins.

AWS ALB passes the user profile data in an X-Amzn-Oidc-Data HTTP header that the app/nginx etc. can access (although it is base64 encoded JSON).

The Bad

Any Google account permits access. (This service is designed to allow app developers to pass off user management via Google, Twitter, Facebook or any OAuth2/OpenID platform and store in Cognito.)

The App needs to validate JWT Token to prove the authenticity of the X-Amzn-Oidc-Data HTTP header, which is great because we’re already using an nginx JWT auth library…

The Ugly

Initially, it was relatively trivial to get Nginx to decode the X-Amzn-Oidc-Data Header, extract Username/email/firstname/lastname and pass as separate headers to the downstream app.

However, you need to validate the signature of the JWT token to ensure it’s genuine, meaning in time (i.e., the session is still valid) and that it hasn’t been spoofed.

Amazon chose to use ES256 signatures for JWT, which the nginx lua library we’ve been using doesn’t support and I couldn’t find one which did support any Elliptical Curve Crypto Signatures. Well, there was a Kong version of nginx, but I didn’t want to attempt backporting.

What follows is an explanation of the solution I ended up writing; a python sidecar to handle the JWT validation, user data extraction and encapsulated the functionality in a new lua module for nginx.

Once/If the nginx lua JWT module improves to support ES crypto, this could be deprecated in favour of a fully lua based module.

For speed of development, I chose to write a python sidecar app to validate the JWT token and return HTTP headers back to nginx. The HTTP status code indicates whether the JWT token validated correctly.

The python app runs under gunicorn. It needs to be run under python3, as again, python2 doesn’t have support for the crypto libraries in use.

Using a python app does also allow you to expand the features and add group memberships from an LDAP service as extra headers for example.

I finally settled on the PyJWT library as it compiled and performed several orders of magnitude faster than a userland version. (less than 1ms typically compared to 150ms+). Speed is critical here, as the JWT token needs to be validated for every single request crossing the ALB.

Basic Implementation

To follow along you will need:

  • A Google GSuite account and developer access
  • An AWS account with an ALB and a Cognito Pool
  • nginx with lua support
  • python3

We’re going to run a python3 sidecar AuthService that validates the JWT token and passes the validated headers back to nginx. Nginx will then forward those headers to your own application behind the ALB. The application does not need to know anything about how the authentication is done and could even be a static site.

Applications such as Grafana and Jenkins can use the Proxy Headers as a trusted identity.

The AuthService sidecar runs locally alongside the Nginx instance and has strictly controlled timeouts. If the JWT authorisation is required and the service is down, nginx will serve a 503: Service Unavailable. If the user is authenticated but not in the list of approved domains, the nginx will serve a 401: Access denied.

Below shows the standard request path for an initial login to a Cognito ALB.

Data flow diagram showing the interaction between the browser and components

Nginx and AuthServices are the two components we need to build to validate the JWT token.

Keyserver is a publically accessible location to retrieve the public key of the server that signed the JWT token. The key id is embedded in the X-Amzn-Oicd-Data header. The python app caches the public keys in memory.

Creating GSuite OAuth2 Credentials

Login into the Google Developers Console and create an app to use for authentication.

Create OAuth Client Credentials for your app.

Create OAuth Client Credentials

Create a set of web application credentials.

Create set of web application credentials

Copy your Client ID add Secret

Copy your Client ID and Secret

Configure Cognito

If you don’t already have a Cognito User Pool create one.

Choose the domain name that Cognito will reserve for you. This is where your users will get directed to log in. (You can use your own domain, but is beyond the scope of this tutorial.)

Pick your domain prefix.

N.B. The full domain needs to be added to the Google Developer Console as a permitted Callback location for your Oauth Web Client app.

Configure Google as your identity provider. Paste in your Client ID and Secret from Google here.

Configure the ALB Endpoints for the Cognito App Client.

If, for example, your test application is being hosted on testapp.mycorp.com :

  • Your Callback URLs will be https://testapp.mycorp.com,https://testapp.mycorp.com/oauth2/idpresponse
  • The /oauth2/idpresponse url is handled by the ALB internally, and your app will not see these requests.[2]
  • Your Sign out URL will be https://testapp.mycorp.com

You can keep appending more ALBs and endpoints to this config later, comma separated.

Configure ALB

Now we can configure the ALB to force authentication when accessing all or part of our Web app.

On your ALB, select the listeners tab and edit the rules for the HTTPS listener (you can only configure this on an HTTPS listener).

Add the Cognito pool and app client to the ALB authenticate config

The Cognito user pool is from our previous step, and the App client is the client configured within the Cognito User Pool.

I reduce the Session timeout down to approximately 12 hours, as the default is 7 days.

From this point on, the ALB only ensures that there a valid session with any Google account, even a personal one. There is no way to restrict which email domains to permit in Cognito.

Configure Nginx

You will need nginx running with lua support and the resty.http lua package available as well as this custom lua script:

nginx-aws-jwt.lua

Our code is configured and managed by Puppet, so you will need to substitute some values with appropriate values (timeouts, valid_domains etc.)

Inside your nginx http block:

 lua_package_path "<>/?.lua;;";

Then inside your server block add the following access_by_lua code to your location block:

location / {
     access_by_lua '
         local jwt = require("nginx-aws-jwt")
         jwt.auth{auth_req=false}
     ';
}

auth_req defaults to true. If true, this will issue a 401: Access denied unless a valid AWS JWT token exists and the user’s email address is in the list of valid_domains,e.g. (mycorp.com, myparentcorp.com)

The false setting, as shown, enables a soft launch and will instrument the backend request with extra headers if a valid JWT token is present and otherwise permit access as normal.

The only other parameter currently supported is valid_domains. And should be used as such.

location / {
    access_by_lua '
        local jwt = require("nginx-aws-jwt")
        jwt.auth{valid_domains="mycorp.com,megacorp.com,myparentcorp.com"}
    ';
}

The above example would permit any users from the three defined GSuite domains access.

Starting the sidecar JWT validator

The python app is tested on python3.6 with the following pip packages

cryptography==2.4.2
gunicorn==19.8.1
PyJWT==1.6.4
requests==2.20.1
statsd==3.3.0

gunicorn was launched using the following gunicorn.ini file with the commands:

#!/bin/bash
PROG="gunicorn-3.6"
INSTANCE="awsjwtauth"
DAEMON=/usr/bin/${PROG}
PID_FILE=/var/run/${PROG%%-*}/${INSTANCE}.pid

APP=app:app
ARGS="--config /etc/gunicorn/gunicorn.ini --env LOG_LEVEL=debug --env REGION=eu-west-1 --env LOGFILE=/var/log/lovecrafts/awsjwtauth/app.log ${APP}"

${DAEMON} --pid ${PID_FILE} ${ARGS}

Confirming it all works

Well, the obvious thing first, hitting the ALBs DNS name, should get you redirected to authenticate with Google and then redirect you back to your test application.

In our setup nginx is proxypassing to our test app so we can inspect the headers that the app sees post authentication by running the following on the instance behind the ALB:

$ ngrep -d any -qW byline '' dst port 3000

T 127.0.0.1:60634 -> 127.0.0.1:3000 [AP]
GET /favicon.ico HTTP/1.1.
Host: testapp.example.com.
X-Forwarded-Host: testapp.example.com.
X-Forwarded-Port: 443.
X-Forwarded-Proto: https.
X-Forwarded-For: 123.123.123.123
X-Amzn-Trace-Id: Root=1-12345678-1234567890123456789012345.
X-Amzn-Oidc-Data: <Base64 encoded json key/signer details>.<Base64 encoded json profile data>.<signature>.
X-Amzn-Oidc-Identity: 55cf11c1-1234-1234-1234-68eaaa646dbb.
X-Amzn-Oidc-Accesstoken: <Base64 JWT Token Redacted>
user-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:63.0) Gecko/20100101 Firefox/63.0.
accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8.
accept-language: en-US,en;q=0.5.
accept-encoding: gzip, deflate, br.
X-LC-Sid: 123412345123457114928da7eab8a01eda6ca38.
X-LC-Rid: 1234123451234582900d3fc4554225bd338edc4.
X-Auth-Family-name: Brockhurst.
X-Auth-Email: bob@mycorp.com.
X-Auth-Given-name: Bob.
X-Auth-Picture: https://lh5.googleusercontent.com/-12345678901/AAAAAAAAAAA/AAAAAAAAAAA/123-123434556/123-1/photo.jpg.

If the validation fails or is not present the X-Auth-* Headers will not be present. This assumes you’ve set auth=false making auth optional.

If auth=true on a validation failure or missing X-Amzn-Oidc-Data then nginx will return 401, and no request is made to the proxypass.

And a quick look at the python app.log

{"@timestamp":"2018-12-04 16:06:04,809", "level":"WARNING", "message":"Unauthorised access by: unknown_user@gmail.com", "lc-rid":"bf8794defb2885e48eb37552e96545b2cfedec98", "lc-sid":"c4dc0ae13deb4696baa4c3920ffcbbdbf25c71df"}
# When a user not in the valid_domains list attempts to access.

{"@timestamp":"2018-12-04 16:07:06,342", "level":"ERROR", "message":"Error Validating JWT Header: Invalid crypto padding", "lc-rid":"d23ecd55123abb4efff0091b37f8f6161b98218c", "lc-sid":"c4dc0ae13deb4696baa4c3920ffcbbdbf25c71df"}
# Several variants on the above, based on signature failures, corrupted/tampered headers etc.

{"@timestamp":"2018-12-04 16:08:02,738", "level":"INFO", "message":"No JWT Header present", "lc-rid":"895d1055e8f821a75e691ea1f25c29f182131030", "lc-sid":"c4dc0ae13deb4696baa4c3920ffcbbdbf25c71df"}
# INFO messaging only in dev, useful for debugging.

Further, debug information can be output in the nginx error log by setting the log level to info.

Monitoring

Apart from normal nginx monitoring, the authentication sidecar app generates statsd metrics published to the local statsd collector prefixed with awsjwtauth

This includes counts of error conditions and success methods, app restarts etc.

It will also send timing information for its only downstream dependency the AWS ALB Keyserver service.

Example Grafana Dashboard

This dashboard shows that typically we handle the authentication step in the python application in under 1ms. The spikes to approx 100ms are where the ALB has switched the keysigner, and so we had to go fetch the public key from the signer again. The python app caches the public key in memory (see the cache hit/misses graph)

Other Notes

ALB authentication only works on HTTPS connections. So if you also have an HTTP listener, it should redirect to HTTPS. This can be configured at the ALB Listener for HTTP.

Taking it further

Through this article, I’ve described how to achieve a minimal overhead Oauth2 SSO implementation for securing services that organisations would typically put on an internal network or behind a VPN.

Other features that could be added include using the JWT token in headers to enumerate the Google Groups the user is a member of to restrict access further, looking up group memberships in your own internal systems such as LDAP or Active Directory, validating that the device/browser is secure and up to date before allowing access[3], and probably many more things I haven’t thought of yet.

References

[1] – [back]https://aws.amazon.com/blogs/aws/built-in-authentication-in-alb/

[2] – [back]https://docs.aws.amazon.com/elasticloadbalancing/latest/application/listener-authenticate-users.html

[3] – [back]https://github.com/Netflix-Skunkworks/stethoscope-app

About the Author

Andy ‘Bob’ Brockhurst (@b3cft) is the Head of Infrastructure Architecture and Security at LoveCrafts Collective, a combination of social network, digital marketplace, online media, and e-commerce site to deliver everything makers need to celebrate, share advice and buy supplies for their craft.

Bob has worked with computers for more than 25 years including many years at The BBC and Yahoo!, and is finding it increasingly difficult to explain to his family what he actually does for a living.

About the Editor

Jennifer Davis is a Senior Cloud Advocate at Microsoft. Jennifer is the coauthor of Effective DevOps. Previously, she was a principal site reliability engineer at RealSelf, developed cookbooks to simplify building and managing infrastructure at Chef, and built reliable service platforms at Yahoo. She is a core organizer of devopsdays and organizes the Silicon Valley event. She is the founder of CoffeeOps.


Just add Code: Fun with Terraform Modules and AWS

06. December 2016 2016 0

Author: Chris Marchesi

Editors: Andrew Langhorn, Anthony Elizondo

This article is going to show you how you can use Terraform, with a little help from Packer and Chef, to deploy a fully-functional sample web application, complete with auto-scaling and load balancing, in under 50 lines of Terraform code.

You will need the sample project to follow along, so make sure you load that up before continuing with reading this article.

The Humble Configuration

Check out the code in the terraform/main.tf file.

It might be hard to think that with this mere smattering of Terraform is setting up:

  • An AWS VPC
  • 2 subnets, each in different availability zones, fully routed
  • An AWS Application Load Balancer
  • A listener for the ALB
  • An AWS Auto Scaling group
  • An ALB target group attached to the ALB
  • Configured security groups for both the ALB and backend instances

So what’s the secret?

Terraform Modules

This example is using a powerful feature of Terraform – the modules feature, providing a semantic and repeatable way to manage AWS infrastructure. The modules hide most of the complexity of setting up a full VPC behind a relatively small set of code, and an even smaller set of changes going forward (generally, to update this application, all that is needed is to update the AMI).

Note that this example is composed entirely of modules – no root module resources exist. That’s not to say that they can’t exist – and in fact one of the secondary examples demonstrates how you can use the outputs of one of the modules to add extra resources on an as-needed basis.

The example is composed of three visible modules, and one module that operates under the hood as a dependency:

  • terraform_aws_vpc, which sets up the VPC and subnets
  • terraform_aws_alb, which sets up the ALB and listener
  • terraform_aws_asg, which configures the Auto Scaling group, and ALB target group for the launched instances
  • terraform_aws_security_group, which is used by the ALB and Auto Scaling modules to set up security groups to restrict traffic flow.

These modules will be explained in detail later in the article.

How Terraform Modules Work

Terraform modules work very similar to basic Terraform configuration. In fact, each Terraform module is a standalone configuration in its own right, and depending on its pre-requisites, can run completely on its own. In fact, a top-level Terraform configuration without any modules being used is still a module – the root module. You sometimes see this mentioned in various parts of the Terraform workflow, such as in things like error messages, and the state file.

Module Sources and Versioning

Terraform supports a wide variety of remote sources for modules, such as simple, generic locations like HTTP, or Git, or well-known locations like GitHub, Bitbucket, or Amazon S3.

You don’t even need to put a module in a remote location. In fact, a good habit to get into is if you need to re-use Terraform code in a local project, put that code in a module – that way you can re-use it several times to create the same kind of resources in either the same, or even better, different, environments.

Declaring a module is simple. Let’s look at the VPC module from the example:

The location of the module is specified with the source parameter. The style of the parameter will dictate what kind of behaviour TF will undertake to get the module.

The rest of the options here are module parameters, which translate to variables within the module. Note that any variable that does not have a default value in the module is a required parameter, and Terraform will not start if these are not supplied.

The last item that should be mentioned is regarding versioning. Most module sources that work off of source control have a versioning parameter you can supply to get a revision or tag – with Git and GitHub sources, this is ref, which can translate to most Git references, be it a branch, or tag.

Versioning is a great way to keep things under control. You might find yourself iterating very fast on certain modules as you learn more about Terraform or your internal infrastructure design patterns change – versioning your modules ensures that you don’t need to constantly refactor otherwise stable stacks.

Module Tips and Tricks

Terraform and HCL is a work in progress, and there may be some things that seem like they may make sense that don’t necessarily work 100% – yet. There are some things that you might want to keep in mind when you are designing your modules that may reduce the complexity that ultimately gets presented to the user:

Use Data Sources

Terraform 0.7+’s data sources feature can go a long way in reducing the amount of data needs to go in to your module.

In this project, data sources are used for things such as obtaining VPC IDs from subnets (aws_subnet) and getting the security groups assigned to an ALB (using the aws_alb_listener and aws_alb data sources chained together). This allows us to create ALBs based off of subnet ID alone, and attach auto-scaling groups to ALBs with knowing only the listener ARN that we need to attach to.

Exploit Zero Values and Defaults

Terraform follows the rules of the language it was created in regarding zero values. Hence, most of the time, supplying an empty parameter is the same as supplying none at all.

This can be advantageous when designing a module to support different kinds of scenarios. For example, the alb module supports TLS via supplying a certificate ARN. Here is the variable declaration:

And here it is referenced in the listener block:

Now, when this module parameter is not supplied, its default value becomes an empty string, which is passed in to aws_alb_listener.alb_listener. This is, most times, exactly the same as if the parameter is not passed in at all. This allows you to not have to worry about this parameter when you just want to use HTTP on this endpoint (the default for the ALB module as a whole).

Pseudo-Conditional Logic

Terraform does not support conditional logic yet, but through creative use of count and interpolation, one can create semi-conditional logic in your resources.

Consider the fact that the terraform_aws_autoscaling module supports the ability to attach the ASG to an ALB, but does not explicit require it. How can you get away with that, though?

To get the answer, check one of the ALB resources in the module:

Here, we make use of the map interpolation function, nested in a lookup function to provide essentially an if/then/else control structure. This is used to control a resource’s instance count, adding an instance if var.enable_albis true, and completely removing the resource from the graph otherwise.

This conditional logic does not necessarily need to be limited to count either. Let’s go back to the aws_alb_listener.alb_listener resource in the ALB module, looking at a different parameter:

Here, we are using this trick to supply the correct SSL policy to the listener if the listener protocol is not HTTP. If it is, we supply the zero value, which as mentioned before, makes it as if the value was never supplied.

Module Limitations

Terraform does have some not-necessarily-obvious limitations that you will want to keep in mind when designing both modules and Terraform code in general. Here are a couple:

Count Cannot be Computed

This is a big one that can really get you when you are writing modules. Consider the following scenario that totally did not happen to me even though I knew of of such things beforehand 😉

  • An ALB listener is created with aws_alb_listener
  • The arn of this resource is passed as an output
  • That output is used as both the ARN to attach an auto-scaling group to, and the pseudo-conditional in the ALB-related resources’ count parameter

What happens? You get this lovely message:

value of 'count' cannot be computed

Actually, it used to be worse (a strconv error was displayed instead), but luckily that changed recently.

Unfortunately, there is no nice way to work around this right now. Extra parameters need to be supplied or you need to structure your modules in way that avoids computed values being passed into count directives in your workflow. (This is pretty much exactly why the terraform_aws_asg module has a enable_alb parameter).

Complex Structures and Zero Values

Complex structures are not necessarily good candidates for zero values, even though it may seem like a good idea. But by defining a complex structure in a resource, you are by nature supplying it a non-zero value, even if most of the fields you supply are empty.

Most resources don’t handle this scenario gracefully, so it’s best to avoid using complex structures in a scenario where you may be designing a module for re-use, and expect that you won’t be using the functionality defined by such a structure often.

The Application in Brief

As our focus in this article is on Terraform modules, and not on other parts of the pattern such as using Packer or Chef to build an AMI, we will only touch up briefly on the non-Terraform parts of this project, so that we can focus on the Terraform code and the AWS resources that it is setting up.

The Gem

The Ruby gem in this project is a small “hello world” application running with Sinatra. This is self-contained within this project and mainly exists to give us an artifact to put on our base AMI to send to the auto-scaling group.

The server prints out the system’s hostname when fetched. This will allow us to see each node in action as we boot things up.

Packer

The built gem is loaded on to an AMI using Packer, for which the code is contained within packer/ami.json. We use chef-solo as a provisioner, which works off a self-contained cookbook named packer_payload in the cookbooks directory. This allows us a bit more of a higher-level workflow than we would have simply with shell scripts, including the ability to better integration test things and also possibly support multiple build targets.

Note that the Packer configuration takes advantage of a new Packer 0.12.0 feature that allows us to fetch an AMI to use as the base right from Packer. This is the source_ami_filter directive. Before Packer 0.12.0, you would have needed to resort to a helper, such as ubuntu_ami.sh, to get the AMI for you.

The Rakefile

The Rakefile is the build runner. It has tasks for Packer (ami), Terraform (infrastructure), and Test Kitchen (kitchen). It also has prerequisite tasks to stage cookbooks (berks_cookbooks), and Terraform modules (tf_modules). It’s necessary to pre-fetch modules when they are being used in Terraform – normally this is handled by terraform get, but the tf_modules task does this for you.

It also handles some parameterization of Terraform commands, which allows us to specify when we want to perform something else other than an apply in Terraform, or use a different configuration.

All of this is in addition to standard Bundler gem tasks like build, etc. Note that install and release tasks have been explicitly disabled so that you don’t install or release the gem by mistake.

The Terraform Modules

Now that we have that out of the way, we can talk about the fun stuff!

As mentioned at the start of the article, This project has 4 different Terraform modules. Also as mentioned, one of them (the Security Group module) is hidden from the end user, as it is consumed by two of the parent modules to create security groups to work with. This exploits the fact that Terraform can, of course, nest modules within each other, allowing for any level of re-usability when designing a module layout.

The AWS VPC Module

The first module, terraform_aws_vpc, creates not only a VPC, but also public subnets as well, complete with route tables and internet gateway attachments.

We’ve already hidden a decent amount of complexity just by doing this, but as an added bonus, redundancy is baked right into the module by distributing any network addresses passed in as subnets to the module across all availability zones available in any particular region via the aws_availability_zones data source. This process does not require previous knowledge of the zones available to the account.

The module passes out pertinent information, such as the VPC ID, the ID of the default network ACL, the created subnet IDs, the availability zones for those subnets as a map, and the ID of the route table created.

The ALB Module

The second module, terraform_aws_alb allows for the creation of AWS Application Load Balancers. If all you need is the defaults, use of this module is extremely simple, creating an ALB that will answer requests on port 80. A default target group is also created that can be used if you don’t have anything else mapped, but we want to use this with our auto-scaling group.

The Auto Scaling Module

The third module, terraform_aws_asg, is arguably the most complex of the three that we see in the sample configuration, but even at that, its required options are very slim.

The beauty of this module is that, thanks to all the aforementioned logic, you can attach more than one ASG to the same ALB with different path patterns (mentioned below), or not attach it to an ALB at all! This allows this same module to be used for a number of scenarios. This is on top of the plethora of options available to you to tune, such as CPU thresholds, health check details, and session stickiness.

Another thing to note is how the AMI for the launch configuration is being fetched from within this module. We work off the tag that we used within Packer, which is supplied as a module variable. This is then searched for within the module via an aws_ami data source. This means that no code or variables need to change when the AMI is updated – the next Terraform run will pick up the most recent AMI with the tag.

Lastly, this module supports the rolling update mechanism laid out by Paul Hinze in this post oh so long ago now. When a new AMI is detected and the auto-scaling group needs to be updated, Terraform will bring up the new ASG, attach it, wait for it to have minimum capacity, and then bring down the old one.

The Security Group Module

The last module to be mentioned, terraform_aws_security_group, is not shown anywhere in our example, but is actually used by the ALB and ASG modules to create Security Groups.

Not only does it create security groups though – it also allows for the creation of 2 kinds of ICMP allow rules. One for all ICMP, if you so choose, but more importantly, allow rules for ICMP type 3 (host unreachable) are always created, as this is how path MTU discovery works. Without this, we might end up with unnecessarily degraded performance.

Give it a Shot

After all this talk about the internals of the project and the Terraform code, you might be eager to bring this up and see it working. Let’s do that now.

Assuming you have the project cloned and AWS credentials set appropriately, do the following:

  • Run bundle install --binstubs --path vendor/bundle to load the project’s Ruby dependencies.
  • Run bundle exec rake ami. This builds the AMI.
  • Run bundle exec rake infrastructure. This will deploy the project.

After this is done, Terraform should return a alb_hostname value to you. You can now load this up in your browser. Load it once, then wait about 1 second, then load it again! Or even better, just run the following in a prompt:

while true; do curl http://ALBHOST/; sleep 1; done

And watch the hostname change between the two hosts.

Tearing it Down

Once you are done, you can destroy the project simply by passing a TF_CMD environment variable in to rake with the destroy command:

TF_CMD=destroy bundle exec rake infrastructure

And that’s it! Note that this does not delete the AMI artifact, you will need to do that yourself.

More Fun

Finally, a few items for the road. These are things that are otherwise important to note or should prove to be helpful in realizing how powerful Terraform modules can be.

Tags

You may have noticed the modules have a project_path parameter that is filled out in the example with the path to the project in GitHub. This is something that I think is important for proper AWS resource management.

Several of our resources have machine-generated names or IDs which make them hard to track on their own. Having a easy-to-reference tag alleviates that. Having the tag reference the project that consumes the resource is even better – I don’t think it gets much clearer than that.

SSL/TLS for the ALB

Try this: create a certificate using Certificate Manager, and change the alb module to the following:

Better yet, see the example here. This can be run with the following command:

And destroyed with:

You now have SSL for your ALB! Of course, you will need to point DNS to the ALB (either via external DNS, CNAME records, or Route 53 alias records – the example includes this), but it’s that easy to change the ALB into an SSL load balancer.

Adding a Second ASG

You can also use the ASG module to create two auto-scaling groups.

There is an example for the above here. Again, run it with:

And destroy it with:

You now have two auto-scaling groups, one handling requests for /foo/*, and one handling requests for /bar/*. Give it a go by reloading each URL and see the unique instances you get for each.

Acknowledgments

I would like to take a moment to thank PayByPhone for allowing me to use their existing Terraform modules as the basis for the publicly available ones at https://github.com/paybyphone. Writing this article would have been a lot more painful without them!

Also thanks to my editors, Anthony Elizondo and Andrew Langhorn for for their feedback and help with this article, and the AWS Advent Team for the chance to stand on their soapbox for my 15 minutes! 🙂

About the Author:

picture of author Chris MarchesiChris Marchesi (@vancluever) is a Systems Engineer working out of Vancouver, BC, Canada. He currently works for PayByPhone, designing tools and patterns to help its engineers and developers work with AWS. He is also a regular contributor to the Terraform project. You can view his work at https://github.com/vancluever, and also his previous articles at https://vancluevertech.com/.

About the Editors:

Andrew Langhorn is a senior consultant at ThoughtWorks. He works with clients large and small on all sorts of infrastructure, security and performance problems. Previously, he was up to no good helping build, manage and operate the infrastructure behind GOV.UK, the simpler, clearer and faster way to access UK Government services and information. He lives in Manchester, England, with his beloved gin collection, blogs at ajlanghorn.com, and is a firm believer that mince pies aren’t to be eaten before December 1st.

Anthony Elizondo is a SRE at Adobe. He enjoys making things, breaking things, and burritos. You can find him at http://twitter.com/complexsplit