IAM Policies, Roles and Profiles and how to keep secrets away from your instances

05. December 2016 2016 0

Author: Mark Harrison
Editors: Jyrki Puttonen

AWS Identity and Access Management (IAM) is Amazon’s service for controlling access to AWS resources, or more simply, it provides a way for you to decide who has access to what in AWS. This simple description however hides the depth and complexity of what is probably one of the most misunderstood of Amazon’s services.

Many of you will have made use of IAM in order to create multiple users in AWS rather than sharing a single root user, but there are many more ways IAM can be useful to you. This article will be focusing on one use of IAM in particular: instance roles. Instance roles allow you to give AWS access to EC2 instances without them needing to store an AWS API key. I’ll be taking you through how to set them up, how to use them with your applications, and some of the things instance roles are useful for.

Throughout this article, I’ll be using terraform to create instances, roles and policies. However, the principles will apply if you use a different provisioning tool or if you use the API directly.

An example

We’re going to start off with a simple terraform configuration that creates a single micro instance in EC2. Here I’ve created a blank directory and a file inside called infrastructure.tf:

When I run terraform apply, terraform creates a running EC2 instance based on the configuration in my infrastructure.tf file. This will be the starting point for us to add IAM roles/policies to.

Let’s say we are writing an application and want to provide access to an S3 bucket. One way would be simply to copy your AWS API keys into the configuration file for your application, but this would give your application full access to your AWS account just as if you had logged in yourself. A better option would be to make a new IAM user, give them just the permissions needed to access the S3 bucket, and create API keys for that user. However, you still have to store the API keys in the application’s configuration file, along with all the hassles of managing secrets that entails.

Instead, what we’re going to do is create a role that allows access to the S3 bucket, and assign it to the instance. First, we’re going to make the S3 bucket:

Then, we’re going to create an AWS IAM policy that grants access to the bucket. A policy is simple a JSON document that lists permissions to things in AWS:

The actual policy document is the JSON bit between the <<EOF and EOF:

There’s quite a bit going on here, but the important parts are the Action and Resource sections. The Action section says what you can do, and in this case we’re saying you can get objects from S3 (in other words, we’re providing read only access to something in S3). The Resource section specifies what you can do it with, and in this case we say you can get S3 objects from anywhere inside the myawsadventapp bucket. If we wanted to provide write access to the bucket we would add another action, s3:PutObject, to the list of actions we allow. We can also change the name of the S3 bucket as needed to provide access to other buckets.

Now that we have the Policy set up to allow access to S3, we need to actually give that set of permissions to the instance itself. To do this, we make a role:

The first part of this is pretty straightforward: we give the role a name. But why is there another Policy JSON document there? This assume role policy specifies who, or what, can become the role. In this case, the policy is just stating that EC2 instances can have the role assigned to them. Generally, when making instance roles, you don’t need to change this.

The policy is already linked to the role (we added a role = section when making the policy. All that remains is to link the role with our instance.

If you were using the AWS web console to make a new instance, assigning a role to it is easy, you just pick the role from the list of roles in the instance details section as you make the instance. However, if you are using terraform, the AWS cli tools, or some other provisioning tool, then there is one more link in the chain: Instance Profiles.

Instance profiles are simply containers for roles that can be attached directly to instances, and can be thought of as simply an implementation detail. Whenever you make a role, make a matching profile, and then attach the profile to the instance. Here’s the profile to match the role we just created:

Notice how the name of the profile is the same as the name of the role. This is how it works with the AWS web console: AWS creates a profile with the same name as the role behind the scenes. Keeping the name the same makes things easier, and once you have done this you can then completely forget that profiles exist.

Finally, now that the profile has been created, we just edit the instance and assign the profile to it:

And now, with all of the required configuration made, we can go ahead and make the instance:

There is one thing to be aware of: an instance profile can’t be changed after an instance has been created, so if you were following along and created the instance earlier without adding the instance profile then you have to recreate the instance from scratch. With this toy instance that’s not a problem, but it may be if you’re adding this to existing infrastructure.

Accessing API keys from the instance

Once the terraform run is complete, we can ssh into the instance and see that the instance profile has been applied:

And if we run a slightly different curl command, we can obtain AWS API keys:

Your application can simply look up the keys when it wants to use an AWS API and doesn’t need to store them in a config file or elsewhere. Note that the credentials listed have an expiration time mentioned. The keys change approximately every 6 hours and you will need to look them up again after this time.

To make life easier, most AWS libraries and commands already support instance roles as a method of getting credentials, and will automatically use any credentials that are available without any further configuration. For example, you can just use the aws cli without needing to configure your credentials:

Some things you can do with IAM roles and instance profiles

So far we’ve shown an example of giving instances access to a particular S3 bucket. This is great, but there are some other uses for instance roles:

One good use case is managing EBS volumes. Say you have an autoscaling group (because AWS instances break and autoscaling groups allow AWS to launch replacements for broken instances), but you have state that needs to be stored on instances that you’d like to not disappear every time an instance is recreated. The way you deal with this is to store the stateful data on EBS volumes, and use a script that runs on boot to attach any EBS volume that isn’t currently in use.

Another case where having IAM roles is really handy: If you install grafana on an AWS instance, the cloudwatch data sourcesupports using IAM roles, and so you can use grafana to view cloudwatch graphs for your AWS account without needing to set up credentials. To do this, use the following IAM policy:

Finally, a special case of the S3 access policy above is to use the S3 bucket to store secrets. This uses S3 as a trusted store, and you use IAM profiles to determine which instances get access to the secrets. This is the basis of the citadel cookbook for Chef that can be used to manage secrets in AWS.

More information

Hopefully this article has given you a taste for IAM roles and instance profiles and how they can make your life much easier when interacting with the AWS API from EC2 instances. If you want more information on using IAM roles, the AWS Documentation on IAM Roles goes into much more detail and is well worth a read.

About the Author

Mark Harrison is a Systems Administrator on the Chef operations team, where he is responsible for the care and feeding of Hosted Chef as well as maintaining several of Chef’s internal systems. Before coming to Chef, Mark led the operations team at OmniTI, helping clients scale their web architectures and supporting some of the largest infrastructures in the world.

About the Editors

Jyrki Puttonen is Chief Solutions Executive at Symbio Finland (@SymbioFinland) who tries to keep on track what happens in cloud.

Exploring Concurrency in Python & AWS

04. December 2016 2016 0

Exploring Concurrency in Python & AWS

From Threads to Lambdas (and lambdas with threads)

Author: Mohit Chawla

Editors: Jesse Davis, Neil Millard

The scope of the current article is to demonstrate multiple approaches to solve a seemingly simple problem of intra-S3 file transfers – using pure Python and a hybrid approach of Python and cloud based constructs, specifically AWS Lambda, with a comparison of the two concurrency approaches.

Problem Background

The problem was to transfer 250 objects daily, each of size 600-800 MB, from one S3 bucket to another. In addition, an initial bulk backup of 1500 objects (6 months of data) had to be taken, totaling 1 TB.

Attempt 1

The easiest way to do this appears to loop over all the objects and transfer them one by one:

This had a runtime of 1 hour 45 minutes. Oops.

Attempt 2

Lets use some threads !

Python offers multiple concurrency methods:

  • asyncio, based on event loops and asynchronous I/O.
  • concurrent.futures, which provides high level abstractions like ThreadPoolExecutor and ProcessPoolExecutor.
  • threading, which provides low level abstractions to build your own solution using threads, semaphores and locks.
  • multiprocessing, which is similar to threading, but for processes.

I used the concurrent.futures module, specifically the ThreadPoolExecutor, which seems to be a good fit for I/O tasks.

Note about the GIL:

Python implements a GIL (Global Interpreter Lock) which limits only a single thread to run at a time, inside a single Python interpreter. This is not a limitation for an I/O intensive task, such as the one being discussed in this article. For more details about how it works, see http://www.dabeaz.com/GIL/.

Here’s the code when using the ThreadPoolExecutor:

This code took 1 minute 40 seconds to execute, woo !

Concurrency with Lambda

I was happy with this implementation, until, at an AWS meetup, there was a discussion about using AWS Lambda and SNS for the same thing, and I thought of trying that out.

AWS Lambda is a compute service that lets you run code without provisioning or managing servers. It can be combined with AWS SNS, which is a message push notification service which can deliver and fan-out messages to several services, including E-Mail, HTTP and Lambda, which as allows the decoupling of components.

To use Lambda and SNS for this problem, a simple pipeline was devised: One Lambda function publishes object names as messages to SNS and another Lambda function is subscribed to SNS for copying the objects.

The following piece of code publishes names of objects to copy to an SNS topic. Note the use of threads to make this faster.

Yep, that’s all the code.

Now, you maybe asking yourself, how is the copy operation actually concurrent ?
The unit of concurrency in AWS Lambda is actually the function invocation. For each published message, the Lambda function is invoked, which means for multiple messages published in parallel, an equivalent number of invocations will be made for the Lambda function. According to AWS, that number for stream based sources is given by:

By default, this is limited to 100 concurrent executions, but can be raised on request.

The execution time for the above code was 2 minutes 40 seconds. This is higher than the pure Python approach, partly because the invocations were throttled by AWS.

I hope you enjoyed reading this article, and if you are an AWS or Python user, hopefully this example will be useful for your own projects.

Note – I gave this as a talk at PyUnconf ’16 in Hamburg, you can see the slides at https://speakerdeck.com/alcy/exploring-concurrency-in-python-and-aws.

About the Author:

Mohit Chawla is a systems engineer, living in Hamburg. He has contributed to open source projects over the last seven years, and has a few projects of his own. Apart from systems engineering, he has a strong interest in data visualization.

server-free pubsub ( and nearly code-free )

02. December 2016 2016 0

Author: Ed Anderson

Editors: Evan Mouzakitis, Brian O’Rourke


This article will introduce you to creating serverless PubSub microservices by building a simple Slack based word counting service.

Lambda Overview

These PubSub microservices are AWS Lambda based. Lambda is a service that does not require you to manage servers in order to run code. The high level overview is that you define events ( called triggers ) that will cause a packaging of your code ( called a function ) to be invoked. Inside your package ( aka function ), a specific function within a file ( called a handler ) will be called.

If you’re feeling a bit confused by overloaded terminology, you are not alone. For now, here’s the short list:

Lambda term  Common Name Description
Trigger AWS Service Component that invokes Lambda
Function software package Group of files needed to run code (includes libraries)
Handler file.function in your package The filename/function name to execute


There are many different types of triggers ( S3, API Gateway, Kinesis streams, and more). See this page for a complete list. Lambdas run in the context of a specific IAM Role. This means that, in addition to features provided by your language of choice ( python, nodejs, java, scala ), you can call from your Lambda to other AWS Services ( like DynamoDB ).

Intro to the PubSub Microservices

These microservices, once built, will count words typed into Slack. The services are:

  1. The first service splits up the user-input into individual words and:
    • increments the counter for each word
    • supplies a response to the user showing the current count of any seen words
    • triggers functions 2 and 3 which execute concurrently
  2. The second service also splits up the user-input into individual words and:
    • adds a count of 10 to each of those words
  3. The third service logs the input it receives.

While you might not have a specific need for a word counter, the concepts demonstrated here can be applied elsewhere. For example, you may have a project where you need to run several things in series, or perhaps you have a single event that needs to trigger concurrent workflows.

For example:

  • Concurrent workflows triggered by a single event:
    • New user joins org, and needs accounts created in several systems
    • Website user is interested in a specific topic, and you want to curate additional content to present to the user
    • There is a software outage, and you need to update several systems ( statuspage, nagios, etc ) at the same time
    • Website clicks need to be tracked in a system used by Operations, and a different system used by Analytics
  • Serial workflows triggered by a single event:
    • New user needs a Google account created, then that Google account needs to be given permission to access another system integrated with Google auth.
    • A new version of software needs to be packaged, then deployed, then activated –
    • Cup is inserted to a coffee machine, then the coffee machine dispenses coffee into the cup


  • The API Gateway ( trigger ) will call a Lambda Function that will split whatever text it is given into specific words
    • Upsert a key in a DynamoDB table with the number 1
    • Drop a message on a SNS Topic
  • The SNS Topic ( trigger ) will have two lambda functions attached to it that will
    • Upsert the same keys in the dynamodb with the number 10
    • Log a message to CloudWatchLogs
Visualization of Different Microservices comprising the Slack Based Word counter
Visualization of the Microservices


Example code for AWS Advent near-code-free PubSub. Technologies used:

  • Slack ( outgoing webhooks )
  • API Gateway
  • IAM
  • SNS
  • Lambda
  • DynamoDB

Pub/Sub is teh.best.evar* ( *for some values of best )

I came into the world of computing by way of The Operations Path. The Publish-Subscribe Pattern has always been near and dear to my ❤️.

There are a few things about PubSub that I really appreciate as an “infrastructure person”.

  1. Scalability. In terms of the transport layer ( usually a message bus of some kind ), the ability to scale is separate from the publishers and the consumers. In this wonderful thing which is AWS, we as infrastructure admins can get out of this aspect of the business of running PubSub entirely.
  2. Loose Coupling. In the happy path, publishers don’t know anything about what subscribers are doing with the messages they publish. There’s admittedly a little hand-waving here, and folks new to PubSub ( and sometimes those that are experienced ) get rude surprises as messages mutate over time.
  3. Asynchronous. This is not necessarily inherent in the PubSub pattern, but it’s the most common implementation that I’ve seen. There’s quite a lot of pressure that can be absent from Dev Teams, Operations Teams, or DevOps Teams when there is no expectation from the business that systems will retain single millisecond response times.
  4. New Cloud Ways. Once upon a time, we needed to queue messages in PubSub systems ( and you might you might still have a need for that feature ), but with Lambda, we can also invoke consumers on demand as messages pass through our system. We don’t necessarily hace to keep things in the queue at all. Message appears, processing code runs, everybody’s happy.

Yo dawg, I heard you like ️☁️

One of the biggest benefits that we can enjoy from being hosted with AWS is not having to manage stuff. Running your own message bus might be something that separates your business from your competition, but it might also be undifferentiated heavy lifting.

IMO, if AWS can and will handle scaling issues for you ( to say nothing of only paying for the transactions that you use ), then it might be the right choice for you to let them take care of that for you.

I would also like to point out that running these things without servers isn’t quite the same thing as running them in a traditional setup. I ended up redoing this implementation a few times as I kept finding the rough edges of running things serverless. All were ultimately addressable, but I wanted to keep the complexity of this down somewhat.



CloudFormation is pretty well covered by AWS Advent, we’ll configure this little diddy via the AWS console.


Setup the first lambda, which will be linked to an outgoing webhook in slack

Setup the DynamoDB

👇 You can follow the steps below, or view this video 👉 Video to DynamoDB Create

  1. Console
  2. DynamoDB
  3. Create Table
    1. Table Name table
    2. Primary Key word
    3. Create

Setup the First Lambda

This Lambda accepts the input from a Slack outgoing webhook, splits the input into separate words, and adds a count of one to each word. It further returns a json response body to the outgoing webhook that displays a message in slack.

If the Lambda is triggered with the input awsadvent some words, this Lambda will create the following three keys in dynamodb, and give each the value of one.

  • awsadvent = 1
  • some = 1
  • words = 1

👇 You can follow the steps below, or view this video 👉 Video to Create the first Lambda

  1. Make the first Lambda, which accepts slack outgoing webook input, and saves that in DynamoDB
    1. Console
    2. Lambda
    3. Get Started Now
    4. Select Blueprint
      1. Blank Function
    5. Configure Triggers
      1. Click in the empty box
      2. Choose API Gateway
    6. API Name
      1. aws_advent ( This will be the /PATH of your API Call )
    7. Security
      1. Open
    8. Name
      1. aws_advent
    9. Runtime
      1. Python 2.7
    10. Code Entry Type
      1. Inline
      2. It’s included as app.py in this repo. There are more Lambda Packaging Examples here
    11. Environment Variables
      1. DYNAMO_TABLE = table
    12. Handler
      1. app.handler
    13. Role
      1. Create new role from template(s)
      2. Name
        1. aws_advent_lambda_dynamo
    14. Policy Templates
      1. Simple Microservice permissions
    15. Triggers
      1. API Gateway
      2. save the URL

Link it to your favorite slack

👇 You can follow the steps below, or view this video 👉 Video for setting up the slack outbound wehbook

  1. Setup an outbound webhook in your favorite Slack team.
  2. Manage
  3. Search
  4. outgoing wehbooks
  5. Channel ( optional )
  6. Trigger words
    1. awsadvent
    2. URLs
  7.  Your API Gateway Endpoint on the Lambda from above
  8. Customize Name
  9.  awsadvent-bot
  10. Go to slack
    1. Join the room
    2. Say the trigger word
    3. You should see something like 👉 something like this


Ok. now we want to do the awesome PubSub stuff

Make the SNS Topic

We’re using a SNS Topic as a broker. The producer ( the aws_advent Lambda ) publishes messages to the SNS Topic. Two other Lambdas will be consumers of the SNS Topic, and they’ll get triggered as new messages come into the Topic.

👇 You can follow the steps below, or view this video 👉 Video for setting up the SNS Topic

  1. Console
  2. SNS
  3. New Topic
  4. Name awsadvent
  5. Note the topic ARN

Add additional permissions to the first Lambda

This permission will allow the first Lambda to talk to the SNS Topic. You also need to set an environment variable on the aws_advent Lambda to have it be able to talk to the SNS Topic.

👇 You can follow the steps below, or view this video 👉 Adding additional IAM Permissions to the aws_lambda role

  1. Give additional IAM permissions on the role for the first lambda
    1. Console
    2. IAM
    3. Roles aws_advent_lambda_dynamo
      1. Permissions
      2. Inline Policies
      3. click here
      4. Policy Name
      5. aws_advent_lambda_dynamo_snspublish

Add the SNS Topic ARN to the aws_advent Lambda

👇 You can follow the steps below, or view this video 👉 Adding a new environment variable to the lambda

There’s a conditional in the aws_advent lambda that will publish to a SNS topic, if the SNS_TOPIC_ARN environment variable is set. Set it, and watch more PubSub magic happen.

  1. Add the SNS_TOPIC_ARN environment variable to the aws_advent lambda
    1. Console
    2. LAMBDA
    3. aws_advent
    4. Scroll down
      1. The SNS Topic ARN from above.

Create a consumer Lambda: aws_advent_sns_multiplier

This microservice increments the values collected by the aws_advent Lambda. In a real world application, I would probably not take the approach of having a second Lambda function update values in a database that are originally input by another Lambda function. It’s useful here to show how work can be done outside of the Request->Response flow for a request. A less contrived example might be that this Lambda checks for words with high counts, to build a leaderboard of words.

This Lambda function will subscribe to the SNS Topic, and it is triggered when a message is delivered to the SNS Topic. In the real world, this Lambda might do something like copy data to a secondary database that internal users can query without impacting the user experience.

👇 You can follow the steps below, or view this video 👉 Creating the sns_multiplier lambda

  1. Console
  2. lambda
  3. Create a Lambda function
  4. Select Blueprint 1. search sns 1. sns-message python2.7 runtime
  5. Configure Triggers
    1. SNS topic
      1. awsadvent
      2. click enable trigger
  6. Name
    1. sns_multiplier
  7. Runtime
    1. Python 2.7
  8. Code Entry Type
    1. Inline
      1. It’s included as sns_multiplier.py in this repo.
  9. Handler
    1. sns_multiplier.handler
  10. Role
    1. Create new role from template(s)
  11. Policy Templates
    1. Simple Microservice permissions
  12. Next
  13. Create Function

Go back to slack and test it out.

Now that you have the most interesting parts hooked up together, test it out!

What we’d expect to happen is pictured here 👉 everything working

👇 Writeup is below, or view this video 👉 Watch it work

  • The first time we sent a message, the count of the number of times the words are seen is one. This is provided by our first Lambda
  • The second time we sent a message, the count of the number of times the words are seen is twelve. This is a combination of our first and second Lambdas working together.
    1. The first invocation set the count to current(0) + one, and passed the words off to the SNS topic. The value of each word in the database was set to 1.
    2. After SNS recieved the message, it ran the sns_multiplier Lambda, which added ten to the value of each word current(1) + 10. The value of each word in the database was set to 11.
    3. The second invocation set the count of each word to current(11) + 1. The value of each word in the database was set to 12.

️️💯💯💯 Now you’re doing pubsub microservices 💯💯💯

Setup the logger Lambda as well

This output of this Lambda will be viewable in the CloudWatch Logs console, and it’s only showing that we could do something else ( anything else, even ) with this microservice implementation.

  1. Console
  2. Lambda
  3. Create a Lambda function
  4. Select Blueprint
    1. search sns
    2. sns-message python2.7 runtime
  5. Configure Triggers
    1. SNS topic
      1. awsadvent
      2. click enable trigger
  6. Name
    1. sns_logger
  7. Runtime
    1. Python 2.7
  8. Code Entry Type
    1. Inline
      1. It’s included as sns_logger.py in this repo.
  9. Handler
    1. sns_logger.handler
  10. Role
    1. Create new role from template(s)
  11. Policy Templates
    1. Simple Microservice permissions
  12. Next
  13. Create Function

In conclusion

PubSub is an awsome model for some types of work, and in AWS with Lambda we can work inside this model relatively simply. Plenty of real-word work depends on the PubSub model.

You might translate this project to things that you do need to do like software deployment, user account management, building leaderboards, etc.

AWS + Lambda == the happy path

It’s ok to lean on AWS for the heavy lifting. As our word counter becomes more popular, we probably won’t have to do anything at all to scale with traffic. Having our code execute on a request-driven basis is a big win from my point of view. “Serverless” computing is a very interesting development in cloud computing. Look for ways to experiment with it, there are plenty of benefits to it ( other than novelty ).

Some benefits you can enjoy via Serverless PubSub in AWS:

  1. Scaling the publishers. Since this used API Gateway to terminate user requests to a Lambda function:
    1. You don’t have idle resources burning money, waiting for traffic
    2. You don’t have to scale because traffic has increased or decreased
  2. Scaling the bus / interconnection. SNS did the following for you:
    1. Scaled to accommodate the volume of traffic we send to it
    2. Provided HA for the bus
    3. Pay-per-transaction. You don’t have to pay for idle resources!
  3. Scaling the consumers. Having lambda functions that trigger on a message being delivered to SNS:
    1. Scaled the lambda invocations to the volume of traffic.
    2. Provides some sense of HA

Lambda and the API Gateway are works in progress.

Lambda is a new technology. If you use it, you will find some rough edges.

The API Gateway is a new technology. If you use it, you will find some rough edges.

Don’t let that dissuade you from trying them out!

I’m open for further discussion on these topics. Find me on twitter @edyesed

About the Author:

Ed Anderson has been working with the internet since the days of gopher and lynx. Ed has worked in healthcare, regional telecom, failed startups, multinational shipping conglomerates, and is currently working at RealSelf.com.

Ed is into dadops,  devops, and chat bots.

Writing in the third person is not Ed’s gift. He’s much more comfortable poking the private cloud bear,  destroying ec2 instances, and writing lambda functions be they use case appropriate or not.

He can be found on Twitter at @edyesed.

About the Editors:

Evan Mouzakitis is a Research Engineer at Datadog. He is passionate about solving problems and helping others. He has written about monitoring many popular technologies, including Lambda, OpenStack, Hadoop, and Kafka.

Brian O’Rourke is the co-founder of RedisGreen, a highly available and highly instrumented Redis service. He has more than a decade of experience building and scaling systems and happy teams, and has been an active AWS user since S3 was a baby.

Deploy your AWS Infrastructure Continuously

01. December 2016 2016 0

Author: Michael Wittig

Continuously integrating and deploying your source code is the new standard in many successful internet companies. But what about your infrastructure? Can you deploy a change to your infrastructure in an automated way? Can you run automated tests on your infrastructure to ensure that a change has no unintended side effects? In this post I will show you how you can apply the same processes to your AWS infrastructure that you apply to your source code. You will learn how the AWS services CloudFormation, CodePipeline and Lambda can be combined to continuously deploy infrastructure.


You may think: “Source code is text files, but my infrastructure is different. I don’t have a source file for my infrastructure.” Infrastructure as Code as defined by Martin Fowler is a concept that is helping bring software development practices to infrastructure practices.

Infrastructure as code is the approach to defining computing and network infrastructure through source code that can then be treated just like any software system.
– Martin Fowler

AWS CloudFormation is one implementation of Infrastructure as Code. CloudFormation is a high quality and free service offered by AWS. To understand CloudFormation you need to know about templates and stacks. The template is the source code, a textual representation of your infrastructure. The stack is the actual running infrastructure described by the template. So a CloudFormation template is exactly what we need, a plain text file. The CloudFormation service interprets the template and turns it into a running infrastructure.

Now, our infrastructure is defined by a text file which is exactly what we need to apply the same processes to it that we have for source code.

The Pipeline

The pipeline to build and deploy is a sequence of steps that are necessary to ship changes to your users. Starting with a change in the code repository and ending in your production environment. The following figure shows a Pipeline that runs inside AWS CodePipeline, the AWS CD service.

AWS CodePipeline - Deploying infrastructure continuously

Whenever a git push is made to a repository hosted on GitHub the pipeline starts to run by fetching the current version of the repository. After that, the pipeline creates or updates itself because the pipeline definition itself is also treated as source code. After that, the up-to-date pipeline creates or updates the test environment. After this step, infrastructure in the test environment looks exactly as it was defined in the template. This is also a good place to deploy the application to the test environment. I’m using Elastic Beanstalk to host the demo application. Now it’s time to check if the infrastructure is still in a good shape. We want to make sure that everything runs as it is defined in the tests. The tests may check if a certain port is reachable, if a certain user can login via SSH, if a certain port is NOT reachable, and so on, and so forth. If the tests are successful, the production environment is adapted to the new template and the new application version is deployed.


From Source to Deploy PipelineCodePipeline has native support for GitHub, CloudFormation, Elastic Beanstalk, and Lambda. So I can use all the services and tie them together using CodePipeline. You can find the full source code and detailed setup instructions in this GitHub repository: michaelwittig/automation-for-the-people


The following template snippet shows an excerpt of the full pipeline description. Here you see how the pipeline can be configured to checkout the GitHub repository and create/update itself:



Infrastructure as Code enables you to apply the same CI & CD processes to infrastructure that you already know from software development. On AWS, you can use CloudFormation to turn a text representation of your infrastructure into a running environment stack. CodePipeline can be used to orchestrate the deployment process and you can implement custom logic, such as infrastructure tests, in a programming language that you can run on AWS Lambda. Finally you can treat your infrastructure as code and deploy each commit with confidence into production.

About the Author

Michael WittigMichael Wittig is author of Amazon Web Services in Action (Manning) and writes frequently about AWS on cloudonaut.io. He helps his clients to gain value from Amazon Web Services. As a software engineer he develops cloud-native real-time web and mobile applications. He migrated the complete IT infrastructure of the first bank in Germany to AWS. He has expertise in distributed system development and architecture, with experience in algorithmic trading and real-time analytics.

welcome to aws advent 2016

05. October 2016 welcome 0

We’re pleased to announce that AWS Advent is returning.

What is the AWS Advent event? Many technology platforms have started a yearly tradition for the month of December revealing an article per day written and edited by volunteers in the style of an advent calendar, a special calendar used to count the days in anticipation of Christmas starting on December 1. The AWS Advent event explores everything around the Amazon Web Services platform.

Examples of past AWS articles:

Please explore the rest of this site for more examples of past topics.

There are a large number of AWS services, and many that have never been covered on AWS advent in previous years. We’re looking for articles that range in audience level from beginning to advanced from beginners to experts in AWS. Introductory, security, architecture, and design patterns with any of the AWS services are welcome topics.

Interested in being part of AWS Advent 2016? 

Process for submission acceptance

  • Interesting title
  • Fresh point of view, unique, timely topic
  • Points relevant and interesting to the topic
  • Scope of the topic matches the intended audience
  • Availability to pair with editor and other volunteers to polish up submission

People who have volunteered to evaluate submissions will start reviewing without identifying information about the individuals to focus on the content evaluation. AWS Advent editors Brandon Burton, and Jennifer Davis will evaluate the program for diversity   We will pair folks up with available volunteers to do technical and copy editing.

Process for volunteer acceptance

  • Availability!

Important Dates

  • Blind submission review begins – October 24, 2016
  • Authors and other volunteers rolling submissions start – October 26, 2016
  • Submissions accepted until advent calendar complete.
  • Rough drafts due – 12:00am November 21, 2016
  • Final drafts due – 12:00am November 30, 2016

Please be aware that we are working on a code of conduct for participants of this event. To start, we are borrowing from the Chef Community Guidelines:

  • Be welcoming, inclusive, friendly, and patient.
  • Be considerate.
  • Be respectful.
  • Be professional.
  • Be careful in the words that you choose.
  • When we disagree, let’s all work together to understand why.


Thank you, and we look forward to a great AWS Advent in 2016!

Jennifer Davis, @sigje

Brandon Burton, @solarce

AWS Advent 2014 is a wrap!

AWS Advent 2014: Repeatable Infrastructure with CloudFormation and YAML

Ted Timmons is a long-term devops nerd and works for Stanson Health, a healthcare startup with a fully remote engineering team.

One key goal of a successful devops process – and successful usage of AWS – is to create automated, repeatable processes. It may be acceptable to spin up EC2 instances by hand in the early stage of a project, but it’s important to convert this from a manual experiment to a fully described system before the project reaches production.

There are several great tools to describe the configuration of a single instance- Ansible, Chef, Puppet, Salt- but these tools aren’t well-suited for describing the configuration of an entire system. This is where Amazon’s CloudFormation comes in.

CloudFormation was launched in 2011. It’s fairly daunting to get started with, errors in CloudFormation templates are typically not caught until late in the process, and since it is fed by JSON files it’s easy to make mistakes. Proper JSON is unwieldy (stray commas, unmatched closing blocks), but it’s fairly easy to write YAML and convert it to JSON.

EC2-VPC launch template

Let’s start with a simple CloudFormation template to create an EC2 instance. In this example many things are hardcoded, like the instance type and AMI. This cuts down on the complexity of the example. Still, it’s a nontrivial example that creates a VPC and other resources. The only prerequisite for this example is to create a keypair in the US-West-2 region called “advent2014”.

As you look at this template, notice both the quirks of CloudFormation (especially “Ref” and “Fn::GetAtt”) and the quirks of JSON. Even with some indentation the brackets are complex, and correct comma placement is difficult while editing a template.


Next, let’s convert this JSON example to YAML. There’s a quick converter in this article’s repository, with python and pip installed, the only other dependency should be to install PyYAML with pip.

Since JSON doesn’t maintain position of hashes/dicts, the output order may vary. Here’s what it looks like immediately after conversion:

Only a small amount of reformatting is needed to make this file pleasant: I removed unnecessary quotes, combined some lines, and moved the ‘Type’ line to the top of each resource.

YAML to JSON to CloudFormation

It’s fairly easy to see the advantages of YAML in this case- it has a massive reduction in brackets and quotes and no need for commas. However, we need to convert this back to JSON for CloudFormation to use. Again, the converter is in this article’s repository.

That’s it!

Ansible assembly

If you would like to use Ansible to prepare and publish to CloudFormation, my company shared an Ansible module to compile YAML into a single JSON template. The shared version of the script is entirely undocumented, but it compiles a full directory structure of YAML template snippets into a template. This significantly increases readability. Just placecloudformation_assemble in your library/ folder and use it like any other module.

If there’s interest, I’ll help to document and polish this module so it can be submitted to Ansible. Just fork and send a pull request.


AWS Advent 2014: CloudFormation woes: Keep calm and use Ansible

Today’s post on using Ansible to help you get the most out of CloudFormation comes to use from Soenke Ruempler, who’s helping keep things running smoothly at Jimdo.

No more outdated information, a single source of truth. Describing almost everything as code, isn’t this one of the DevOps dreams? Recent developments have made this dream even closer. In the Era of APIs, tools like TerraForm and Ansible have evolved which are able to codify the creation and maintenance of entire “organizational ecosystems”.

This blog post is a brief description of the steps we have taken to come closer to this goal at my employer Jimdo. Before we begin looking at particular implementations, let’s take the helicopter view and have a look at the current state and the problems with it.

Current state

We began to move to AWS in 2011 and have been using CloudFormation from the beginning. While we currently describe almost everything in CloudFormation, there are some legacy pieces which were just “clicked” through the AWS console. In order to to have some primitive auditing and documentation for those, we usually document all “clicked” settings with a Jenkins job, which runs Cucumber scenarios that do a live inspection of the settings (by querying the AWS APIs with a read-only user).

While this setup might not look that bad and has a basic level of codification, there are several drawbacks, especially with CloudFormation itself, which we are going to have a look at now.

Problems with the current state

Existing AWS resources cannot be managed by CloudFormation

Maybe you have experienced this same issue: You start off with some new technology or provider and initially use the UI to play around. And suddenly, those clicked spikes are in production. At least this is the story how we came to AWS at Jimdo 😉

So you might say: “OK, then let’s rebuild the clicked resources into a CloudFormation stack.” Well, the problem is that we didn’t describe basic components like VPC and Subnets as CloudFormation stacks in the first place, and as other production setups rely on those resources, we cannot change this as easily anymore.

Not all AWS features are immediately available in CloudFormation

Here is another issue: The usual AWS feature release process is that a component team releases a new feature (e.g. ElastiCache replica groups), but the CloudFormation part is missing (the CloudFormation team at AWS is a separate team with its own roadmap). And since CloudFormation isn’t open source, we cannot add the missing functionality by ourselves.

So, in order to use those “Non-CloudFormation” features, we used to click the setup as a workaround, and then again document the settings with Cucumber.

But the click-and-document-with-cucumber approach seems to have some drawbacks:

  • It’s not an enforced policy to document, so colleagues might miss the documentation step or see no value in it
  • It might be incomplete as not all clicked settings are documented
  • It encourages a “clicking culture”, which is the exact opposite of what we want to achieve

So we need something which could be extended as a CloudFormation stack with resources that we couldn’t (yet) express in CloudFormation. And we need them to be grouped together semantically, as code.

Post processors for CloudFormation stacks

Some resources require post-processing in order to be fully ready. Imagine the creation of an RDS MySQL database with CloudFormation. The physical database was created by CloudFormation, but what about databases, users, and passwords? This cannot be done with CloudFormation, so we need to work around this as well.

Our current approaches vary from manual steps documented in a wiki to a combination of Puppet and hiera-aws: Puppet – running on some admin node – retrieves RDS instance endpoints by tags and then iterates over them and executes shell scripts. This is a form of post-processing entirely decoupled from the CloudFormation stack, actually in terms of time (hourly Puppet run) and in also in terms of “location” (it’s in another repository). A very complicated way just for the sake of automation.

Inconvenient toolset

Currently we use the AWS CLI tools in a plain way. Some coworkers use the old tools, some use the new ones. And I guess there are even folks with their own wrappers / bash aliases.

A “good” example is the missing feature of changing tags of CloudFormation stacks after creation. So if you forgot to do this in the first place, you’d need to recreate the entire stack! The CLI tools do not automatically add tags to stacks, so this is easily forgotten and should be automated. As a result we need to think of a wrapper around CloudFormation which automates those situations.

Hardcoded / copy and pasted data

The idea of “single source information” or “single source of truth” is to never have a representation of data saved in more than one location. In the database world, it’s called “database normalization”. This is a very common pattern which should be followed unless you have an excellent excuse.

But, if you may not know better, you are under time pressure, or your tooling is still immature, it’s hard to keep the data single-sourced. This usually leads to copying and pasting hardcoding data.

Examples regarding AWS are usually resource IDs like Subnet-IDs, Security Groups or – in our case- our main VPC ID.

While this may not be an issue at first, it will come back to you in the future, e.g. if you want to rollout your stacks in another AWS region, perform disaster recovery, or you have to grep for hardcoded data in several codebases when doing refactorings, etc.

So we needed something to access information of other CloudFormation stacks and/or otherwise created resources (from the so called “clicked infrastructure”) without ever referencing IDs, Security Groups, etc. directly.

Possible solutions

Now we have a good picture of what our current problems are and we can actually look for solutions!

My research resulted in 3 possible tools: AnsibleTerraForm and Salt.

As of writing this Ansible seems to be the only currently available tool which can deal with existing CloudFormation stacks out of the box and also seems to meet the other criteria at first glance, I decided to move on with it.

Spiking the solution with Ansible

Describing an existing CloudFormation stack as Ansible Playbook

One of the mentioned problems are the inconvenient CloudFormation CLI tools: To create/update a stack, you would have to synthesize at least the stack name, template file name, and parameters, which is no fun and error-prone. For example:

With Ansible, we can describe a new or existing CloudFormation stack with a few lines as an Ansible Playbook, here one example:

Creating and updating (converging) the CloudFormation stack becomes as straightforward as:

Awesome! We finally have great tooling! The YAML syntax is machine and human readable and our single source of truth from now on.

Extending an existing CloudFormation stack with Ansible

As for added power, it should be easier to implement AWS functionality that’s currently missing from CloudFormation as an Ansible module than a CloudFormation external resource […] and performing other out of band tasks, letting your ticketing system know about a new stack for example, is a lot easier to integrate into Ansible than trying to wrap the cli tools manually.

— Dean Wilson

The above example stack uses the AWS ElastiCache feature of Redis replica groups, which unfortunately isn’t currently supported by CloudFormation. We could only describe the main ElastiCache cluster in CloudFormation. As a workaround, we used to click this missing piece and documented it with Cucumber as explained above.

A short look at the Ansible documentation reveals there is currently no support for ElastiCache replica groups in Ansible as well. But a quick research shows we have the possibility to extend Ansible with custom modules.

So I started spiking my own Ansible module to handle ElastiCache replica groups, inspired by the existing “elasticache” module. This involved the following steps:

  1. Put the module under “library/”, e.g. elasticache_replication_group.py (I published the unfinished skeleton as a Gist for reference)
  2. Add an output to the existing CloudFormation stack which is creating the ElastiCache cluster, in order to return the ID(s) of the cache cluster(s): We need them to create the read replica group(s). Register the output of the cloudformation Ansible task:
  1. Extend the playbook to create the ElastiCache replica group by reusing the output of thecloudformation task:

Pretty awesome: Ansible works as a glue language while staying very readable. Actually it’s possible to read through the playbook and have an idea what’s going on.

Another great thing is that we can even extend core functionality of Ansible without any friction (as waiting for upstream to accept a commit, build/deploy new packages, etc) which should increase the tool acceptance across coworkers even more.

This topic touches another use-case: The possibility to “chain” CloudFormation stacks with Ansible: Reusing Outputs from Stacks as parameters for other stacks. This is especially useful to split big monolithic stacks into smaller ones which as a result can be managed and reused independently (separation of concerns).

Last but not least, it’s now easy to extend the Ansible playbook with post processing tasks (remember the RDS/Database example above).

Describing existing AWS resources as a “Stack”

As mentioned above, one issue with CloudFormation is a a way to import existing infrastructure into a stack. Luckily, Ansible supports most of the AWS functionality so we can create a playbook to express existing infrastructure as code.

To discover the possibilities, I converted a fraction of our current production VPC/subnet setup into an Ansible playbook:

As you can see, there is not even a hardcoded VPC ID! Ansible identifies the VPC by a Tag-CIDR tuple, which meets our initial requirement of “no hardcoded data”.

To stress this, I changed the aws_region variable to another AWS region, and it was possible to create the basic VPC setup in another region, which is another sign for a successful single-source-of-truth.

Single source information

Now we want to reuse the information of the VPC which we just brought “under control” in the last example. Why should we do this? Well, in order to be fully automated (which is our goal), we cannot afford any hardcoded information.

Let’s start with the VPC ID, which should be one of the most requested IDs. Getting it is relatively easy because we can just extract it from the ec2_vpc module output and assign it as a variable with the set_fact Ansible module:

OK, but we also need to reuse the subnet information – and to avoid hardcoding, we need to address them without using subnet IDs. As we tagged the subnets above, we could use the tuple (name-tag, Availability zone) to identify and group them.

With the awesome help from the #ansible IRC channel folks, I could make it work to extract one subnet by ID and Tag from the output:

While this satisfies the single source requirement, it doesn’t seem to scale very well with a bunch of subnets. Imagine you’d have to do this for each subnet (we already have more than 50 at Jimdo).

After some research I found out that it’s possible to add custom filters to Ansible that allow to manipulate data with Python code:

We can now assign the subnets for later usage like this in Ansible:

This is a great way to prepare the subnets for later usage, e.g. in iterations, to create RDS or ElastiCache subnet groups. Actually, almost everything in a VPC needs subnet information.

Those examples should be enough for now to give us confidence that Ansible is a great tool which fits our needs. Takeaways

As of of writing this, Ansible and CloudFormation seem to be a perfect fit for me. The combination turns out to be a solid solution to the following problems:

  • Single source of information / no hardcoded data
  • Combining documentation and “Infrastructure as Code”
  • Powerful wrapper around basic AWS CLI tooling
  • Inception point for other orchestration software (e. g. CloudFormation)
  • Works with existing AWS resources
  • Easy to extend (Modules, Filters, etc: DSL weaknesses can be worked around by hooking in python code)

Next steps / Vision

After spiking the solution, I could imagine the following next steps for us:

  • Write playbooks for all existing stacks and generalize concepts by extracting common concepts (e.g. common tags)
  • Transform all the tests in Cucumber to Ansible playbooks in order to have a single source
  • Remove hardcoded IDs from existing CloudFormation stacks by parameterizing them via Ansible.
  • Remove AWS Console (write) access to our Production AWS account in order to enforce the “Infrastructure as Code” paradigm
  • Bring more clicked infrastructure / ecosystem under IaC-control by writing more Ansible modules (e.g. GitHub Teams and Users, Fastly services, Heroku Apps, Pingdom checks)
  • Spinning up the VPC including some services in another region in order to prove we are fully single-sourced (e. g. no hardcoded IDs) and automated.
  • Trying out Ansible Tower for:
    • Regular convergence runs in order to avoid configuration drift and maybe even revert clicked settings (similar to “Simian army” approach)
    • A “single source of Infrastructure updates”
  • Practices like Game Days to actually test Disaster recovery scenarios

I hope this blog post has brought some new thoughts and inspirations to the readers. Happy holidays!