Four ways AWS Lambda makes me happy

09. December 2016 2016 0

Author: Tal Perry
Editors: Jyrki Puttonen, Bill Weiss

Intro

What is Lambda

Side projects are my way of learning new technology. One that I’ve been anxious to try is AWS Lambda. In this article, I will focus on the things that make Lambda a great service in my opinion.

For the uninitiated, Lambda is a service that allows you to essentially upload a function and AWS will make sure the hardware is there to run it. You pay for the compute time in hundred millisecond increments instead of by the hour, and you can run as many copies of your lambda function as needed.

You can think of Lambda as a natural extension to containers. Containers (like Docker) allow you to easily deploy multiple workloads to a fleet of servers. You no longer deploy to a server, you deploy to the fleet and if there is enough room in the fleet your container runs. Lambda takes this one step further by abstracting away the management of the underlying server fleet and containerization. You just upload code, AWS containerizes it and puts it on their fleet.

Why did I choose Lambda?

My latest side project is SmartScribe, an automated transcription service. SmartScribe transcribes hours of audio in minutes, a feat which requires considerable memory and parallel processing of audio. While a fleet of containers could get the job done, I didn’t want to manage a fleet, integrate it with other services nor did I want to pay for my peak capacity when my baseline usage was far lower. Lambda abstracts away these issues, which made it a very satisfying choice.

How AWS Lambda makes me happy

It’s very cheap

I love to invest my time in side projects, I get to create and learn. Perhaps irrationally, I don’t like to put a lot of money into them from the get go. When I start building a project I want it up all the time so that I can show it around. On the other hand I know that 98% of the time, my resources will not be used.

Serverless infrastructure saves me that 98% by allowing me to pay by the millisecond instead of by the hour. 98% is a lot of savings by any account

I don’t have to think about servers

As I mentioned, I like to invest my time in side projects but I don’t like to invest it in maintaining or configuring infrastructure. A thousand little things can go wrong on your server and any one of those will bring your product to a halt. I’m more than happy to never think about another server again.

Here are a few things that have slowed me down before that Lambda has abstracted away:

  1. Having to reconfigure because I forgot to set the IP address of an instance to elastic and the address went away when I stopped it (to save money)
  2. Worrying about disk space. My processes write to the disk. Were I to use a traditional architecture I’d have to worry about multiple concurrent processes consuming the entire disk, a subtle and aggravating bug. With lambda, each function invocation is guaranteed a (small) chunk of tmp space which reduces my concern.
  3. Running out of memory. This is a fine point because a single lambda function can only use 1.5 G of memory.

Two caveats:

  1. Applications that hold large data sets in memory might not benefit from Lambda. Applications that hold small to medium sized data sets in memory are prime candidates.
  2. 512MB of provisioned tmp space is a major bottleneck to writing larger files to disk.

Smart Scribe works with fairly large media files and we need to store them in memory with overhead. Even a few concurrent users can easily lead to problems with available memory – even with a swap file (and we hate configuring servers so we don’t want one). Lambda guarantees that every call to my endpoints will receive the requisite amount of memory. That’s priceless.

I use Apex to deploy my functions, which happens in one line

Apex is smart enough to only deploy the functions that have changed. And in that one line, my changes and only them reach every “server” I have. Compare that to the time it takes to do a blue green deployment or, heaven forbid, sshing into your server and pulling the latest changes.

But wait, there is more. Pardon last year’s buzzword, but AWS Lambda induces or at least encourages a microservice architecture. Since each function exists as its own unit, testing becomes much easier and more isolated which saves loads of time.

Tight integration with other AWS services

What makes microservices hard is the overhead of orchestration and communications between all of the services in your system. What makes Lambda so convenient is that it integrates with other AWS services, abstracting away that overhead.

Having AWS invoke my functions based on an event in S3 or SNS means that I don’t have to create some channel of communication between these services, nor monitor that channel. I think that this fact is what makes Lambda so convenient, the overhead you pay for a scalable, maintainable and simple code base is virtually nullified.

The punch line

One of the deep axioms of the world is “Good, Fast, Cheap : Choose two”. AWS Lambda takes a stab at challenging that axiom.

About the Author:

By day, Tal is a data science researcher at Citi’s Innovation lab Tel Aviv focusing on NLP. By night he is the founder of SmartScribe, a fully serverless automated transcription service hosted on AWS. Previously Tal was CTO of Superfly where he and his team leveraged AWS technologies and good Devops to scale the data pipeline 1000x. Check out his projects and reach out on Twitter @thetalperry

About the Editors:

Jyrki Puttonen is Chief Solutions Executive at Symbio Finland (@SymbioFinland) who tries to keep on track what happens in cloud.

Bill Weiss is a senior manager at Puppet in the SRE group.  Before his move to
Portland to join Puppet he spent six years in Chicago working for Backstop
Solutions Group, prior to which he was in New Mexico working for the
Department of Energy.  He still loves him some hardware, but is accepting
that AWS is pretty rad for some some things.


Building custom AMIs with Packer

08. December 2016 2016 0

Author: Andreas Rütten
Editors: Steve Button, Joe Nuspl

Amazon machine images (AMIs) are the basis of every EC2 instance launched. They contain the root volume and thereby define what operating system or application will run on the instance.

There are two types of AMIs:

  • public AMIs are provided by vendors, communities or individuals. They are available on the AWS Marketplace and can be paid or free.
  • private AMIs belong to a specific AWS account. They can be shared with other AWS accounts by granting launch permissions. Usually they are either a copy of a public AMI or created by the account owner.

There are several reasons to create your own AMI:

  • predefine a template of the software, which runs on your instances. This provides a major advantage for autoscaling environments. Since most, if not all, of the system configuration has already been done, there is no need to run extensive provisioning steps on boot. This drastically reduces the amount of time from instance start to service ready.
  • provide a base AMI for further usage by others. This can be used to ensure a specific baseline across your entire organization.

What is Packer

Packer is software from the HashiCorp universe, like Vagrant, Terraform or Consul. From a single source, you can create machine and container images for multiple platforms.

For that Packer has the concept of builder and provisioner.

Builders exist for major cloud providers (like Amazon EC2, Azure or Google), for container environments (like Docker) or classical visualization environments (like QEMU, VirtualBox or VMware). They manage the build environment and perform the actual image creation.

Provisioners on the other hand are responsible for installing and configuring all components, that will be part of the final image. They can be simple shell commands or fully featured configuration management systems like Puppet, Chef or Ansible.

How to use Packer for building AWS EC2 AMIs

The heart of every Packer build is the template, a JSON file which defines the various steps of each Packer run. Let’s have a look at a very simple Packer template:
simple_template

There is just one builder and one simple provisioner. On line 4, we specify the amazon-ebs builder which means Packer will create an EBS-backed AMI by:

  • launching the source AMI
  • running the provisioner
  • stopping the instance
  • creating a snapshot
  • converting the snapshot into a new AMI
  • terminating the instance

As this all occurs under your AWS account, you need to specify your access_key and secret_key. Lines 7-9 specify the region, source AMI and instance type that will be used during the build. ssh_username specifies which user Packer will use to ssh into the build instance. This is specific to the source AMI. Ubuntu based AMIs use ubuntu which AWS Linux based AMIs use ec2-user.

Packer will create temporary keypairs and security groups to connect the local system to the build instance to run the provisioner. If you are running packer prior to 0.12.0 watch out for GitHub issue #4057.

The last line defines the name of the resulting AMI. We use the {{timestamp}} function of the Packer template engine which generates the current timestamp as part of the final AMI name.

The provisioner section defines one provisioner of the type shell. The local script “setup_things.sh” will be transferred to the build instance and executed. This is the easiest and most basic way to provision an instance.

A more extensive example

The requirements for a real world scenario usually needs something more than just executing a simple shell script during provisioning. Lets add some more advanced features to the simple template.

Optional sections

The first thing somebody could add is a description and a variables section to the top of our template, like:

The first one is just a simple and optional description of the template. While the second adds some useful functionality. It defines variables, which can later be used in the template. Some of them have a default value, others are optional and can be set during the Packer call. Using packer inspect shows that:

Overriding can be done like this:

Multiple builders

The next extension could be to define multiple builders:

The amazon-ebs builder was extended by using some of the previously introduced variables. It got a bit more specific about the build environment which will be used on AWS side by defining a VPC, a subnet and attaching a public IP address to the build instance, and also added how the description of the resulting AMI will look.

The second builder defines a build with docker. This is quite useful for testing the provisioning part of the template. Creating an EC2 instance and an AMI afterwards takes some time and resources while building in a local docker environment is faster.

The pull option ensures that the base docker image is pulled if it isn’t already in the local repository. While the commit option is set so that the container will be committed to an image in the local repository after provisioning instead of exported.

Per default, packer will execute all builders which have been defined. This can be useful if you want to build the same image in a different Cloud Provider or in different AWS regions at the same time. In our example we have a special test builder and the actual AWS builder. The following command tells packer to use only a specific builder:

Provisioner

Provisioners are executed sequentially during the build phase. Using the only option you can restrict the provisioner to be called only by the corresponding builder.

This is useful if you need different provisioners or different options for a provisioner. In this example both call the same script to do some general bootstrap actions. One is for the amazon-ebs builder, where we call the script with sudo, and the other is for the docker builder where we don’t need to call sudo, as being root is the default inside a docker container.

The script itself is about upgrading all installed packages and installing Ansible to prepare the next provisioner:

Now a provisioner of the type ansible-local can be used. Packer will copy the defined Ansible Playbook from the local system into the instance and then execute it locally.

The last one is another simple shell provisioner to do some cleanup:

Post-Processors

Post-processors are run right after the image is built. For example to tag the docker image in the local docker repository:

Or to trigger the next step of a CI/CD pipeline.

Pitfalls

While building AMIs with packer is quite easy in general, there are some pitfalls to be aware of.

The most common is differences between the build system and the instance which will be created based on the AMI. It could be anything from simple things like different instance type to running in a different VPC. This means thinking about what can already be done at build time and what is specific to the environment where an EC2 instance is created based on the build AMI. Other examples, an Apache worker threads configuration based on the amount of available CPU cores, or a VPC specific endpoint the application communicates with, for example, an S3 VPC endpoint or the CloudWatch endpoint where custom metrics are sent.

This can be addressed by running a script or any real configuration management system at first boot time.

Wrapping up

As we have seen, building an AMI with a pre-installed configuration is not that hard. Packer is an easy to use and powerful tool to do that. We have discussed the basic building blocks of a packer template and some of the more advanced options. Go ahead and check out the great Packer documentation which explains this and much more in detail.

All code examples can be found at https://github.com/aruetten/aws-advent-2016-building-amis

About the Author

Andreas Rütten is a Senior Systems Engineer at Smaato, a global real-time advertising platform for mobile publishers & app developers. He also is the UserGroup leader of the AWS UG Meetup in Hamburg.

About the Editors

Steve Button is a Linux admin geek / DevOps who likes messing around with Raspberry Pi, Ruby, Python. Love technology and hate technology.

Joe Nuspl is a Portland, OR based DevOps Kung Fu practioner. He is a senior operations engineer at Workday. One of the DevOpsDays Portland 2016 organizers. Author of the zap chef community cookbook. Aspiring culinary chef. Occasionally he rambles on http://nvwls.github.io/ or @JoeNuspl on Twitter.


Are you getting the most out of IAM?

07. December 2016 2016 0

Author: Jon Topper
Editors: Bill Weiss, Alfredo Cambera

Identity Concepts

Identity is everywhere, whether we’re talking about your GitHub id, Twitter handle, or email address. A strong notion of identity is important in information systems, particularly where security and compliance is involved, and a good identity system supports access control, trust delegation, and audit trail.

AWS provides a number of services for managing identity, and today we’ll be looking at their main service in this area: IAM – Identity and Access Management.

IAM Concepts

Let’s take a look at the building blocks that IAM provides.

First of all, there’s the root user. This is how you’ll log in when you’ve first set up your AWS account. This identity is permitted to do anything and everything to any resource you create in that account, and – like the unix root user – you should really avoid using it for day to day work.

As well as the root user, IAM supports other users. These are separate identities which will typically be used by the people in your organization. Ideally, you’ll have just one user per person; and only one person will have access to that user’s credentials – sharing usernames and passwords is bad form. Users can have permissions granted to them by the use of policies.

Policies are JSON documents which, when attached to another entity, dictate what those entities are allowed to do.

Just like a unix system, we also have groups. Groups pull together lists of users, and any policies applied to the group are available to the members.

IAM also provides roles. In the standard AWS icon set, an IAM Role is represented as a hard hat. This is fairly appropriate, since other entities can “wear” a role, a little like putting on a hat. You can’t log directly into a role – they can’t have passwords – but users and instances can assume a role, and when they do so, the policies associated with that role dictate what they’re allowed to do.

Finally we have tokens. These are sets of credentials you can hold, either permanent or temporary. If you have a token you can present these to API calls to prove to them who you are.

IAM works across regions, so any IAM entity you create is available everywhere. Unlike other AWS services, IAM itself doesn’t cost anything – though obviously anything created in your account by an IAM user will incur costs in the same way as if you’ve done it yourself.

Basic Example

In a typical example we may have three members of staff: Alice, Bob and Carla. Alice is the person who runs the AWS account, and to stop her using the root account for day to day work, she can create herself an IAM user, and assign it one of the default IAM Policies: AdministratorAccess.

As we said earlier, IAM Policies are JSON documents. The AdministratorAccess policy looks like this:

The Version number here establishes which version of the JSON policy schema we’re using and this will likely be the same across all of your policies. For the purpose of this discussion it can be ignored.

The Statement list is the interesting bit: here, we’re saying that anyone using this policy is permitted to call any Action(the * is a wildcard match), on any Resource. Essentially this holder of this policy has the same level of access as the root account, which is what Alice wants, because she’s in charge.

Bob and Carla are part of Alice’s team. We want them to be able to make changes to most of the AWS account, but we don’t want to let them manipulate users – otherwise they might disable Alice’s account, and she doesn’t want that! We can create a group called PowerUsers to put Bob and Carla in, and assign another default policy to that group, PowerUserAccess, which looks like this:

Here you can see that we’re using a NotAction match in the Statement list. We’re saying that users with this policy are allowed to access all actions that don’t match the iam:* wildcard. When we give this policy to Bob and Carla, they’re no longer able to manipulate users with IAM, either in the console, on the CLI or via API calls.

This, though, presents a problem. Now Bob and Carla can’t make changes to their own users either! They won’t be able to change their passwords, for a start, which isn’t great news.

So, we want to allow PowerUsers to perform certain IAM activities, but only on their own users – we shouldn’t let Bob change Carla’s password. IAM provides us with a way to do that. See, for example, this ManageOwnCredentials policy:

The important part of this policy is the ${aws:username} variable expansion. This is expanded when the policy is evaluated, so when Bob is making calls against the IAM service, that variable is expanded to bob.

There’s a great set of example policies for administering IAM resources in the IAM docs, and these cover a number of other useful scenarios.

Multi-Factor Authentication

You can increase the level of security in your IAM accounts by requiring users to make use of a multi-factor authentication token.

Your password is something that you know. An MFA token adds a possession factor: it’s something that you have. You’re then only granted access to the system when both of these factors are present.

If someone finds out your password, but they don’t have access to your MFA token, they still won’t be able to get into the system.

There are instructions on how to set up MFA tokens in the IAM documentaion. For most types of user, a “virtual token” such as the Google Authenticator app is sufficient.

Once this is set up, we can prevent non-MFA users from accessing certain policies by adding this condition to IAM policy statements:

As an aside, several other services permit the use of MFA tokens (they may refer to it as 2FA) – enabling MFA where available is a good practise to get into. I use it with my Google accounts, with Github, Slack, and Dropbox.

Instance Profiles

If your app needs to write to an S3 bucket, or use DynamoDB, or otherwise make AWS API calls, you may have AWS access credentials hard-coded in your application config. There is a better way!

In the Roles section of the IAM console, you can create a new AWS Service Role, and choose the “Amazon EC2” type. On creation of that role, you can attach policy documents to it, and define what that role is allowed to do.

As a real life example, we host application artefacts as package repositories in an S3 bucket. We want our EC2 instances to be able to install these packages, and so we create a policy which allows our instances read-only access to our S3 bucket.

When we create new EC2 instances, we can attach our new role to it. Code running on the instance can then request temporary tokens associated with the new server role.

These tokens are served by the Instance Metadata Service. They can be used to call actions on AWS resources as dictated by the policies attached to the role.

diagram

The diagram shows the flow of requests. At step 1, the application connects to the instance metadata service with a request to assume a role. In step 2, the metadata service returns a temporary access token back to the application. In step 3, the application connects to S3 using that token.

The official AWS SDKs are all capable of obtaining credentials from the Metadata Service without you needing to worry about it. Refer to the documentation for details.

The benefit of this approach is that if your application is compromised and your AWS tokens leak out, these can only be used for a short amount of time before they’ll expire, reducing the amount of damage that can be caused in this scenario. With hard-coded credentials you’d have to rotate these yourself.

Cross-Account Role Assumption

One other use of roles is really useful if you use multiple AWS accounts. It’s considered best practice to use separate AWS accounts for different environments (eg. live and test). In our consultancy work, we work with a number of customers, who each have four or more accounts, so this is invaluable to us.

In our main account (in this example, account ID 00001), we create a group for our users who are allowed to access customer accounts. We create a policy for that group, AssumeRoleCustomer, that looks like this:

In this example, our customer’s account is ID 00005, and they have a role in that account called ScaleFactoryUser. ThisAssumeRoleCustomer policy permits our users to call sts:AssumeRole to take on the ScaleFactoryUser role in the customer’s account.

sts:AssumeRole is an API call which will return a temporary token for the role specified in the resource, which we can then use in order to behave as that role.

Of course, the other account (00005) also needs a policy to allow this, and so we set up a Trust Relationship Policy:

This policy allows any entity in the 00001 account to call sts:AssumeRole in our account, as long as it is using an MFA token (remember we saw that conditional in the earlier example?).

Having set that up, we can now log into our main account in the web console, click our username in the top right of the console and choose “Switch Role”. By filling in the account number of the other account (0005), and the name of the role we want to assume (ScaleFactoryUser) the web console calls sts:AssumeRole in the background, and uses that to start accessing the customer account.

Role assumption doesn’t have to be cross-account, by the way. You can also allow users to assume roles in the same account – and this can be used to allow unprivileged users occasional access to superuser privileges, in the same way you might use sudo on a unix system.

Federated Identity

When we’re talking about identity, It’s important to know the difference between the two “auth”s: authentication and authorization.

Authentication is used to establish who you are. So, when we use a username and password (and optionally an MFA token) to connect to the web console, that’s authentication at work.

This is distinct from Authorization which is used to establish what you can do. IAM policies control this.

In IAM, these two concepts are separate. It is possible to configure an Identity Provider (IdP) which is external to IAM, and use that for authentication. Users authenticated against the external IdP can then be assigned IAM roles which control the authentication part of the story.

IdPs can be either SAML or using OpenID Connect. Google Apps (or are we calling it G-Suite now?) can be set up as a SAML provider, and I followed this blog post with some success. I can now jump straight from my Google account into my AWS console, taking on a role I’ve called GoogleSSO, without having to give any other credentials.

Wrapping Up

I hope I’ve given you a flavour of some of the things you can do with IAM. If you’re still logging in with the root account, if you’re not using MFA, or if you’re hard-coding credentials in your application config, you should now be armed with the information you need to level up your security practice.

In addition to that, you may benefit from using role assumption, cross-account access, or an external IdP. As a bonus hint, you should also look into CloudTrail logging, so that your Alice can keep an eye on what Bob and Carla are up to!

However you’re spending the rest of this year, I wish you all the best.

About the Author

Jon Topper has been building Linux infrastructure for fifteen years. His UK-based consultancy, The Scale Factory, are a team of DevOps and infrastructure specialists, helping organizations of various sizes design, build, operate and scale their platforms.

About the Editors

Bill is a senior manager at Puppet in the SRE group.  Before his move to Portland to join Puppet he spent six years in Chicago working for Backstop Solutions Group, prior to which he was in New Mexico working for the Department of Energy.  He still loves him some hardware, but is accepting that AWS is pretty rad for some some things.

Alfredo Cambera is a Venezuelan outdoorsman, passionate about DevOps, AWS, automation, Data Visualization, Python and open source technologies. He works as Senior Operations Engineer for a company that offers Mobile Engagement Solutions around the globe.


Just add Code: Fun with Terraform Modules and AWS

06. December 2016 2016 0

Author: Chris Marchesi

Editors: Andrew Langhorn, Anthony Elizondo

This article is going to show you how you can use Terraform, with a little help from Packer and Chef, to deploy a fully-functional sample web application, complete with auto-scaling and load balancing, in under 50 lines of Terraform code.

You will need the sample project to follow along, so make sure you load that up before continuing with reading this article.

The Humble Configuration

Check out the code in the terraform/main.tf file.

It might be hard to think that with this mere smattering of Terraform is setting up:

  • An AWS VPC
  • 2 subnets, each in different availability zones, fully routed
  • An AWS Application Load Balancer
  • A listener for the ALB
  • An AWS Auto Scaling group
  • An ALB target group attached to the ALB
  • Configured security groups for both the ALB and backend instances

So what’s the secret?

Terraform Modules

This example is using a powerful feature of Terraform – the modules feature, providing a semantic and repeatable way to manage AWS infrastructure. The modules hide most of the complexity of setting up a full VPC behind a relatively small set of code, and an even smaller set of changes going forward (generally, to update this application, all that is needed is to update the AMI).

Note that this example is composed entirely of modules – no root module resources exist. That’s not to say that they can’t exist – and in fact one of the secondary examples demonstrates how you can use the outputs of one of the modules to add extra resources on an as-needed basis.

The example is composed of three visible modules, and one module that operates under the hood as a dependency:

  • terraform_aws_vpc, which sets up the VPC and subnets
  • terraform_aws_alb, which sets up the ALB and listener
  • terraform_aws_asg, which configures the Auto Scaling group, and ALB target group for the launched instances
  • terraform_aws_security_group, which is used by the ALB and Auto Scaling modules to set up security groups to restrict traffic flow.

These modules will be explained in detail later in the article.

How Terraform Modules Work

Terraform modules work very similar to basic Terraform configuration. In fact, each Terraform module is a standalone configuration in its own right, and depending on its pre-requisites, can run completely on its own. In fact, a top-level Terraform configuration without any modules being used is still a module – the root module. You sometimes see this mentioned in various parts of the Terraform workflow, such as in things like error messages, and the state file.

Module Sources and Versioning

Terraform supports a wide variety of remote sources for modules, such as simple, generic locations like HTTP, or Git, or well-known locations like GitHub, Bitbucket, or Amazon S3.

You don’t even need to put a module in a remote location. In fact, a good habit to get into is if you need to re-use Terraform code in a local project, put that code in a module – that way you can re-use it several times to create the same kind of resources in either the same, or even better, different, environments.

Declaring a module is simple. Let’s look at the VPC module from the example:

The location of the module is specified with the source parameter. The style of the parameter will dictate what kind of behaviour TF will undertake to get the module.

The rest of the options here are module parameters, which translate to variables within the module. Note that any variable that does not have a default value in the module is a required parameter, and Terraform will not start if these are not supplied.

The last item that should be mentioned is regarding versioning. Most module sources that work off of source control have a versioning parameter you can supply to get a revision or tag – with Git and GitHub sources, this is ref, which can translate to most Git references, be it a branch, or tag.

Versioning is a great way to keep things under control. You might find yourself iterating very fast on certain modules as you learn more about Terraform or your internal infrastructure design patterns change – versioning your modules ensures that you don’t need to constantly refactor otherwise stable stacks.

Module Tips and Tricks

Terraform and HCL is a work in progress, and there may be some things that seem like they may make sense that don’t necessarily work 100% – yet. There are some things that you might want to keep in mind when you are designing your modules that may reduce the complexity that ultimately gets presented to the user:

Use Data Sources

Terraform 0.7+’s data sources feature can go a long way in reducing the amount of data needs to go in to your module.

In this project, data sources are used for things such as obtaining VPC IDs from subnets (aws_subnet) and getting the security groups assigned to an ALB (using the aws_alb_listener and aws_alb data sources chained together). This allows us to create ALBs based off of subnet ID alone, and attach auto-scaling groups to ALBs with knowing only the listener ARN that we need to attach to.

Exploit Zero Values and Defaults

Terraform follows the rules of the language it was created in regarding zero values. Hence, most of the time, supplying an empty parameter is the same as supplying none at all.

This can be advantageous when designing a module to support different kinds of scenarios. For example, the alb module supports TLS via supplying a certificate ARN. Here is the variable declaration:

And here it is referenced in the listener block:

Now, when this module parameter is not supplied, its default value becomes an empty string, which is passed in to aws_alb_listener.alb_listener. This is, most times, exactly the same as if the parameter is not passed in at all. This allows you to not have to worry about this parameter when you just want to use HTTP on this endpoint (the default for the ALB module as a whole).

Pseudo-Conditional Logic

Terraform does not support conditional logic yet, but through creative use of count and interpolation, one can create semi-conditional logic in your resources.

Consider the fact that the terraform_aws_autoscaling module supports the ability to attach the ASG to an ALB, but does not explicit require it. How can you get away with that, though?

To get the answer, check one of the ALB resources in the module:

Here, we make use of the map interpolation function, nested in a lookup function to provide essentially an if/then/else control structure. This is used to control a resource’s instance count, adding an instance if var.enable_albis true, and completely removing the resource from the graph otherwise.

This conditional logic does not necessarily need to be limited to count either. Let’s go back to the aws_alb_listener.alb_listener resource in the ALB module, looking at a different parameter:

Here, we are using this trick to supply the correct SSL policy to the listener if the listener protocol is not HTTP. If it is, we supply the zero value, which as mentioned before, makes it as if the value was never supplied.

Module Limitations

Terraform does have some not-necessarily-obvious limitations that you will want to keep in mind when designing both modules and Terraform code in general. Here are a couple:

Count Cannot be Computed

This is a big one that can really get you when you are writing modules. Consider the following scenario that totally did not happen to me even though I knew of of such things beforehand 😉

  • An ALB listener is created with aws_alb_listener
  • The arn of this resource is passed as an output
  • That output is used as both the ARN to attach an auto-scaling group to, and the pseudo-conditional in the ALB-related resources’ count parameter

What happens? You get this lovely message:

value of 'count' cannot be computed

Actually, it used to be worse (a strconv error was displayed instead), but luckily that changed recently.

Unfortunately, there is no nice way to work around this right now. Extra parameters need to be supplied or you need to structure your modules in way that avoids computed values being passed into count directives in your workflow. (This is pretty much exactly why the terraform_aws_asg module has a enable_alb parameter).

Complex Structures and Zero Values

Complex structures are not necessarily good candidates for zero values, even though it may seem like a good idea. But by defining a complex structure in a resource, you are by nature supplying it a non-zero value, even if most of the fields you supply are empty.

Most resources don’t handle this scenario gracefully, so it’s best to avoid using complex structures in a scenario where you may be designing a module for re-use, and expect that you won’t be using the functionality defined by such a structure often.

The Application in Brief

As our focus in this article is on Terraform modules, and not on other parts of the pattern such as using Packer or Chef to build an AMI, we will only touch up briefly on the non-Terraform parts of this project, so that we can focus on the Terraform code and the AWS resources that it is setting up.

The Gem

The Ruby gem in this project is a small “hello world” application running with Sinatra. This is self-contained within this project and mainly exists to give us an artifact to put on our base AMI to send to the auto-scaling group.

The server prints out the system’s hostname when fetched. This will allow us to see each node in action as we boot things up.

Packer

The built gem is loaded on to an AMI using Packer, for which the code is contained within packer/ami.json. We use chef-solo as a provisioner, which works off a self-contained cookbook named packer_payload in the cookbooks directory. This allows us a bit more of a higher-level workflow than we would have simply with shell scripts, including the ability to better integration test things and also possibly support multiple build targets.

Note that the Packer configuration takes advantage of a new Packer 0.12.0 feature that allows us to fetch an AMI to use as the base right from Packer. This is the source_ami_filter directive. Before Packer 0.12.0, you would have needed to resort to a helper, such as ubuntu_ami.sh, to get the AMI for you.

The Rakefile

The Rakefile is the build runner. It has tasks for Packer (ami), Terraform (infrastructure), and Test Kitchen (kitchen). It also has prerequisite tasks to stage cookbooks (berks_cookbooks), and Terraform modules (tf_modules). It’s necessary to pre-fetch modules when they are being used in Terraform – normally this is handled by terraform get, but the tf_modules task does this for you.

It also handles some parameterization of Terraform commands, which allows us to specify when we want to perform something else other than an apply in Terraform, or use a different configuration.

All of this is in addition to standard Bundler gem tasks like build, etc. Note that install and release tasks have been explicitly disabled so that you don’t install or release the gem by mistake.

The Terraform Modules

Now that we have that out of the way, we can talk about the fun stuff!

As mentioned at the start of the article, This project has 4 different Terraform modules. Also as mentioned, one of them (the Security Group module) is hidden from the end user, as it is consumed by two of the parent modules to create security groups to work with. This exploits the fact that Terraform can, of course, nest modules within each other, allowing for any level of re-usability when designing a module layout.

The AWS VPC Module

The first module, terraform_aws_vpc, creates not only a VPC, but also public subnets as well, complete with route tables and internet gateway attachments.

We’ve already hidden a decent amount of complexity just by doing this, but as an added bonus, redundancy is baked right into the module by distributing any network addresses passed in as subnets to the module across all availability zones available in any particular region via the aws_availability_zones data source. This process does not require previous knowledge of the zones available to the account.

The module passes out pertinent information, such as the VPC ID, the ID of the default network ACL, the created subnet IDs, the availability zones for those subnets as a map, and the ID of the route table created.

The ALB Module

The second module, terraform_aws_alb allows for the creation of AWS Application Load Balancers. If all you need is the defaults, use of this module is extremely simple, creating an ALB that will answer requests on port 80. A default target group is also created that can be used if you don’t have anything else mapped, but we want to use this with our auto-scaling group.

The Auto Scaling Module

The third module, terraform_aws_asg, is arguably the most complex of the three that we see in the sample configuration, but even at that, its required options are very slim.

The beauty of this module is that, thanks to all the aforementioned logic, you can attach more than one ASG to the same ALB with different path patterns (mentioned below), or not attach it to an ALB at all! This allows this same module to be used for a number of scenarios. This is on top of the plethora of options available to you to tune, such as CPU thresholds, health check details, and session stickiness.

Another thing to note is how the AMI for the launch configuration is being fetched from within this module. We work off the tag that we used within Packer, which is supplied as a module variable. This is then searched for within the module via an aws_ami data source. This means that no code or variables need to change when the AMI is updated – the next Terraform run will pick up the most recent AMI with the tag.

Lastly, this module supports the rolling update mechanism laid out by Paul Hinze in this post oh so long ago now. When a new AMI is detected and the auto-scaling group needs to be updated, Terraform will bring up the new ASG, attach it, wait for it to have minimum capacity, and then bring down the old one.

The Security Group Module

The last module to be mentioned, terraform_aws_security_group, is not shown anywhere in our example, but is actually used by the ALB and ASG modules to create Security Groups.

Not only does it create security groups though – it also allows for the creation of 2 kinds of ICMP allow rules. One for all ICMP, if you so choose, but more importantly, allow rules for ICMP type 3 (host unreachable) are always created, as this is how path MTU discovery works. Without this, we might end up with unnecessarily degraded performance.

Give it a Shot

After all this talk about the internals of the project and the Terraform code, you might be eager to bring this up and see it working. Let’s do that now.

Assuming you have the project cloned and AWS credentials set appropriately, do the following:

  • Run bundle install --binstubs --path vendor/bundle to load the project’s Ruby dependencies.
  • Run bundle exec rake ami. This builds the AMI.
  • Run bundle exec rake infrastructure. This will deploy the project.

After this is done, Terraform should return a alb_hostname value to you. You can now load this up in your browser. Load it once, then wait about 1 second, then load it again! Or even better, just run the following in a prompt:

while true; do curl http://ALBHOST/; sleep 1; done

And watch the hostname change between the two hosts.

Tearing it Down

Once you are done, you can destroy the project simply by passing a TF_CMD environment variable in to rake with the destroy command:

TF_CMD=destroy bundle exec rake infrastructure

And that’s it! Note that this does not delete the AMI artifact, you will need to do that yourself.

More Fun

Finally, a few items for the road. These are things that are otherwise important to note or should prove to be helpful in realizing how powerful Terraform modules can be.

Tags

You may have noticed the modules have a project_path parameter that is filled out in the example with the path to the project in GitHub. This is something that I think is important for proper AWS resource management.

Several of our resources have machine-generated names or IDs which make them hard to track on their own. Having a easy-to-reference tag alleviates that. Having the tag reference the project that consumes the resource is even better – I don’t think it gets much clearer than that.

SSL/TLS for the ALB

Try this: create a certificate using Certificate Manager, and change the alb module to the following:

Better yet, see the example here. This can be run with the following command:

And destroyed with:

You now have SSL for your ALB! Of course, you will need to point DNS to the ALB (either via external DNS, CNAME records, or Route 53 alias records – the example includes this), but it’s that easy to change the ALB into an SSL load balancer.

Adding a Second ASG

You can also use the ASG module to create two auto-scaling groups.

There is an example for the above here. Again, run it with:

And destroy it with:

You now have two auto-scaling groups, one handling requests for /foo/*, and one handling requests for /bar/*. Give it a go by reloading each URL and see the unique instances you get for each.

Acknowledgments

I would like to take a moment to thank PayByPhone for allowing me to use their existing Terraform modules as the basis for the publicly available ones at https://github.com/paybyphone. Writing this article would have been a lot more painful without them!

Also thanks to my editors, Anthony Elizondo and Andrew Langhorn for for their feedback and help with this article, and the AWS Advent Team for the chance to stand on their soapbox for my 15 minutes! 🙂

About the Author:

picture of author Chris MarchesiChris Marchesi (@vancluever) is a Systems Engineer working out of Vancouver, BC, Canada. He currently works for PayByPhone, designing tools and patterns to help its engineers and developers work with AWS. He is also a regular contributor to the Terraform project. You can view his work at https://github.com/vancluever, and also his previous articles at https://vancluevertech.com/.

About the Editors:

Andrew Langhorn is a senior consultant at ThoughtWorks. He works with clients large and small on all sorts of infrastructure, security and performance problems. Previously, he was up to no good helping build, manage and operate the infrastructure behind GOV.UK, the simpler, clearer and faster way to access UK Government services and information. He lives in Manchester, England, with his beloved gin collection, blogs at ajlanghorn.com, and is a firm believer that mince pies aren’t to be eaten before December 1st.

Anthony Elizondo is a SRE at Adobe. He enjoys making things, breaking things, and burritos. You can find him at http://twitter.com/complexsplit


IAM Policies, Roles and Profiles and how to keep secrets away from your instances

05. December 2016 2016 0

Author: Mark Harrison
Editors: Jyrki Puttonen

AWS Identity and Access Management (IAM) is Amazon’s service for controlling access to AWS resources, or more simply, it provides a way for you to decide who has access to what in AWS. This simple description however hides the depth and complexity of what is probably one of the most misunderstood of Amazon’s services.

Many of you will have made use of IAM in order to create multiple users in AWS rather than sharing a single root user, but there are many more ways IAM can be useful to you. This article will be focusing on one use of IAM in particular: instance roles. Instance roles allow you to give AWS access to EC2 instances without them needing to store an AWS API key. I’ll be taking you through how to set them up, how to use them with your applications, and some of the things instance roles are useful for.

Throughout this article, I’ll be using terraform to create instances, roles and policies. However, the principles will apply if you use a different provisioning tool or if you use the API directly.

An example

We’re going to start off with a simple terraform configuration that creates a single micro instance in EC2. Here I’ve created a blank directory and a file inside called infrastructure.tf:

When I run terraform apply, terraform creates a running EC2 instance based on the configuration in my infrastructure.tf file. This will be the starting point for us to add IAM roles/policies to.

Let’s say we are writing an application and want to provide access to an S3 bucket. One way would be simply to copy your AWS API keys into the configuration file for your application, but this would give your application full access to your AWS account just as if you had logged in yourself. A better option would be to make a new IAM user, give them just the permissions needed to access the S3 bucket, and create API keys for that user. However, you still have to store the API keys in the application’s configuration file, along with all the hassles of managing secrets that entails.

Instead, what we’re going to do is create a role that allows access to the S3 bucket, and assign it to the instance. First, we’re going to make the S3 bucket:

Then, we’re going to create an AWS IAM policy that grants access to the bucket. A policy is simple a JSON document that lists permissions to things in AWS:

The actual policy document is the JSON bit between the <<EOF and EOF:

There’s quite a bit going on here, but the important parts are the Action and Resource sections. The Action section says what you can do, and in this case we’re saying you can get objects from S3 (in other words, we’re providing read only access to something in S3). The Resource section specifies what you can do it with, and in this case we say you can get S3 objects from anywhere inside the myawsadventapp bucket. If we wanted to provide write access to the bucket we would add another action, s3:PutObject, to the list of actions we allow. We can also change the name of the S3 bucket as needed to provide access to other buckets.

Now that we have the Policy set up to allow access to S3, we need to actually give that set of permissions to the instance itself. To do this, we make a role:

The first part of this is pretty straightforward: we give the role a name. But why is there another Policy JSON document there? This assume role policy specifies who, or what, can become the role. In this case, the policy is just stating that EC2 instances can have the role assigned to them. Generally, when making instance roles, you don’t need to change this.

The policy is already linked to the role (we added a role = section when making the policy. All that remains is to link the role with our instance.

If you were using the AWS web console to make a new instance, assigning a role to it is easy, you just pick the role from the list of roles in the instance details section as you make the instance. However, if you are using terraform, the AWS cli tools, or some other provisioning tool, then there is one more link in the chain: Instance Profiles.

Instance profiles are simply containers for roles that can be attached directly to instances, and can be thought of as simply an implementation detail. Whenever you make a role, make a matching profile, and then attach the profile to the instance. Here’s the profile to match the role we just created:

Notice how the name of the profile is the same as the name of the role. This is how it works with the AWS web console: AWS creates a profile with the same name as the role behind the scenes. Keeping the name the same makes things easier, and once you have done this you can then completely forget that profiles exist.

Finally, now that the profile has been created, we just edit the instance and assign the profile to it:

And now, with all of the required configuration made, we can go ahead and make the instance:

There is one thing to be aware of: an instance profile can’t be changed after an instance has been created, so if you were following along and created the instance earlier without adding the instance profile then you have to recreate the instance from scratch. With this toy instance that’s not a problem, but it may be if you’re adding this to existing infrastructure.

Accessing API keys from the instance

Once the terraform run is complete, we can ssh into the instance and see that the instance profile has been applied:

And if we run a slightly different curl command, we can obtain AWS API keys:

Your application can simply look up the keys when it wants to use an AWS API and doesn’t need to store them in a config file or elsewhere. Note that the credentials listed have an expiration time mentioned. The keys change approximately every 6 hours and you will need to look them up again after this time.

To make life easier, most AWS libraries and commands already support instance roles as a method of getting credentials, and will automatically use any credentials that are available without any further configuration. For example, you can just use the aws cli without needing to configure your credentials:

Some things you can do with IAM roles and instance profiles

So far we’ve shown an example of giving instances access to a particular S3 bucket. This is great, but there are some other uses for instance roles:

One good use case is managing EBS volumes. Say you have an autoscaling group (because AWS instances break and autoscaling groups allow AWS to launch replacements for broken instances), but you have state that needs to be stored on instances that you’d like to not disappear every time an instance is recreated. The way you deal with this is to store the stateful data on EBS volumes, and use a script that runs on boot to attach any EBS volume that isn’t currently in use.

Another case where having IAM roles is really handy: If you install grafana on an AWS instance, the cloudwatch data sourcesupports using IAM roles, and so you can use grafana to view cloudwatch graphs for your AWS account without needing to set up credentials. To do this, use the following IAM policy:

Finally, a special case of the S3 access policy above is to use the S3 bucket to store secrets. This uses S3 as a trusted store, and you use IAM profiles to determine which instances get access to the secrets. This is the basis of the citadel cookbook for Chef that can be used to manage secrets in AWS.

More information

Hopefully this article has given you a taste for IAM roles and instance profiles and how they can make your life much easier when interacting with the AWS API from EC2 instances. If you want more information on using IAM roles, the AWS Documentation on IAM Roles goes into much more detail and is well worth a read.

About the Author

Mark Harrison is a Systems Administrator on the Chef operations team, where he is responsible for the care and feeding of Hosted Chef as well as maintaining several of Chef’s internal systems. Before coming to Chef, Mark led the operations team at OmniTI, helping clients scale their web architectures and supporting some of the largest infrastructures in the world.

About the Editors

Jyrki Puttonen is Chief Solutions Executive at Symbio Finland (@SymbioFinland) who tries to keep on track what happens in cloud.


Exploring Concurrency in Python & AWS

04. December 2016 2016 0

Exploring Concurrency in Python & AWS

From Threads to Lambdas (and lambdas with threads)

Author: Mohit Chawla

Editors: Jesse Davis, Neil Millard

The scope of the current article is to demonstrate multiple approaches to solve a seemingly simple problem of intra-S3 file transfers – using pure Python and a hybrid approach of Python and cloud based constructs, specifically AWS Lambda, with a comparison of the two concurrency approaches.

Problem Background

The problem was to transfer 250 objects daily, each of size 600-800 MB, from one S3 bucket to another. In addition, an initial bulk backup of 1500 objects (6 months of data) had to be taken, totaling 1 TB.

Attempt 1

The easiest way to do this appears to loop over all the objects and transfer them one by one:

This had a runtime of 1 hour 45 minutes. Oops.

Attempt 2

Lets use some threads !

Python offers multiple concurrency methods:

  • asyncio, based on event loops and asynchronous I/O.
  • concurrent.futures, which provides high level abstractions like ThreadPoolExecutor and ProcessPoolExecutor.
  • threading, which provides low level abstractions to build your own solution using threads, semaphores and locks.
  • multiprocessing, which is similar to threading, but for processes.

I used the concurrent.futures module, specifically the ThreadPoolExecutor, which seems to be a good fit for I/O tasks.

Note about the GIL:

Python implements a GIL (Global Interpreter Lock) which limits only a single thread to run at a time, inside a single Python interpreter. This is not a limitation for an I/O intensive task, such as the one being discussed in this article. For more details about how it works, see http://www.dabeaz.com/GIL/.

Here’s the code when using the ThreadPoolExecutor:

This code took 1 minute 40 seconds to execute, woo !

Concurrency with Lambda

I was happy with this implementation, until, at an AWS meetup, there was a discussion about using AWS Lambda and SNS for the same thing, and I thought of trying that out.

AWS Lambda is a compute service that lets you run code without provisioning or managing servers. It can be combined with AWS SNS, which is a message push notification service which can deliver and fan-out messages to several services, including E-Mail, HTTP and Lambda, which as allows the decoupling of components.

To use Lambda and SNS for this problem, a simple pipeline was devised: One Lambda function publishes object names as messages to SNS and another Lambda function is subscribed to SNS for copying the objects.

The following piece of code publishes names of objects to copy to an SNS topic. Note the use of threads to make this faster.

Yep, that’s all the code.

Now, you maybe asking yourself, how is the copy operation actually concurrent ?
The unit of concurrency in AWS Lambda is actually the function invocation. For each published message, the Lambda function is invoked, which means for multiple messages published in parallel, an equivalent number of invocations will be made for the Lambda function. According to AWS, that number for stream based sources is given by:

By default, this is limited to 100 concurrent executions, but can be raised on request.

The execution time for the above code was 2 minutes 40 seconds. This is higher than the pure Python approach, partly because the invocations were throttled by AWS.

I hope you enjoyed reading this article, and if you are an AWS or Python user, hopefully this example will be useful for your own projects.

Note – I gave this as a talk at PyUnconf ’16 in Hamburg, you can see the slides at https://speakerdeck.com/alcy/exploring-concurrency-in-python-and-aws.

About the Author:

Mohit Chawla is a systems engineer, living in Hamburg. He has contributed to open source projects over the last seven years, and has a few projects of his own. Apart from systems engineering, he has a strong interest in data visualization.


server-free pubsub ( and nearly code-free )

02. December 2016 2016 0

Author: Ed Anderson

Editors: Evan Mouzakitis, Brian O’Rourke

Intro

This article will introduce you to creating serverless PubSub microservices by building a simple Slack based word counting service.

Lambda Overview

These PubSub microservices are AWS Lambda based. Lambda is a service that does not require you to manage servers in order to run code. The high level overview is that you define events ( called triggers ) that will cause a packaging of your code ( called a function ) to be invoked. Inside your package ( aka function ), a specific function within a file ( called a handler ) will be called.

If you’re feeling a bit confused by overloaded terminology, you are not alone. For now, here’s the short list:

Lambda term  Common Name Description
Trigger AWS Service Component that invokes Lambda
Function software package Group of files needed to run code (includes libraries)
Handler file.function in your package The filename/function name to execute

 

There are many different types of triggers ( S3, API Gateway, Kinesis streams, and more). See this page for a complete list. Lambdas run in the context of a specific IAM Role. This means that, in addition to features provided by your language of choice ( python, nodejs, java, scala ), you can call from your Lambda to other AWS Services ( like DynamoDB ).

Intro to the PubSub Microservices

These microservices, once built, will count words typed into Slack. The services are:

  1. The first service splits up the user-input into individual words and:
    • increments the counter for each word
    • supplies a response to the user showing the current count of any seen words
    • triggers functions 2 and 3 which execute concurrently
  2. The second service also splits up the user-input into individual words and:
    • adds a count of 10 to each of those words
  3. The third service logs the input it receives.

While you might not have a specific need for a word counter, the concepts demonstrated here can be applied elsewhere. For example, you may have a project where you need to run several things in series, or perhaps you have a single event that needs to trigger concurrent workflows.

For example:

  • Concurrent workflows triggered by a single event:
    • New user joins org, and needs accounts created in several systems
    • Website user is interested in a specific topic, and you want to curate additional content to present to the user
    • There is a software outage, and you need to update several systems ( statuspage, nagios, etc ) at the same time
    • Website clicks need to be tracked in a system used by Operations, and a different system used by Analytics
  • Serial workflows triggered by a single event:
    • New user needs a Google account created, then that Google account needs to be given permission to access another system integrated with Google auth.
    • A new version of software needs to be packaged, then deployed, then activated –
    • Cup is inserted to a coffee machine, then the coffee machine dispenses coffee into the cup

 

  • The API Gateway ( trigger ) will call a Lambda Function that will split whatever text it is given into specific words
    • Upsert a key in a DynamoDB table with the number 1
    • Drop a message on a SNS Topic
  • The SNS Topic ( trigger ) will have two lambda functions attached to it that will
    • Upsert the same keys in the dynamodb with the number 10
    • Log a message to CloudWatchLogs
Visualization of Different Microservices comprising the Slack Based Word counter
Visualization of the Microservices

 

Example code for AWS Advent near-code-free PubSub. Technologies used:

  • Slack ( outgoing webhooks )
  • API Gateway
  • IAM
  • SNS
  • Lambda
  • DynamoDB

Pub/Sub is teh.best.evar* ( *for some values of best )

I came into the world of computing by way of The Operations Path. The Publish-Subscribe Pattern has always been near and dear to my ❤️.

There are a few things about PubSub that I really appreciate as an “infrastructure person”.

  1. Scalability. In terms of the transport layer ( usually a message bus of some kind ), the ability to scale is separate from the publishers and the consumers. In this wonderful thing which is AWS, we as infrastructure admins can get out of this aspect of the business of running PubSub entirely.
  2. Loose Coupling. In the happy path, publishers don’t know anything about what subscribers are doing with the messages they publish. There’s admittedly a little hand-waving here, and folks new to PubSub ( and sometimes those that are experienced ) get rude surprises as messages mutate over time.
  3. Asynchronous. This is not necessarily inherent in the PubSub pattern, but it’s the most common implementation that I’ve seen. There’s quite a lot of pressure that can be absent from Dev Teams, Operations Teams, or DevOps Teams when there is no expectation from the business that systems will retain single millisecond response times.
  4. New Cloud Ways. Once upon a time, we needed to queue messages in PubSub systems ( and you might you might still have a need for that feature ), but with Lambda, we can also invoke consumers on demand as messages pass through our system. We don’t necessarily hace to keep things in the queue at all. Message appears, processing code runs, everybody’s happy.

Yo dawg, I heard you like ️☁️

One of the biggest benefits that we can enjoy from being hosted with AWS is not having to manage stuff. Running your own message bus might be something that separates your business from your competition, but it might also be undifferentiated heavy lifting.

IMO, if AWS can and will handle scaling issues for you ( to say nothing of only paying for the transactions that you use ), then it might be the right choice for you to let them take care of that for you.

I would also like to point out that running these things without servers isn’t quite the same thing as running them in a traditional setup. I ended up redoing this implementation a few times as I kept finding the rough edges of running things serverless. All were ultimately addressable, but I wanted to keep the complexity of this down somewhat.

WELCOME TO THE FUTURE, FRIENDS

TL;DR GIMMIE SOME EXAMPLES

CloudFormation is pretty well covered by AWS Advent, we’ll configure this little diddy via the AWS console.

TO THE BATCODE CAVE!

Setup the first lambda, which will be linked to an outgoing webhook in slack

Setup the DynamoDB

👇 You can follow the steps below, or view this video 👉 Video to DynamoDB Create

  1. Console
  2. DynamoDB
  3. Create Table
    1. Table Name table
    2. Primary Key word
    3. Create

Setup the First Lambda

This Lambda accepts the input from a Slack outgoing webhook, splits the input into separate words, and adds a count of one to each word. It further returns a json response body to the outgoing webhook that displays a message in slack.

If the Lambda is triggered with the input awsadvent some words, this Lambda will create the following three keys in dynamodb, and give each the value of one.

  • awsadvent = 1
  • some = 1
  • words = 1

👇 You can follow the steps below, or view this video 👉 Video to Create the first Lambda

  1. Make the first Lambda, which accepts slack outgoing webook input, and saves that in DynamoDB
    1. Console
    2. Lambda
    3. Get Started Now
    4. Select Blueprint
      1. Blank Function
    5. Configure Triggers
      1. Click in the empty box
      2. Choose API Gateway
    6. API Name
      1. aws_advent ( This will be the /PATH of your API Call )
    7. Security
      1. Open
    8. Name
      1. aws_advent
    9. Runtime
      1. Python 2.7
    10. Code Entry Type
      1. Inline
      2. It’s included as app.py in this repo. There are more Lambda Packaging Examples here
    11. Environment Variables
      1. DYNAMO_TABLE = table
    12. Handler
      1. app.handler
    13. Role
      1. Create new role from template(s)
      2. Name
        1. aws_advent_lambda_dynamo
    14. Policy Templates
      1. Simple Microservice permissions
    15. Triggers
      1. API Gateway
      2. save the URL

Link it to your favorite slack

👇 You can follow the steps below, or view this video 👉 Video for setting up the slack outbound wehbook

  1. Setup an outbound webhook in your favorite Slack team.
  2. Manage
  3. Search
  4. outgoing wehbooks
  5. Channel ( optional )
  6. Trigger words
    1. awsadvent
    2. URLs
  7.  Your API Gateway Endpoint on the Lambda from above
  8. Customize Name
  9.  awsadvent-bot
  10. Go to slack
    1. Join the room
    2. Say the trigger word
    3. You should see something like 👉 something like this

☝️☝️ CONGRATS YOU JUST DID CHATOPS ☝️☝️


Ok. now we want to do the awesome PubSub stuff

Make the SNS Topic

We’re using a SNS Topic as a broker. The producer ( the aws_advent Lambda ) publishes messages to the SNS Topic. Two other Lambdas will be consumers of the SNS Topic, and they’ll get triggered as new messages come into the Topic.

👇 You can follow the steps below, or view this video 👉 Video for setting up the SNS Topic

  1. Console
  2. SNS
  3. New Topic
  4. Name awsadvent
  5. Note the topic ARN

Add additional permissions to the first Lambda

This permission will allow the first Lambda to talk to the SNS Topic. You also need to set an environment variable on the aws_advent Lambda to have it be able to talk to the SNS Topic.

👇 You can follow the steps below, or view this video 👉 Adding additional IAM Permissions to the aws_lambda role

  1. Give additional IAM permissions on the role for the first lambda
    1. Console
    2. IAM
    3. Roles aws_advent_lambda_dynamo
      1. Permissions
      2. Inline Policies
      3. click here
      4. Policy Name
      5. aws_advent_lambda_dynamo_snspublish

Add the SNS Topic ARN to the aws_advent Lambda

👇 You can follow the steps below, or view this video 👉 Adding a new environment variable to the lambda

There’s a conditional in the aws_advent lambda that will publish to a SNS topic, if the SNS_TOPIC_ARN environment variable is set. Set it, and watch more PubSub magic happen.

  1. Add the SNS_TOPIC_ARN environment variable to the aws_advent lambda
    1. Console
    2. LAMBDA
    3. aws_advent
    4. Scroll down
    5. SNS_TOPIC_ARN
      1. The SNS Topic ARN from above.

Create a consumer Lambda: aws_advent_sns_multiplier

This microservice increments the values collected by the aws_advent Lambda. In a real world application, I would probably not take the approach of having a second Lambda function update values in a database that are originally input by another Lambda function. It’s useful here to show how work can be done outside of the Request->Response flow for a request. A less contrived example might be that this Lambda checks for words with high counts, to build a leaderboard of words.

This Lambda function will subscribe to the SNS Topic, and it is triggered when a message is delivered to the SNS Topic. In the real world, this Lambda might do something like copy data to a secondary database that internal users can query without impacting the user experience.

👇 You can follow the steps below, or view this video 👉 Creating the sns_multiplier lambda

  1. Console
  2. lambda
  3. Create a Lambda function
  4. Select Blueprint 1. search sns 1. sns-message python2.7 runtime
  5. Configure Triggers
    1. SNS topic
      1. awsadvent
      2. click enable trigger
  6. Name
    1. sns_multiplier
  7. Runtime
    1. Python 2.7
  8. Code Entry Type
    1. Inline
      1. It’s included as sns_multiplier.py in this repo.
  9. Handler
    1. sns_multiplier.handler
  10. Role
    1. Create new role from template(s)
  11. Policy Templates
    1. Simple Microservice permissions
  12. Next
  13. Create Function

Go back to slack and test it out.

Now that you have the most interesting parts hooked up together, test it out!

What we’d expect to happen is pictured here 👉 everything working

👇 Writeup is below, or view this video 👉 Watch it work

  • The first time we sent a message, the count of the number of times the words are seen is one. This is provided by our first Lambda
  • The second time we sent a message, the count of the number of times the words are seen is twelve. This is a combination of our first and second Lambdas working together.
    1. The first invocation set the count to current(0) + one, and passed the words off to the SNS topic. The value of each word in the database was set to 1.
    2. After SNS recieved the message, it ran the sns_multiplier Lambda, which added ten to the value of each word current(1) + 10. The value of each word in the database was set to 11.
    3. The second invocation set the count of each word to current(11) + 1. The value of each word in the database was set to 12.

️️💯💯💯 Now you’re doing pubsub microservices 💯💯💯

Setup the logger Lambda as well

This output of this Lambda will be viewable in the CloudWatch Logs console, and it’s only showing that we could do something else ( anything else, even ) with this microservice implementation.

  1. Console
  2. Lambda
  3. Create a Lambda function
  4. Select Blueprint
    1. search sns
    2. sns-message python2.7 runtime
  5. Configure Triggers
    1. SNS topic
      1. awsadvent
      2. click enable trigger
  6. Name
    1. sns_logger
  7. Runtime
    1. Python 2.7
  8. Code Entry Type
    1. Inline
      1. It’s included as sns_logger.py in this repo.
  9. Handler
    1. sns_logger.handler
  10. Role
    1. Create new role from template(s)
  11. Policy Templates
    1. Simple Microservice permissions
  12. Next
  13. Create Function

In conclusion

PubSub is an awsome model for some types of work, and in AWS with Lambda we can work inside this model relatively simply. Plenty of real-word work depends on the PubSub model.

You might translate this project to things that you do need to do like software deployment, user account management, building leaderboards, etc.

AWS + Lambda == the happy path

It’s ok to lean on AWS for the heavy lifting. As our word counter becomes more popular, we probably won’t have to do anything at all to scale with traffic. Having our code execute on a request-driven basis is a big win from my point of view. “Serverless” computing is a very interesting development in cloud computing. Look for ways to experiment with it, there are plenty of benefits to it ( other than novelty ).

Some benefits you can enjoy via Serverless PubSub in AWS:

  1. Scaling the publishers. Since this used API Gateway to terminate user requests to a Lambda function:
    1. You don’t have idle resources burning money, waiting for traffic
    2. You don’t have to scale because traffic has increased or decreased
  2. Scaling the bus / interconnection. SNS did the following for you:
    1. Scaled to accommodate the volume of traffic we send to it
    2. Provided HA for the bus
    3. Pay-per-transaction. You don’t have to pay for idle resources!
  3. Scaling the consumers. Having lambda functions that trigger on a message being delivered to SNS:
    1. Scaled the lambda invocations to the volume of traffic.
    2. Provides some sense of HA

Lambda and the API Gateway are works in progress.

Lambda is a new technology. If you use it, you will find some rough edges.

The API Gateway is a new technology. If you use it, you will find some rough edges.

Don’t let that dissuade you from trying them out!

I’m open for further discussion on these topics. Find me on twitter @edyesed

About the Author:

Ed Anderson has been working with the internet since the days of gopher and lynx. Ed has worked in healthcare, regional telecom, failed startups, multinational shipping conglomerates, and is currently working at RealSelf.com.

Ed is into dadops,  devops, and chat bots.

Writing in the third person is not Ed’s gift. He’s much more comfortable poking the private cloud bear,  destroying ec2 instances, and writing lambda functions be they use case appropriate or not.

He can be found on Twitter at @edyesed.

About the Editors:

Evan Mouzakitis is a Research Engineer at Datadog. He is passionate about solving problems and helping others. He has written about monitoring many popular technologies, including Lambda, OpenStack, Hadoop, and Kafka.

Brian O’Rourke is the co-founder of RedisGreen, a highly available and highly instrumented Redis service. He has more than a decade of experience building and scaling systems and happy teams, and has been an active AWS user since S3 was a baby.


Deploy your AWS Infrastructure Continuously

01. December 2016 2016 0

Author: Michael Wittig

Continuously integrating and deploying your source code is the new standard in many successful internet companies. But what about your infrastructure? Can you deploy a change to your infrastructure in an automated way? Can you run automated tests on your infrastructure to ensure that a change has no unintended side effects? In this post I will show you how you can apply the same processes to your AWS infrastructure that you apply to your source code. You will learn how the AWS services CloudFormation, CodePipeline and Lambda can be combined to continuously deploy infrastructure.

Precondition

You may think: “Source code is text files, but my infrastructure is different. I don’t have a source file for my infrastructure.” Infrastructure as Code as defined by Martin Fowler is a concept that is helping bring software development practices to infrastructure practices.

Infrastructure as code is the approach to defining computing and network infrastructure through source code that can then be treated just like any software system.
– Martin Fowler

AWS CloudFormation is one implementation of Infrastructure as Code. CloudFormation is a high quality and free service offered by AWS. To understand CloudFormation you need to know about templates and stacks. The template is the source code, a textual representation of your infrastructure. The stack is the actual running infrastructure described by the template. So a CloudFormation template is exactly what we need, a plain text file. The CloudFormation service interprets the template and turns it into a running infrastructure.

Now, our infrastructure is defined by a text file which is exactly what we need to apply the same processes to it that we have for source code.

The Pipeline

The pipeline to build and deploy is a sequence of steps that are necessary to ship changes to your users. Starting with a change in the code repository and ending in your production environment. The following figure shows a Pipeline that runs inside AWS CodePipeline, the AWS CD service.

AWS CodePipeline - Deploying infrastructure continuously

Whenever a git push is made to a repository hosted on GitHub the pipeline starts to run by fetching the current version of the repository. After that, the pipeline creates or updates itself because the pipeline definition itself is also treated as source code. After that, the up-to-date pipeline creates or updates the test environment. After this step, infrastructure in the test environment looks exactly as it was defined in the template. This is also a good place to deploy the application to the test environment. I’m using Elastic Beanstalk to host the demo application. Now it’s time to check if the infrastructure is still in a good shape. We want to make sure that everything runs as it is defined in the tests. The tests may check if a certain port is reachable, if a certain user can login via SSH, if a certain port is NOT reachable, and so on, and so forth. If the tests are successful, the production environment is adapted to the new template and the new application version is deployed.

Implementation

From Source to Deploy PipelineCodePipeline has native support for GitHub, CloudFormation, Elastic Beanstalk, and Lambda. So I can use all the services and tie them together using CodePipeline. You can find the full source code and detailed setup instructions in this GitHub repository: michaelwittig/automation-for-the-people

 

The following template snippet shows an excerpt of the full pipeline description. Here you see how the pipeline can be configured to checkout the GitHub repository and create/update itself:

 

Summary

Infrastructure as Code enables you to apply the same CI & CD processes to infrastructure that you already know from software development. On AWS, you can use CloudFormation to turn a text representation of your infrastructure into a running environment stack. CodePipeline can be used to orchestrate the deployment process and you can implement custom logic, such as infrastructure tests, in a programming language that you can run on AWS Lambda. Finally you can treat your infrastructure as code and deploy each commit with confidence into production.

About the Author

Michael WittigMichael Wittig is author of Amazon Web Services in Action (Manning) and writes frequently about AWS on cloudonaut.io. He helps his clients to gain value from Amazon Web Services. As a software engineer he develops cloud-native real-time web and mobile applications. He migrated the complete IT infrastructure of the first bank in Germany to AWS. He has expertise in distributed system development and architecture, with experience in algorithmic trading and real-time analytics.

welcome to aws advent 2016

05. October 2016 welcome 0

We’re pleased to announce that AWS Advent is returning.

What is the AWS Advent event? Many technology platforms have started a yearly tradition for the month of December revealing an article per day written and edited by volunteers in the style of an advent calendar, a special calendar used to count the days in anticipation of Christmas starting on December 1. The AWS Advent event explores everything around the Amazon Web Services platform.

Examples of past AWS articles:

Please explore the rest of this site for more examples of past topics.

There are a large number of AWS services, and many that have never been covered on AWS advent in previous years. We’re looking for articles that range in audience level from beginning to advanced from beginners to experts in AWS. Introductory, security, architecture, and design patterns with any of the AWS services are welcome topics.

Interested in being part of AWS Advent 2016? 

Process for submission acceptance

  • Interesting title
  • Fresh point of view, unique, timely topic
  • Points relevant and interesting to the topic
  • Scope of the topic matches the intended audience
  • Availability to pair with editor and other volunteers to polish up submission

People who have volunteered to evaluate submissions will start reviewing without identifying information about the individuals to focus on the content evaluation. AWS Advent editors Brandon Burton, and Jennifer Davis will evaluate the program for diversity   We will pair folks up with available volunteers to do technical and copy editing.

Process for volunteer acceptance

  • Availability!

Important Dates

  • Blind submission review begins – October 24, 2016
  • Authors and other volunteers rolling submissions start – October 26, 2016
  • Submissions accepted until advent calendar complete.
  • Rough drafts due – 12:00am November 21, 2016
  • Final drafts due – 12:00am November 30, 2016

Please be aware that we are working on a code of conduct for participants of this event. To start, we are borrowing from the Chef Community Guidelines:

  • Be welcoming, inclusive, friendly, and patient.
  • Be considerate.
  • Be respectful.
  • Be professional.
  • Be careful in the words that you choose.
  • When we disagree, let’s all work together to understand why.

 

Thank you, and we look forward to a great AWS Advent in 2016!

Jennifer Davis, @sigje

Brandon Burton, @solarce