AWS Advent Day 1 – Kappa: Simplifying AWS Lambda Deployments

Our first AWS Advent post comes to us from Mitch Garnaat, the creator of the AWS python library boto and who is currently herding clouds and devops over at Scopely. He’s gonna walk us through exploring AWS Lambda and some tooling he built to help use it.

AWS Lambda is an interesting new service from Amazon Web Services. It allows you to write Lambda Functions and associate these functions with events such as new files appearing in an S3 bucket or new records being written to an Amazon Kinesis stream. The details of how the functions get executed and how they are scaled to meet demand are handled completely by the AWS Lambda service. So, as the developer, you don’t have to worry about instances or load balancers or auto scaling groups, etc. It all just happens automatically for you.

Sound too good to be true? Well, there are some caveats. The main one is that the AWS Lambda service is in Preview right now so there are some rough edges. The good news is that AWS has made the service available for testing and evaluation and your input can have a big impact on the future of the service. I encourage you to give it a try.

My first impressions of AWS Lambda (aside from the obvious wow factor) is that the process of creating and deploying a Lambda Function was more complicated than I imagined. For example, to have a small Javascript function called whenever a record is written to an Amazon Kinesis requires quite a few steps.

  • Write the Javascript function (AWS Lambda only supports Javascript right now)
  • Create an IAM Role that will be used to allow the Lambda Function to access any AWS resources it needs when executing.
  • Zip up the Javascript function and any dependencies
  • Upload the zip file to the AWS Lambda service
  • Send test data to the Lambda Function
  • Create an IAM Role that will be used by the service invoking your Lambda Function
  • Retrieve the output of the Lambda Function from Amazon CloudWatch Logs
  • Add an event source to the Lambda Function
  • Monitor the output of the live function

Each of these steps actually requires multiple steps involving different services. For example, the roles are created in IAM but you then need to know the ARN of those roles when uploading the function or adding the event source. The bottom line is that using AWS Lambda at the moment requires a lot of knowledge about other Amazon Web Services.

Sounds Like We Need Some Tools

Whenever I’m faced with a task that is complicated, fiddly, and repetitive, my first reaction is always to think about what kind of tool I can create to make it easier. That’s where kappa comes in.

Kappa is a command line tool written in Python. The goal of kappa is to make it easier to deploy Lambda Functions. It tries to handle many of the fiddly details and hopefully lets you focus more on what your Lambda Functions are actually doing.

Getting Kappa

You can install kappa using PyPI but for our purposes the best way is to simply clone the github repo. Once you have cloned it (hopefully inside a virtualenv) simply run:

and you should be all set.

A Simple Kinesis Example

To get an idea of how kappa works, let’s try a simple example of a Lambda Function that gets called each time a new record is written to a Kinesis stream. The function we will write doesn’t really do anything except log some debug output but the possibilities are endless. For example, I have a modified version of this basic function that indexes the payload in the Kinesis record to an ElasticSearch server. So, you can basically send any kind of JSON data into a Kinesis stream and it will get indexed in ElasticSearch. And I don’t have to create any EC2 instances or any other compute resources to make it happen.

The actual example is bundled with the kappa github repo so if you have cloned the repo as described above, simply cd into the samples/kinesis directory to find the example files.

Roles and Policies

First, lets handle the IAM Roles we will need. We need an execution role and an invocation role. The former represents the permissions granted to our function when it is being executed. The latter represents the permissions granated to whichever service is actually responsible for invoking our function.

IAM roles and policies are complex and rather arcane. The best approach is usually to find a working example similar to what you want and modify it. I stole the policies (and some other things) from theexample AWS provides in the Lambda docs. I then repackaged these as a CloudFormation template (see roles.cf in the sample directory).

The benefit of using CloudFormation to handle the IAM roles and policies is that it provides a transactional approach to creating and updating the roles and policies and lets you version control the policies easily in git. Kappa takes care of all of the details of dealing with CloudFormation for you. You should be able to use these roles and policies directly although over time you may need to modify them or further restrict them.

Config

Kappa is driven from a YAML config file. There is a sample config file in the sample directory. You will have to make a couple of changes.

  • The profile attribute refers to the profile within your AWS config file (e.g. the one used by botocore and AWSCLI). These are the credentials that kappa will use.
  • The region attribute refers to the AWS region used. You may need to adjust this.
  • The event_source attribute refers to the source of events driving our Lambda Function. In our case, that should be the ARN of the Kinesis stream you have already created for use in this sample.

Make It So

Now we are ready to go. To deploy your sample app:

This will create the stack in CloudFormation containing our IAM policies and roles. It will wait for the stack creation to complete and then it will retrieve the ARN for the execution policy from the stack resources. Finally, it will also zip up the Javascript function and upload that to AWS Lambda.

At this point our Lambda Function is available in Lambda but its not hooked up to any event sources yet. Before we do that, we can test it out with some sample data. The input.json file in the sample directory contains data similar to what our function will receive and this file is referenced in our config file so we can easily send this test data to our function like this:

This calls the InvokeAsynch request of the AWS Lambda service. Our Lambda Function will get called with the test data and we should be able to see the output in the Amazon CloudWatch Logs service. To see the output:

Kappa takes care of finding the right log group name and stream in the CloudWatch Logs service that contains our output. It then prints the most recent log events from that stream. Note that it can take a few minutes for the log output to become available so you might have to make this call a few times before the data shows up.

Assuming that our test looks good, we can now configure our Lambda Function to start getting live events from our Kinesis stream. To do this:

Kappa finds the invocation role we created with CloudFormation earlier and finds the ARN of the Kinesis stream in our config file and then calls the AddEventSource request of the AWS Lambda service to hook our Lambda Function up to the Kinesis stream.

At this point, you can send some real data to your Kinesis stream and use the kappa tail command to see the output of your function based on those new events.

If you need to make changes to your roles, policies, or to the function itself just call kappa deployagain and kappa will take care of updating the CloudFormation stack and uploading the new version of your Javascript function.

Next Steps

The kappa tool is very new. I hope its useful but I’m sure it will become even more useful with feedback from folks who are actually using it. Don’t be shy! Give it a try and create some issues.

Finally, here are some useful links related to AWS Lambda.