Multi-region Serverless APIs: I’ve got a fever and the only cure is fewer servers

08. December 2018 2018 0

Meet SAM

Let’s talk about the hottest thing in computers for the past few years. No, not Machine Learning. No, not Kubernetes. No, not big data. Fine, one of the hottest things in computers. Right, serverless!

It’s still an emerging and quickly changing field, but I’d like to take some time to demonstrate how easy it is to make scalable and reliable multi-region APIs using just a few serverless tools and services.

It’s actually deceptively simple. Well, for a “blog post”-level application, anyway.

We’re going to be managing this application using the wonderful AWS Serverless Application Model (SAM) and SAM CLI. Far and away the easiest way I have ever used for creating and deploying serverless applications. And, in keeping with contemporary practices, it even has a cute little animal mascot.

SAM is a feature of CloudFormation that provides a handful of short-hand resources that get expanded out to their equivalent long-hand CloudFormation resources upon ChangeSet calculation. You can also drop down into regular CloudFormation whenever you need to to manage resources and configurations not covered by SAM.

The SAM CLI is a local CLI application for developing, testing, and deploying your SAM applications. It uses Docker under the hood to provide as close to a Lambda execution environment as possible and even allows you to run your APIs locally in an APIGateway-like environment. It’s pretty great, IMO.

So if you’re following along, go ahead and install Docker and the SAM CLI and we can get started.

The SAMple App

Once that’s installed, let’s generate a sample application so we can see what it’s all about. If you’re following along on the terminal, you can run sam init -n hello-sam -r nodejs8.10 to generate a sample node app called hello-sam. You can also see the output in the hello-sam-1 folder in the linked repo if you aren’t at a terminal and just want to read along.

The first thing to notice is the README.md that is full of a huge amount of information about the repo. For the sake of brevity, I’m going to leave learning the basics of SAM and the repo structure you’re looking at as a bit of an exercise for the reader. The README and linked documentation can tell you anything you need to know.

The important thing to know is that hello_world/ contains the code and template.yaml contains a special SAM-flavored CloudFormation template that controls the application. Take some time to familiarize yourself with it if you want to.

SAM Local

So what can SAM do, other than give us very short CFN templates? Well the SAM CLI can do a lot to help you in your local development process. Let’s try it out.

Step 0 is to install your npm dependencies so your function can execute:

Alright, now let’s have some fun.

There are a few local invocation commands that I won’t cover here because we’re making an API. The real magic with the CLI is that you can run your API locally with sam local start-api. This will inspect your template, identify your API schema, start a local API Gateway, and mount your functions at the correct paths. It’s by no means a perfect replica of running in production, but it actually does a surprisingly great job.

When we start the API, it will mount our function at /hello, following the path specified in the Events attribute of the resource.

Now you can go ahead and curl against the advertised port and path to execute your function.

Then on the backend, you’ll see it executing your function in Docker:

You can change your code at-will and the next invocation will pick it up. You can also attach a debugger to the process if you aren’t a “debug statement developer.”

Want to try to deploy it? I guess we might as well – it’s easy enough. The only pre-requisite is that we need an S3 bucket for our uploaded code artifact. So go ahead and make that – call it whatever you like.

Now, we’ll run a sam package. This will bundle up the code for all of your functions and upload it to S3. It’ll spit out a rendered “deployment template” that has the local CodeUris swapped out for S3 URLs.

If you check out deploy-template.yaml, you should see…very few remarkable differences. Maybe some of the properties have been re-ordered or blank lines removed. But the only real difference you should see is that the relative CodeUrl of CodeUri: hello_world/ for your function has been resolved to an S3 URL for deployment.

Now let’s go ahead and deploy it!

Lastly, let’s find the URL for our API so we can try it out:

Cool, let’s try it:

Nice work! Now that you know how SAM works, let’s make it do some real work for us.

State and Data

We’re planning on taking this multi-region by the end of this post. Deploying a multi-region application with no state or data is both easy and boring. Let’s do something interesting and add some data to our application. For the purposes of this post, let’s do something simple like storing a per-IP hit counter in DynamoDB.

We’ll go through the steps below, but if you want to jump right to done, check out the hello-sam-2 folder in this repository.

SAM offers a SimpleTable resource that creates a very simple DynamoDB table. This is technically fine for our use-case now, but we’ll need to be able to enable Table Streams in the future to go multi-region. So we’ll need to use the regular DynamoDB::Table resource:

We can use environment variables to let our functions know what our table is named instead of hard-coding it in code. Let’s add an environment variable up in the Globals section. This ensures that any functions we may add in the future automatically have access to this as well.

Change your Globals section to look like:

Lastly, we’ll need to give our existing function access to update and read items from the table. We’ll do that by setting the Policies attribute of the resource, which turns into the execution role. We’ll give the function UpdateItem and GetItem. When you’re done, the resource should look like:

Now let’s have our function start using the table. Crack open hello_world/app.js and replace the content with:

On each request, our function will read the requester’s IP from the event, increment a counter for that IP in the DynamoDB table, and return the total number of hits for that IP to the user.

This’ll probably work in production, but we want to be diligent and test it because we’re responsible, right? Normally, I’d recommend you spin up a DynamoDB-local Docker container, but to keep things simple for the purposes of this post, let’s create a “local dev” table in our AWS account called HitsTableLocal.

And now let’s update our function to use that table when we’re executing locally. We can use the AWS_SAM_LOCAL environment variable to determine if we’re running locally or not. Toss this at the top of your app.js to select that table when running locally:

Now let’s give it a shot! Fire up the app with sam local start-api and let’s do some curls.

Nice! Now let’s deploy it and try it out for real.

Not bad. Not bad at all. Now let’s take this show on the road!

Going Global

Now we’ve got an application, and it even has data that our users expect to be present. Now let’s go multi-region! There are a couple of different features that will underpin our ability to do this.

First is the API Gateway Regional Custom Domain. We need to use the same custom domain name in multiple regions, the edge-optimized custom domain won’t cut it for us since it uses CloudFront. The regional endpoint will work for us, though.

Next, we’ll hook those regional endpoints up to Route53 Latency Records in order to do closest-region routing and automatic failover.

Lastly, we need to way to synchronize our DynamoDB tables between our regions so we can keep those counters up-to-date. That’s where DynamoDB Global Tables come in to do their magic. This will keep identically-named tables in multiple regions in-sync with low latency and high accuracy. It uses DynamoDB Streams under the hood, and ‘last writer wins’ conflict resolution. Which probably isn’t perfect, but is good enough for most uses.

We’ve got a lot to get through here. I’m going to try to keep this as short and as clear as possible. If you want to jump right to the code, you can find it in the hello-sam-3 directory of the repo.

First things first, let’s add in our regional custom domain and map it to our API. Since we’re going to be using a custom domain name, we’ll need a Route53 Hosted Zone for a domain we control. I’m going to pass through the domain name and Hosted Zone Id via a CloudFormation parameter and use it below. When you deploy, you’ll need to supply your own values for these parameters.

Toss this at the top of template.yaml to define the parameters:

Now we can create our custom domain, provision a TLS certificate for it, and configure the base path mapping to add our API to the custom domain – put this in the Resources section:

That !Ref ServerlessRestApi references the implicit API Gateway that is created as part of the AWS::Serverless::Function Event object.

Next, we want to assign each regional custom domain to a specific Route53 record. This will allow us to perform latency-based routing and regional failover through the use of custom healthchecks. Let’s put in a few more resources:

The AWS::Route53::Record resource creates a DNS record and assigns it to a specific AWS region. When your users query for your record, they will get the value for the region closest to them. This record also has a AWS::Route53::HealthCheck attached to it. This healthcheck will check your regional endpoint every 30 seconds. If your regional endpoint has gone down, Route53 will stop considering that record when a user queries for your domain name.

Our Route53 Healthcheck is looking at /health on our API, so we’d better implement that if we want our service to stay up. Let’s just drop a stub healthcheck into app.js. For a real application you could perform dependency checks and stuff, but for this we’ll just return a 200:

The last piece, we, unfortunately, can’t control directly with CloudFormation; we’ll need to use regular AWS CLI commands. Since Global Tables span regions, it kind of makes sense. But before we can hook up the Global Table, each table needs to exist already.

Through the magic of Bash scripts, we can deploy to all of our regions and create the Global Table all in one go!

For a more idempotent (but more verbose) version of this script, check out hello-sam-3/deploy.sh.

Note: if you’ve never provisioned an ACM Certificate for your domain before, you may need to check your CloudFormation output for the validation CNAMEs.

And…that’s all there is to it. You have your multi-region app!

Let’s try it out

So let’s test it! How about we do this:

  1. Get a few hits in on our home region
  2. Fail the healthcheck in our home region
  3. Send a few hits to the next region Route53 chooses for us
  4. Fail back to our home region
  5. Make sure the counter continues at the number we expect

Cool, we’ve got some data. Let’s failover! The easiest way to do this is just to tell Route53 that up actually means down. Find the healthcheck id for your region using aws route53 list-health-checks and run:

Now let’s wait a minute for it to fail over and give it another shot.

Look at that, another region! And it started counting at 3. That’s awesome, our data was replicated. Okay, let’s fail back, you know the drill:

Give it a minute for the healthcheck to become healthy again and fail back. And now let’s hit the service a few more times:

Amazing. Your users will now automatically get routed to not just the nearest region, but the nearest healthy region. And all of the data is automatically replicated between all active regions with very low latency. This grants you a huge amount of redundancy, availability, and resilience to service, network, regional, or application failures.

Now not only can your app scale effortlessly through the use of serverless technologies, it can failover automatically so you don’t have to wake up in the middle of the night and find there’s nothing you can do because there’s a network issue that is out of your control – change your region and route around it.

Further Reading

I don’t want to take up too much more of your time, but here’s some further reading if you wish to dive deeper into serverless:

  • A great Medium post by Paul Johnston on Serverless Best Practices
  • SAM has configurations for safe and reliable deployment and rollback using CodeDeploy!
  • AWS built-in tools for serverless monitoring are lackluster at best, you may wish to look into external services like Dashbird or Thundra once you hit production.
  • ServerlessByDesign is a really great web app that allows you to drag, drop, and connect various serverless components to visually design and architect your application. When you’re done, you can export it to a working SAM or Serverless repository!

About the Author

Norm recently joined Chewy.com as a Cloud Engineer to help them start on their Cloud transformation. Previously, he ran the Cloud Engineering team at Cimpress. Find him on twitter @nromdotcom.

About the Editor

Jennifer Davis is a Senior Cloud Advocate at Microsoft. Jennifer is the coauthor of Effective DevOps. Previously, she was a principal site reliability engineer at RealSelf, developed cookbooks to simplify building and managing infrastructure at Chef, and built reliable service platforms at Yahoo. She is a core organizer of devopsdays and organizes the Silicon Valley event. She is the founder of CoffeeOps. She has spoken and written about DevOps, Operations, Monitoring, and Automation.


Scaling up An Existing Application with Lambda Functions

05. December 2018 2018 0

What is Serverless?

Serverless infrastructure is the latest buzzword in tech, but from the name itself, it’s not clear what it means. At its core, serverless is the next logical step in infrastructure service providers abstracting infrastructure away from developers so that when you deploy your code, “it just works.”

Serverless doesn’t mean that your code is not running on a server. Rather, it means that you don’t have to worry about what server it’s running on, whether that server has adequate resources to run your code, or if you have to add or remove servers to properly scale your implementation. In addition to abstracting away the specifics of infrastructure, serverless lets you pay only for the time your code is explicitly running (in 100ms increments).

AWS-specific Serverless: Lambda

Lambda, Amazon Web Service’s serverless offering, provides all these benefits, along with the ability to trigger a Lambda function with a wide variety of AWS services.` Natively, Lambda functions can be written in:

  • Java
  • Go
  • PowerShell
  • Node.js
  • C#
  • Python

However, as of November 29th, AWS announced that it’s now possible to use the Runtime API to add any language to that list. Adapters for Erlang, Elixir, Cobol, N|Solid, and PHP are currently in development as of this writing.

What does it cost?

The Free Tier (which does not expire after the 12-month window of some other free tiers) allows up to 1 million requests per month, with the price increasing to $0.20 per million requests thereafter. The free tier also includes 400,000 GB-seconds of compute time.

The rate at which this compute time will be used up depends on how much memory you allocate to your Lambda function on its creation. Any way you look at it, many workflows can operate in the free tier for the entire life of the application, which makes this a very attractive option to add capacity and remove bottlenecks in existing software, as well as quickly spin up new offerings.

What does a Lambda workflow look like?

Using the various triggers that AWS provides, Lambda offers the promise of never having to provision a server again. This promise, while technically possible, is only fulfilled for various workflows and with complex stringing together of various services.

Complex Workflow With No Provisioned Servers

However, if you have an existing application and don’t want to rebuild your entire infrastructure on AWS services, can you still integrate serverless into your infrastructure?

Yes.

At its most simple implementation, AWS Lambda functions can be the compute behind a single API endpoint. This means rather than dealing with the complex dependency graph we see above; existing code can be abstracted into Lambda functions behind an API endpoint, which can be called in your existing code. This approach allows you to take codepaths that are currently bottlenecks and put them on infrastructure that can run in parallel as well as scale infinitely.

Creating an entire serverless infrastructure can seem daunting. Instead of exploring how you can create an entire serverless infrastructure all at once, we will look at how to eliminate bottlenecks in an existing application using the power that serverless provides.

Case Study: Resolving a Bottleneck With Serverless

In our hypothetical application, users can set up alerts to receive emails when a certain combination of API responses all return true. These various APIs are queried hourly to determine whether alerts need to be sent out. This application is built as a traditional monolithic application, with all the logic and execution happening on a single server.

Important to note is that the application itself doesn’t care about the results of the alerts. It simply needs to dispatch them, and if the various API responses match the specified conditions, the user needs to get an email alert.

What is the Bottleneck?

In this case, as we add more alerts, processing the entire collection of alerts starts to take more and more time. Eventually, we will reach a point where the checking of all the alerts will overrun into the next hour, thus ensuring the application is perpetually behind in checking a user’s alerts.

This is a prime example of a piece of functionality that can be moved to a serverless function, because the application doesn’t care about the result of the serverless function, we can dispatch all of our alert calls asynchronously and take advantage of the auto-scaling and parallelization of AWS Lambda to ensure all of our events are processed in a fraction of the time.

Refactoring the checking of alerts into a Lambda function takes this bottleneck out of our codebase and turns it into an API call. However, there are now a few caveats that we have to resolve if we want to run this as efficiently as possible.

Caveat: Calling an API-invoked Lambda Function Asynchronously

If you’re building this application in a language or library that’s synchronous by default, turning these checks into API calls will just result in API calls each waiting for the previous to finish. While it’s possible you’ll receive a speed boost because your Lambda function is set up to be more powerful than your existing server, you’ll eventually run into a similar problem as we had before.

As we’ve already discussed, all our application needs to care about is that the call was received by API Gateway and Lambda, not whether it finished or what the result of the alert was. This means, if we can get API Gateway to return as soon as it receives the request from our application, we can run through all these requests in our application much more quickly.

In their documentation for integrating API Gateway and Lambda, AWS has provided documentation on how to do just this:

To support asynchronous invocation of the Lambda function, you must explicitly add the X-Amz-Invocation-Type:Event header to the integration request.

This will make the loop in our application that dispatches the requests run much faster and allow our alert checks to be parallelized as much as possible.

Caveat: Monitoring

Now that your application is no longer responsible for ensuring these API calls complete successfully, failing APIs will no longer trigger any monitoring you have in place.

Out of the box, Lambda supports Cloudwatch monitoring where you can check if there were any errors in the function execution. You can also set up Cloudwatch to monitor API Gateway as well. If the existing metrics don’t fit your needs, you can always set up custom metrics in Cloudwatch to ensure you’re monitoring everything that makes sense for your application.

By integrating Cloudwatch into your existing monitoring solution, you can ensure your serverless functions are firing properly and always available.

Tools for Getting Started

One of the most significant barriers to entry for serverless has traditionally been the lack of tools for local development and the fact that the infrastructure environment is a bit of a black box.

Luckily, AWS has built SAM (Serverless Application Model) CLI, which can be run locally inside Docker to give you a simulated serverless environment. Once you get the project installed and API Gateway running locally, you can hit the API endpoint from your application and see how your serverless function performs.

This allows you to test your new function’s integration with your application and iron out any bugs before you go through the process of setting up API Gateway on AWS and deploying your function.

Serverless: Up and Running

Once you go through the process of creating a serverless function and getting it up and running, you’ll see just how quickly you can remove bottlenecks from your existing application by leaning on AWS infrastructure.

Serverless isn’t something you have to adopt across your entire stack or adopt all at once. By shifting to a serverless application model piece by piece, you can avoid overcomplicating your workflow while still taking advantage of everything this exciting new technology has to offer.

About the Author

Keanan Koppenhaver @kkoppenhaver, is CTO at Alpha Particle, a digital consultancy that helps plan and execute digital projects that serve anywhere from a few users a month to a few million. He enjoys helping clients build out their developer teams, modernize legacy tech stacks, and better position themselves as technology continues to move forward. He believes that more technology isn’t always the answer, but when it is, it’s important to get it right.

About the Editor

Jennifer Davis is a Senior Cloud Advocate at Microsoft. Jennifer is the coauthor of Effective DevOps. Previously, she was a principal site reliability engineer at RealSelf, developed cookbooks to simplify building and managing infrastructure at Chef, and built reliable service platforms at Yahoo. She is a core organizer of devopsdays and organizes the Silicon Valley event. She is the founder of CoffeeOps. She has spoken and written about DevOps, Operations, Monitoring, and Automation.


Alexa is checking your list

20. December 2016 2016 0

Author: Matthew Williams
Editors: Benjamin Marsteau, Scott Francis

Recently I made a kitchen upgrade: I bought an Amazon Dot. Alexa, the voice assistant inside the intelligent puck, now plays a key role in the preparation of meals every day. With both hands full, I can say “Alexa, start a 40-minute timer” and not have to worry about burning the casserole. However, there is a bigger problem coming up that I feel it might also help me out on. It is the gift-giving season, and I have been known to get the wrong things. Wouldn’t it be great if I could have Alexa remind me what I need to get for each person on my list? Well, that simple idea took me down a path that has consumed me for a little too long. And as long as I built it, I figured I would share it with you.

Architecting a Solution

Now it is important to remember that I am a technologist and therefore I am going to go way beyond what’s necessary. [ “anything worth doing is worth overdoing.” — anon. ] Rather than just building the Alexa side of things, I decided to create the entire ecosystem. My wife and I are the first in our families to add Alexa to their household, so that means I need a website for my friends and family to add what they want. And of course, that website needs to talk to a backend server with a REST API to collect the lists into a database. And then Alexa needs to use that same API to read off my lists.

OK, so spin up an EC2 instance and build away, right? I did say I am a technologist, right? That means I have to use the shiniest tools to get the job done. Otherwise, it would just be too easy.

My plan is to use a combination of AWS Lambda to serve the logic of the application, the API Gateway to host the REST endpoints, DynamoDB for saving the data, and another Lambda to respond to Alexa’s queries.

The Plan of Attack

Based on my needs, I think I came up with the ideal plan of attack. I would tackle the problems in the following order:

  1. Build the Backend – The backend includes the logic, API, and database.
    1. Build a Database to Store the Items
    2. Lambda Function to Add an Item
    3. Lambda Function to Delete an Item
    4. Lambda Function to List All Items
    5. Configure the API Gateway
  2. Build the User Interface – The frontend can be simple: show a list, and let folks add and remove from that list.
  3. Get Alexa Talking to the Service – That is why we are here, right?

There are some technologies used that you should understand before beginning. You do not have to know everything about Lambda or the API Gateway or DynamoDB, but let’s go over a few of the essentials.

Lambda Essentials

The purpose of Lambda is to run the functions you write. Configuration is pretty minimal, and you only get charged for the time your functions run (you get a lot of free time). You can do everything from the web console, but after setting up a few functions, you will want another way. See this page for more about AWS Lambda.

API Gateway Essentials

The API Gateway is a service to make it easier to maintain and secure your APIs. Even if I get super popular, I probably won’t get charged much here as it is $3.50 per million API calls. See this page for more about the Amazon API Gateway.

DynamoDB Essentials

DynamoDB is a simple (and super fast) NoSQL database. My application has simple needs, and I am going to need a lot more friends before I reach the 25 GB and 200 million requests per month that are on the free plan. See this page for more about Amazon DynamoDB.

Serverless Framework

Sure I can go to each service’s console page and configure them, but I find it a lot easier to have it automated and in source control. There are many choices in this category including the Serverless framework, Apex, Node Lambda, and many others. They all share similar features so you should review them to see which fits your needs best. I used the Serverless framework for my implementation.

Alexa Skills

When you get your Amazon Echo or Dot home, you interact with Alexa, the voice assistant. The things that she does are Alexa Skills. To build a skill you need to define a list of phrases to recognize, what actions they correspond to, and write the code that performs those actions.

Let’s Start Building

There are three main components that need to be built here: API, Web, and Skill. I chose a different workflow for each of them. The API uses the Serverless framework to define the CloudFormation template, Lambda Functions, IAM Roles, and API Gateway configuration. The Webpage uses a Gulp workflow to compile and preview the site. And the Alexa skill uses a Yeoman generator. Each workflow has its benefits and it was exciting to use each.

If you would like to follow along, you can clone the GitHub repo: https://github.com/DataDog/AWS-Advent-Alexa-Skill-on-Lambda.

Building the Server

The process I went through was:

  1. Install Serverless Framework (npm i -g serverless)
  2. Create the first function (sls create -n <service name> -t aws-nodejs)The top-level concept in Serverless is that of a service. You create a service, then all the Lambda functions, CloudFormation templates, and IAM roles defined in the serverless.yaml file support that service.Add the resources needed to a CloudFormation template in the serverless.yaml file. For example:Refer to the CloudFormation docs and the Serverless Resources docs for more about this section.
  3. Add the resources needed to a CloudFormation template in the serverless.yaml file. For example:
    alexa_1
    Refer to the CloudFormation docs and the Serverless Resources docs for more about this section.
  4. Add the IAM Role statements to allow your Lambda access to everything needed. For example:
    alexa_2
  5. Add the Lambda functions you want to use in this service. For example:
    alexa_3
    The events section lists the triggers that can kick off this function. **http** means to use the API Gateway. I spent a little time in the API Gateway console and got confused. But these four lines in the serverless.yaml file were all I needed.
  6. Install serverless-webpack npm and add it to the YAML file:
    alexa_4
    This configuration tells Serverless to use WebPack to bundle all your npm modules together in the right way. And if you want to use EcmaScript 2015 this will run Babel to convert back down to a JavaScript version that Lambda can use.  You will have to setup your webpack.config.js and .babelrc files to get everything working.
  7. Write the functions. For the function I mentioned earlier, I added the following to my items.js file:
    alexa_5
    This function sets the table name in my DynamoDB and then grabs all the rows. No matter what the result is, a response is formatted using this createResponse function:
    alexa_6Notice the header. Without this, Cross Origin Resource Sharing will not work. You will get nothing but 502 errors when you try to consume the API.
  8. Deploy the Service:

    Now I use 99Design’s aws-vault to store my AWS access keys rather than adding them to a rc file that could accidentally find its way up to GitHub. So the command I use is:

    If everything works, it creates the DynamoDB table, configures the API Gateway APIs, and sets up the Lambdas. All I have to do is try them out from a new application or using a tool like Paw or Postman. Then rinse and repeat until everything works.

Building the Frontend

alexa_7

Remember, I am a technologist, not an artist. It works, but I will not be winning any design awards. It is a webpage with a simple table on it and loads up some Javascript to show my DynamoDB table:

alexa_8

Have I raised the technologist card enough times yet? Well, because of that I need to keep to the new stuff even with the Javascript features I am using. That means I am writing the code in ECMAScript 2015, so I need to use Babel to convert it to something usable in most browsers. I used Gulp for this stage to keep building the files and then reloading my browser with each change.

Building the Alexa Skill

Now that we have everything else working, it is time to build the Alexa Skill. Again, Amazon has a console for this which I used for the initial configuration on the Lambda that backs the skill. But then I switched over to using Matt Kruse’s Alexa App framework. What I found especially cool about his framework was that it works with his alexa-app-server so I can test out the skill locally without having to deploy to Amazon.

For this one I went back to the pre-ECMAScript 2015 syntax but I hope that doesn’t mean I lose technologist status in your eyes.

Here is a quick look at a simple Alexa response to read out the gift list:

alexa_9

Summary

And now we have an end to end solution around working with your gift lists. We built the beginnings of an API to work with gift lists. Then we added a web frontend to allow users to add to the list. And then we added an Alexa skill to read the list while both hands are on a hot pan. Is this overkill? Maybe. Could I have stuck with a pen and scrap of paper? Well, I guess one could do that. But what kind of technologist would I be then?

About the Author

Matt Williams is the DevOps Evangelist at Datadog. He is passionate about the power of monitoring and metrics to make large-scale systems stable and manageable. So he tours the country speaking and writing about monitoring with Datadog. When he’s not on the road, he’s coding. You can find Matt on Twitter at @Technovangelist.

About the Editors

Benjamin Marsteau is a System administrator | Ops | Dad | and tries to give back to the community has much as it gives him.


Serverless everything: One-button serverless deployment pipeline for a serverless app

14. December 2016 2016 0

Author: Soenke Ruempler
Editors: Ryan S. Brown

Update: Since AWS recently released CodeBuild, things got much simpler. Please also read my follow-up post AWS CodeBuild: The missing link for deployment pipelines in AWS.

Infrastructure as Code is the new default: With tools like Ansible, Terraform, CloudFormation, and others it is getting more and more common. A multitude of services and tools can be orchestrated with code. The main advantages of automation are reproducibility, fewer human errors, and exact documentation of the steps involved.

With infrastructure expressed as code, it’s not a stretch to also want to codify deployment pipelines. Luckily, AWS has it’s own service for that named CodePipeline, which in turn can be fully codified and automated by CloudFormation (“Pipelines as Code”).

This article will show you how to create a deploy pipeline for a serverless app with a “one-button” CloudFormation template. The more concrete goals are:

  • Fully serverless: neither the pipeline nor the app itself involves server, VM or container setup/management (and yes, there are still servers, just not managed by us).
  • Demonstrate a fully automated deployment pipeline blueprint with AWS CodePipeline for a serverless app consisting of a sample backend powered by the Serverless framework and a sample frontend powered by “create-react-app”.
  • Provide a one-button quick start for creating deployment pipelines for serverless apps within minutes. Nothing should be run from a developer machine, not even an “inception script”.
  • Show that it is possible to lower complexity by leveraging AWS components so you don’t need to configure/click third party providers (e.g. TravisCi/CircleCi) as pipeline steps.

We will start with a repository consisting of a typical small web application with a front end and a back end. The deployment pipeline described in this article makes some assumptions about the project layout (see the sample project):

  • a frontend/ folder with a package.json which will produce a build into build/ when npm run build is called by the pipeline.
  • a backend/ folder with a serverless.yml. The pipeline will call the serverless deploy (the Serverless framework). It should have at least one http event so that the Serverless framework creates a service endpoint which can then be used in the frontend to call the APIs.

For a start, you can just clone or copy the sample project into your own GitHub account.

As soon as you have your project ready, we can continue to create a deployment pipeline with CloudFormation.

The actual CloudFormation template we will use here to create the deployment pipeline does not reside in the project repository. This allows us to develop/evolve the pipeline and the pipeline code and the projects using the pipeline independent from each other. It is published to an S3 bucket so we can build a one-click launch button. The launch button will direct users to the CloudFormation console with the URL to the template prefilled:

Launch Stack

After you click on the link (you need to be logged in into the AWS Console), and click “Next” to confirm that you want to use the predefined template, some CloudFormation stack parameters have to be specified:CloudFormation stack parameters

First you need to specify the GitHub Owner/Repository of the project (the one you copied earlier), a branch (usually master) and a GitHub Oauth Token as described in the CodePipeline documentation.

The other parameters specify where to find the Lambda function source code for the deployment steps, we can live with the defaults for now, stuff for another blog post. (Update: the Lambda functions became obsolete the move to AWS CodeBuild,, and so did the template parameters regarding Lambda source code location.)

The next step of the CloudFormation stack setup allows you to specify advanced settings like tags, notifications and so on. We can leave as-is as well.

On the last assistant page you need to acknowledge that CloudFormation will create IAM roles on your behalf:

CloudFormation IAM confirmation

The IAM roles are needed to give Lambda functions the right permissions to run and logs to CloudWatch. Once you pressed the “Create” button, CloudFormation will create the following AWS resources:

  • An S3 Bucket containing the website assets with website hosting enabled.
  • A deployment pipeline (AWS CodePipeline) consisting of the following steps:
    • Checks out the source code from GitHub and saves it as an artifact.
    • Back end deployment: A Lambda function build step which takes the source artifact, installs and calls the Serverless framework.
    • Front end deployment: Another Lambda function build step which takes the source artifact, runs npm build and deploys the build to the Website S3 bucket

(Update: in the meantime, I replaced the Lambda functions with AWS CodeBuild).

No servers harmed so far, and also no workstations: No error-prone installation steps in READMEs to be followed, no curl | sudo bash or other awkward setup instructions. Also no hardcoded AWS access key pairs anywhere!

A platform team in an organization could provide several of these types of templates for particular use cases, then development teams could get going just by clicking the link.

Ok, back to our example: Once the CloudFormation stack creation is fully finished, the created CodePipeline is going to run for the first time. On the AWS console:

CodePipeline running

As soon as the initial pipeline run is finished:

  • the back end CloudFormation stack has been created by the Serverless framework, depending on what you defined in the backend/serverless.yml configuration file.
  • the front end has been built and put into the website bucket.

To find out the URL of our website hosted in S3, open the resources of the CloudFormation stack and expand the outputs. The WebsiteUrl output will show the actual URL:

CloudFormation Stack output

Click on the URL link and view the website:

Deployed sample website

Voila! We are up and running!

As might have seen in the picture above, there is some JSON output: It’s actually the result of a HTTP call the front end made against the back end: the hello function, which just responds the Lambda event object.

Let’s dig a bit deeper into this detail as it shows the integration of frontend and backend: To pass the ServiceEndpoint URL to the front end build step, the back end build step is exporting all CloudFormation Outputs of the Serverless-created stack as a CodePipeline build artifact, which the front end build step takes in turn to pass it to npm build (in our case via a react-specific environment var). This is how the API call looks in react:

This Cross-site request actually works, because we specified CORS to be on in the serverless.yml:

Here is a high-level overview of the created CloudFormation stack:

Overview of the CloudFormation Stack

With the serverless pipeline and serverless project running, change something in your project, commit it and view the change propagated through the pipeline!

Additional thoughts:

I want to setup my own S3 bucket with my own CloudFormation templates/blueprints!

In case that you don’t trust me as a template provider, or you want to change the one-button CloudFormation template, you can of course host your own S3 bucket. The scope of doing that is beyond this article but you can start by looking at my CloudFormation template repo.

I want to have testing/staging in the pipeline!

The sample pipeline does not have any testing or staging steps. You can add more steps to the pipeline, e.g. another Lambda step, which calls e.g. npm test on your source code.

I need a database/cache/whatever for my app!

No problem, just add additional resources to the serverless.yml configuration file.

Summary

In this blog post I demonstrated a CloudFormation template which bootstraps a serverless deployment pipeline with AWS CodePipeline. This enables rapid application development and deployment, as development teams can use the template in a “one-button” fashion for their projects.

We have deployed a sample project with a deployment pipeline with a front and back end.

AWS gives us all the lego bricks we need to create such pipelines in an automated, codified and (almost) maintenance-free way.

Known issues / caveats

  • I am describing an experiment / working prototype here. Don’t expect high quality, battle tested code (esp. the JavaScript parts 🙂 ). It’s more about the architectural concept. Issues and Pull requests to the mentioned projects are welcome 🙂 (Update: luckily I could delete all the JS code with  the move to AWS CodeBuild)
  • All deployment steps currently run with full access roles (AdministratorAccess) which is a bad practice but it was not the focus of this article.
  • The website could also be backed by a CloudFront CDN with HTTPS and a custom domain.
  • Beware of the 5 minute execution limit in Lambda functions (e.g. more complex serverless.yml setups might take longer, this could be worked around by sourcing it resource creation a CloudFormation pipeline step, Michael Wittig has blogged about that). (Update: this point became invalid with the move to AWS CodeBuild)
  • The build steps are currently not optimized, e.g. installing npm/serverless every time is not necessary. It could use an artifact from an earlier CodePipeline step (Update: this point became invalid with the move to AWS CodeBuild)
  • The CloudFormation stack created by the Serverless framework is currently suffixed with “dev”, because that’s their default environment. The prefix should be omitted or made configurable.

Acknowledgements

Special thanks goes to the folks at Stelligent.

First for their open source work on serverless deploy pipelines with Lambda, especially the “dromedary-serverless” project. I adapted much from the Lambda code.

Second for their “one-button” concept which influenced this article a lot.

About the Author

Along with 18 years of web software development and web operations experience, Soenke Ruempler is an expert in AWS technologies (6 years in experience of development and operating), and in moving on-premise/legacy systems to the Cloud without service interruptions.

His special interests and fields of knowledge are Cloud/AWS, Serverless architectures, Systems Thinking, Toyota Kata (Kaizen), Lean Software Development and Operations, High Performance/Reliability Organizations, Chaos Engineering.

You can find him on Twitter, Github and occasionally blogging on ruempler.eu.

About the Editors

Ryan Brown is a Sr. Software Engineer at Ansible (by Red Hat) and contributor to the Serverless Framework. He’s all about using the best tool for the job, and finds simplicity and automation are a winning combo for running in AWS.


Four ways AWS Lambda makes me happy

09. December 2016 2016 0

Author: Tal Perry
Editors: Jyrki Puttonen, Bill Weiss

Intro

What is Lambda

Side projects are my way of learning new technology. One that I’ve been anxious to try is AWS Lambda. In this article, I will focus on the things that make Lambda a great service in my opinion.

For the uninitiated, Lambda is a service that allows you to essentially upload a function and AWS will make sure the hardware is there to run it. You pay for the compute time in hundred millisecond increments instead of by the hour, and you can run as many copies of your lambda function as needed.

You can think of Lambda as a natural extension to containers. Containers (like Docker) allow you to easily deploy multiple workloads to a fleet of servers. You no longer deploy to a server, you deploy to the fleet and if there is enough room in the fleet your container runs. Lambda takes this one step further by abstracting away the management of the underlying server fleet and containerization. You just upload code, AWS containerizes it and puts it on their fleet.

Why did I choose Lambda?

My latest side project is SmartScribe, an automated transcription service. SmartScribe transcribes hours of audio in minutes, a feat which requires considerable memory and parallel processing of audio. While a fleet of containers could get the job done, I didn’t want to manage a fleet, integrate it with other services nor did I want to pay for my peak capacity when my baseline usage was far lower. Lambda abstracts away these issues, which made it a very satisfying choice.

How AWS Lambda makes me happy

It’s very cheap

I love to invest my time in side projects, I get to create and learn. Perhaps irrationally, I don’t like to put a lot of money into them from the get go. When I start building a project I want it up all the time so that I can show it around. On the other hand I know that 98% of the time, my resources will not be used.

Serverless infrastructure saves me that 98% by allowing me to pay by the millisecond instead of by the hour. 98% is a lot of savings by any account

I don’t have to think about servers

As I mentioned, I like to invest my time in side projects but I don’t like to invest it in maintaining or configuring infrastructure. A thousand little things can go wrong on your server and any one of those will bring your product to a halt. I’m more than happy to never think about another server again.

Here are a few things that have slowed me down before that Lambda has abstracted away:

  1. Having to reconfigure because I forgot to set the IP address of an instance to elastic and the address went away when I stopped it (to save money)
  2. Worrying about disk space. My processes write to the disk. Were I to use a traditional architecture I’d have to worry about multiple concurrent processes consuming the entire disk, a subtle and aggravating bug. With lambda, each function invocation is guaranteed a (small) chunk of tmp space which reduces my concern.
  3. Running out of memory. This is a fine point because a single lambda function can only use 1.5 G of memory.

Two caveats:

  1. Applications that hold large data sets in memory might not benefit from Lambda. Applications that hold small to medium sized data sets in memory are prime candidates.
  2. 512MB of provisioned tmp space is a major bottleneck to writing larger files to disk.

Smart Scribe works with fairly large media files and we need to store them in memory with overhead. Even a few concurrent users can easily lead to problems with available memory – even with a swap file (and we hate configuring servers so we don’t want one). Lambda guarantees that every call to my endpoints will receive the requisite amount of memory. That’s priceless.

I use Apex to deploy my functions, which happens in one line

Apex is smart enough to only deploy the functions that have changed. And in that one line, my changes and only them reach every “server” I have. Compare that to the time it takes to do a blue green deployment or, heaven forbid, sshing into your server and pulling the latest changes.

But wait, there is more. Pardon last year’s buzzword, but AWS Lambda induces or at least encourages a microservice architecture. Since each function exists as its own unit, testing becomes much easier and more isolated which saves loads of time.

Tight integration with other AWS services

What makes microservices hard is the overhead of orchestration and communications between all of the services in your system. What makes Lambda so convenient is that it integrates with other AWS services, abstracting away that overhead.

Having AWS invoke my functions based on an event in S3 or SNS means that I don’t have to create some channel of communication between these services, nor monitor that channel. I think that this fact is what makes Lambda so convenient, the overhead you pay for a scalable, maintainable and simple code base is virtually nullified.

The punch line

One of the deep axioms of the world is “Good, Fast, Cheap : Choose two”. AWS Lambda takes a stab at challenging that axiom.

About the Author:

By day, Tal is a data science researcher at Citi’s Innovation lab Tel Aviv focusing on NLP. By night he is the founder of SmartScribe, a fully serverless automated transcription service hosted on AWS. Previously Tal was CTO of Superfly where he and his team leveraged AWS technologies and good Devops to scale the data pipeline 1000x. Check out his projects and reach out on Twitter @thetalperry

About the Editors:

Jyrki Puttonen is Chief Solutions Executive at Symbio Finland (@SymbioFinland) who tries to keep on track what happens in cloud.

Bill Weiss is a senior manager at Puppet in the SRE group.  Before his move to
Portland to join Puppet he spent six years in Chicago working for Backstop
Solutions Group, prior to which he was in New Mexico working for the
Department of Energy.  He still loves him some hardware, but is accepting
that AWS is pretty rad for some some things.