AWS Direct Connect

Today’s post on AWS Direct Connect is a contribution by Benjamin Krueger, who is a Site Reliability Engineer for Sourcefire, Inc and is presently working with a highly talented team to build a flexible hybrid cloud infrastructure.

He enjoys a delicious cup of buzzword soup, and isn’t afraid to SOA his cloud with API driven platform metrics. His event streams offer high availability processing characteristics. Turbo Encabulator.

Deck the halls with single-mode fiber

I wish I could have my cake and eat it too.

Whether you are a fan or critic, the world of cloud computing has undeniably changed how many of us build and operate the services we offer. Also undeniable, however, is the fact that the reliability of your access to resources in the cloud is limited by the reliability of all the networks in between. In the networking world, one way that ISPs, carriers, and content providers often side-step this issue is by participating in Internet Exchanges; physical network meet up points where participants exchange network traffic directly between their respective networks. Another form of this is through direct network peering agreements where two parties maintain a direct physical network connection between each other to exchange traffic.

While the cloud offers lots of benefits, sometimes it just doesn’t make sense to run your entire operation there. You can’t run your own specialized network appliances in the cloud, for example. Perhaps your requirements specify a level of hardware control that can’t be met by anything other than an in-house datacenter. Maybe the cost-benefit of available cloud server instances makes sense for some workloads but not for others. Sure, you can write off the cloud entirely but wouldn’t it be nice if you could build a hybrid solution and get a network connection direct from your own datacenter or home office to your cloud provider’s network? If you’re an Amazon Web Services customer then you can do this today with AWS Direct Connect. This article won’t be a howto cookbook but will outline what Direct Connect is and how you can use it to improve the reliability and performance of your infrastructure when taking advantage of the benefits of cloud services.

AWS Direct Connect service.

The AWS Direct Connect service lets you establish a network link, at 1Gb or 10Gb, from your datacenter to one of seven Amazon regional datacenters across the globe. At the highest level, you work with your infrastructure provider to establish a network link between your datacenter and an AWS Direct Connect Location. Direct Connect Locations are like meet up points. Each is located in physical proximity to an Amazon region, and are the point where direction connections are brought in to Amazon’s network for that region.

AWS Direct Connect Locations

As an illustration, let’s explore a hypothetical Direct Connect link from a New Jersey datacenter to Amazon’s US-East region. Amazon maintains Direct Connect Locations for their US-East Northern Virginia region at CoreSite in New York City, and seven Equinix data centers in Northern Virginia. Being in New Jersey, it makes sense for us to explore a connection to their CoreSite location in New York. Since you don’t already have a presence in CoreSite, you would make arrangements to rent a cage and collocate. Then you would have to make arrangements, usually with a telco or other network infrastructure provider, to create a link between your datacenter and your gear in CoreSite. At that point, you can begin the process to cross-connect between your CoreSite cage and Amazon’s CoreSite cage.

The example I just outlined has quite a few drawbacks. We need to interface with a lot of companies and sign a lot of contracts. That necessarily means quite a bit of involvement from your executive management and legal counsel. It also requires a significant investment of time and Capex, as well as ongoing Opex. Is there anything we can do to make this process simpler and more cost-effective?

An AWS Direct Connect Layout

As it turns out, there is something we can do. Amazon has established a group of what they call APN Technology and Consulting Partners. That’s quite a mouthful, but it boils down to a group of companies with can manage many of the details involved in the Direct Connect process. In the example layout above, we work with an APN Partner who establishes a link between our datacenter and the Direct Connect Location. They take care of maintaining a presence there, as well as the details involved in interfacing with Amazon’s cage. The end result is a single vendor helping us establish our Direct Connect link to Amazon.

So what’s this gonna cost me?

At the time of this publication, Amazon’s charges $0.30 per port-hour for 1Gb connections and $2.25 per port-hour for 10Gb connections. Since the ports are always on while you use the service, that works out to approximately $220/mo for 1Gb and $1650/mo for 10Gb. In addition, Amazon charges $0.03 per GB of outbound transfers while inbound transfers are free. That means a Direct Connect link makes the most sense for pushing large quantities of data towards Amazon. This works out well for scenarios where systems in the cloud make small requests to systems in your datacenter which then return larger results.

Costs when dealing with an APN Partner can vary. In my own environment, the vendor costs approximately $3k/mo. The vendor takes care of the connection between our North Virginia datacenter and Amazon’s Equinix Direct Connect Location, and we get a single-mode fiber drop straight into our cage. All we have to do is plug it in to our router. For more complex links, costs will obviously be higher. You could direct connect your Toronto datacenter to Amazon through CoreSite in New York but with getting fiber out of your cage, working with a network carrier for the trip between cities, cage rental, and cross connect charges, don’t be surprised if the bill is significant!

Get your packets runnin’

Once you have a physical path to Amazon, you need to plug it in to something. Amazon requires that your router support 802.1Q VLANs, BGP, and BGP MD5 authentication. While this often means using a traditional network router from a company like Cisco or Juniper, you could also build a router using Linux and a BGP implementation like OpenBGP or Zebra. If you have an ops team, but the idea of BGP makes you shiver, don’t fret. Once you give them some details, Amazon will generate a sample configuration for common Cisco and Juniper routers.

To begin routing traffic to AWS public services over your Direct Connect link, you will need to create a Virtual Interface in Amazon’s configuration console. You only need a few pieces of information to set this up: A VLAN number, a BGP ASN, your router’s peer IP address (which Amazon will provide), Amazon’s peer IP address (also provided), and the network prefixes that you want to advertise. Some of this is straight-forward, and some less so. If you do not have a BGP ASN then you can choose an arbitrary number between 64512 and 65534, which is a range of BGP ASNs reserved by IANA similar to RFC1918 address space. The prefix is a public address block which Amazon will know to route over your virtual interface; This could be as small as a /32 for a NAT server that your systems live behind. It should be noted that at this time, Direct Connect does not support IPv6.

Amazon has authored some excellent documentation for most of their AWS services, and the process for creating Virtual Interfaces is no exception. Your configuration may require some subtle changes, and of course you should never promote any system to production status without fully understanding the operational and security consequences of its configuration.

But once you’ve reached that point and your virtual interface is online, Amazon will begin routing your packets over the link and you now have a direct connection straight in to Amazon’s network!

So what does a hybrid infrastructure look like?

In addition to using Direct Connect to access AWS public services, you can also use it to run your own infrastructure on Amazon’s platform. One of the most polished and well-supported ways to do this is by using Amazon’s Virtual Private Cloud service. A VPC environment allows you to take your server instances out of Amazon’s public network. If you are familiar with Amazon’s EC2 platform, you will recall that server instances live on a network of private address space alongside every other EC2 customer. VPC takes that same concept, but puts your instances on one or more private address spaces of your choosing by themselves. Additionally, it offers fine-grained control over which machines get NAT to the public internet, which subnets can speak to each other, and other routing details. Another benefit offered by VPC is the ability for your Direct Connect Virtual Interface to drop straight on to your VPC’s private network. This means that the infrastructure in both your datacenter and Amazon VPC can live entirely on private address space, and communicate directly. Your network traffic never traverses the public internet. In essence, your VPC becomes a remote extension of your own datacenter.

When to use all this?

So what kind of situations can really benefit from this kind of hybrid infrastructure? There are myriad possibilities, but one that might be a common case is to take advantage of Amazon’s flexible infrastructure for service front-ends while utilizing your own hardware and datacenter for IO intensive or sensitive applications. In our hypothetical infrastructure, taking advantage of Amazon’s large bandwidth resources, ability to cope with DDoS, and fast instance provisioning, you bring up new web and application servers as demand requires. This proves to be cost effective, but your database servers are very performance sensitive and do not cope well in Amazon’s shared environment. Additionally, your VP really wants the master copy of your data to be on resources you control. Running your database on Amazon is right out, but using Direct Connect your app servers can connect right to your database in your datacenter. This works well, but all of your read requests are traversing the link and you’d like to eliminate that. So you set up read slaves inside Amazon, and configure your applications to only send writes to the master. Now only writes and your replication stream traverse the link, taking advantage of Amazon’s Direct Connect pricing and free inbound traffic.

How’s it work?

So how well can Direct Connect perform? Here is an example of the latency between the router in my own datacenter in Northern Virginia, and the router on Amazon’s US-East network. This is just about a best-case scenario, of course, and the laws of physics apply.

One millisecond, which is the lowest precision result our router provides! Due to a misconfiguration, I don’t presently have throughput stats but when measured in the past we have been able to match the interface speed that the router is capable of. In other words, Direct Connect performs exactly as you would expect a fiber link between two locations would.

Wrapping up

There are caveats to using Direct Connect, especially in a production environment. Being a single line of fiber, your network path is exposed to a few single points of failure. These include your router, the fiber in between yourself and the Direct Connect Location, and the infrastructure between the Direct Connect Location and Amazon’s regional datacenter. Additionally, Amazon does not offer an SLA on Direct Connect at this time and reserves the right to take down interfaces at their entry for maintenance. Because of this, Amazon recommends ensuring that you can fail over to your primary internet link or ordering a second Direct Connect link. If your requirements include low latency and high throughput, and failing over to your default internet provider link will not suffice, a second Direct Connect link may be justified.

While I’ve outlined Direct Connect’s benefits for a single organization’s hybrid infrastructure, that certainly isn’t the only group who can take advantage of this service. Hosting companies, for example, might wish to maintain Direct Connect links to Amazon datacenters so that their customers can take advantage of Amazon’s Web Services in a low latency environment. Organizations with a traditional datacenter might use AWS as a low cost disaster recovery option, or as a location for off-site backup storage.

I hope this article has helped illuminate Amazon’s Direct Connect service. Despite a few drawbacks this service is a powerful tool in the system administrator’s toolbox, allowing us to improve the reliability and performance of our infrastructures while taking advantage of the benefits of Amazon’s cloud platform. Hopefully we will soon start seeing similar offerings from other cloud providers. Perhaps there may even be dedicated cloud exchanges in the future, allowing direct communication between providers and letting us leverage the possibility of a truly distributed infrastructure on multiple clouds.


Automating Backups in AWS

In Day 9’s post we learned about some ideas for how to do proper backups when using AWS services.

In today’s post we’ll take a hands-on approach to automating creating resources and performing the action needs to achieve these kinds of backups, using some bash scripts and the Boto python library for AWS.

Ephemeral Storage to EBS volumes with rsync

Since IO performance is key for many applications and services, it is common to use your EC2 instance’s ephermeral storage and Linux software raid for your instance’s local data storage. While EBS volumes can have erratic performance, they are useful to provide backup storage that’s not tied to your instance, but is still accessible through a filesystem.

The approach we’re going to take is as follows:

  1. Make a two EBS volume software raid1 and mount as /backups
  2. Make a shell script to rsync /data to /backups
  3. Set the shell script up to run as a cron job

Making the EBS volumes

Adding the EBS volumes to your instance can be done with a simple Boto script

add-volumes.py

Once you’ve run this script you’ll have two new volumes attached as local devices on your EC2 instance.

Making the RAID1

Now you’ll want to make a two volume RAID1 from the EBS volumes and make a filesystem on it.

The following shell script takes care of this for you

make-raid1-format.sh

Now you have a /backups/ you can rsync files and folders to for your backup process.

rsync shell script

rsync is the best method for syncing data on Linux servers.

The following shell script will use rsync to make backups for you.

rsync-backups.sh

making a cron job

To make this a cron job that runs once a day, you can add a file like the following, which assumes you put rsync-backups.sh in /usr/local/bin

This cron job will run as root, at 12:15AM in the timezone of the instance.

/etc/cron.d/backups

Data Rotation, Retention, Etc

To improve on how your data is rotated and retained you can explore a number of open source tools, including:

EBS Volumes to S3 with boto-rsync

Now that you’ve got your data backed up to EBS volumes, or you’re using EBS volumes as your main source of datastore, you’re going to want to ensure a copy of your data exists elsewhere. This is where S3 is a great fit.

As you’ve seen, rsync is often the key tool in moving data around on and between Linux filesystems, so it makes sense that we’d use an rsync style utility that talks to S3.

For this we’ll look at how we can use boto-rsync.

boto-rsync is a rough adaptation of boto’s s3put script which has been reengineered to more closely mimic rsync. Its goal is to provide a familiar rsync-like wrapper for boto’s S3 and Google Storage interfaces.

By default, the script works recursively and differences between files are checked by comparing file sizes (e.g. rsync’s –recursive and –size-only options). If the file exists on the destination but its size differs from the source, then it will be overwritten (unless the -w option is used).

boto-rsync is simple to use, being as easy as boto-rsync [OPTIONS] /local/path/ s3://bucketname/remote/path/, which assumes you have your AWS key put in ~/.boto or the ENV variables set.

boto-rsync has a number of options you’ll be familiar with from rsync and you should consult the README to get more familiar with this.

As you can see, you can easily couple boto-rsync with a cron job and some script to get backups going to S3.

Lifecycle policies for S3 to Glacier

One of the recent features added to S3 was the ability to use lifecycle policies to archive your S3 objects to Glacier

You can create a lifecycle policy to archive data in an S3 bucket to glacier very easily with the following boto code.

s3-glacier.py

Conclusion

As you can see, there are many options for automating your backups on AWS in comprehensive and flexible ways, and this post is only the tip of the iceberg.


Using ELB and Auto-Scaling

Load balancing is a critical piece of any modern web application infrastructure and Amazon’s Elastic Load Balancer (ELB) service provides an API driven and integrated solution for load balancing when using AWS services. Building on top of Amazon’s CloudWatch monitoring and metrics solution, and easily coupled with ELB, Amazon’s Auto-Scaling service provides you with the ability to dynamically scaling parts of your web application infrastructure on the fly and based on performance or user demand.

Elastic Load Balancer

ELB is a software load balancer solution that provides you with public IPs, SSL terminations, and the ability to do layer 4 and 7 load balancing, with session stickyness, as needed. Managed through the AWS console, CLI tools, or theELB API. All while paying by the hour, only for the resources and bandwidth used.

Auto-Scaling

Auto-Scaling lets you define CloudWatch metrics for dynamically scaling EC2 instances up and down, completely automatically. You’re able to utilize On-Demand or Spot instances, inside or out of your VPC for the scaling and it easily couples with ELB to allow auto-scaled instances to begin serving traffic for web applications. Managed through the AS CLI tools or theAS API. All while paying by the hour, only for the CloudWatch metrics used. You’re also able to use [AWS SNS] to get alerted as auto-scaling policies take actions.

Getting Started with ELB

ELB is composed of ELB instances. An ELB instance has the following elements:

To get started with ELB you’ll build an ELB instance

  1. Login to the AWS console
  2. Click Load Balancers
  3. On the DEFINE LOAD BALANCER page, make the following selections:
  4. Enter a name for your load balancer (e.g., MyLoadBalancer).
  5. Leave CreateLB inside set to EC2 because in this example you’ll create your load balancer in Amazon EC2. The default settings require that your Amazon EC2 HTTP servers are active and accepting requests on port 80.
  6. On the CONFIGURE HEALTH CHECK page of the Create a New Load Balancer wizard, set the following configurations:
  7. Leave Ping Protocol set to its default value of HTTP.
  8. Leave Ping Port set to its default value of 80.
  9. In the Ping Path field, replace the default value with a single forward slash (“/”). Elastic Load Balancing sends health check queries to the path you specify in Ping Path. This example uses a single forward slash so that Elastic Load Balancing sends the query to your HTTP server’s default home page, whether that default page is named index.html, default.html, or a different name.
  10. Leave the Advanced Options set to their default values.
  11. On the ADD INSTANCES page, check the boxes in the Select column to add instances to your load balancer.
  12. On the Review page of the Create a New Load Balancer wizard, check your settings. You can make changes to the settings by clicking the edit link for each setting.
  13. Now that you’ve made your configuration choices, added your instances, reviewed your selections, and have clicked Create, you’re ready to create your load balancer.
  14. After you click Create button in the REVIEW page, a confirmation window opens. Click Close. When the confirmation window closes, the Load Balancers page opens. Your new load balancer now appears in the list.
  15. You can test your load balancer after you’ve verified that at least one of your EC2 instances is InService. To test your load balancer, copy the DNS Name value that is listed in the Description tab and paste it into the address field of an Internet-connected web browser. If your load balancer is working, you will see the default page of your HTTP server.

Now that you’ve created an ELB instance, some of things you may want to do could include:

Getting Started with Auto-Scaling

Auto-Scaling is built from two things, a launch configuration and an auto-scaling group.

To build an auto-scaling configuration, do the following

  1. Download and Install the AS CLI tools
  2. Create a launch configuration, e.g. as-create-launch-config MyLC --image-id ami-2272864b --instance-type m1.large
  3. Create an Auto-Scaling group, e.g. as-create-auto-scaling-group MyGroup --launch-configuration MyLC --availability-zones us-east-1a --min-size 1 --max-size 1
  4. You can list your auto-scaling group with as-describe-auto-scaling-groups --headers

At this point you have a basic auto-scaling group.

To make this useful you’ll probably want to do some of the following

Conclusion

In conclusion, ELB and Auto-Scaling provide a number of options for managing and scaling your web application infrastructure based on traffic growth and user demand and letting easily mix and match them with other AWS services.


Using IAM to Increase Flexibility and Security

Today’s post is a contribution from Erik Hollensbe, an active member of the Chef and Operations communities online and a practicing Operations Engineer.

AWS IAM (Identity and Access Management) is a tool to apply ACLs to AWS credentials – it’s not much more than that. While this sounds pretty banal, it can be used to solve a number of problems with both the flexibility and security of your network.

Scare Tactics Time

A lot of companies and groups use AWS exclusively, where previously they would have used racks of machines in a data center. Short of having a working proximity card and a large bucket of water, there wasn’t much you were going to be able to do to cause irreparable damage to every component of your company’s network. Presuming you did that, and didn’t kill yourself by electrocution, you still have to evade the DC cameras to get away with it.

That all changes with something like AWS. The master keys to your account can literally be used to destroy everything. Your machines, your disks, your backups, your assets. Everything. While vendor partitioning, off site backups, etc, is an excellent strategy (aside from other, separate gains) to mitigate the long-term damage, it doesn’t change this. Plus since the credentials are shared, it’s not exactly a feat to do this anonymously.

While my intent isn’t to scare you into using IAM, it’s important to understand that in more than a few organizations, not only will many members of your staff have these credentials, but frequently enough they will also live on the servers as part of code deploys, configuration management systems, or one-off scripts. So you don’t even have to work at the company in that situation, you simply need to find a hole to poke open to tear down an entire network.

Security Theatre in a Nutshell

Before I go into how to use IAM to solve these problems, I’m going to balance this out with a little note about security theatre.

Know the problem you’re actually solving. If you’re not clear on what you’re solving, or it’s not a full solution, you’re practicing the worst kind of security theatre, wasting everyone’s time as a result. Good security is as much about recognizing and accepting risk as mitigating it. Some of these solutions may not apply to you and some of them may not be enough. Use your head.

IAM as a tool to mitigate turnover problems

This is the obvious one, so I’ll mention it first. Managing turnover is something that’s familiar to anyone with significant ops experience, whether or not they had any hand in the turnover itself. Machines and networks are expected to be protected from reprisals and a good ops person is thinking about this way ahead of when it happens for the first time.

Just to be clear, no human likes discussing this subject, but it is a necessity and an unfortunate consequence of both business and ops. Ignoring it isn’t a solution.

IAM solves these problems in a number of ways:

  • Each user gets their own account to both log in to the web interface and associated credentials to use.
  • Users are placed in groups which have policies (ACLs). Users individually have policies as well and these can cascade. Policies are just JSON objects, so they’re easy to re-use, keep in source control, etc.

Most users have limited needs and it would be wise to (without engaging in security theatre) assess what those needs are and design policies appropriately. Even if you don’t assign restrictive policies, partitioning access by user makes credential revocation fast and easy, which is exactly what you want and need in an unexpected turnover situation… which is usually the time when it actually matters.

And who watches the watchers? Let’s be honest with ourselves for a second. You may be behind the steering wheel, but you probably aren’t the final word on the route to take, and anyone who thinks they are because they hold the access keys needs more friends in the legal profession. Besides, it’s just not that professional. Protect your network against yourself too. It’s just the right thing to do.

So, here’s the shortest path to solving turnover problems with AWS credentials:

  • Bootstrap IAM – click on IAM in the AWS control panel. Set up an admin group (the setup will offer to create this group for you) and a custom URL for your users to log in to.
  • Set up users for everyone who needs to do anything with AWS. Make them admins. (Admins still can’t modify the owner settings, but they can affect other IAM users.)
  • Tell your most senior technical staff to create a new set of owner credentials, to change the log in password, and to revoke the old credentials.

Now you’re three clicks away (or an API call) from dealing with any fear of employee reprisal short of the CTO, and you have traceable legal recourse in case it took you too long to do that. Congratulations.

IAM as a tool to scope automated access

Turnover is not a subject I enjoy discussing when it comes to security, but it’s the easier introduction. While I think the above is important, it’s arguably the lesser concern.

As our applications and automation tooling like configuration management becomes more involved and elaborate, we start integrating parts of the AWS API. Whether that’s a web app uploading files to a S3 bucket, a deploy script that starts new EC2 machines, or a chef recipe that allocates new volumes from EBS for a service to use, we become dependent on the API. This is a good thing, of course – the API is really where the win is in using a service like AWS.

However, those credentials have to live somewhere. On disk, in a central configuration store, encrypted, unencrypted, it doesn’t matter. If your automation or app can access it, an attacker that wants it will get it.

Policies let us scope what credentials can do. Does your app syncing assets with S3 and cloudfront need to allocate EBS volumes, or manage Route53 zones? Prrrrrroobably not. If it’s easier to think about this in unix terms, does named need to access the contents of /etc/shadow?

“Well, duh!”, you might say, yet many companies plug owner credentials directly into their S3 or EBS allocation tooling, and then run on EC2 under the same account. We preach loudly about not running as root, but then expose our entire network (not just that machine) to plunder.

Instead, consider assigning policies to different IAM accounts that allow exactly what that tool needs to do, and making those credentials available to that tool. Not only will you mitigate access issues, but it will be clearer when your tooling is trying to do something you didn’t expect it to do by side-effect, just like a service or user on your machine messing with a file you didn’t expect it to.

You can populate these credentials with your favorite configuration management system, or credentials can also be associated to EC2 instances directly, where the metadata is available from an internally-scoped HTTP request.

Creating a Policy

An IAM policy is just a JSON-formatted file with a simple schema that looks something like this:

{ "Statement": [ { "Sid": "Stmt1355374500916", "Action": [ "ec2:CreateImage" ], "Effect": "Allow", "Resource": "*" } ] }

Some highlights:

  • A Statement is a hash describing a rule.
  • Actions are a 1:1 mapping to AWS API calls. For example, the above statement references the CreateImage API call from the ec2 API.
  • Effect is just how to restrict the Action. Valid values are Allow and Deny.
  • A Resource is an ARN, which is just a qualified namespace. In the EC2 case ARNs have no effect, but you’d use one if you were referring to something like a S3 bucket.

For extended policy documentation, look here.

One of my favorite things about AWS policies is that they’re JSON. This JSON file can be saved in source control and re-used for reference, repeat purposes, or in a DR scenario.

AWS itself provides a pretty handy Policy Generator for making this a little easier. You will still want to become familiar with the API calls to write effective policies, but there is also a small collection of example policies while you get your feet wet.

Happy Hacking!


AWS EC2 Configuration Management with Chef

Today’s post is a contribution from Joshua Timberman, a Technical Community Manager at Opscode, an avid RPGer and DM extraordinaire, a talented home-brewer, who is always Internet meme and Buzzword compliant.

He shares with us how Chef can help manage your EC2 instances.


In a previous post, we saw some examples about how to get started managing AWS EC2 instances with Puppet and Chef. In this post, we’ll take a deeper look into how to manage EC2 instances with Chef. It is outside the scope of this post to go into great detail about building cookbooks. If you’re looking for more information on working with cookbooks, see the following links

Prerequisites

There are a number of prequisites for performing the tasks outlined in this post, including

  • Workstation Setup
  • Authentication Credentials
  • Installing Chef

Workstation Setup

We assume that all commands and work will originate from a local workstation. For example, a company-issued laptop. We’ll take for granted that it is running a supported platform. You’ll need some authentication credentials, and configure knife.

Authentication Credentials

You’ll need the Amazon AWS credentials for your account. You’ll also need to create an SSH key pair to use for your instances. Finally, if you’re using a Chef Server, you’ll need your user key and the “validation” key

Install Chef

If your local workstation system doesn’t already have Chef installed, Opscode recommends using the “Omnibus package” installers.

Installing the knife-ec2 Plugin

Chef comes with a a plugin-based administrative command-line tool called knife. Opscode publishes the knife-ec2 plugin which extends knife with fog to interact with the EC2 API. This plugin will be used in further examples, and it can be installed as a RubyGem into the “Omnibus” Ruby environment that comes with Chef.

example

sudo /opt/chef/embedded/bin/gem install knife-ec2

If you’re using a different Ruby environment, you’ll need to use the proper gem command.

knife.rb

In order to use knife with your AWS account, it must be configured. The example below uses Opscode Hosted Chef as the Chef Server. It includes the AWS credentials as read in from shell environment variables. This is so the actual credentials aren’t stored in the config file directly.

Normally, the config file lives in ./.chef/knife.rb, where the current directory is a “Chef Repository.” See the knife.rb documentation for more information.

example

The additional commented lines can all be passed to the knife ec2 server create command through its options, see --help for full options list.

Launching Instances

Launch instances using knife-ec2’s “server create” command. This command will do the following:

  1. Create an instance in EC2 using the options supplied to the command, and in the knife.rb file.
  2. Wait for the instance to be available on the network, and then wait until SSH is available.
  3. SSH to the instance as the specified user (see command-line options), and perform a “knife bootstrap,” which is a built-in knife plugin that installs Chef and configures it for the Chef Server.
  4. Run chef-client with a specified run list, connecting to the Chef Server configured in knife.rb.

In this example, we’re going to use an Ubuntu 12.04 AMI provided by Canonical in the default region and availability zone (us-east–1, us-east–1d. We’ll use the default instance size (m1.small). We must specify the user that we’ll connect with SSH (-x ubuntu), because it is not the default (root). We also specify the AWS SSH keypair (-S jtimberman). As a simple example, we’ll set up an Apache Web Server with Opscode’s apache2 cookbook with a simple run list (-r 'recipe[apt],recipe[apache2]'=), and use the =apt cookbook to ensure the APT cache is updated. Then, we specify the security groups so the right firewall rules are opened (-G default,www).

knife ec2 server create -x ubuntu -I ami-9a873ff3 -S jtimberman -G default,www -r 'recipe[apt],recipe[apache2]'

The first thing this command does is talk to the EC2 API and provision a new instance.

The “Bootstrap” Process

What follows will be the output of the knife bootstrap process. That is, it installs Chef, and then runs chef-client with the specified run list.

The registration step is where the “validation” key is used to create a new client key for this instance. On the Chef Server:

On the EC2 instance:

The client.pem file was created by the registration. We can safely delete the validation.pem file now, it is not needed, and there’s actually a recipe for that.

The client.rb looks like this:

The chef_server_url and validation_client_name came from the knife.rb file above. The node name came from the instance ID assigned by EC2. Node names on a Chef Server must be unique, EC2 instance IDs are unique, whereas the FQDN (Chef’s default here) may be recycled from terminated instances.

The ohai directory contains hints for EC2. This is a new feature of the knife-ec2 plugin and ohai, to help better identify cloud instances, since certain environments make it difficult to auto-detect (including EC2 VPC and Openstack).

Now that the instance has finished Chef, it has a corresponding node object on the Chef Server.

In this output, the FQDN is the private internal name, but the IP is the public address. This is so when viewing node data, one can copy/paste the public IP easily.

Managing Instance Lifecycle

There are many strategies out there for managing instance lifecycle in Amazon EC2. They all use different tools and workflows. The knife-ec2 plugin includes a simple “purge” option that will remove the instance from EC2, and if the node name in Chef is the instance ID, will remove the node and API client objects from Chef, too.

Conclusion

AWS EC2 is a wonderful environment to spin up new compute quickly and easily. Chef makes it even easier than ever to configure those instances to do their job. The scope of this post was narrow, to introduce some of the concepts behind the knife-ec2 plugin and how the bootstrap process works, and there’s much more that can be learned.

Head over to the Chef Documentation to read more about how Chef works.

Find cookbooks shared by others on the Chef Community Site.

If you get stuck, the community has great folks available via the IRC channels and mailing lists.


Amazon Elastic Beanstalk

Elastic Beanstalk (EB) is a service which helps you easily manage and deploy your application code into an automated application environment. It handles provisioning AWS resources like EC2 instances, ELB instances, and RDS databases, and let’s you focus on writing your code and deploying with a git push style deployment when you’re ready to deploy to development, staging, or production environments.

What does Elastic Beanstalk offer me? FAQ

Getting Started Walkthrough

What Is AWS Elastic Beanstalk and Why Do I Need It?

Pricing

AWS Elastic Beanstalk is free, but the AWS resources that AWS Elastic Beanstalk provides are live (and not running in a sandbox). You will incur the standard usage fees for any resources your environment uses, until you terminate them.

The total charges for the activity we’ll do during this blog post will be minimal (typically less than a dollar). It is possible to do some testing of EB in Free tier by following this guide.

Further Reading on Pricing

Key Concepts

The key concepts when trying to understand and use Elastic Beanstalk are

  • Application
  • Environment
  • Version
  • Environment Configuration
  • Configuration Template

The primary AWS services that Elastic Beanstalk can/will use are

  • Amazon Elastic Compute Cloud (Amazon EC2)
  • Amazon Relational Database Service (Amazon RDS)
  • Amazon Simple Storage Service (Amazon S3)
  • Amazon Simple Notification Service (Amazon SNS)
  • Amazon CloudWatch
  • Elastic Load Balancing
  • Auto Scaling

It’s important to understand what each of the main components in Elastic Beanstalk, so let’s explore them in a little more depth.

Application

An AWS Elastic Beanstalk application is a logical collection of AWS Elastic Beanstalk components, including environments, versions, and environment configurations. In AWS Elastic Beanstalk an application is conceptually similar to a folder.

Version

In AWS Elastic Beanstalk, a version refers to a specific, labeled iteration of deployable code. A version points to an Amazon Simple Storage Service (Amazon S3) object that contains the deployable code (e.g., a Java WAR file). A version is part of an application. Applications can have many versions.

Environment

An environment is a version that is deployed onto AWS resources. Each environment runs only a single version, however you can run the same version or different versions in many environments at the same time. When you create an environment, AWS Elastic Beanstalk provisions the resources needed to run the application version you specified. For more information about the environment and the resources that are created, see Architectural Overview.

Environment Configuration

An environment configuration identifies a collection of parameters and settings that define how an environment and its associated resources behave. When you update an environment’s configuration settings, AWS Elastic Beanstalk automatically applies the changes to existing resources or deletes and deploys new resources (depending on the type of change).

Configuration Template

A configuration template is a starting point for creating unique environment configurations. Configuration templates can be created or modified only by using the AWS Elastic Beanstalk command line utilities or APIs.

Further Reading

Workflow

The typical workflow for using Elastic Beanstalk is that you’ll create one or more environments for a given application. Commonly development, staging, and production environments are created.

As you’re ready to deploy new versions of your application to a given environment, you’ll upload a new version and deploy it to that environment via the AWS console, the CLI tools, an IDE, or the an EB API library.

Supported Languages

Elastic Beanstalk currently supports the following languages:

Getting Started

To get started with Elastic Beanstalk, we’ll be using the AWS console.

  1. Login to the console and choose the Elastic Beanstalk service.
  2. Select your application platform, we’ll use Python for this example, then click Start
  3. AWS will begin provisioning you a new application environment. This can take a few minutes since it involves provisioning at least one new EC2 instance. EB is performing a number of steps while you wait, including
  4. Creating an AWS Elastic Beanstalk application named “My First Elastic Beanstalk Application.”
  5. Creating a new application version labeled “Initial Version” that refers to a default sample application file.
  6. Launching an environment named “Default-Environment” that provisions the AWS resources to host the application.
  7. Deploying the “Initial Version” application into the newly created “Default-Environment.”
  8. Once the provisioning is finished you are able to view the default application by expanding the Environment Details and clicking on the URL

At this point we have a deployed EB managed application environment.

Further Reading

Deploying an application

There are two ways to deploy applications to your EB environments

  1. Manually through the AWS console
  2. Using the AWS DevTools, in conjunction with Git or an IDE like Visual Studio or Eclipse.

Manual Deploy

Let’s do a manual update of the application through the console

  1. Since we’re using Python as our example framework, I am using the python sample from the Getting Started walk through
  2. Login to the EB console
  3. Click on the Versions tab
  4. Click Upload New Version
  5. Enter Second Version for the Version Label
  6. Choose the python-secondsample.zip and Upload it
  7. Under Deployment choose Upload, leave the environment set to Default-Environment
  8. Click Upload New Version
  9. You should now see Second Version available on the Versions tab

Now we can deploy the new version of the application to our environment

  1. Check the box next to Second Version
  2. Click the Deploy button
  3. Set Deploy to: to Default-Environment
  4. Click Deploy Version
  5. Below your list of Versions it will now display the Default-Environment.
  6. You can click on the Events tab to watch the progress of this deploy action.
  7. Wait for the Environment update completed successfully. event to be logged.

Once the deployment is finished, you can check it by

  1. Clicking on the Overview tab
  2. Expanding your application environment, e.g. Default-Environment
  3. Reviewing the Running Version field.
  4. It should now say Second Version

CLI Deploy

Being able to deploy from the command line and with revision control is ideal. So Amazon has written a set of tools, the AWS DevTools, that integrate with Git to help get this workflow up and running.

Let’s walk through doing a CLI deploy. I am going to assume you already have Git installed.

  1. Get the ELB command line tools downloaded and installed
  2. Unzip the python sample into a directory and initialize that directory as a Git repository with git init
  3. Adding everything in the directory to the repo with git add * and commit it with git commit -a -m "all the things"
  4. From your Git repository directory, run AWSDevTools-RepositorySetup.sh. You can find AWSDevTools-RepositorySetup.sh in the AWS DevTools/Linux directory. You need to run this script for each Git repository.
  5. Follow the Git setup steps to setup the DevTools with your AWS credentials, application, and environment names.
  6. Edit application.py in your Git repo and add a comment like # I was here
  7. Commit this change with git commit -a -m "I was here"
  8. Push your change to your EB application environment with git aws.push, you can see what this should look like in the example on deploying a PHP app with Git and DevTools
  9. If you’re Push succeeds, you should see the Running Version of your application show the Git SHA1 of your commit.

You can now continue to work and deploy by committing to Git and using git aws.push

Further Reading

Application Environment Configurations

Once you’re familiar and comfortable with deploying applications, you’ll likely want to customize your application environments. Since EB uses EC2 instances running Linux or Windows, you have a certain amount of flexibility in what customizations you can make to the environment.

Customizing Instance Options

You’re able to tweak many options regarding your instances, ELB, Auto-Scaling, and Database options, including

  • Instance type
  • EC2 security group
  • Key pairs
  • Port and HTTPS options for ELB
  • Auto-Scaling instance settings and AZ preference
  • Setting up your environment to use RDS resources.

To customize these things, you

  1. Login to the AWS console
  2. Locate your environment and click on Actions
  3. Make your desired changes to the settings
  4. Click Apply Changes

As mentioned, some changes can be done on the fly, like ELB changes, while others, like changing the instance size or migrating to RDS, require a longer period of time and some application downtime.

Further reading on Environment Customization

Customizing Application Containers

At this time, the following container types support customization

  • Tomcat 6 and 7 (non-legacy container types)
  • Python
  • Ruby 1.8.7 and 1.9.3
  • Currently, AWS Elastic Beanstalk does not support configuration files for PHP, .NET, and legacy Tomcat 6 and 7 containers.

You’re able to customize a number of things, including

  • Packages
  • Sources
  • Files
  • Users
  • Groups
  • Commands
  • Container_commands
  • Services
  • Option_settings

Further Readingon Application Container Customization

Where to go from here

Now that you’re used Elastic Beanstalk, seen how to deploy code, learned how to customized your instances, you may be considering running your own application with Elastic Beanstalk.

Some of the things you’ll want to look into further if you want to deploy your application to EB are:


Bootstrapping Config Management on AWS

When using a cloud computing provider like AWS’s EC2 service, being able to ensure that all of your instances are running the same configuration and being able to know that new instances you create can be quickly configured to meet your needs is critical.

Configuration Management tools are the key to achieving this. In my experience so far, the two most popular open source configuration management tools are PuppetLabs’ Puppet and Opscode’s Chef products. Both are open source, written in Ruby, and you’re able to run your own server and clients without needing to purchase any licensing or support. Both also have vibrant and passionate communities surrounding them. These are the two we will focus on for the purposes of this post.

Getting started with using Puppet or Chef itself and/or building the Puppet or Chef server will not be the focus of this post, but I will provide some good jumping off points to learn more about this. I am going to focus specifically on some techniques for bootstrapping the Puppet and Chef clients onto Linux EC2 instances.

user-data and cloud-init

Before getting into the specifics of bootstrapping each client, let’s take a look at two important concepts/tools for Linux AMIs

user-data

user-data is a piece of instance metadata that is available to your EC2 instances at boot time and during the lifetime of your instance.

At boot time for Ubuntu AMIs and the Amazon Linux AMI, this user-data is passed to cloud-init during the first bootup of the EC2 instance, and cloud-init will read the data and can execute it.

So a common technique for bootstrapping instances is to pass the contents of a shell script to the EC2 API as the user-data, the shell code is executed during boot, as the root user, and your EC2 instance is modified accordingly.

This is the technique we will use to help bootstrap our config management clients.

cloud-init

cloud-init is the Ubuntu package that handles early initialization of a cloud instance. It is installed in the official Ubuntu images available on EC2 and Amazon also includes it in their Amazon Linux AMI.

It provides a wide variety of built-in functions you can use to customize and configure your instances during bootup, which you send to cloud-init via user-data. It also supports the ability to run arbitrary shell commands.

It’s definitely possible to use cloud-init as a lightweight way to do config management style actions at bootup, but you’re left to build your own tools to make additional modifications to your EC2 instances during their lifecycle.

In our case we’re going to take advantage of user-data and cloud-init to use curl to download a shell script from S3 that takes care of our client bootstrapping, as this technique translates well to any Linux distributions, not just those which include cloud-init. And this is also easily re-usable in other cloud provider environments, your own data center, or home lab/laptop/local dev environement(s).

Bootstrapping Puppet

To bootstrap Puppet, you’ll need two things

  1. A Puppetmaster where you can sign the certificate the client generates
  2. A shell script, puppet-bootstrap.sh, which installs the Puppet agent and connecting it to the puppetmaster

The process of bootstrapping works as follows

  1. You provision an EC2 instance, passing it user-data with the shell script
  2. The EC2 instance runs the puppet-bootstrap.sh script on the instance
  3. The shell script installs the Puppet client, sets the server in puppet.conf, and starts the Puppet service.

puppet-bootstrap.sh

So Michtell Hashimoto of Vagrant fame has recently started an amazing puppet-bootstrap repository on Github. So grab the script for the distribution type, RHEL, Debian/Ubuntu, etc, and save it locally.

Then add the following two lines to the script

echo "server=puppetmaster.you.biz" >> /etc/puppet/puppet.conf echo "listen=true" >> /etc/puppet/puppet.conf

Save the script and pass it in as your user-data.

Client certificate signing

The final step is to sign the client’s certificate on your Puppetmaster.

You can do this with the following command

At this point you can give the instance a node definition and begin applying your classes and modules.

Bootstrapping Chef

To bootstrap Chef onto you’re going to need five things

  1. A Chef server or Hosted Chef account
  2. A client.rb in an S3 bucket, with what you want your instance default settings to be
  3. Your validation.pem (ORGNAME-validator.pem if you’re using Hosted Chef), in an S3 bucket
  4. A shell script, chef-bootstrap.sh, to install the Omnibus installer and drop your files in place, which you pass in as user-data
  5. An IAM role that includes read access to the above S3 bucket

The process of bootstrapping works as follows

  1. You provision an EC2 instance, passing it the IAM role and user-data with the shell script
  2. The EC2 instance runs the chef-bootstrap.sh script on the instance
  3. The shell script installs the Omnibus Chef client, drops the .pem, and client.rb in place and kicks off the first chef-client run

Creating the IAM Role

To create the IAM role you do the following

  1. Login to the AWS console
  2. Click on the IAM service
  3. Click on Roles, set the Role Name
  4. Click on Create New Role
  5. Select AWS Service Roles, click Select
  6. Select Policy Generator, click Select
  7. In the Edit Permissions options
    1. Set Effect to Allow
    2. Set AWS Service to Amazon S3
    3. For Actions, select ListAllMyBuckets and GetObject
    4. For the ARN, use arn:aws:s3:::BUCKETNAME, e.g. arn:aws:s3:::meowmix
    5. Click Add Statement
  8. Click Continue
  9. You’ll see a JSON Policy Document, review it for correctness, then Click Continue
  10. Click Create Role

Files on S3

There are many tools for uploading the files mentioned to S3, including the AWS console. I’ll leave the choice of tool up to the user.

If you’re not familiar with uploading to S3, see the Getting Started Guide.

chef-bootstrap.sh

The bootstrap.sh script is very simple, an example is included in the Github repo and shown below, the .pem and client.rb are geared towards Hosted Chef.

client.rb

The client.rb is very simple, an example is included in the Github repo and shown below, this client.rb is geared towards Hosted Chef

At this point you’ll have a new EC2 instance that’s bootstrapped with the latest Omnibus Chef client and is connected to your Chef Server. You can begin applying roles, cookbooks, etc your new instance(s) with knife

In conclusion

You’ve now seen some ideas and two practical applications of automating as much of the configuration management bootstrapping process as is easily possible with Puppet and Chef. These can be easily adapted for other distributions and tools and customized to suit your organizations needs and constraints.


Options for Automating AWS

As we’ve seen in previous posts, boto and cloudformation are both options for helping automate your AWS resources, and can even compliment each other.

But not everyone will want to use Amazon’s CFN (which we covered in depth in the day 6 post) or a Python library, so I thought we’d explore some of the options for automating your usage of AWS in various programming languages.

Python – boto, libcloud

Python has a few options for libraries. The most actively developed and used one’s I’ve seen are boto and libcloud

Boto

Boto is meant to be a Python library for interacting with AWS service. It mirrors the AWS APIs in a Pythonic fashion and gives you the ability to build tools in Python on top of it, to manipulate and manage your AWS resources.

The project is lead by Mitch Garnaat, who is currently a Sr Engineer at Amazon.

Boto has a number of tutorials to get you started, including

Currently the following AWS services are supported:

Compute

  • Amazon Elastic Compute Cloud (EC2)
  • Amazon Elastic Map Reduce (EMR)
  • AutoScaling
  • Elastic Load Balancing (ELB)

Content Delivery

  • Amazon CloudFront

Database

  • Amazon Relational Data Service (RDS)
  • Amazon DynamoDB
  • Amazon SimpleDB

Deployment and Management

  • AWS Identity and Access Management (IAM)
  • Amazon CloudWatch
  • AWS Elastic Beanstalk
  • AWS CloudFormation

Application Services

  • Amazon CloudSearch
  • Amazon Simple Workflow Service (SWF)
  • Amazon Simple Queue Service (SQS)
  • Amazon Simple Notification Server (SNS)
  • Amazon Simple Email Service (SES)

Networking

  • Amazon Route53
  • Amazon Virtual Private Cloud (VPC)

Payments and Billing

  • Amazon Flexible Payment Service (FPS)

Storage

  • Amazon Simple Storage Service (S3)
  • Amazon Glacier
  • Amazon Elastic Block Store (EBS)
  • Google Cloud Storage

Workforce

  • Amazon Mechanical Turk

Other

  • Marketplace Web Services

libcloud

libcloud is a mature cloud provider library that is an Apache project. It’s meant to provide a Python interface to multiple cloud providers, with AWS being one of the first it supported and among the most mature in those that libcloud supports.

libcloud is organized around four components

  • Compute – libcloud.compute.*
  • Storage – libcloud.storage.*
  • Load balancers – libcloud.loadbalancer.*
  • DNS – libcloud.dns.*

Given the above components and my review of the API docs, libcloud effectively supports the following AWS services

  • EC2
  • S3
  • Route53

If you’re interested in learning more about libcloud, take a look at the Getting Started guide and the API documentation

Ruby – fog, aws-sdk gem

The main Ruby options seem to be Fog and the aws-sdk gem.

Fog

Similar to libcloud, Fog’s goal is to a be a mature cloud provider library with support for many providers. It provides a Ruby interface to them, with AWS being one of the first it supported and among the most mature in those that Fog supports. It’s also used to provide EC2 support for Opscode Chef’s knife

Fog is organized around four components

  • Compute
  • Storage
  • CDN
  • DNS

Based on a review of the supported services list and the aws library code, Fog currently has support for all the major AWS services.

If you’re interested in learning more about Fog, take a look at the Getting Started tutorial and the source code

aws-sdk

The aws-sdk gem is the official gem from Amazon that’s meant to help Ruby developers integrate AWS services into their applications, with special support for Rails applications in particular.

It currently supports the following AWS services:

  • Amazon Elastic Compute Cloud (EC2)
  • Amazon SimpleDB (SDB)
  • Amazon Simple Storage Service (S3)
  • Amazon Simple Queue Service (SQS)
  • Amazon Simple Notifications Service (SNS)

If you’re interested in learning more about the ruby sdk, see the Getting Started guide and the FAQ

Java – jclouds, AWS SDK for Java

The Java world has a number of options, including jclouds and the official SDK for Java

jclouds

jclouds is a Java and Clojure library whose goal is to a be a mature cloud provider library with support for many providers. It provides a Java interface to them, with AWS being one of the first it supported and among the most mature in those that jclouds supports.

jclouds is organized into two main components

  • Compute API
  • Blobstore API

jclouds currently has support for the following AWS services

  • EC2
  • SQS
  • EBS
  • S3
  • CloudWatch

SDK for Java

The SDK for Java is the official Java library from Amazon that’s meant to help Java developers integrate AWS services into their applications.

It currently supports all the AWS services.

If you’re interested in learning more about the Java sdk, see the Getting Started guide and the API documentation

PHP – AWS SDK for PHP

The only PHP full featured PHP library I could find was the official SDK for PHP

The SDK for PHP is the official PHP library from Amazon that’s meant to help PHP developers integrate AWS services into their applications.

It currently supports all the AWS services.

If you’re interested in learning more about the PHP sdk, see the Getting Started guide and the API documentation

JavaScript – AWS SDK for Node.JS, AWSLib

There seem to be two JavaScript options, the AWS SDK for Node.js and aws-lib

SDK for Node.js

The SDK for Node.js is the official JavaScript library from Amazon that’s meant to help Javascript and Node.js developers integrate AWS services into their applications. This SDK is currently considered a developer preview

It currently supports the following AWS services

  • EC2
  • S3
  • DynamoDB
  • Simple Workflow

If you’re interested in learning more about the Node.js sdk, see the Getting Started guide and the API documentation

aws-lib

aws-lib is a simple Node.js library to communicate with the Amazon Web Services API.

It currently supports the following services

  • EC2
  • Product Advertising API
  • SimpleDB
  • SQS (Simple Queue Service)
  • SNS (Simple Notification Service)
  • SES (Simple Email Service)
  • ELB (Elastic Load Balancing Service)
  • CW (CloudWatch)
  • IAM (Identity and Access Management)
  • CFN (CloudFormation)
  • STS (Security Token Service)
  • Elastic MapReduce

If you’re interested in learning more about aws-lib, see the Getting started page and read the source code.


AWS Backup Strategies

Inspired by today’s SysAdvent post on Backups for Startups, I wanted to discuss some backup strategies for various AWS services.

As the Backups for Startups post describes

a backup is an off-line point-in-time snapshot – nothing more and nothing less. A backup is not created for fault tolerance. It is created for disaster recovery.

There are three common backup methods for achieving these point in time snapshots

  • Incremental
  • Differential
  • Full

The post explains each as well as I could, some I’m just going to share with you how Joseph Kern describes them

Incremental backups are all data that has changed since the last incremental or full backup. This has benefits of smaller backup sizes, but you must have every incremental backup created since the last full. Think of this like a chain, if one link is broken, you will probably not have a working backup.

Differential backups are all data that has changed since the last full backup. This still benefits by being smaller than a full, while removing the dependency chain needed for pure incremental backups. You will still need the last full backup to completely restore your data.

Full backups are all of your data. This benefits from being a single source restore for your data. These are often quite large.

A traditional scheme uses daily incremental backups with weekly full backups. Holding the fulls for two weeks. In this way your longest restore chain is six media (one weekly full, six daily incremental), while your shortest restore chain is only one media (one weekly full).

Another similar method uses daily differentials with weekly fulls. Your longest chain is just 2 media (one differential, and one full), While your shortest is still just a single full backup.

The article also has some good suggestions on capacity planning and cost estimation, which I suggest you review before implementing the AWS backup strategies we’ll learn in this post.

Let’s explore, at a high level, how we can apply these backup methods to some of the most commonly used AWS services. A future post will provide some hands-on examples of using specific tools and code to do some of these kinds of backups.

Backing up Data with S3 and Glacier

Amazon S3 has been a staple of backups for many organizations for years. Often people are using S3 even when they don’t use any other AWS services, because S3 provides a simple and cost effective solution to redundantly store your data off-site. A couple months ago Amazon introduced their Glacier service, which provides archival storage to tape drives for very low cost, but at the expense of having slow (multi-hour) retrieval times. Amazon recently integrated S3 and Glacier to provide the best of both worlds through one API interface.

S3 is composed of two things, buckets and objects. A bucket is a container for objects stored in Amazon S3. Every object is contained in a bucket and each object is available via a unique HTTP url. You’re able to manage access to your buckets and objects through a variety of tools, including IAM Policies, Bucket Policies, and ACLs.

As described above, you’re going to want your backup strategy to include full backups, at least weekly, and either incremental or differential backups on at least a daily basis. This will provide you with a number of point-in-time recovery options in the event of a disaster.

Getting data into S3

There are a number of options for getting data into S3 for backup purposes. If you want to roll your own scripts, you can use one of the many AWS libraries to develop code for storing your data in S3, performing full, incremental, and differential backups, and handling purging data older than the retention period you want to use, e.g. older than 30, 60, 90 days.

If you’re using a Unix-like operating systems, tools like s3cmd, duplicity (using it with S3), or Amanda Backup (using it with S3) provide a variety of options for using S3 as your backup storage, and these tools take care of a lot of the heavy lifting around doing the full, incremental, differential dance, as well as handling purging data beyond your retention period. Each tool has pros and cons in terms of implementation details and complexity vs ease of use.

If you’re using Windows, tools like Cloudberry S3 Backup Server Edition (a commercial tool), S3Sync (a free tool), or S3.exe (an open source cli tool) provide a variety of options for using S3 as your backup storage, and these tools take care of a lot of the heavy lifting around doing the full, incremental, differential dance, as well as handling purging data beyond your retention period. Each tool has pros and cons in terms of implementation details and complexity vs ease of use.

Managing the amount of data in S3

To implement a cost effective backup strategy with S3, I recommend that you take advantage of the Glacier integration when creating the lifecycle policies for each of your buckets. This effectively automates the moving of older data into Glacier and handles the purging off data beyond your retention period automatically.

Backing up EC2 instance data

When considering how to backup your EC2 instance data, there are a number of considerations, including the amount of data that needs to be backed up. Ideally things like your application source code, service configurations (e.g. Apache, Postfix, MySQL), and your configuration management code will all be stored in a version control system, such as Git (and on Github or Bitbucket), so that these are already backed up. But this can leave a lot of application data on file systems and in database that still needs to be backed up.

For this I’d suggest a two pronged approach, using EBS and S3. Since EBS has built-in support for Snapshots, I suggest using EBS volumes as a place to store a near real time copy of your application data and properly quiesced dumps of your database data. And using the snapshotting to provide a sensible number of recovery points for being able to quickly restore data. Getting the data from ephemeral disks or your primary EBS volumes can easily be done with some scripting and tools like rsync or robocopy.

Secondly, using one of the previously discussed tools, you should be doing more long term archives from the secondary EBS volumes to S3, and optionally you can use lifecycle policies on your S3 buckets to move data into Glacier for your longest term archives.

This approach involves some cost and complexity, but will provide you with multiple copies of your data and multiple options for recovery with different speed trade-offs. Specific implementation details are left as an exercise for the reader and some pragmatic examples will be part of a future post.

Backing up RDS data and configs

RDS provides built-in backup and snapshotting features to help protect your data. As discussed in the RDS post, I recommend you deploy RDS instances in a multi-AZ scenario whenever possible, as this reduces the uptime impact of performing backups.

RDS has a built in automated backups feature that will automatically perform daily backups at a scheduled time, with the caveat that it will cause an I/O pause of your RDS instance during the snapshot, for up to 35 days. These backups are stored on S3 storage for additional protection against data-loss.

RDS also supports making user initiated snapshots at any point in time, with the caveat that it will cause an I/O pause of your RDS instance during the snapshot, which can be mitigated with multi-AZ deployments. These snapshots are stored on S3 storage for additional protection against data-loss.

Additionally, because of how RDS instances do transaction logging, you’re able to do point-in-time restores to any point within the automated backup recovery window.

The only potential downside to these backup and snapshot features is that they’re isolated to the region your RDS instances run in. To provide DR copies of your database data to another region you’ll need to create a solution for this. One approach that is relatively low cost is to run a t1.micro in another region with a scheduled job that connects to your main RDS instance, performs a native SQL backup to local storage, then uploads the native SQL backup to S3 storage in your DR region. This kind of solution can have performance and cost considerations for large amounts of database data and so must be considered carefully before implementing.

Backing up AWS service configurations

While Amazon has built their services to be highly available and protect your data, it’s always important to ensure you have your own backups of any critical data.

Services like Route53 or Elastic Load Balancing (ELB) don’t store application data, but they do store data critical to rebuilding your application infrastructure in the event of a major failure, or if you’re trying to do disaster recovery in another region.

Since these services are all accessible through HTTP APIs, there are opportunities to roll your own backups of your AWS configuration data.

Route 53

With Route 53, you could get a list of your Hosted Zones, then get the details of each Zone. Finally, you could get the details of each DNS record. Once you have all this data, you can save it into a text format of your choice and upload it to S3 in another region. A ruby implementation of this idea is already available.

ELB

With ELB, you could get a list of all your Load Balancer instances, then store the data in a text format of your choice and finally upload it to S3 in another region. I did not find any existing implementations of this with some quick searching, but one could quickly be built using the AWS library of your choosing.

Summary

In summary, there are a number of great options for building a backup strategy and implementation that will meet your organizations retention, disaster recovery, and cost needs. Most of which are free and/or open source, and can be built in a highly automated fashion.

In a future post we’ll get hands-on about implementing some of these ideas in an automated fashion with the Boto Python library for AWS.


Amazon Simple Notification Service

Amazon Simple Notification Service (SNS) is a web service which helps you easily publish and deliver notifications to a variety of end points in an automated and low cost fashion. SNS currently supports sending messages to Email, SMS, HTTP/S, and SQS Queue endpoints.

You’re able to use SNS through the AWS console, the SNS CLI tools or through the SNS API.

The moving parts

SNS is composed of three main parts

  1. A topic
  2. A subscription
  3. Published messages

A topic is a communication channel to send messages and subscribe to notifications. Once you create a topic, you’re provided with a topic ARN (Amazon Resource Name), which you’ll use for subscriptions and publishing messages.

A subscription is done to a specific endpoint of a topic. This can be a web service, an email address or an SQS queue.

Published messages are generated by publishers, which can be scripts calling the SNS API, users using the AWS console, or using the CLI tools. Once a new message is published, Amazon SNS attempts to deliver that message to every endpoint that is subscribed to the topic.

Costs

SNS has a number of cost factors, including API requests, notifications to HTTP/S, notifications to Email, notifications to SMS, and data transferred out of a region.

You can get started using SNS with AWS’s Free Usage tier though. So you won’t have to pay to play right away.

Using SNS

To get started with using SNS, we’ll walk through making a topic, creating an email subscription, and publishing a message, all through the AWS console.

Making a topic

  1. Login to the AWS Console
  2. Click Create New Topic.
  3. Enter a topic name in the Topic Name field.
  4. Click Create Topic.
  5. Copy the Topic ARN for the next task.

You’re now ready to make a subscription.

Creating an email subscription

  1. In the AWS Console click on My Subscriptions
  2. Click the Create New Subscription button.
  3. In the Topic ARN field, paste the topic ARN you created in the previous task, for example: arn:aws:sns:us-east-1:054794666397:MyTopic.
  4. Select Email in the Protocol drop-down box.
  5. Enter your email address for the notification in the Endpoint field.
  6. Click Subscribe.
  7. Go to your email and open the message from AWS Notifications, and then click the link to confirm your subscription.
  8. You should see a confirmation response from SNS

You’re now ready to publish a message.

Publishing a message

  1. In the AWS Console click the topic you want to publish to, under My Topics in the Navigation pane.
  2. Click the Publish to Topic button.
  3. Enter a subject line for your message in the Subject field.
  4. Enter a brief message in the Message field.
  5. Click Publish Message.
  6. A confirmation dialog box will appear, click Close to close the confirmation dialog box.
  7. You should get the email shortly.

The SNS documentation has more details on:

Automating SNS

You’ve learned how to manually work with SNS, but as with all AWS services, things are best when automated.

Building on Day 4’s post, Getting Started with Boto, we’ll walk through automating SNS with some boto scripts.

Making a topic

This script will connect to us-west–2 and create a topic named adventtopic.

If the topic is successfully created, it will return the topic ARN. Otherwise, it will log any errors to sns-topic.log.

sns-topic.py

Making an email subscription

This script will connect to us-west–2 and create an email subscription to the topic named adventtopic for the email address you specify.

If the subscription is successfully created, it will return the topic ARN. Otherwise, it will log any errors to sns-topic.log.

  • Note: You’ll need to manually confirm the subscription in your email client before you can move on to using the pre for publishing a message.

sns-email-sub.py

Publishing a message

This script will connect to us-west–2 and create a message to the topic named adventtopic with the subject and message body you specify.

If the publication is successfully performed, it will return the topic ARN. Otherwise, it will log any errors to sns-publish.log.

sns-publish.py

Final thoughts

At this point you’ve successfully automated the main use cases for SNS.

As you can see, SNS can be very useful for sending notifications and with a little automation, can quickly become a part of your application infrastructure toolkit.

All the code samples are available in the Github repository