Using Infrastructure as Code as a Poor Man’s DR

09. December 2018 2018 0

What is DR?

Let’s start by setting the context of what I mean by Disaster Recovery(DR). There are different interpretations of DR and High Availability, with a very thin and moving line between the two. Here I am specifically referring to the ability to recover your infrastructure from a disaster such as when an AWS region is unavailable. I am not talking about situations where immediate failover is needed.

What is Infrastructure as Code

Infrastructure as Code (IaC) is something that has been around for a while now, but many people are just starting to fully embrace and see the benefits of it. Like DR, this is something that many people have taken to mean many different things over the years. People refer to BASH scripts that generate KVM VMs as IaC. While this is technically correct, this is not IaC. I am explicitly talking about tools such as Terraform, that are designed to generate infrastructure based on a configuration file.

What IaC for DR?

Go to any company or organization that does not have a viable DR strategy and ask them why that is the case. Nine times out of ten, the answer that you get will relate to cost. That makes sense, having a true DR environment can be very expensive, additionally, for people that are not technical and have never experienced a true IT disaster, it can be tough to comprehend why this is all needed. These factors make it very difficult for IT to get approval to put a secondary environment in place. This is where IaC comes in.

If your IaC is properly set up, you can essentially get DR for free. How? If a disaster takes out your infrastructure, you just , and you have your infrastructure back.

But wait, my IaC tool deploys to the AWS region that is down

If your tool is improperly configured, you may not benefit from the DR capabilities of IaC. You need to make sure that you are abstracting the providers and regions out of the actual infrastructure configuration. This allows you to quickly change the region that you are pointing at and re-deploy. For example, in Terraform, you would want to have a separate provider.tf file, that has a provider section with the region specified like this: provider “aws” { region = “eu-west-1” } This will allow you to change one line simply, and re-deploy your exact infrastructure to another region. This is as opposed to having the region information embedded in individual .tf files, which unfortunately I see floating around pretty often.

What if all of AWS (or GCP, or Azure) completely goes down and not just a region?

A complete outage of a service provider is another concern that I hear from time to time. I have a couple of different thoughts about this scenario.

My first thought is that the chances of that happening are so vanishingly small, that it is hardly worth thinking about. If all of AWS is down, you likely have something more serious, like worldwide thermo-nuclear war. However, as engineers, sysadmins, and other assorted IT professionals, we have a habit of not being able to not think about these extreme cases.

For those situations, you can have standby code. What do I mean by this? I mean that you can develop code that deploys the equivalent of your current infrastructure in another environment. Now, this is obviously time-consuming, and since none of us have a ton of spare time, that is a cost, and one that personally I don’t think is worth it, but it’s possible, and up to each reader to decide if it is needed for their environment.

Ok, I have my infrastructure back, what about my data?

Well, you are still doing backups, right? I am making a case for replacing a dedicated DR environment with code; I am not making a case for throwing basic common sense out the window.

That being said, there are times where it would take an impractical amount of time to restore data from backups just because there was a 1-2 hour outage. Especially when you can re-deploy your infrastructure from code to another region in minutes.

This is where I advocate for a hybrid approach between a complete IaC DR plan and a tradition DR setup. In this type of solution, you would have a replicated database (or other data source) running at all times in another region that you plan to use for DR purposes. Then, if disaster strikes, your data is sitting there just waiting for you to deploy the networking and compute resources to access it.

Since this does require keeping some infrastructure running at all times, it does cost some money. However, it will cost far less than having a whole second DR site sitting around waiting and may be an easier pill for the people that have to spend the money to swallow.

Conclusion

I hope that after reading this article you will have an understanding of the feasibility of using IaC as a means to have a DR environment in places where it otherwise would not be feasible. I further hope that you see the benefits of this solution in situations where a full disaster recovery solution is possible, but possibly not needed. Perhaps that money could better be spent elsewhere if you already have IaC in place to cover the worst case scenario.

What’s next

This article introduces the DR topic to get those that read it thinking about using Infrastructure as Code as a possible disaster recovery plan and solution. It just begins to scratch the surface of what is possible and the different considerations that need to be made.

Please visit my website at https://www.adair.tech over the next several weeks as I will publish follow up articles there that will delve further into details, tools, and specific plans for accomplishing this. You can also contact me via my website or Twitter to have a 1-1 conversation on the topic and explore your particular use case in more depth.

About the Author

Brad Adair is an experienced IT professional with over a decade of experience in systems engineering and administration, cloud engineering and architecture, and IT management. He is the President of Adair Technology, LLC., which is a Columbus based IT consulting firm specializing in AWS architecture and other IT infrastructure consulting. He is also an AWS Certified Solutions Architect. Outside of the office he enjoys sports, politics, Disney World, and spending time with his wife and kids.

About the Editor

Jennifer Davis is a Senior Cloud Advocate at Microsoft. Jennifer is the coauthor of Effective DevOps. Previously, she was a principal site reliability engineer at RealSelf, developed cookbooks to simplify building and managing infrastructure at Chef, and built reliable service platforms at Yahoo. She is a core organizer of devopsdays and organizes the Silicon Valley event. She is the founder of CoffeeOps. She has spoken and written about DevOps, Operations, Monitoring, and Automation.


Deploy your AWS Infrastructure Continuously

01. December 2016 2016 0

Author: Michael Wittig

Continuously integrating and deploying your source code is the new standard in many successful internet companies. But what about your infrastructure? Can you deploy a change to your infrastructure in an automated way? Can you run automated tests on your infrastructure to ensure that a change has no unintended side effects? In this post I will show you how you can apply the same processes to your AWS infrastructure that you apply to your source code. You will learn how the AWS services CloudFormation, CodePipeline and Lambda can be combined to continuously deploy infrastructure.

Precondition

You may think: “Source code is text files, but my infrastructure is different. I don’t have a source file for my infrastructure.” Infrastructure as Code as defined by Martin Fowler is a concept that is helping bring software development practices to infrastructure practices.

Infrastructure as code is the approach to defining computing and network infrastructure through source code that can then be treated just like any software system.
– Martin Fowler

AWS CloudFormation is one implementation of Infrastructure as Code. CloudFormation is a high quality and free service offered by AWS. To understand CloudFormation you need to know about templates and stacks. The template is the source code, a textual representation of your infrastructure. The stack is the actual running infrastructure described by the template. So a CloudFormation template is exactly what we need, a plain text file. The CloudFormation service interprets the template and turns it into a running infrastructure.

Now, our infrastructure is defined by a text file which is exactly what we need to apply the same processes to it that we have for source code.

The Pipeline

The pipeline to build and deploy is a sequence of steps that are necessary to ship changes to your users. Starting with a change in the code repository and ending in your production environment. The following figure shows a Pipeline that runs inside AWS CodePipeline, the AWS CD service.

AWS CodePipeline - Deploying infrastructure continuously

Whenever a git push is made to a repository hosted on GitHub the pipeline starts to run by fetching the current version of the repository. After that, the pipeline creates or updates itself because the pipeline definition itself is also treated as source code. After that, the up-to-date pipeline creates or updates the test environment. After this step, infrastructure in the test environment looks exactly as it was defined in the template. This is also a good place to deploy the application to the test environment. I’m using Elastic Beanstalk to host the demo application. Now it’s time to check if the infrastructure is still in a good shape. We want to make sure that everything runs as it is defined in the tests. The tests may check if a certain port is reachable, if a certain user can login via SSH, if a certain port is NOT reachable, and so on, and so forth. If the tests are successful, the production environment is adapted to the new template and the new application version is deployed.

Implementation

From Source to Deploy PipelineCodePipeline has native support for GitHub, CloudFormation, Elastic Beanstalk, and Lambda. So I can use all the services and tie them together using CodePipeline. You can find the full source code and detailed setup instructions in this GitHub repository: michaelwittig/automation-for-the-people

 

The following template snippet shows an excerpt of the full pipeline description. Here you see how the pipeline can be configured to checkout the GitHub repository and create/update itself:

 

Summary

Infrastructure as Code enables you to apply the same CI & CD processes to infrastructure that you already know from software development. On AWS, you can use CloudFormation to turn a text representation of your infrastructure into a running environment stack. CodePipeline can be used to orchestrate the deployment process and you can implement custom logic, such as infrastructure tests, in a programming language that you can run on AWS Lambda. Finally you can treat your infrastructure as code and deploy each commit with confidence into production.

About the Author

Michael WittigMichael Wittig is author of Amazon Web Services in Action (Manning) and writes frequently about AWS on cloudonaut.io. He helps his clients to gain value from Amazon Web Services. As a software engineer he develops cloud-native real-time web and mobile applications. He migrated the complete IT infrastructure of the first bank in Germany to AWS. He has expertise in distributed system development and architecture, with experience in algorithmic trading and real-time analytics.