Modular cfn-init Configsets with SparkleFormation

13. December 2016 2016 0

Author: Michael F. Weinberg
Editors: Andreas Heumaier

This post lays out a modular, programmatic pattern for using CloudFormation Configsets in SparkleFormation codebases. This technique may be beneficial to:

  • Current SparkleFormation users looking to streamline EC2 instance provisioning
  • Current CloudFormation users looking to manage code instead of JSON/YAML files
  • Other AWS users needing an Infrastructure as Code solution for EC2 instance provisioning

Configsets are a CloudFormation specific EC2 feature that allow you to configure a set of instructions for cfn-init to run upon instance creation. Configsets group collections of specialized resources, providing a simple solution for basic system setup and configuration. An instance can use one or many Configsets, which are executed in a predictable order.

Because cfn-init is triggered on the instance itself, it is an excellent solution for Autoscaling Group instance provisioning, a scenario where external provisioners cannot easily discover underlying instances, or respond to scaling events.

SparkleFormation is a powerful Ruby library for composing CloudFormation templates, as well as orchestration templates for other cloud providers.

The Pattern

Many CloudFormation examples include a set of cfn-init instructions in the instance Metadata using the config key. This is an effective way to configure instances for a single template, but in an infrastructure codebase, doing this for each service template is repetitious and introduces the potential for divergent approaches to the same problem in different templates. If no config key is provided, cfn-init will automatically attempt to run a default Configset. Configsets in CloudFormation templates are represented as an array. This pattern leverages Ruby’s concat method to construct adefault Configset in SparkleFormation’s compilation step. This allows us to use Configsets to manage the instance Metadata in a modular fashion.

To start any Instance or Launch Config resources should include an empty array as the default Configset in their metadata, like so:

Additionally, the Instance or Launch Config UserData should run the cfn-init command. A best practice is to place this in a SparkleFormation registry entry. A barebones example:

With the above code, cfn-init will run the empty default Configset. Using modular registry entries, we can expand this Configset to meet our needs. Each registry file should add the defined configuration to the default Configset, like this:

A registry entry can also include more than one config block:

Calling these registry entries in the template will add them to the default Configset in the order they are called:

Note that other approaches to extending the array will also work:

sets.default += [ 'key_to_add' ], sets.default.push('key_to_add'), sets.default << 'key_to_add', etc.

Use Cases

Extending the default Configset rather than setting the config key directly makes it easy to build out cfn-initinstructions in a flexible, modular fashion. Modular Configsets, in turn, create opportunities for better Infrastructure as Code workflows. Some examples:

Development Instances

This cfn-init pattern is not a substitute for full-fledged configuration management solutions (Chef, Puppet, Ansible, Salt, etc.), but for experimental or development instances cfn-init can provide just enough configuration management without the increased overhead or complexity of a full CM tool.

I use the Chef users cookbook to manage users across my AWS infrastructure. Consequently, I very rarely make use of AWS EC2 keypairs, but I do need a solution to access an instance without Chef. My preferred solution is to use cfn-init to fetch my public keys from Github and add them to the default ubuntu (or ec2-user) user. The registry for this:

In the template, I just set a github_user parameter and include the registry, and I get access to an instance in any region without needing to do any key setup or configuration management.

This could also be paired with a configuration management registry entry and the Github user setup can be limited to development:

Compiling this with the environment variable development=true will include the Github Configset, in any other case it will run the full configuration management.

In addition to being a handy shortcut, this approach is useful for on-boarding other users/teams to an Infrastructure codebase and workflow. Even with no additional automation in place, it encourages system provisioning using a code-based workflow, and provides a groundwork to layer additional automation on top of.

Incremental Automation Adoption

Extending the development example, a modular Configset pattern is helpful for incrementally introducing automation. Attempting to introduce automation and configuration management to an infrastructure that is actively being architected can be very frustrating—each new component require not just understanding the component and its initial configuration, but also determining how best to automate and abstract that into code. This can lead to expedient, compromise implementations that add to technical debt, as they aren’t flexible enough to support emergent needs.

An incremental approach can mitigate these issues, while maintaining a focus on code and automation. Well understood components are fully automated, while some emergent features are initially implemented with a mixture of automation and manual experimentation. For example, an engineer approaching a new service might perform some baseline user setup and package installation via an infrastructure codebase, but configure the service manually while determining the ideal configuration. Once that configuration matures, the automation resources necessary to achieve it are included in the codebase.

CloudFormation Configsets are effective options for package installation and are also good for fetching private assets from S3 buckets. An engineer might use a Configset to setup her user on a development instance, along with the baseline package dependencies and a tarball of private assets. By working with the infrastructure codebase from the outset, she has the advantage of knowing that any related AWS components are provisioned and configured as they would be in a production environment, so she can iterate directly on service configuration. As the service matures, the Configset instructions that handled user and package installation may be replaced by more sophisticated configuration management tooling, but this is a simple one-line change in the template.

Organization Wide Defaults

In organizations where multiple engineers or teams contribute discrete application components in the same infrastructure, adopting standard approaches across the organization is very helpful. Standardization often hinges on common libraries that are easy to include across a variety of contexts. The default Configset pattern makes it easy to share registry entries across an organization, whether in a shared repository or internally published gems. Once an organizational pattern is codified in a registry entry, including it is a single line in the template.

This is especially useful in organizations where certain infrastructure-wide responsibilities are owned by a subset of engineers (e.g. Security or SRE teams). These groups can publish a gem (SparklePack) containing a universal configuration covering their concerns that the wider group of engineers can include by default, essentially offering these in an Infrastructure as a Service model. Monitoring, Security, and Service Discovery are all good examples of the type of universal concerns that can be solved this way.


cfn-init Configsets can be a powerful tool for Infrastructure as Code workflows, especially when used in a modular, programmatic approach. The default Configset pattern in SparkleFormation provides an easy to implement, consistent approach to managing Configsets across an organization–either with a single codebase or vendored in as gems/SparklePacks. Teams looking to increase the flexibility of their AWS instance provisioning should consider this pattern, and a progammatic tool such as SparkleFormation.

For working examples, please checkout this repo.

About the Author

Michael F. Weinberg is an Infrastructure & Automation specialist, with a strong interest in cocktails and jukeboxes. He currently works at Hired as a Systems Engineer. His open source projects live at