AWS and the New Enterprise WAN
The public cloud’s wholesale transformation of IT includes a shifting in enterprise IT requirements for the wide-area network (WAN). The viability of traditional network architectures for interconnecting hundreds or even thousands of remote offices, or branches, is rapidly decreasing as enterprises consume IT as utility. More agile, secure, and dynamic WAN is needed. As an industry, we have a name for this emerging networking trend: Software-Defined WAN (SD-WAN). In this article, we explore SD-WAN with a focus on the integration with AWS VPC infrastructure.
Importance of Multi-Cloud
Enterprises may have applications in on-premise data centers, colocation facilities, and infrastructure-as-a-service (IaaS) provider platforms and need to access the information in these locations in a flexible manner. Also, SaaS (Software As A Service) is now the standard option for a wide range of enterprise applications. Yet the growth in SaaS hasn’t always been met by a growth in the infrastructure needed to cope with the resulting increase in network utilization. Older WAN technologies deployed at corporate branches are no longer sufficient for the modern SaaS-enabled workforce. As data stops flowing to and from the data center and starts flowing over the internet, congestion, packet loss, and high latencies are all too common.
The notion that companies can spread a single application across multiple public clouds has had its detractors. Some argue that the only “multi-cloud” approach will be a distribution of applications in a way that caters to the perceived strengths of a given cloud provider. For example, a company might consume AWS’s Lambda for functions-as-a-service while looking to Google Cloud Platform (GCP) for machine learning services. This approach is valid, and we see it regularly in our work; however, we also observe how the rise of Kubernetes is changing the IT roadmaps within the enterprise. We have no doubts: the future is multi-cloud.
Let’s not forget that most enterprises–unlike companies “born in the cloud”–must continue to operate infrastructure on-premise and in third-party colocation facilities. Why will private cloud deployments persist? Isn’t this passe? To answer this, let’s look at a telling quote from Amazon’s Anu Sharma, product manager for the new AWS Outputs hybrid cloud service. She acknowledges “…there are some applications that [customers] cannot move to AWS largely because of physical constraints…” She highlighted latency and its effect on moving data in and out of the cloud. Whether the private cloud is implemented as OpenStack, AWS Outposts, or Azure Stack, private cloud will remain in the picture.
Today’s WAN is Inadequate
Therefore, in an environment in which application placement is diverse, enterprises must figure out how to connect the employees in many physical locations to the tools they need to perform their job functions. The complexity involved in moving bits around these highly heterogeneous environments can be overwhelming.
Traditionally enterprises paid telecommunication companies premium prices for Multiprotocol Label Switching (MPLS) links or private point-to-point links for connecting remote branches to centralized corporate data centers. Traffic from branch locations was carried over the private connectivity regardless whether the bits were intended for an internal app, a SaaS application, or an Internet search engine. This added latency to network connections as all traffic was–to use networking parlance–”backhauled” to a small number of corporate locations. Within these corporate data centers, the network was the proverbial long pole in the tent in deploying new applications such that the geographically dispersed workforce could access them.
Figure 1 depicts the connection of multiple remote branches to a centralized data center. Note that all traffic exiting the branches traverses the expensive MPLS network to reach all destinations–including Internet ones.
Figure 1: Traditional Enterprise WAN
As mentioned earlier, as workloads spread across on-premise and public cloud infrastructure, enterprises need more flexible, secure, and agile means for connecting branch offices, namely SD-WAN. The projects around hybrid cloud connectivity and the modernization of the WAN infrastructure will run in parallel for many years to come. Any effort to move traffic between on-premise and off-premise needs to keep the requirements of the new WAN in mind.
Before defining SD-WAN, let’s examine the underlying access mechanisms at our disposal. We are no longer limited to T1 and other leased line service for business-grade connectivity. Access might consist of fiber-based Ethernet, business cable, fixed 4G/5G, or satellite. These access types might coexist with MPLS links or as a mean to replace them. A given branch might have more than one path for exiting the branch. Considering internet connectivity by itself, all type of access are not alike. A larger office might have a 1Gb/s fiber links while a kiosk in the mall might have spotty Wi-fi.
Does this heterogeneity of WAN access and multiple entry/exit paths sound like a management nightmare? This could very well be the case if we designed and operated the new WAN in the manner of the previous generation WAN.
Enter The Software-Defined WAN
What SD-WAN provides is an abstraction layer for the WAN to simplify the management and cost of wide-area connectivity while recognizing application performance improvements
Let’s compare and contrast the WAN abstraction and VPC as a data center abstraction. VPC constructs such as subnets, load balancers, and virtual gateways are purely ephemeral with the ability to appear and later vanquish with the stroke of an API call. But how do we abstract a WAN? Fiber and copper are tangible. We want to break the coupling of network capabilities with how the packets are delivered over the various access mechanisms. SD-WAN accomplishes this through the abstraction or introduction of an overlay network that extends over various connectivity methods. The overload network is implemented using SD-WAN appliances on either side of the connection.
Figure 2: SD-WAN Overlay
Is it possible to cut through the hype and describe what SD-WAN can deliver for the enterprise in term of efficiency, cost reduction, and strategic enablement of new services? Yes, although doing so can be challenging.
To start, SD-WAN may be deployed in many different models. For example, service providers can add SD-WAN to their existing MPLS offering as a “first mile/last mile” technology. On the other hand, enterprises might want to deploy SD-WAN in a full overlay model where they assert full control over the SD-WAN solution and appliances. Even such a model may be deployed in-house or as a managed service. In this post, we focus on the latter: the full overlay model.
In addition, there is much confusion as what features should an SD-WAN solution contain at a minimum. The SD-WAN space–like any other “hot” technology area–is a crowded area. Engineers may find it very hard to distinguish between a true SD-WAN solution and old WAN optimization appliance wrapped in new marketing jargon. To make things even more confusing, each enterprise is approaching SD-WAN from a different problem space. For example, some might consider application performance enhancements as their primary goal while others consider cost reductions as the primary driver.
We believe the following should be present in any chosen SD-WAN solution:
Network agility–not lower costs or better performance–is the main factor for enterprises adopting SD-WAN infrastructure, according to findings in a survey conducted by Cato Networks.
A good SD-WAN solution enables rapid branch deployments with self-provisioning. Bringing a new branch or remote location online should be easy and completed within minutes. The branch appliance, physical or virtual, should just to be connected to the LAN and WAN links serving the branch, plugged it in, and turned on. No specialized IT expertise should be required on premise at the branch.
It is Flexible
Flexibility can be evaluated in many different contexts. Any SD-WAN solution should be end-to-end and not put any restrictions on where the data should reside. The reason it has to be end-to-end is that your users are in many places. They can be on your branches, or they can be on your campuses, or they can be road warriors connecting to your resources using client VPN software over the public Internet.
Similarly, your data and your applications are everywhere. They’re on-prem, and they are in the public cloud. The hubs for the SDN-deployment should be able to be a physical or virtual device in a traditional data center, a virtual device in the corporate private cloud or a virtual device in the public cloud of choice.
In addition, a true SD-WAN solution should support different topologies. Many enterprises use a hub and spoke or a full mesh topology. Most SD-WAN solutions support these basic topologies. But one could think of many other hybrid topologies, and SD-WAN solutions should not restrict enterprises to one end of the topology spectrum. SD-WAN should provide insertion of network services whether on the branch customer premise equipment (CPE), in the public cloud, or in regional and enterprise data centers, deployed in a wide range of topologies.
In addition, these SD-WAN solutions should provide automation and business-policy abstraction to simplify complex configurations and provide flexibility in traffic routing and policy definitions.
Includes Integrated Security
One could imagine a day in which a traditional firewall device isn’t needed per branch. It is no wonder that on the long list of SD-WAN vendors we find many of familiar names in the traditional firewall vendor space. We believe integrating advanced security features into SD-WAN services allows a cleaner more simple deployment model for the branches.
Even though basic firewalling capabilities in some SD-WAN appliances might be sufficient for some enterprises, given today’s threat landscape most enterprise need and demand advanced firewall capabilities if the only appliance deployed on the branch is an SD-WAN device. Enterprises need to find a solution that delivers advanced security features without compromising desired SD-WAN functionality such as application optimization or fast fail-over.
Optimizes Application Performance
One of SD-WAN’s most desirable features is to position applications to choose the best connectivity based not only on performance metrics but also business metrics such as cost. Let’s take an example of ensuring users are proximal to their applications. In our example, an enterprise has existing servers deployed in both the on-premise data center as well as the AWS public cloud. Let’s say you have an end user in Arlington, VA. Your on-premise data center is located in Pittsburgh while the AWS deployment region is us-east-1 in Ashburn, VA.
For applications that are housed in the Pittsburgh data center, branches have two possible paths to get to these resources. For applications that require large bandwidth and guaranteed SLAs, the MPLS path should be used. On the other hand, if cost is the primary factor for an application such as bulk file transfers, then perhaps the path through the Internet is the best option.
Similar choices exist for data housed in the AWS cloud. For the Branch to AWS VPC we can pick between Direct Connect (DX) connection to the VPC vs. pure Internet access. We should be able to optimize based on business requirements. A sensitive application might only be allowed to use the DX connection provided through the data center while for the rest Internet-based access might suffice. An acceptable SD-WAN solution provides IPSEC-based encryption between the branches and the centralized hub location (cloud or data center) when the connection is through the Internet. The configuration of this IPsec tunnel and the routing through them should be performed by the SD-WAN controller and not manually.
Figure 3: Branch with multiple paths to reach applications
As described, SD-WAN hub can reside within an AWS VPC, in effect turning the VPC into another aggregation hub for the remote sites. In AWS, an SD-WAN termination point is an appliance from the Marketplace. For an interesting in-depth look at SD-WAN appliances, check out the AWS-commissioned report by ESG Labs entitled SD-WAN Integration with Amazon Web Services.
There may be different architectures to home an SD-WAN appliance within an AWS architecture, but we would like to explore one we like to call “edge services VPC. For the sake of simplicity, we show a single region deployment with three VPCs.
Figure 4: The Edge Services VPC Design
There are many possible ways to terminate the edge SD-WAN connections into a public cloud. At the very basic level SDN solutions rely on an SD-WAN gateway, which performs the hub functionality. This appliance which will be a VM when deployed in the AWS VPC aggregates all the connections from the SD-WAN branches.
We like the idea of a separate “edge services VPC” dedicated for all the edge connectivity terminations. This VPC would terminate the SDN-WAN connections. The connectivity between this VPC and other VPCs can be provided through a simple VPC peering if the AWS deployment is small or through a Transit Gateway (TGW) if a larger number of VPCs need access to the Edge Services VPC. Even though outside the scope of this paper, one could imagine yet a larger deployment with edge services VPC per region, each connected to other VPCs within that region through a TGW.
In this article, we’ve described how multi-cloud and the diversity of WAN connectivity options for enterprise branches have given rise to a flexible, agile, and secure SD-WAN. We believe that enterprise public cloud migrations–while not necessarily dependent on SD-WAN–will occur in the same timelines as the move to SD-WAN as enterprise IT architects recognize that more intelligence is needed in the network path selection process. The details about SD-WAN vendor selection and design will vary. One thing is certain: the enterprise WAN is evolving toward a more software-centric approach to meet the needs of enterprise applications.
About the Authors
Amir Tabdili and Jeff Loughridge have been designing, operating, and engineering large-scale IP infrastructures since the late-1990s. In their current roles as Chief Architect and CTO of Konekti Systems, the two help clients with public cloud networking, SD-WAN, and hybrid IT architectures. You can learn more about Konekti at https://konekti.us.
About the Editor
Jennifer Davis is a Senior Cloud Advocate at Microsoft. Jennifer is the coauthor of Effective DevOps. Previously, she was a principal site reliability engineer at RealSelf, developed cookbooks to simplify building and managing infrastructure at Chef, and built reliable service platforms at Yahoo. She is a core organizer of devopsdays and organizes the Silicon Valley event. She is the founder of CoffeeOps. She has spoken and written about DevOps, Operations, Monitoring, and Automation.