I was talking to my friends at Mako Networks, provider of PCI DSS certified and secure network equipment to telco’s, payment card processing and retail industry (and others) in late 2012 on how they were using virtualization to solve their business requirements. One of the interesting use cases I thought I’d share is how they’re using what they term Lightweight Virtualization to produce a Multi-Tenanted Device (MTD), which is a virtual multi-tenanted network concentrator. This fits in nicely with their cloud based central network management system (CMS) and allows their partners to serve thousands of customers without thousands of individual network concentrator devices.
I’ve been using the Mako Networks systems in my primary and DR locations for over 10 years. They’ve always been rock solid and fast. The thing that really got me interested though was how simple the routers were to get set up and to manage via the cloud based central management system, they just work, and the quality of reporting that is built in is great. Delivering complex things like secure networking in a very simple way is of great value in my opinion.
The idea of a multi-tenanted device (MTD) for Mako’s Network VPN Concentrator appliances is quite similar in concept to how VMware vCloud Director provides multi-tenanted edge gateway services in public cloud environments based on VMware vCloud Director (i.e. multiple tenants being serviced by virtualized edge gateway devices). The difference here is that the MTD is a component of a physical network concentrator appliance being used to deliver PCI DSS compliant network services for Mako Networks’ Partners to potentially thousands of customers and tens of thousands of individual merchant devices and sites.
I asked Kevin Ptak, Mako’s Communications Manager and Murray Knox, Mako’s R&D Director to write this article. I have added some emphasis and detail in bold italics below.
Disclosure: Mako Networks are not paying to have this article published on my blog, I’m publishing this because I think it will be on interest to my readers, however one of my companies does have a very minor shareholding in their parent company (<1%).
Mako Networks provides a managed network service, comprised of (primarily physical) network appliances installed at customer locations and a cloud-based Central Management System. This includes both small-office appliances and central office concentrators to allow a number of branch offices to securely communicate with a central office.
Some time ago, one of our partners approached us with a problem. They wanted to deploy our VPN concentrator service to multiple customers to allow access to our partner’s SaaS offerings. However, their customers weren’t big enough to warrant an individual concentrator each. Additionally, the target market was between 500 and 2,000 customers and it was not feasible to deploy this many concentrators in their data center. Therefore, our partner requested that we develop a version of the concentrator that can be used by multiple customers, a Multi-Tenanted Device (MTD).
One of the reasons they requested a MTD was to be able to provision and enable a new customer in an eight-hour timeframe. They could not achieve this Service Level Agreement (SLA) using alternative technologies but their experience with our platform showed that the Mako System makes it very quick and simple to provision a new customer.
The Mako System is designed around the concept of one customer per device. It would be a significant change to our management system and appliances to allow multiple customers per appliance. Therefore, we decided a better approach would be to support multiple virtual appliances within a single physical appliance.
Our initial idea was to use standard virtualization technology and deploy our Mako appliance firmware as a guest on the virtualization platform. This would have provided the required capabilities and could be developed in a relatively short time. However, we quickly found that this would not meet partner requirements to provision new customers within eight hours. Also, initial capacity planning showed the memory requirements for this platform were significant.
We had experience with lightweight virtualization in other areas and decided to investigate the feasibility of using it for the MTD. Lightweight virtualization differs from standard virtualization in that there is a single instance of the operating system that executes and provides services to the individual guests. Each guest is really no more than a set of processes to which the operating system provides a virtual filesystem and network namespace.
Lightweight virtualization is useful when you want to deploy a large number of very similar (or identical) applications that you need to keep segregated. It uses less memory and disk than full virtualization, as you don’t need a full copy of the guest operating system for each instance. The host operating system can provide core services (e.g. scheduling, clock synchronization, memory management, logging) allowing each guest to have a smaller memory and CPU footprint.
Lightweight virtualization is now a mainstream technology on the Linux platform in the form of LXC and network namespaces. We created a number of prototypes using this technology to prove that it could support our Mako appliance requirements and that the network namespaces provide the required levels of separation between appliances. After proving the viability of LXC, we then proceeded to design and implement the MTD around this technology.
We had to design a deployment process for the MTD to allow provisioning of a new customer in a simple, error-free manner without requiring complex network changes. We settled on a process whereby a customer is provisioned and configured in the Mako CMS in the standard way, then we copy the customer’s configuration data to the MTD and supply it to a deployment script which creates a lightweight container and configures the network interfaces to be used.
Another issue we encountered was the time required to create up to 500 network interfaces when the system is booted from cold. Early versions of the LXC system required significant amounts of time to create the required number of network interfaces. Eventually, with careful tuning and patching we managed to get the time required down to an acceptable level, aided by the rapid development of LXC and namespace systems.
Our standard software update process for appliances was not going to be suitable for the MTD. Normally, when we deploy a software update we supply every component including the operating system to ensure the integrity of the update. However, with the MTD, the host operating system is pre-deployed and would be supplying many of the capabilities and we only needed to deploy a limited number of subsystems. Further, each virtual Mako appliance uses identical copies of the firmware. Therefore, we decided that all virtual appliances share the same instance of the firmware. This signififcantly reduced storage requirements and made it so that only one copy of the firmware needed to be maintained.
Another system requirement was high availability. We had to provide a system that was immune to both hardware and software failures. Our nework appliances (or ‘Makos,’ as we call them) firmware already has mechanisms for managing software failures. However, in a virtualized environment it could not manage failures of the host operating system or hardware. Therefore, we adopted a two-pronged approach to resiliency. Our CPE firmware would handle its own internal failures. For hardware and host problems we decided that clustering was an appropriate solution. The MTD would be deployed on multiple physical nodes within a cluster and each node capable of handling the full load of 500 customers when required. (This removed the host OS being a single point of failure, which would otherwise have been the case)
Scaling and Capacity Planning
Capacity planning is an essential stage of deploying a system in a lightweight environment, as with any other virtualization system.
Overcommitting resources is a standard technique with virtualization. However, it is essential that overcommitting not jeopardize either SLAs or the ability of the guests to function. We have very good performance profile information for our standard Makos, which was helpful in sizing the MTD. We know how much memory, CPU, disk I/O and network I/O each Makos typically uses (even during peaks).
Network response time was an important SLA for us. if a virtual Mako was paged out of memory then performance would suffer dramatically. Therefore we decided to ensure that there was enough RAM so that an MTD appliance could run with 500 virtual CPEs all in memory at once.
We also had good knowledge of the disk I/O requirements of the Mako in both normal and peak conditions. Multiplying the I/O requirements of a single Mako by 500 showed that standard hard disk technology would not provide the throughput we required, therefore we adopted SSDs.
Lastly, we developed a series of load a stress tests and tested the MTD thoroughly up to and beyond its design limits. We tested the MTD multiple times in a wide variaty of scenarios until we and our partner were satisfied that it would operate reliably and meet the required SLAs.
Resource capping is used within each lightweight virtualization container to ensure that one container can’t spin out of control consuming resources and adversely affecting others. In addition to this Mako Networks has advanced monitoring across all of the containers so that they can detect any adverse behaviour and take action before there is a system impact. The practices applied here are the same as would be applied to other business critical workloads in other virtualization environments, especially where it comes to testing and ensuring SLA’s are met.
Like standard virtualization, lightweight virtualization is a rapidly evolving and maturing technology. However it allowed us to develop a new product in a relatively short timeframe and offer another level of scalability while retaining the simple management of our core offering. It allows our partners to offer a new service to their customers with a significant reduction on hardware requirements compared to a physical deployment.
I hope you found this of interest. There are quite a few lessons that can be taken from this in terms of developing virtualiation architecture strategies for all sorts of business critical workloads. I also found this interesting myself due to Mako Networks cloud based Central Management System (CMS) that controls the thousands of devices deployed globally, and the multi-tenancy device use case for the VPN concentrator. If you found this interesting and want to know more about Mako Networks I would encourage you to visit their web site – www.makonetworks.com. Their system is incredibly cost effective and can provide secure PCI DSS certified merchant IP based payment processing over public networks. Even if you’re not interested in PCI DSS networking for payment processing, their system can be used to architect secure and affordable corporate networks with branch offices, with the bonus of being PCI DSS certified in the future. As always your comments and feedback are welcomed.
This post first appeared on the Long White Virtual Clouds blog at longwhiteclouds.com, by Michael Webster +. Copyright © 2013 – IT Solutions 2000 Ltd and Michael Webster +. All rights reserved. Not to be reproduced for commercial purposes without written permission.