In Part 1 we compared multi-cloud with cross-cloud strategies and have identified the reasons to use each of them, with the complexities they introduce. In order to properly enable seamless operation between the clouds, multiple technologies and architectures can be used.
Leveraging a multi-cloud management platform
Several vendors provide cloud management platforms (CMP) that enable management of workloads and data in multiple clouds, with a “single pane of glass” (SPOG) to manage the resources used in all clouds, apply common policies across them and minimize the total operating cost. Looking into leveraging such solutions would go a long way towards making multi-cloud or cross-cloud a viable option. There are multiple comparisons of such products, which I will not repeat here, but one interesting comparison that is based on users’ ranking is available at Gartner PeerInsights.
Using such a CMP can also simplify the complexity in training your team to manage multiple clouds. They will still need to learn the basic terminology and operation of the various clouds you use, so the issue is not completely eliminated, but much of the day to day work will be done though the CMP so the interaction with the cloud management consoles will be limited and contained.
If you want however not to settle for multi-cloud strategy and want to ensure your applications are easily moveable between clouds, you could use existing tools to orchestrate the migration or replication of your workloads between clouds, or may want to consider developing your applications in a portable format.
Workload migration/replication solutions
A full workload migration or replication between two sites, or between two clouds, typically leverage tools that were originally designed for disaster recovery (DR). Unlike backup solutions, DR solutions are handling the full stack of the application, with its underlying compute resources, networking and storage. While backup solutions only protect the persistent data that the application uses, the DR solutions collect the information about the VMs in which the application runs (metadata), the application itself (its VMDKs in a VMware environment), the network settings and the persistent data itself. All this information is sent to the remote site so that in case of a disaster, the whole environment can be recreated and restarted in the other location. This was “relatively easy” in on-prem to on-prem DR, since the environments were mostly (if not always) the same, i.e. same hypervisor, similar networks etc. However as DR solutions evolved from on-prem solutions to support public clouds, they had to introduce support for the different environments in which the applications had to run, by supporting conversion from one format to another (e.g. from VMware’s VM to AWS AMI). Additionally, they had to be able to translate network definitions, compute instance attributes (e.g. number of CPUs, memory size etc.) and other relevant parameters between the two environments. Thankfully, introduction of those capabilities opened the opportunity to use such DR solutions also for migration between different clouds, since that could be implemented as a “one time” DR failover. Not surprisingly, DR vendors have started promoting their products also as migration tools (see example of CloudEndure). It is a viable and valid option that should be considered for those looking to have a simpler way to move from one cloud to another, reducing dependency and cloud provider lock-in, but it is not a solution in all situations.
Developing portable applications
DR tools are acceptable solutions for migrating applications that are designed with “legacy” mentality, leveraging services that are all encompassed within the application environment. For example, if the application uses a MySQL database, it is assumed that the database is running on one of the VMs that are protected within the same protection group, so it is replicated, converted and restarted together with all other VMs. When used for migration, such DR tool can then migrate the MySQL database together with the application. However, a growing number of developers are using services provided by the cloud providers in order to avoid the need to install, manage and maintain their own databases or other components they require. Instead of setting up their own MySQL instance on an EC2 compute instance, and then continuously take care of maintaining, upgrading and patching it, they use a cloud service, such as an AWS RDS (Relational Database Service) instance. Unfortunately, once they do that, a simple DR solution will not be able to replicate the RDS data from AWS to another public cloud that uses a different and incompatible service, and cross-cloud replication will not work.
So to enable true operation across multiple clouds, with a true cross-cloud experience, the approach should be different. It should not be moving the application from Cloud A to Cloud B, but instead have the application run in both clouds, and just ensure the data is synchronized between the two clouds. This is easier said than done, since the application should then be designed with cross cloud portability in mind. It should use services that are common between clouds, ensure that the installation process is leveraging common methods, and when there is no commonality, that the software is designed to manipulate each cloud in its own unique way. A good method to achieve that, would be to design the applications as containers running on top of a platform-as-a-service (PaaS) layer (such as Pivotal Cloud Foundry or Red Hat Openshift) that presents to the application a common set of services regardless of the underlying infrastructure and simplifies the development process while maintaining portability. The important point to notice is that the decision to run on multiple clouds should be taken before the application software is coded, and it should be part of the requirements for the programmers writing that application.
Once the application can run in two clouds, you could choose the operational model, depending on your business requirements and the amount of time you allow for the move from one cloud to another. The applications could be deployed as:
- Active-Active – in this model the applications run on both clouds at the same time, their resources are allocated and are in use. This however requires a mechanism to synchronize the data between the instances in both clouds, which should comply with the business requirements and implemented within the applications. While such a model is most expensive in cloud resource consumption, and the most complex to implement due to the need to accommodate a multi-writer situation, it can allow for non-stop operation without downtime, and would therefore fit DR requirements for mission/business-critical applications.
- Active-Passive (warm standby) – the application is running and providing service in one cloud, while it is running but not providing service in the other cloud. The mechanism to synchronize the data between the clouds is still required, however it is simpler to develop since there’s only a single “writer” to the data. This model allows for a quick restart of the business service from the “standby” cloud, though with a short downtime, and it therefore fits a DR needs of applications with medium-to-high business importance. The cost of resources in the “standby” cloud could be reduced by using minimal compute resources while in standby mode, and auto-scale to full capacity when a “failover” occurs.
- Active–Passive (cold standby) – this model is the closest to a legacy DR solution, where the application image is available in the standby cloud but is not running, and only the data is being replicated. Once a failover action is initiated, an orchestration engine deploys and starts the application in the standby cloud, attaches it to the data copy and resumes the service. This model is the least costly, would take more time to restart, and typically fits a Tier 2-3 applications with low-medium level of business criticality, or to enable workload migration for preventing vendor lock-in.
Hey, but what about Hybrid Cloud?
As we discuss multiple public cloud environments, we cannot ignore yesterday’s word-of-the-day, the Hybrid Cloud. A hybrid cloud is typically referring to a mixed environment of both public and private clouds, where workloads and data could reside in both. The private cloud in the enterprise environment is typically based on VMware or Microsoft technologies (and rarely on KVM/Openstack), while the pubic cloud could be any of the available clouds. For our discussion it is safe to assume that a hybrid cloud scenario is similar to multi-cloud or cross-cloud (depending how they use it), since it involves two locations and (typically) two or more virtualization technologies and cloud environments. While the private cloud side could have less services and capabilities than the public cloud, the methodologies described above still apply for either case.
Running your applications on multiple clouds could provide benefits in availability and cost, and could address regulatory requirements. However, you should carefully explore the real needs of your organization, why do you want to run on multiple clouds, and how should you best address those needs by leveraging one or more of the approaches discussed above.