One of the hottest topics in IT is multi-cloud strategy. It seems like only yesterday people had to be convinced that a public cloud could be a viable alternative to run production workloads, and now organizations are looking into using even more than one clouds. According to Enterprise Strategy Group, over 81% of organizations are using more than two cloud providers or more, and this trend does not seem to slow down. So if you are using a public cloud, should you also adopt a multi-cloud strategy? Maybe you should even have a cross-cloud strategy? And what is the difference between those two?It should be noted that sometimes (e.g. in Wikipedia) people are using the term multi-cloud is used to describe both multi-cloud and cross-cloud (as they are defined below). However I believe it is worth having a clear distinction between those models, since they present different challenges that lead to different implementations.
Typically when one talks about a “multi cloud” strategy, they want to have workloads running in more than one cloud. The main reason to use multiple clouds as I heard from cloud users is application optimization. Different clouds are optimized for different parameters, based on the positioning determined by the cloud provider. One could be optimized for availability but is expensive to use, while another would be much cheaper with lower resilience. A certain cloud could be optimized for auto-scaling and elastic resource allocation, while the other cloud is more rigid but delivers much better performance. The ability to run each application in a cloud that best fits the requirements of that workload could lead to easier application deployment, cost effectiveness and improved results.
When multiple clouds are used this way, i.e. running various applications on different clouds, it is typically referred to as a multi-cloud strategy. The different cloud environments are usually kept separate, and data is mostly not being transferred between clouds. In some cases, the applications on each cloud may be even operated by different departments. Nevertheless, running multiple environments in parallel still introduces complexities in managing them in a consistent manner:
- Knowledge and experience – the IT organization is required to manage the various cloud environments, which are quite different. Learning how to manage workloads and assets in Amazon AWS is very different than doing the same in Microsoft Azure or Google Cloud Platform. It requires learning a different terminology, processes and tools, and would require working with multiple vendors with all the complexity it entails.
- Common Policy Management –workloads should comply with the corporate IT policies for data protection, availability and security whether they run on-premises or on public clouds. Unfortunately, the public clouds usually use different terminologies and definitions, and provide different sets of data availability, protection and privacy capabilities. Translating the organization’s policies to each cloud is not always a simple task, and if many applications/VMs are involved, it becomes it a major challenge.
A different approach to leverage multiple clouds is to have the same data (and applications) reside in both clouds, for multiple reasons: data protection and a recovery from a disaster that occurred in one cloud, a regulatory requirement, or the desire to avoid vendor lock-in and be able to quickly switch to another cloud provider. Such a strategy will be referred below as cross-cloud strategy (although as I mentioned above, some would confusingly refer to it also as multi-cloud). This approach is supposed to support flexible movement of applications and data between the clouds, and in theory, enable same capabilities like those available when two on-premises data centers are used. However, this approach is not easy to implement since in addition to the complexities of using multiple clouds as detailed above, there are additional challenges involved:
- Complexity – adapting an on-prem application to just run on a public cloud is not a trivial task. Making it run cost effectively on that cloud is an additional effort that requires deep knowledge of the cloud services and operating model. Now imagine doing the same for two (or more?) clouds. This increases the efforts required, the development and testing time, the size of the team and the overall cost. Beyond the application, the data structure itself may need to be changed and therefore a simple data replication is insufficient for complex situations like disaster recovery.
- Cost – public cloud providers are interested in getting users on-board, and not leaving their cloud. Therefore, while ingress data transfer (copy data to the cloud) is typically free, the cost for egress data transfer (send data out from the cloud) is high. Sending the data out to another cloud, especially if this is done continuously for data protection purposes, could add a significant burden to the business. For example, if one has a 10TB database in AWS, with assumed daily change rate of 10%, replicating the changed data daily to another cloud would cost approximately $30,000 a year in AWS data transfer cost alone (excluding networking cost and resources cost in AWS and in the other cloud).
For those reasons, using a multi-cloud strategy to prevent vendor (cloud provider) lock-in is a tricky subject. If one is running workloads in Cloud A, the process to move to Cloud B could take months (for planning, application and data refactoring, data transfer etc.) and cost a lot, so it is not a real threat to Cloud Provider A. To circumvent that complexity, the user would need to continuously maintain workloads in both clouds all the time, ensuring a simpler and faster transition. However, as discussed earlier, this is complex and costly, both in cloud resources as well as personnel to handle applications in both clouds. While methods to do this will be discussed in Part 2 of this post, you should consider the cost and complexity you’d need to invest for just keeping the option (which may never materialize) to easily move to another vendor.
Similarly, when disaster recovery (DR) is concerned, the simpler and cheaper way to achieve this goal and protect your data against a regional disaster is to replicate the data to another region within the same cloud. If the US-East region is impacted by a hurricane or a blackout, most likely that you could still run in US-West region, or even in Europe. This is a much simpler implementation since no conversion of application, networking or storage configuration is required, tools provided by the cloud provider can easily transfer the data and orchestrate the failover, and the intra-cloud data transfer cost is much cheaper (4 times cheaper in AWS) than sending the data to another cloud. Again, you should consider the complexity and cost of running in two clouds just to protect against a total, global failure of your cloud provider. This is a business decision and depending on your organization’s requirements you may be forced to do it, but I believe most users and organizations would opt for the simpler path. By now, many people consider the large, hyper-scale cloud providers as a force of nature, with built in redundancy and protection mechanisms, and to them a total global failure of such a cloud is just unthinkable and is considered a force majeure.
Now I do not disagree that there are situations where regulatory requirements or other legal considerations require organizations to run workloads and continuously store data in more than one cloud. Yes, this happens, although I would argue that it is only in specific industries or verticals, but in these cases, the above complications should be addressed.
So what’s next?
A multi-cloud strategy is popular for good reasons, and a cross-cloud strategy is important in certain market segments. However to make them workable in the business environment would still require proper planning and additional supporting tools to make it successful and cost effective, and possibly selection of proper architectures and environments to make them feasible. Both can be put into good use with proper planning and the right set of tools, to achieve the goals of the users, which will be detailed in Part 2 of this post.