The public cloud has transformed the way we use and consume IT resources. It also impacted the way we develop applications and how those interact with the underlying infrastructure on which they run. While public clouds are attractive for many users and organizations due to the many benefits they bring, several myths have emerged about data protection in the cloud, which should be understood by anyone using public clouds.
Myth 1 – “If I move my application to the public cloud, I no longer need to protect it myself, that’s why I am paying the cloud provider for.”
Initially, workloads used for test and development were the first to move, benefiting from the cloud’s economic pay-per-use model and resource elasticity. Once those trials were successful and confidence in the public cloud grew, organizations have moved additional workloads to the public cloud, including business-important and mission-critical applications, some of which were at the core of their business. Nevertheless, just because the decision was made to move them to the public cloud, the business requirements for those workloads regarding data protection and service availability did not change. Since the regulatory requirements and the internal business processes and practices still hold, there is a need to apply data protection solutions to achieve such level of persistency and availability. The obvious place to look for this is with the public cloud service providers, however careful reading of hyper-scale public clouds’ T&Cs show that the responsibility to protect the data lies on the user (i.e. you) and not on the cloud provider (e.g. AWS Customer Agreement section 4.3 and Microsoft Azure Agreement section 2.1).
The public cloud provider may provide tools to perform the protection, but putting them to work in a way that will satisfy the requirements of your manager, your organization and the regulator is your responsibility.
Myth 2 – “Cloud Native applications are designed according to 12-factor methodologies and are resilient, so they do not need data protection.”
Cloud native applications are very common with start-ups who build them from scratch following 12-factor application design principles. Large organizations are also adopting this methodology and many enterprises with large development teams started adopting this approach to build their next generation applications. Twelve-factor development methodology makes the applications “cloud friendly” in the sense that it can behave better in the unstable cloud environment. For example, a software container is required to be stateless, so if it crashes, a new container can be automatically spun-up and continue where the former one has left off. The common misconception is that with this design, there’s no need for an additional layer of data protection. However, reality is that while this is a great approach to make the compute resources more resilient, it does not protect the data in the database, where the application state is being stored. Actually, the 12-factor approach relies on a reliable, persistent datastore/database for maintaining the application data, so protecting it is becoming even more important.
Myth 3 – “The Public Cloud provider gives me all that I need to easily protect my data.”
Well, this has some truth in it, but for many organizations the “easy” part is still a myth. The public cloud providers give tools and capabilities to perform the basic functions that are needed in order to create and retain copies of the data. However, they typically do not provide a fully designed data protection solution as provided by data protection vendors for on-premises deployments. For example, while you can create snapshots of the compute instances or storage volumes, there is no simple way built into the public cloud platform to manage the life-cycle of those snapshots, catalog them, search them, replicate them etc. Some operations are available on an individual instance/volume basis, but performing them on a large number of instances, with predefined protection policies while providing compliance reports, as would be required for many businesses and organizations is not easily achievable by the public cloud tools. The need to code and maintain scripts to perform all these operations may be challenging for the cloud administrator or the data protection analyst in the organization, and they may not be excited to continuously maintain and monitor changes and updates to them for years to come. Moreover, a majority of organizations are adopting a multi-cloud strategy, meaning they use clouds from several vendors simultaneously. Using directly tools provided by the cloud providers would require them to duplicate their efforts and manage things from two (or more) distinct user interfaces.
Luckily, there are solutions created by vendors to help manage data protection in the public cloud using the efficient public cloud native tools and building layers of automation and orchestration on top of them. One such solution is Dell EMC Cloud Snapshot Manager that provides this capability “as-a-service” and helps manage snapshots in the public cloud in a way that will be simple, yet comprehensive for the cloud administrator. Without the need for any software installation, it can automatically discover all the elements in the user’s public cloud environment, assign protection policies based on tagging and manage full life cycle of the snapshots without the need for manual intervention. This helps the user maintain compliance with the various requirements while controlling public cloud costs, e.g. by discarding unnecessary old snapshots.
Public clouds are great options that are part of the current and future IT environment, however it is always essential to ensure the transition to the public cloud is planned carefully, and that the business requirements for data protection are always met. Make sure your data and workloads are available and recoverable, no matter what happens.