In order to increase application Availability, every company, no matter the size, needs to have a secondary site for disaster recovery (DR). Today’s technology provides the ability power on any workload instantly in the secondary site if any issue occurs in the primary one.
Historically, DR sites have only been a solution targeted at large enterprises for a number of reasons. However, new technologies and service offerings like Disaster Recovery as a Service (DRaaS) make DR sites accessible and cost-effective for companies of any size.
The rise of virtualization
Without any doubt, the main driver for the explosive growth of disaster recovery solutions has been virtualization. Before, replication of data and applications to a secondary site was only possible by leveraging the capabilities of the storage arrays, but not every solution included native replication capabilities. Besides, the replication at the storage layer has two additional pitfalls:
From a technology point of view, storage layer replication requires both sites to adopt the same storage model or brand, because every replication technology is proprietary. This isn’t an issue for large organizations that could create a secondary site with the same size and power of the primary site. However, small and mid-sized organizations don’t have the deep wallets or the interest to invest the same budget for both the primary and secondary sites. They need more cost-effective solution.
Replication at the storage layer lacks granularity to enable the fine-tuning of the replication policy. A storage array is not aware of the workloads it hosts. If a database and a file server are both stored in the same location, they are replicated using the same policy. Because of this, it’s difficult to create different policies and obtain different RPOs and RTOs for different workloads, unless administrators separate workloads into different silos. However, that causes additional overhead and requires more management, monitoring and IT resources.
Virtualization, on the other hand, makes replication services easily accessible, configurable and usable, and it eliminates these two major pitfalls. With virtualization is possible to define specific RTOs and RPOs for each workload because the management unit transitions from the giant, bulky storage array to flexible, individual virtual machines. An IT administrator can prioritize and deprioritize certain workloads. For example, they can configure a replication solution to create copies at the secondary location near-continuously (every 15 minutes) for the important database server but every hour or even every day for a file server. Virtualization has abstracted the hardware layer making it possible to apply more granular controls to replication, optimize results and utilize IT resources more efficiently.
The need for a second site
Disaster recovery is a great solution to increase the Availability of modern data centers, and it does so by leveraging virtualization and state-of-the-art replication technologies to create off-site replicas of virtual machines.
But when end users start to plan for a DR site, they are faced other problems: First, the capital expenses of building and maintaining the secondary site is a challenge for many organizations. In a second location, owned or rented (for example in a colocation facility), one needs to deploy new hardware and software according to the size of the production environments. Then one also has to configure and manage it, virtually doubling the IT infrastructure efforts. Additionally, because production workloads mostly run at the primary site, the secondary site is rarely used. This drives the cost even higher compared to its value, making it difficult to sell an ROI to upper management.
Historically, only large enterprise organizations had the necessary budget to afford a private secondary site, or they already had more than one office with IT staff in each to leverage the facility or those resources. Today however, even large organizations are looking for ways to reduce their capital and operational expenditures while still maintaining a DR site.
Disaster Recovery as a Service
This is one of the situations where a cloud-based solution fits perfectly, and it’s the reason why Disaster Recovery as a Service is becoming so popular. By renting resources from a service provider on a pay-as-you-go model, end users have the same result (CPU, ram, storage and networking resources available for failover operations) without the capital costs and burden of designing, deploying and managing a daily DR site. Service providers with expertise in DR handle the complete management of the infrastructure. In return, the service provider offers a SLA (Service Level Agreement) relative to the quality of the offered service. The service provider can also offer different SLAs for different workloads, thanks again to virtualization.
This allows end users to focus on the replication activity, plan the different RPO values needed for applications, and define the DR strategy and DR solution using business metrics rather than IT needs.
It’s no wonder DRaaS is in high demand. In fact, as you can see from this Google Trends chart DRaaS’s popularity has exploded.
DRaaS popularity trends
If you look at another search term, “Disaster Recovery,” you can see that demand for DR alone is actually shrinking:
This is a sign that customers are not searching as often for a generic solution, but they are specifically looking for DRaaS from service providers.
The plethora of solutions available on the market is impressive. Companies of any size looking for a DRaaS solution should evaluate carefully which options come with the service. Let’s review some of them.
1. Ease of use
Ease of use is an often-overlooked DR solution characteristic. People tend to focus more on the great capabilities of a given technology, but if this technology is also extremely complex and hard to set up and use, there should be a giant red flag. The promised ROI will be difficult to achieve and the value added to your business will be limited. On the other hand, an easy-to-use solution can be tested quicker and adopted faster and with less effort. Most importantly, the end user can benefit from a technology that “just works” during the ongoing consumption of the service without the need to tune it constantly, debug issues and so on.
Another overlooked aspect: During a DR scenario, IT people are often stressed by the issues they are facing, the downtime they are experiencing and the pressure for a quick return to operations from upper management. An easy-to-use solution gives you the ability to focus on the few easy steps required to restart the applications in the secondary site, while a complex solution compounds issues under stressful situations.
Veeam’s cloud technology, VM replication through Veeam Cloud Connect, is easy to use and simple to set up, thanks to a single TCP port connectivity protected by a secure, reliable SSL/TLS connection to the service provider of your choice. There’s no need to set up and maintain VPN connections or open multiple ports in customer firewalls. The single tunnel is used for any kind of traffic: replication management traffic, actual VM data transfers and even inter-VM communications during partial failovers. All communications are encapsulated into the single tunnel. Once the connection is established, there’s no need for any additional network configuration.
2. What about networking?
DRaaS is mainly designed to offer replication services, but replication alone is not enough. In fact, one of the biggest pain points of any DR service is NOT replication: It’s networking. Applications have specific network configurations, and many of the modern services need to be published on the internet in order to be consumed by employees, suppliers and customers. While replicating a virtual machine and powering it on at the DR site is relatively easy, guaranteeing that networking is properly configured and automatically re-programmed when a change in the configuration happens is NOT, and this guarantee is often taken for granted. When looking for a DRaaS solution, end users should really ask some specific questions to uncover a DRaaS solutions capability in this area. The possibility to move applications seamlessly between the primary and secondary site in a transparent fashion has no price except the time saved from reconfiguring applications during a failover.
3. Self-Service
Self-service is paramount to any cloud service, and DRaaS is no exception. How many operations can service providers automate? It may seem like this is only important for service providers, but automated service means end users can request changes to their subscriptions quicker and adjust their subscriptions to the DRaaS service quickly and easily. There is no value in consuming a cloud service if it takes a service provider hours or days to process a change request.
Obviously, self-service is also about how many operations you can execute on your own, without asking anything to the service provider. Ultimately, the service provider is in charge of maintaining the underlying infrastructure, while every operation of a company’s data and workloads can remain with that company.
This is for a simple yet powerful reason: The end user knows exactly how the service should be configured or how it should be reconfigured when needed.
However, self-service also has a huge value in reducing the downtime during a failover: If anything major happens to the end-user site, the user can leverage the self-service capabilities of the chosen service provider and quickly start a failover operation without losing time to open a support ticket with the service provider.