Regardless of size, every business gets hurt when downtime strikes. Small and medium businesses (SMBs) take a big hit when their systems go down. An ITIC study found that nearly half of SMBs estimate that a single hour of downtime costs as much as $100,000 in lost revenue, end-user productivity, and IT support.
That’s why more and more by SMBs are adopting disaster recovery as a service (DRaaS). One study shows 34 percent of companies plan to migrate to DRaaS in 2021.
Cloud-based backup and disaster recovery solutions are often at the top of the list when considering DRaaS solutions. It allows businesses to access data anywhere, anytime, with certainty because the best disaster recovery clouds are highly distributed and fault-tolerant, delivering 99.999+ percent uptime.
This article is intended to help SMBs understand the key components and requirements of a DRaaS solution.
DRaaS – the basics:
Replication: Distributed Backups Maximize Data Protection
Data replication is the process of updating copies of data in multiple places at the same time. Replication serves a single purpose: it makes sure data is available to users when they need it.
Data replication synchronizes data source—say primary storage—with backup target databases, so when changes are made to source data, it is quickly updated in backups. The target database could include the same data as the source database—full-database replication—or a subset of the source database.
For backup and disaster recovery, it makes sense to make full-database replications. At the same time, companies can also reduce their source database workloads for analysis and reporting functions by replicating subsets of source data, say by business department or country, to backup targets.
Image Management: Managing and Maintaining Backup Images
As companies continue to add more backups over time, they’ll need to manage these accumulated images and the storage space the images consume. Image management solutions with a managed-folder structure allow companies to spend less time configuring settings on backups. But that’s just the start. These solutions can also provide image verification so that backup image files are ready and available for fast, reliable recovery and advanced image verification that delivers regular visual confirmation that backups are working correctly.
To reduce restoration time, risk of backup file corruption, and reduce storage space required, image management solutions can automatically consolidate continuous incremental backup image files. And companies can balance storage space and file recovery by setting policies that suit their needs and easily watch over backup jobs in the user interface, with alerts sent when any issues arise.
Image management solutions allow the management of system resources to enable throttling and concurrent processing. And backups are replicated onto backup targets—local, on-network, and cloud—so companies are always prepared for disaster. And the solution allows pre-staging of the recovery of a server before disaster strikes to reduce downtime.
Failover: The Core Driver for Business Continuity
Failover is a backup operational mode that switches to a standby database, server, or network if the primary system fails or is offline for maintenance. Failover ensures business continuity by seamlessly redirecting requests from the failed or downed mission-critical system to the backup system. The backup systems should mimic the primary operating system environment and be on another device or in the cloud.
With failover capabilities for important servers, backend databases, and networks can count on continuous availability and near-certain reliability. Say the primary onsite server fails. Failover takes over hosting requirements with a single click. Failover also lets companies run maintenance projects, without human oversight, during scheduled software updates. That ensures seamless protection against cybersecurity risks.
Why Failover Matters
While failover integration may seem costly, it’s crucial to bear in mind the incredibly high cost of downtime. Think of failover as a critical safety and security insurance policy. And failover should be an essential part of any disaster recovery plan. From a systems engineering standpoint, the focus should be on minimizing data transfers to reduce bottlenecks while ensuring high-quality synchronization between primary and backup systems.
Failback: Getting Back to Normal
Failback is the follow-on to failover. While failover is switching to a backup source, failback is the process of restoring data to the original resource from a backup. Once the cause of the failover is remedied, the business can resume normal operations. Failback also involves identifying any changes made while the disaster recovery site or virtual machine was running in place of the primary site or virtual machine.
It’s crucial that the disaster recovery solution can run the company’s workloads and sustain the operations for as long as necessary. That makes failback testing critical as part of the disaster recovery plan. It’s essential to monitor any failback tests closely and document any implementation gaps so they can be closed. Regular failback testing will save critical time when the company needs to get its house back in order.
Companies need to consider several important areas regarding the failback section of their disaster recovery plan. Connectivity is first on the list. If there isn’t a reliable connection or pathway between the primary and backup data, failback likely won’t even be possible. A secure connection ensures that a failback can be performed without interruption. Companies can be sure that their source data and backup target data are always synchronized, so the potential for data loss is minimized.
Companies also need to ensure that data stored in their disaster recovery site is always secure. If a disaster strikes, it may be impossible to recover quickly. Suppose a failover does occur and the company’s operations are now running from a disaster recovery cloud. In that case, they need to protect the data in that virtual environment by replicating it to their backup targets immediately. That’s why network bandwidth is the next concern. If SMBs don’t have sufficient bandwidth, bottlenecks and delays will interfere with synchronization and hamper recovery.
Testing is the most critical element for ensuring failback is successful when businesses need it. That means testing all systems and networks to ensure they are capable of resuming operations after failback. It’s advisable to use an alternate location as the test environment and use the test’s knowledge to optimize the failback strategies.
Whether it’s a natural disaster like a hurricane or a flood, a regional power outage, or even ransomware, there is little doubt about the business case for DRaaS. With DRaaS ensuring business continuity, no matter what happens, recovery from a sitewide disaster is fast and easy to perform from a disaster recovery cloud. Add up the cost to a business in dollars and cents: lost data, lost productivity and reputation damage. Just an hour of downtime could pay for a year—or many years, for that matter—of DRaaS.
Shridar Subramanian is the CMO of Arcserve. He has more than 23 years of experience in information technology. Previously, Shridar was the VP of marketing at Virident Systems, a leading provider of PCI SSDs, and he also was the senior director of marketing at Monosphere Inc., a storage virtualization software company.