In my last blog, “Managed Services, Part 1: Making Safe, Documented, and Reliable Changes,” I defined one of the five cores of support of managed services—change management. In this blog, I will cover backup and recovery.

Have you ever watched your preteen freak out when he discovers that an extensive essay he wrote for school and saved as a Word document on his computer was lost? Now imagine how a CEO or CFO would feel upon discovering that files were lost on his or her computer. There would be more than a freak out. I presume that such a catastrophe could even lead to a dismissal or two.

Anyone who has a computer looks at their files as a precious commodity—in some cases, more valuable than gold. So, any managed services provider worth its salt should have a protocol for backing up and recovering data.

A hosting and managed services provider would ensure, for example, that backup data is fully encrypted to protect it from snooping competitors or disgruntled employees.  The host would also ensure that only you and a representative of the hosting company have the key and access to the material. Additionally, they would also fully distribute data across all the available zones of the cloud that you’re using. This ensures that it’s easily backed up should something disastrous occur. You also want to make sure that you have a customized amount of backup in the cloud. You’ll also need to think about how often you want your data backed up and offer a range from just a few minutes to a few months for it to be retained.

A hosting company worth partnering with would be aware that a business does not want to keep backup data indefinitely and make certain that its protocol is to regularly destroy it.

Furthermore, you will want to be aware of the level of backup the host can provide; and what recovery mode the host uses if the cloud infrastructure crashes. The host should be clear about the turnaround time for dealing with recovery issues.

Intrigued? Want to know more? To fully understand the process of a managed services provider, we really should look at a best practice for hosting and managed services.

How Companies Ensure Data Backup and Recovery

The first thing to realize is that a hosting managed service provider usually uses a third party to provide the cloud infrastructure for infrastructure as a service (IaaS), which assists in the management process. One example of such a third party is Amazon. The company offers Amazon Web Service (AWS) cloud regions to store data. Whoever the third-party cloud provider is, each region of the cloud you use includes two or more independent availability zones and each of these zones uses one or more data centers.  With AWS, there is a data center in each of 10 regions across the globe.

A backup occurs when a snapshot is made of the data of each tier in the elastic block storage (EBS) system of the cloud. The snapshot is taken occasionally and stored while still encrypted in a section of the cloud that makes the files accessible in all availability zones.

The backup service of the system periodically makes encrypted copies of your website and distributes them across all available zones of the cloud.  The process is performed daily and the backup data is retained for as long as you want it to be. The backup process is efficient and does not disturb the running of any applications.

Five disaster-recovery incidences trigger an action when recovery issues arise within the cloud:

  1. Failure of an individual application server or data volume.
  2. Failure of all application servers or data volumes in a solution tier.
  3. Complete failure of an availability zone.
  4. Failure of the elastic load balancer (ELB) with or without an accompanying failure of instances.
  5. Failure of an entire cloud region.

Should incident 1 occur, redundant architecture assumes that the cloud operates redundantly.  So if a single server fails, there is probably no affect on the impact cluster operation.  The data in the EBS remains intact. Other options to fix the issue include cloning operating systems. The data is recovered within a varied amount of time. You should ask your hosting company what that time is in their system. The process will not affect the computer’s operation.

If incident 2 occurs, there could be an affect on the user. The data is recovered from the EBS or from backups. The process time of the recovery may vary. Again, check with your hosting managed service provider as to how much time elapses before backup is assured.

Should incident 3 happen, the redundant architecture of the system would recover the data without impacting the user.  Ask the provider what the recovery time would be on its system.

If incident 4 takes place, the ELB is commonly redundant and will recover without affecting the user. The action taken in incident 3 could also handle the problem.

Finally, if incident 5 happens, additional capacity in the part of the cloud region that is functioning will temporarily start. The total failure of an entire cloud region would result in limited impact because the cloud includes a redundant deployment in the latency-sensitive Domain Name System (DNS).

Finally, it would be helpful if your host practices disaster recovery through so-called “war games,” which simulate an issue.  Some companies with such a procedure perform the test every two to four weeks.

Your data is precious. Assure yourself that your managed service provider employs a system that can back up and recover data fully and effectively.