Helping you compute the right future since 1986
 

 

DISASTER PLANNING
Planning for computer failures, natural disasters and human error.

WHAT CONSTITUTES A DISASTER?

It seemed appropriate to focus on disaster planning for our first "focus" of 1997 since January marked the 3rd anniversary of the Northridge earthquake that left us without an office for months. This is one kind of natural disaster that every business can prepare for, as well as the more common disasters which include equipment failure or human errors such as file erasure. Here is a list of the kinds of disasters or situations which can be overcome with careful planning:

bulletData loss from human or computer malfunction
bulletPower outages
bulletEquipment failure
bulletFile, earthquake, or flood
bulletRobbery

We find that today businesses rely more and more upon their computers and networks in their daily routines. Often, being without your computer for just a few hours can be a major pain and when a system is down for a few days, this can become a major obstacle to doing business. This month, I’d like to focus on ways to avoid, mitigate, or prepare for the inevitable down-time that we all face at one time or another.

HOW LONG CAN YOU AFFORD TO BE DOWN?

This is the most important question to ask yourself when determining a strategy for disaster preparation. If you rely very little on computers and the data they process, you may be able to get by for days or weeks without their services. Other businesses can’t be down for even an hour and therefore have many redundant and backup systems in place which can be used during times of failure or disaster. I’ve broken it down into three simple categories: 1 Hour, 1 Day, or 1 Week.

One Hour

If you feel that one hour is too long to be down, then you’re in this critical category where extra measures need to be taken to both prevent and plan for immediate plans of action should a failure occur. Measures that should be implemented include battery backup devices on ALL fileservers and workstations, nightly as well as mid-day backups on tape or to disk, backup or standby server for networks in case of a server failure, and duplication of equipment and software installations wherever possible. Duplicate copies of important programs and data can also be maintained on a laptop computer in the event you are forced to leave your offices quickly. Please review items a, b, c, and d below for more details.

One Day

This category probably fits a large percentage of businesses. Being down a full day is definitely a problem but can be managed. Again, common problems can be avoided by utilizing battery backups on servers and workstations, implementing daily backups, and duplicating installation of those important software applications that you might need to use in a pinch. Other measures that can be taken in a network environment include keeping any "unique" parts on-hand for an emergency. Although we stock standard items, some installations cannot be duplicated especially as equipment ages. Review items a, b, and c for more details.

One Week

If you find yourself in this category, you reliance on computer technology is low. In actuality, many businesses may be between the One Day and One Week categories when push comes to shove. Because almost any equipment can be replaced in a weeks time, very little or no redundancy of hardware is required. However, you data is irreplaceable so you still need to be concerned about getting good backups – daily if possible. Finally, don’t forget about keeping software diskettes in a safe, secure place where they can be located especially if your backup routine does not include backup of program files. See items a and b for more details.

PLANNING FOR ALL THESE PROBLEMS

Now that you’ve decided which time-frame category you’re in, you need to review the appropriate measures listed below so that they can be avoided or dealt with appropriately.

A - Backups

This area is the single most important technique for handling disasters. While equipment may fail or the earth may move, you can get back in business eventually and more easily if you have important data and/or programs on diskette or tape. We generally recommend backing up all files and doing it on a nightly basis – more often if importance requires it. It is also critically important that multiple copies be kept. If you’re using a tape backup, rotate tape cartridges and use at least a 3 tape rotation – preferably 6 tapes or more with one or more tapes stored off-site in the event of a disaster involving your office space. Additional tapes rotated on a monthly basis provide a means to recover lost or damaged files where loss is not recognized for some time. Don’t forget to keep copies of all important software and have backup copies available when possible – especially for those outdated programs not easily re-purchased!

B - Preventive Measures

These steps can help you prevent downtime due to power outages, hardware and software malfunctions. If you are running any type of accounting software or database, make sure you utilize battery backup devices on all of your workstations and fileservers. Power outages (even those so brief they go undetected) account for many data corruption problems that result in downtime and consulting fees to correct these problems. Good quality power protection is available from about $100 per workstation. Next, don’t forget about security measures to prevent unauthorized usage and sabotage. Basic measures such as password protection are often overlooked in small offices because of the informal environment -- but disenchanted employees, cleaning crews, or others should be discouraged even if just with a Windows screen saver password. Also, annual maintenance of your PC to keep it free of dust and to check for problems with cooling fans can help system failures from overheating. Lastly, regular software maintenance can keep your PC operating at peak efficiency. Checking your system for viruses, hard drive errors and fragmentation, and cleaning floppy drives all will help maintain the health of your PC. Also in this category is the installation of new software and updates – don’t install updates or new products unless you have to!! Most problems that we encounter with Windows have to do with updates that cause problems or that interfere with other installed software by altering your environment in some way. It’s always best to test installations on one PC first before installing a product network-wide. In a Windows 95 environment, try to stick with 32-bit Windows applications only where possible – most of which have uninstall options. The old saying "if it ain’t broke don’t fix it" applies here, but eventually as new products are introduced you’ll be forced to upgrade other software and/or hardware on your system.

C - Duplication

Duplication is one of the easiest and cost-effective ways to deal with problems when they arise, especially for those offices that either have a network or multiple standalone computers. You’ve already won half the battle by having more than one computer – you just need to have duplicate copies of your important software programs installed on more than one computer and backup copies of important data that is accessible either on a workstation tape backup or copied to the local hard drive. In a network environment, the most severe problem is the loss of the fileserver. Often the tape backup drive is also attached to the fileserver so that data is not readily accessible in the short-term. Therefore, important (not all) data can be copied to local hard drives as part of your nightly or even mid-day backup routine. By keeping several workstations loaded with software and data that you would need if your server goes down, you can be up-and-running in a matter of minutes should the need arise. Finally, if your hardware or software is unique in any way, purchase spare parts or make duplicate copies of software diskettes to have on hand. If you have a lapto tebook computer, keep important programs loaded and ready to go along with a backup tape in the event of a natural disaster or building evacuation.

D - Total Fault Tolerance

These measures are primarily intended for use in network environments to provide continuous or near-continuous up-time even when hardware components fail. Although fileservers rarely fail, there are many components which can go down and any one of which could put you down for a day or more. Power supplies, hard drives, motherboards, add-in cards such as video, and battery backups are all components that can fail. The more fault tolerant components you purchase, the better your chances of staying up! Because most components are quite reliable, its hard to suggest just specific fault tolerant pieces while ignoring other areas so you must look at the cost to implement each.

First, fault tolerant power supplies (basically two power supplies run in parallel) can add $300 or more to the cost of your server but keep your server running even if one fails. Mirrored or duplexed hard drives or RAID units increase the cost of your server by 50% or more (depending upon your configuration this can be $400, $800, $1200 or more) and protect you from hard drive failure. To get a truly fault tolerant server, one must purchase two duplicate servers and appropriate software to keep the two synchronized. One such product that works with NetWare is called StandBy Server – these more than double the total cost of your network meaning an extra $5000 or more for the added reliability.

One thing you must keep in mind is that each of the preventive and planning measures presented in this article are cumulative. Therefore, you need to decide how long you can be down, what type of disasters you will plan for, and then implement those plans and make them a regular procedure.

Just remember -- the best plans provide no benefit if they are not carried out religiously!

 

Our previous Focus On Solutions covered E-Mail Solutions. If you miss any issues in our Focus On Solutions series, be sure to check our web-site at http://home.earthlink.net/~cms for these and other useful information.

 

Dave McCann

 


[Return]

This document published March 3, 1997

 

Email questions or comments to webmaster@classicmicro.com
Copyright © 2003 Classic Micro Systems (818) 786-1979. All rights reserved.
Revised: April 09, 2008.