We find that today businesses rely more and more upon their computers and networks in
their daily routines. Often, being without your computer for just a few hours can be a
major pain and when a system is down for a few days, this can become a major obstacle to
doing business. This month, Id like to focus on ways to avoid, mitigate, or prepare
for the inevitable down-time that we all face at one time or another.
HOW LONG CAN YOU AFFORD TO BE DOWN?
This is the most important question to ask yourself when determining a strategy for
disaster preparation. If you rely very little on computers and the data they process, you
may be able to get by for days or weeks without their services. Other businesses
cant be down for even an hour and therefore have many redundant and backup systems
in place which can be used during times of failure or disaster. Ive broken it down
into three simple categories: 1 Hour, 1 Day, or 1 Week.
One Hour
If you feel that one hour is too long to be down, then youre in this critical
category where extra measures need to be taken to both prevent and plan for immediate
plans of action should a failure occur. Measures that should be implemented include
battery backup devices on ALL fileservers and workstations, nightly as well as mid-day
backups on tape or to disk, backup or standby server for networks in case of a server
failure, and duplication of equipment and software installations wherever possible.
Duplicate copies of important programs and data can also be maintained on a laptop
computer in the event you are forced to leave your offices quickly. Please review items a,
b, c, and d below for more details.
One Day
This category probably fits a large percentage of businesses. Being down a full day is
definitely a problem but can be managed. Again, common problems can be avoided by
utilizing battery backups on servers and workstations, implementing daily backups, and
duplicating installation of those important software applications that you might need to
use in a pinch. Other measures that can be taken in a network environment include keeping
any "unique" parts on-hand for an emergency. Although we stock standard items,
some installations cannot be duplicated especially as equipment ages. Review items a, b,
and c for more details.
One Week
If you find yourself in this category, you reliance on computer technology is low. In
actuality, many businesses may be between the One Day and One Week categories when push
comes to shove. Because almost any equipment can be replaced in a weeks time, very little
or no redundancy of hardware is required. However, you data is irreplaceable so you still
need to be concerned about getting good backups daily if possible. Finally,
dont forget about keeping software diskettes in a safe, secure place where they can
be located especially if your backup routine does not include backup of program files. See
items a and b for more details.
PLANNING FOR ALL THESE PROBLEMS
Now that youve decided which time-frame category youre in, you need to
review the appropriate measures listed below so that they can be avoided or dealt with
appropriately.
A - Backups
This area is the single most important technique for handling disasters. While
equipment may fail or the earth may move, you can get back in business eventually and more
easily if you have important data and/or programs on diskette or tape. We generally
recommend backing up all files and doing it on a nightly basis more often if
importance requires it. It is also critically important that multiple copies be kept. If
youre using a tape backup, rotate tape cartridges and use at least a 3 tape rotation
preferably 6 tapes or more with one or more tapes stored off-site in the event of a
disaster involving your office space. Additional tapes rotated on a monthly basis provide
a means to recover lost or damaged files where loss is not recognized for some time.
Dont forget to keep copies of all important software and have backup copies
available when possible especially for those outdated programs not easily
re-purchased!
B - Preventive Measures
These steps can help you prevent downtime due to power outages, hardware and software
malfunctions. If you are running any type of accounting software or database, make sure
you utilize battery backup devices on all of your workstations and fileservers. Power
outages (even those so brief they go undetected) account for many data corruption problems
that result in downtime and consulting fees to correct these problems. Good quality power
protection is available from about $100 per workstation. Next, dont forget about
security measures to prevent unauthorized usage and sabotage. Basic measures such as
password protection are often overlooked in small offices because of the informal
environment -- but disenchanted employees, cleaning crews, or others should be discouraged
even if just with a Windows screen saver password. Also, annual maintenance of your PC to
keep it free of dust and to check for problems with cooling fans can help system failures
from overheating. Lastly, regular software maintenance can keep your PC operating at peak
efficiency. Checking your system for viruses, hard drive errors and fragmentation, and
cleaning floppy drives all will help maintain the health of your PC. Also in this category
is the installation of new software and updates dont install updates or new
products unless you have to!! Most problems that we encounter with Windows have to do with
updates that cause problems or that interfere with other installed software by altering
your environment in some way. Its always best to test installations on one PC first
before installing a product network-wide. In a Windows 95 environment, try to stick with
32-bit Windows applications only where possible most of which have uninstall
options. The old saying "if it aint broke dont fix it" applies here,
but eventually as new products are introduced youll be forced to upgrade other
software and/or hardware on your system.
C - Duplication
Duplication is one of the easiest and cost-effective ways to deal with problems when
they arise, especially for those offices that either have a network or multiple standalone
computers. Youve already won half the battle by having more than one computer
you just need to have duplicate copies of your important software programs installed on
more than one computer and backup copies of important data that is accessible either on a
workstation tape backup or copied to the local hard drive. In a network environment, the
most severe problem is the loss of the fileserver. Often the tape backup drive is also
attached to the fileserver so that data is not readily accessible in the short-term.
Therefore, important (not all) data can be copied to local hard drives as part of your
nightly or even mid-day backup routine. By keeping several workstations loaded with
software and data that you would need if your server goes down, you can be up-and-running
in a matter of minutes should the need arise. Finally, if your hardware or software is
unique in any way, purchase spare parts or make duplicate copies of software diskettes to
have on hand. If you have a lapto tebook computer, keep important programs loaded
and ready to go along with a backup tape in the event of a natural disaster or building
evacuation.
D - Total Fault Tolerance
These measures are primarily intended for use in network environments to provide
continuous or near-continuous up-time even when hardware components fail. Although
fileservers rarely fail, there are many components which can go down and any one of which
could put you down for a day or more. Power supplies, hard drives, motherboards, add-in
cards such as video, and battery backups are all components that can fail. The more fault
tolerant components you purchase, the better your chances of staying up! Because most
components are quite reliable, its hard to suggest just specific fault tolerant pieces
while ignoring other areas so you must look at the cost to implement each.
First, fault tolerant power supplies (basically two power supplies run in parallel) can
add $300 or more to the cost of your server but keep your server running even if one
fails. Mirrored or duplexed hard drives or RAID units increase the cost of your server by
50% or more (depending upon your configuration this can be $400, $800, $1200 or more) and
protect you from hard drive failure. To get a truly fault tolerant server, one must
purchase two duplicate servers and appropriate software to keep the two synchronized. One
such product that works with NetWare is called StandBy Server these more than
double the total cost of your network meaning an extra $5000 or more for the added
reliability.
One thing you must keep in mind is that each of the preventive and planning measures
presented in this article are cumulative. Therefore, you need to decide how long you can
be down, what type of disasters you will plan for, and then implement those plans and make
them a regular procedure.
Just remember -- the best plans provide no benefit if they are not carried out
religiously!
Our previous Focus On Solutions covered E-Mail Solutions. If you miss any issues in our
Focus On Solutions series, be sure to check our web-site at http://home.earthlink.net/~cms
for these and other useful information.
Dave McCann

[Return]
This document published March 3, 1997