"Planners Prepare, Survivors React"SM
Disaster preparedness does not prevent disasters. The shift in philosophy over the past few years has been away from prevention to planed response. If employees are trained to react to a disaster, they may be able to mitigate its effect (or at least prevent further disasters from happening).
Even diversified systems are not immune from failure or overload.
TCP/IP is a rather handy tool in this respect: Network planning for networked systems anticipates the survival of the network. . . even in the case of the wholesale destruction of entire sub networks. Failures of entire regions can be anticipated and addressed (e.g., Chicago goes up in flames, but New York and LA handle the distributed load usually handled by Chicago.). However, systems in New York and LA each become at risk for the failure of the other. A subsequent earthquake in LA leaves New York as the only center left to carry the anticipated load for New York and the failed Chicago and LA centers.
A good disaster preparedness plan may not be able to anticipate or handle multiple systems failures.
For example, in 1999, a tornado struck Salt Lake City at about the same time that downtown Chicago experienced a carrier network failure and an unrelated city power failure. The plights of the unusual tornado in Salt Lake City and the dual failures of MCI and the downtown Chicago power failure of several generators reveal the complexity of disaster preparedness planning.
Let's examine the plight of a hypothetical business in Salt Lake City with a backup in downtown Chicago. Suppose that the network equipment was at a facility in Salt Lake City and that the equipment was damaged by the tornado. If (under the disaster preparedness escalation procedure) the system was switched over to downtown Chicago, the risk of loss would be worsened by the MCI and subsequent power failures. The disaster preparedness team may have had very few options left in their escalation procedure. The laws of probability often collide headlong into Murphy's law. Even dual systems are subject to failure.
Both AT&T and MCI have experienced major frame relay outages. In both cases, the carrier shifted blame to the underlying manufacturers (Cisco and Lucent). If the major carriers are not immune to failure, then smaller carriers are at equal or greater risk. No single system is immune from failure.
In Chicago, ComEd had already lost one generator. Two more failed and the power was turned off in the loop to save the fourth. Some of the telecommunications buildings in the South Loop were serviced by a secondary power grid. Temperatures were moderate on the day of the dual outages. Results could have been far worse if it had been an exceptionally hot day.
Success can also cause failures:
Planning for success may also mean planning for surges in business that may overwhelm a business. Victoria's Secret and Encyclopedia Britannica and countless others were, at least on one occasion, not able to handle the rush of traffic to their services. Network traffic and its behavior is hard to visualize, but an event in Chicago provided a physical image. In the year that the Cubs had to play an extra game following the loss of an opponent with a similar record. Within minutes of the loss, people were running through the streets of Chicago to line up at Wrigley Field, or a neighborhood Tower Records, Osco or anyplace else that sold Cubs tickets. The demand for a new product, in an instant, surged from nothing to thousands of people running in the streets in the vain hopes of getting the limited number of tickets.