30 November 2004

When Systems Fail It's No Accident

Once in a while some beautiful technological achievement fails catastrophically. How does that happen? Usually it is due to a sequence of independent errors, accidents, and misjudgments. When these faults line up, disaster happens.

The explosion aboard the Apollo 13 Service Module that almost cost the lives of three U.S. astronauts in 1970 has been studied extensively, and its many causes are known.

The problem was with oxygen tank 2. The following is mostly verbatim from NASA's web page on the accident and the Apollo 13 Review Board Report. (Skip to the bottom if you're in a hurry.)

1. The oxygen tanks had originally been designed to run off the 28 volt DC power of the command and service modules. However, the tanks were redesigned to also run off the 65 volt DC ground power at Kennedy Space Center. All components were upgraded to accept 65 volts except the heater thermostatic switches, which were overlooked. These switches were supposed to open and turn off the heater when the tank temperature reached 80 degrees F. (Normal temperatures in the tank were -300 to -100 F.)

2. The thermostatic switch discrepancy was not detected by NASA, NR, or Beech in their review of documentation, nor did tests identify the incompatibility of the switches with the ground support equipment at KSC, since neither qualification nor acceptance testing required switch cycling under load as should have been done.

3. The no. 2 oxygen tank used in Apollo 13 had originally been installed in Apollo 10. It was removed from Apollo 10 for modification and during the extraction was dropped 2 inches, slightly jarring an internal fill line. In itself, the displaced fill tube assembly was not particularly serious, but it led to the use of improvised detanking procedures at KSC which almost certainly set the stage for the accident.

4. During pre-flight testing, tank no. 2 would not empty correctly, possibly due to the damaged fill line. (On the ground, the tanks were emptied by forcing oxygen gas into the tank and forcing the liquid oxygen out; in space there was no need to empty the tanks.) The heaters in the tanks were normally to be used only for very short periods to heat the interior slightly, increasing the pressure to keep the oxygen flowing. When the tank would not empty normally, It was decided to use the heater to "boil off" the excess oxygen, requiring 8 hours of 65 volt DC power.

It is believed that in trying to open as the temperature rose the thermostat switches arced, being designed for lower voltage, and welded shut, allowing the temperature within the tank to rise to over 1000 degrees F in spots.

5. The gauges measuring the temperature inside the tank were designed to measure only to 80 F, so the extreme heating was not noticed. The high temperature emptied the tank, but also resulted in serious damage to the Teflon insulation on the electrical wires to the power fans within the tank.

6. The Teflon insulation was flammable in pure oxygen, given an ignition source.

56 hours into the mission the power fans were turned on within the tank for the third "cryo-stir" of the mission, a procedure to stir the oxygen slush inside the tank so it wouldn't stratify. The exposed fan wires shorted and the Teflon insulation caught fire. This fire rapidly heated and increased the pressure of the oxygen inside the tank, and may have spread along the wires to the electrical conduit in the side of the tank, which weakened and ruptured under the pressure, causing the no. 2 oxygen tank to explode. This damaged the no. 1 tank and parts of the interior of the service module and blew off the bay no. 4 cover.

Astronaut Swigert said, "Okay, Houston, we've had a problem here."

So there you have it:
  • Overlooked upgrading thermostat switched during design modification
  • Poorly designed test procedures failed to reveal the switch problem
  • Jarred tank displacing fill tube
  • Used untested emptying procedure; thermostat switches failed; tank overheated; insulation was damaged. The tank was now a bomb.
  • Tank temperature gauges only read to 80 F, so overheating was not obvious. Nobody noticed that although the tank temperature had reached the top of the scale, the switches had not opened, as shown by current readings on the control panel.
  • Teflon insulation was flammable in its pure oxygen environment.
If any one of those six things hadn't happened, the accident would not have occurred.

But after that sequence of errors it was only a matter of time before the fan wires shorted, the fire started and the tank exploded. (It was only on the third stir that they shorted, perhaps jostled by the stirring itself. If it had happened earlier the crew would probably have been lost.)

Most big engineering failures (bridge and building collapses, plane crashes, ship sinkings) follow this pattern. A number of little failures add up to one big disaster. More examples at this site. Wikipedia has some further links.

No comments: