4.1 Fault Tree
4.1 Fault Tree

The NASA Accident Investigation Team investigated the accident using "fault trees," a common organizational tool in systems engineering. Fault trees are graphical representations of every conceivable sequence of events that could cause a system to fail. The fault trees uppermost level illustrates the events that could have directly caused the loss of Columbia by aerodynamic breakup during re-entry. Subsequent levels comprise all individual elements or factors that could cause the failure described immediately above it. In this way, all potential chains of causation that lead ultimately to the loss of Columbia can be diagrammed, and the behavior of every subsystem that was not a precipitating cause can be eliminated from consideration. Figure 4.1-1 depicts the fault tree structure for the Columbia accident investigation. NASAchartered six teams to develop fault trees, one for each of the Shuttles major components: the Orbiter, Space Shuttle Main Engine, Reusable Solid Rocket Motor, Solid Rocket Booster, External Tank, and Payload. A seventh "systems integration" fault tree team analyzed failure scenarios involving two or more Shuttle components. These interdisciplinary teams included NASA and contractor personnel, as well as outside experts.

Some of the fault trees are very large and intricate. For instance, the Orbiter fault tree, which only considers events on the Orbiter that could have led to the accident, includes 234 elements. In contrast, the Systems Integration fault tree, which deals with interactions among parts of the Shuttle, includes 295 unique multi-element integration faults, 128 Orbiter multi-element faults, and 221 connections to the other Shuttle components. These faults fall into three categories: induced and natural environments (such as structural interface loads and electromechanical effects); integrated vehicle mass properties; and external impacts(such as debris from the ExternalTank).Because the Systems Integration team considered multi-element faults ­ that is, scenarios involving several Shuttle components ­ it frequently worked in tandem with the Component teams.

Figure 4.1-1. Accident investigation fault tree structure.

In the case of the Columbia accident, there could be two plausible explanations for the aerodynamic breakup of the Orbiter: (1) the Orbiter sustained structural damage that undermined attitude control during re-entry; or (2) the Orbiter maneuvered to an attitude in which it was not designed to fly. The former explanation deals with structural damage initiated before launch, during ascent, on orbit, or during re-entry. The latter considers aerodynamic breakup caused by improper attitude or trajectory control by the Orbiters Flight Control System. Telemetry and other data strongly suggest that improper maneuvering was not a factor. Therefore, most of the fault tree analysis concentrated on structural damage that could have impeded the Orbiters attitude control, in spite of properly operating guidance, navigation, and flight control systems.

When investigators ruled out a potential cascade of events, as represented by a branch on the fault tree, it was deemed "closed." When evidence proved inconclusive, the item remained "open." Some elements could be dismissed at a high level in the tree, but most required delving into lower levels. An intact Shuttle component or system (for example, a piece of Orbiter debris) often provided the basis for closing an element.Telemetrydatacanbeequallypersuasive:itfrequently demonstrated that a system operated correctly until the loss of signal, providing strong evidence that the system in question did not contribute to the accident. The same holds true for data obtained from the Modular Auxiliary Data System recorder, which was recovered intact after the accident.

The closeout of particular chains of causation was examined at various stages, culminating in reviews by the NASA Orbiter Vehicle Engineering Working Group and the NASA Accident Investigation Team. After these groups agreed to close an element, their findings were forwarded to the Board for review. At the time of this reports publication, the Board had closed more than one thousand items.Asummary of fault tree elements is listed in Figure 4.1-2.

Figure 4.1-2. Summary of fault tree elements reviewed by the Board.

The open elements are grouped by their potential for contributing either directly or indirectly to the accident.The first group contains elements that may have in any way contributed to the accident. Here, "contributed" means that the element may have been an initiating event or a likely cause of the accident. The second group contains elements that could not be closed and may or may not have contributed to the accident. These elements are possible causes or factors in this accident. The third group contains elements that could not be closed, but are unlikely to have contributed to the accident. Appendix D.3 lists all the elements that were closed and thus eliminated from consideration as a cause or factor of this accident.

Some of the element closure efforts will continue after this report is published. Some elements will never be closed, because there is insufficient data and analysis to unconditionally conclude that they did not contribute to the accident. For instance, heavy rain fell on Kennedy Space Center prior to the launch of STS-107. Could this abnormally heavy rainfall have compromised the External Tank bipod foam? Experiments showed that the foam did not tend to absorb rain, but the rain could not be ruled out entirely as having contributed to the accident. Fault tree elements that were not closed as of publication are listed inAppendix D.4.