Conventional systems engineering methodologies do not provide guidance on how to reduce the risks of unexpected events. The term unexpected event has recently been coined, to stress the need to design systems to handle unexpected situations. This Wikipedia entry summarizes the contribution made by several scientists and methodologists (notably, Hollnagel, Casey, Reason, Dekker and Leveson), who specialize in human errors, to express their view that unexpected events should not be attributed to the person who happened to be there at the time of the mishap, and that they should not be regarded as force majeure. A better approach, which may help mitigating the risks of unexpected events, is by considering the human limited capability to coordinate with the system during the interaction.
In practice, systems cannot always meet all the expectations of its stakeholder.
- The number of all possible system states is too large to handle at design time
- The expectations are based on partial knowledge of the design
- The way the system is being operated changes with time, introducing new exceptional situations.
Well known examples of accidents due to unexpected events include:
Characteristics of unexpected events
The accident investigator Sidney Dekker observed that unexpected events, such as user slips and mode errors, are commonly considered as human errors. If the results are dramatic, they are typically attributed to the user's carelessness. If there is nobody to accuse, they may be regarded as force majeure. Dekker called this "the Old View" of accident investigation.
Recent studies demonstrate that systems engineers can reduce operational risks due to unexpected events by "the New View": by considering the human limitations, and by taking steps to ensure that the system and its users are working in a coordinated manner.
Main sources of abnormal system behavior include:
- A fault of a system component
- An operational error (either a slip or by mistake)
- A user's wrong perception of the system state
- An unexpected operational context.
The Old View
Recent accident studies demonstrate that people are tempted to regard the persons who issued the trigger, or who were evident to the trigger as "bad apples"
- When faced with a human error problem, you may be tempted to ask 'Why didn't they watch out better? How could they not have noticed?'. You think you can solve your human error problem by telling people to be more careful, by reprimanding the miscreants, by issuing a new rule or procedure. These are all expressions of 'The Bad Apple Theory', where you believe your system is basically safe if it were not for those few unreliable people in it. This old view of human error is increasingly outdated and will lead you nowhere.
Mitigating the risks of unexpected events
A new definition of human errors implies that they are the result of the mishap, not the source. The "new view" of mishaps is that they might be due to organizational failure to prevent them (see James Reason, [Organizational_models_of_accidents],(1997). This can be achieved by promoting safety culture and employing safety engineering.
Safety culture: James Reason argued that it is the duty of the organization to define the lines between acceptable and unacceptable behavior. Sidney Dekker proposed that organizations may do more for safety by promoting "Just Culture"
- Responses to incidents and accidents that are seen as unjust can impede safety investigations, promote fear rather than mindfulness in people who do safety-critical work, make organizations more bureaucratic rather than more careful, and cultivate professional secrecy, evasion, and self-protection. A just culture is critical for the creation of a safety culture. Without reporting of failures and problems, without openness and information sharing, a safety culture cannot flourish..
Safety engineering: Harel and Weiss proposed that the system may be designed so that the mishaps are avoided. This is a multi-disciplinary task: Common engineering methodologies and practices do not mitigate the risks of unexpected events, such as use errors or mode errors, typically attributed to ‘force majeure’. Traditional safety engineering is concerned about the first source of abnormal system behavior: component failure. The second source of abnormal system behavior: slips and mistakes, is typically done by human error analysis. User-centered design is concerned about the third source of abnormal system behavior: wrong user orientation. The fourth source of abnormal system behavior: unexpected operational context, may be managed using the STAMP (Systems-Theoretic Accident Model and Processes) approach to safety. A key feature of this framework is the association of constrain with normal system behavior. According to this model, accidents are due to the improper setting of constrain, or to insufficient means to enforce them on the system. The methods presented in this study are about setting and enforcing such constrains.
The paradigm of Extended System Engineering is that system engineers can mitigate such operational risks by considering the human limitations in assuring that the system and its operators are coordinated.
- , Mitigating the Risks of Unexpected Events by Systems Engineering
- , Hollnagel: Why "Human Error" is a Meaningless Concept
- , Casey: Set Phasers on Stun
- , Reason: Managing the Risks of Organizational Accidents
- , Dekker: The Field Guide to Understanding Human Error
- , Nancy Leveson home page
- , Thomas K. Landauer (1996): The Trouble with Computers: Usefulness, Usability, and Productivity
- , A. Harel, R. Kennett and F. Ruggeri (2008) - Modeling Web Usability Diagnostics on the basis of Usage Statistics, in: Statistical Methods in eCommerce Research, W. Jank and G. Shmueli editors, Wiley.
- , Sheridan & Nadler, (2006): Review of Human-Automation Interaction Failures and Lessons Learned (Report No. DOT-VNTSC-NASA-06-01)
- , Steven Casey (1993). Set Phasers on Stun, And Other True Tales of Design, Technology and Human Error, Aegean Publishing.
- , The design of future things
- , Sidney Dekker, (2006). The Field Guide to Understanding Human Error
- , Avi Harel (2008) Standards for Defending Systems against Interaction Faults, Incose International Symposium, Utrecht, The Netherlands.
- Hollnagel, E. (1991). The phenotype of erroneous actions: Implications for HCI design. In G. W. R. Weir and J. L. Alty (Eds.), Human-computer interaction and complex systems. Academic Press.
- , James Reason, (1998). Achieving a safe culture: theory and practice.
- , Sidney Dekker (2007). Just culture: balancing safety and accountability
- Donald. A. Norman (1980). Why people make mistakes. Reader’s Digest, 117, 103-106.
- Donald. A. Norman (1990). The "problem" with automation: Inappropriate feedback and interaction, not "over automation". In Human Factors in Hazardous Situations, D. E. Broadbent, J. Reason, and A. Baddeley, Eds. Clarendon Press, New York, NY, 137-145.
- , Nancy Leveson: Leveson, N.G. (2004). A New Accident Model for Engineering Safer Systems, Safety Science, Vol. 42, No. 4, pp. 237-270.
- , A. Zonnenshain, A. Harel (2008) - Extended System Engineering - ESE: Integrating Usability Engineering in System Engineering, The 17th International Conference of the Israel Society for Quality, Jerusalem, Israel
- , A. Zonnenshein, A. Harel (2009) - Task-oriented System Engineering, INCOSE International Symposium, Singapore.
- Steven Casey. Set Phasers on Stun
- Sheridan & Nadler: Review of Human-Automation Interaction Failures and Lessons Learned
- James Reason. Achieving a safe culture: theory and practice
- Sidney Dekker (2007). Just culture: balancing safety and accountability
- Nancy Leveson. A New Accident Model for Engineering Safer Systems
- Extended System Engineering
- Task-oriented System Engineering, INCOSE International Symposium, Singapore, abstract
- Task-oriented System Engineering - full article
- Mitigating the Risks of Unexpected Events by Systems Engineering
- Standards for Defending Systems against Interaction Faults