Technique for human error-rate prediction

The technique for human error-rate prediction (THERP) is a technique used in the field of human reliability assessment (HRA), for the purposes of evaluating the probability of a human error occurring throughout the completion of a specific task. From such analyses measures can then be taken to reduce the likelihood of errors occurring within a system and therefore lead to an improvement in the overall levels of safety. There exist three primary reasons for conducting an HRA; error identification, error quantification and error reduction. As there exist a number of techniques used for such purposes, they can be split into one of two classifications; first generation techniques and second generation techniques. First generation techniques work on the basis of the simple dichotomy of ‘fits/doesn’t fit’ in the matching of the error situation in context with related error identification and quantification and second generation techniques are more theory based in their assessment and quantification of errors. ‘HRA techniques have been utilised in a range of industries including healthcare, engineering, nuclear, transportation and business sector; each technique has varying uses within different disciplines.

THERP models human error probabilities (HEPs) using a fault-tree approach, in a similar way to an engineering risk assessment, but also accounts for performance shaping factors (PSFs) that may influence these probabilities. The probabilities for the human reliability analysis event tree (HRAET), which is the primary tool for assessment, are nominally calculated from the database developed by the authors Swain and Guttman; local data e.g. from simulators or accident reports may however be used instead. The resultant tree portrays a step by step account of the stages involved in a task, in a logical order. The technique is known as a total methodology [1] as it simultaneously manages a number of different activities including task analysis, error identification, representation in form of HRAET and HEP quantification.

Background

The technique for human error rate prediction (THERP) is a first generation methodology, which means that its procedures follow the way conventional reliability analysis models a machine. [7] The technique was developed in the Sandia Laboratories for the US Nuclear Regulatory Commission [2]. Its primary author is Swain, who developed the THERP methodology gradually over a lengthy period of time. [1]. THERP relies on a large human reliability database that contains HEPs, and is based upon both plant data and expert judgments. The technique was the first approach in HRA to come into broad use and is still widely used in a range of applications even beyond its original nuclear setting.

THERP methodology

The methodology for the THERP technique is broken down into 5 main stages:

1. Define the system failures of interest These failures include functions of the system where human error has a greater likelihood of influencing the probability of a fault, and those of interest to the risk assessor; operations in which there may be no interest include those not operationally critical or those for which there already exist safety counter measures.

2. List and analyse the related human operations, and identify human errors that can occur and relevant human error recovery modes This stage of the process necessitates a comprehensive task and human error analysis. The task analysis lists and sequences the discrete elements and information required by task operators. For each step of the task, possible errors are considered by the analyst and precisely defined. The possible errors are then considered by the analyst, for each task step. Such errors can be broken down into the following categories:

Errors of omission – leaving out a step of the task or the whole task itself
Error of commission – this involves several different types of error:
- Errors of selection – error in use of controls or in issuing of commands
- Errors of sequence – required action is carried out in the wrong order
- Errors of timing – task is executed before or after when required
- Errors of quantity – inadequate amount or in excess

The opportunity for error recovery must also be considered as this, if achieved, has the potential to drastically reduce error probability for a task.

The tasks and associated outcomes are input to an HRAET in order to provide a graphical representation of a task’s procedure. The trees’ compatibility with conventional event-tree methodology i.e. including binary decision points at the end of each node, allows it to be evaluated mathematically. An event tree visually displays all events that occur within a system. It starts off with an initiating event, then branches develop as various consequences of the starting event. These are represented in a number of different paths, each associated with a probability of occurrence. As mentioned previously, the tree works on a binary logic, so each event either succeeds or fails. With the addition of the probabilities for the individual events along each path, i.e., branches, the likelihood of the various outcomes can be found. Below is an example of an event tree that represents a system fire:

Therefore, under the condition that all of a task’s sub-tasks are fully represented within a HRAET, and the failure probability for each sub-task is known, this makes it possible to calculate the final reliability for the task.

3. Estimate the relevant error probabilities HEPs for each sub-task are entered into the tree; it is necessary for all failure branches to have a probability otherwise the system will fail to provide a final answer. HRAETs provide the function of breaking down the primary operator tasks into finer steps, which are represented in the form of successes and failures. This tree indicates the order in which the events occur and also considers likely failures that may occur at each of the represented branches. The degree to which each high level task is broken down into lower level tasks is dependent on the availability of HEPs for the successive individual branches. The HEPs may be derived from a range of sources such as: the THERP database; simulation data; historical accident data; expert judgement. PSFs should be incorporated into these HEP calculations; the primary source of guidance for this is the THERP handbook. However the analyst must use their own discretion when deciding the extent to which each of the factors applies to the task

4. Estimate the effects of human error on the system failure events With the completion of the HRA the human contribution to failure can then be assessed in comparison with the results of the overall reliability analysis. This can be completed by inserting the HEPs into the full system’s fault event tree, which allows human factors to be considered within the context of the full system.

5. Recommend changes to the system and recalculate the system failure probabilities Once the human factor contribution is known, sensitivity analysis can be used to identify how certain risks may be improved in the reduction of HEPs. Error recovery paths may be incorporated into the event tree as this will aid the assessor when considering the possible approaches by which the identified errors can be reduced.

Worked example

Context

The following example illustrates how the THERP methodology can be used in practise in the calculation of human error probabilities (HEPs). It is used to determine the HEP for establishing air based ventilation using emergency purge ventilation equipment on in-tank precipitation (ITP) processing tanks 48 and 49 after failure of the nitrogen purge system following a seismic event

Assumptions

In order for the final HEP calculation to be valid, the following assumptions require to be fulfilled:

There exists a seismic event initiator that leads to the establishment of air based ventilation on the ITP processing tanks 48 and 49
It is assumed that both on and offsite power is unavailable within the context and therefore control actions performed by the operator are done so locally, on the tank top
The time available for operations personnel to establish air based ventilation by use of the emergency purge ventilation, following the occurrence of the seismic event, is a duration of 3 days
There is a necessity for an ITP equipment status monitoring procedure to be developed to allow for a consistent method to be adopted for the purposes of evaluating the ITP equipment and component status and selected process parameters for the period of an accident condition
Assumed response times exist for initial diagnosis of the event and for the placement of emergency purge ventilation equipment on the tank top. The former is 10 hours while the latter is 4 hours.
The in-tank precipitation process has associated operational safety requirements (OSR) that identify the precise conditions under which the emergency purge ventilation equipment should be hooked up to the riser
The “tank 48 system” standard operating procedure has certain conditions and actions that must be included within for correct completion to be performed (see file for more details)
A vital component of the emergency purge ventilation equipment unit is a flow indicator; this is required in the event of the emergency purge ventilation equipment being hooked up incorrectly as it would allow for a recovery action
The personnel available to perform the necessary tasks all possess the required skills
Throughout the installation of the emergency purge ventilation equipment, carried out by maintenance personnel, a tank operator must be present to monitor this process.

Method

An initial task analysis was carried out on the off normal procedure and standard operating procedure. This allowed for the operator to align and then initiate the emergency purge ventilation equipment given the loss of the ventilation system. Thereafter, each individual task was analysed from which it was then possible to assign error probabilities and error factors to events that represented operator responses.

A number of the HEPs were adjusted to take account of various identified performance-shaping factors (PSFs)
Upon assessment of characteristics of the task and behaviour of the crew, recovery probabilities were deciphered. Such probabilities are influenced by such factors as task familiarity, alarms and independent checking
Once error probabilities were decided upon for the individual tasks, event trees were then constructed from which calculation formulations were derived. The probability of failure was obtained through the multiplication of each of the failure probabilities along the path under consideration.

HRA event tree for align and start emergency purge ventilation equipment on in-tank precipitation tank 48 or 49 after a seismic event

The summation of each of the failure path probabilities provided the total failure path probability (FT)

Results

Task A: Diagnosis, HEP 6.0E-4 EF=30
Task B: Visual inspection performed shiftly, recovery factor HEP=0.001 EF=3
Task C: Initiate standard operating procedure HEP= .003 EF=3
Task D: Maintainer hook-up emergency purge ventilation equipment HEP=.003 EF=3
Task E: Maintainer 2 hook-up emergency purge, recovery factor CHEP=0.5 EF=2
Task G: Tank operator instructing /verifying hook-up, recovery factor CHEP=0.5 Lower bound = .015 Upper bound = 0.15
Task H: Read flow indicator, recovery factor CHEP= .15 Lower bound= .04 Upper bound = .5
Task I: Diagnosis HEP= 1.0E-5 EF=30
Task J: Analyse LFL using portable LFL analyser, recovery factor CHEP= 0.5 Lower bound = .015 Upper bound =.15

From the various figures and workings, it can be determined that the HEP for establishing air based ventilation using the emergency purge ventilation equipment on In-tank Precipitation processing tanks 48 and 49 after a failure of the nitrogen purge system following a seismic event is 4.2 E-6. This numerical value is judged to be a median value on the lognormal scale. However, it should be noted that this result is only valid given that all the previously stated assumptions are implemented.

Advantages of THERP

It is possible to use THERP at all stages of design. Furthermore THERP is not restricted to the assessment of designs already in place and due to the level of detail in the analysis it can be specifically tailored to the requirements of a particular assessment. [3]
THERP is compatible with Probabilistic Risk Assessments (PRA); the methodology of the technique means that it can be readily integrated with fault tree reliability methodologies. [3]
The THERP process is transparent, structured and provides a logical review of the human factors considered in a risk assessment; this allows the results to be examined in a straightforward manner and assumptions to be challenged. [3]
The technique can be utilised within a wide range of differing human reliability domains and has a high degree of face validity. [3]
It is a unique methodology in the way that it highlights error recovery and it also quantitatively models a dependency relation between the various actions or errors.

Disadvantages of THERP

THERP analysis is very resource intensive, and may require a large amount of effort to produce reliable HEP values. This can be controlled by ensuring an accurate assessment of the level of work required in the analysis of each stage. [3]
The technique does not lend itself to system improvement. Compared to some other Human Reliability Assessment tools such as HEART, THERP is a relatively unsophisticated tool as the range of PSFs considered is generally low and the underlying psychological causes of errors are not identified.
With regards to the consistency of the technique, large discrepancies have been found in practice with regards to different analysts assessment of the risk associated with the same tasks. Such discrepancies may have arisen from either the process mapping of the tasks in question or in the estimation of the HEPs associated with each of the tasks through the use of THERP tables compared to, for example, expert judgement or the application of PSFs. [4, 5, 6].^{Missing reference 6}
The methodology fails to provide guidance to the assessor in how to model the impact of PSFs and the influence of the situation on the errors being assessed.
The THERP HRAETs implicitly assume that each sub-task’s HEP is independent from all others i.e. the HRAET does not update itself in the event that an operator takes a sub-optimal route through the task path. This is reinforced by the HEP being merely reduced by the chance of recovery from a mistake, rather than by introducing alternative (i.e. sub-optimal) “success” routes into the event-tree, which could allow for Bayesian updating of subsequent HEPs.
THERP is a “first generation” HRA tool, and in common with other such tools has been criticised for not taking adequate account of context [7]^{Missing reference}

References

[1] Kirwan, B. (1994) A Guide to Practical Human Reliability Assessment. CRC Press.

[2] Swain, A.D. & Guttmann, H.E., Handbook of Human Reliability Analysis with Emphasis on Nuclear Power Plant Applications. 1983, NUREG/CR-1278, USNRC.

[3] Humphreys, P. (1995). Human Reliability Assessor’s Guide. Human Factors in Reliability Group.

[4] Kirwan, B. (1996) The validation of three human reliability quantification techniques - THERP, HEART, JHEDI: Part I -- technique descriptions and validation issues. Applied Ergonomics. 27(6) 359-373.

[5] Kirwan, B. (1997) The validation of three human reliability quantification techniques - THERP, HEART, JHEDI: Part II - Results of validation exercise. Applied Ergonomics. 28(1) 17-25.

[7] Hollnagel, E. (2005) Human reliability assessment in context. Nuclear Engineering and Technology. 37(2) 159-166.