Jump to content

Failure mode and effects analysis

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Ncsupimaster (talk | contribs) at 18:05, 12 June 2012. The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

A failure modes and effects analysis (FMEA) is an inductive failure analysis used in product development, systems engineering / Reliability Engineering and operations management for analysis of potential failure modes within a system for classification by the severity and likelihood of the failures. A successful FMEA activity helps a team to identify potential failure modes based on past experience with similar products or processes or based on common sense logic, enabling the team to design those failures out of the system with the minimum of effort and resource expenditure, thereby reducing development time and costs. Because it forces a review of functions and functional requirements, it also serves as a form of design review to erase weakness (related to failure) out of the design. It is widely used in development and manufacturing industries in various phases of the product life cycle and is now increasingly finding use in the service industry. Failure modes are any errors or defects in a process, design, or item, especially those that affect the intended function of the product and or process, and can be potential or actual. Effects analysis refers to studying the consequences of those failures on different system levels.

Basic terms[1]

FMEA cycle.
Failure
The loss of an intended function of a device under stated conditions.
Failure mode
The manner or way by which a failure is observed in terms of failure of the part function under investigation; it may generally describe the way the failure occurs. It shall at least clearly describe a (end) failure state of the item/function under consideration as result of the failure mechanims (cause of the failure mode). For example; a fractured axle or a open electrical contact can be a failure mode.
Failure Cause and/or Mechanism
Defects in requirements, design, process, quality control, handling or part application, which are the underlying cause or sequence of causes that initiate a process (mechanism) that leads to a failure mode over a certain time. A failure mode may have more causes. For example; fatigue or corrosion of a beam or contact is a failure mechanism and not a failure mode. The related failure mode (state) under analysis could be a "full fracture of structural beam" or for example "a open electrical contact". The initial Cause might have been "Improper application of corrosion protection layer (paint)" and /or "(abnormal) vibration input from another failed system".
Failure effect
Immediate consequences of a failure on operation, function or functionality, or status of some item
Indenture levels
An identifier for item complexity. Complexity increases as levels are closer to one.
Local effect
The failure effect as it applies to the item under analysis.
Next higher level effect
The failure effect as it applies at the next higher indenture level.
End effect
The failure effect at the highest indenture level or total system.
Severity
The consequences of a failure mode. Severity considers the worst potential consequence of a failure, determined by the degree of injury, property damage, system damage and/or time lost to repair the failure.

History

Procedures for conducting FMECA were described in US Armed Forces Military Procedures document MIL-P-1629[2] (1949; revised in 1980 as MIL-STD-1629A).[3] By the early 1960s, contractors for the U.S. National Aeronautics and Space Administration (NASA) were using variations of FMECA or FMEA under a variety of names.[4][5] NASA programs using FMEA variants included Apollo, Viking, Voyager, Magellan, Galileo, and Skylab.[6][7][8] The civil aviation industry was an early adopter of FMEA, with the Society for Automotive Engineers publishing ARP926 in 1967.[9]

During the 1970s, use of FMEA and related techniques spread to other industries. In 1971 NASA prepared a report for the U.S. Geological Survey recommending the use of FMEA in assessment of offshore petroleum exploration.[10] FMEA as application for HACCP on the Apollo Space Program moved into the food industry in general.[11] In the late 1970s the Ford Motor Company introduced FMEA to the automotive industry for safety and regulatory consideration after the Pinto affair. They applied the same approach to processes (PFMEA) to consider potential process induced failures prior to launching production.

Although initially developed by the military, FMEA methodology is now extensively used in a variety of industries including semiconductor processing, food service, plastics, software, and healthcare.[12][13] It is integrated into the Automotive Industry Action Group's (AIAG) Advanced Product Quality Planning (APQP) process to provide risk mitigation, in both product and process development phases. Each potential cause must be considered for its effect on the product or process and, based on the risk, actions are determined and risks revisited after actions are complete. Toyota has taken this one step further with its Design Review Based on Failure Mode (DRBFM) approach. The method is now supported by the American Society for Quality which provides detailed guides on applying the method.[14]

Implementation

In FMEA, failures are prioritized according to how serious their consequences are, how frequently they occur and how easily they can be detected. An FMEA also documents current knowledge and actions about the risks of failures for use in continuous improvement. FMEA is used during the design stage with an aim to avoid future failures (sometimes called DFMEA in that case). Later it is used for process control, before and during ongoing operation of the process. Ideally, FMEA begins during the earliest conceptual stages of design and continues throughout the life of the product or service.

The outcomes of an FMEA development are actions to prevent or reduce the severity or likelihood of failures, starting with the highest-priority ones. It may be used to evaluate risk management priorities for mitigating known threat vulnerabilities. FMEA helps select remedial actions that reduce cumulative impacts of life-cycle consequences (risks) from a systems failure (fault).

It is used in many formal quality systems such as QS-9000 or ISO/TS 16949, and AS9100.

Using FMEA when designing

FMEA is intended to provide an analytical approach to reviewing potential failure modes and their associated causes. FMEA is a recognised tool to help to assess which risks have the greatest concern, and therefore which risks to address in order to prevent problems before they arise. The development of these specifications helps to ensure the product will meet the defined requirements and customer needs.

The pre-work

The process for conducting an FMEA is typically developed in three main phases, in which appropriate actions need to be defined. Before starting with an FMEA, several other techniques are frequently employed to ensure that robustness and history are included in the analysis.

A robustness analysis can be obtained from interface matrices, boundary diagrams, and parameter diagrams. Failures are often found from external 'noise factors' and from shared interfaces with other parts and/or systems.

Typically, a description of the system and its function is developed, considering both intentional and unintentional uses.

A block diagram of the system is often created for inclusion with the FMEA, giving an overview of the major components or process steps and how they are related. These are called logical relations around which the FMEA can be developed.

The primary FME document or 'worksheet' lists all of the items or functions of the system in a logical manner, typically based on the block diagram.

Example FMEA Worksheet
Item / Function Potential Failure mode Potential Effects of Failure S (severity rating) Potential Cause(s) O (occurrence rating) Current controls D (detection rating) CRIT (critical characteristic RPN (risk priority number) Recommended actions Responsibility and target completion date Action taken New S New O New D New RPN
Fill tub High level sensor is disconnected Liquid spills on customer floor 8 sensor is exposed at top and can be easily disconnected by user 2 Fill timeout based on time to fill to low level sensor 5 N 80 Perform cost analysis of adding additional sensor halfway between low and high level sensors to calculate fill rate at mid-point and determine max fill volume in case high level sensor is disconnected Jane Doe
15-May-2012

NOTE: Above shown example format is not in line with mil.std 1629 or Civil Aerospace practise. The basic terms as given in first paragraph of this page are not available in this template!

Step 1: Occurrence

In this step it is necessary to look at the cause of a failure mode and the number of times it occurs. This can be done by looking at similar products or processes and the failure modes that have been documented for them in the past. A failure cause is looked upon as a design weakness. All the potential causes for a failure mode should be identified and documented. Again this should be in technical terms. Examples of causes are: erroneous algorithms, excessive voltage or improper operating conditions. A failure mode is given an occurrence ranking (O), again 1–10. Actions need to be determined if the occurrence is high (meaning > 4 for non-safety failure modes and > 1 when the severity-number from step 2 is 9 or 10). This step is called the detailed development section of the FMEA process. Occurrence also can be defined as %. If a non-safety issue happened less than 1%, we can give 1 to it. It is based on your product and customer specification.

Rating Meaning
1 No known occurrences on similar products or processes
2/3 Low (relatively few failures)
4/5/6 Moderate (occasional failures)
7/8 High (repeated failures)
9/10 Very high (failure is almost inevitable)

[15]

Step 2: Severity

Determine all failure modes based on the functional requirements and their effects. Examples of failure modes are: Electrical short-circuiting, corrosion or deformation. A failure mode in one component can lead to a failure mode in another component, therefore each failure mode should be listed in technical terms and for function. Hereafter the ultimate effect of each failure mode needs to be considered. A failure effect is defined as the result of a failure mode on the function of the system as perceived by the user. In this way it is convenient to write these effects down in terms of what the user might see or experience. Examples of failure effects are: degraded performance, noise or even injury to a user. Each effect is given a severity number (S) from 1 (no danger) to 10 (critical). These numbers help an engineer to prioritize the failure modes and their effects. If the sensitivity of an effect has a number 9 or 10, actions are considered to change the design by eliminating the failure mode, if possible, or protecting the user from the effect. A severity rating of 9 or 10 is generally reserved for those effects which would cause injury to a user or otherwise result in litigation.

Rating Meaning
1 No effect
2 Very minor (only noticed by discriminating customers)
3 Minor (affects very little of the system, noticed by average customer)
4/5/6 Moderate (most customers are annoyed)
7/8 High (causes a loss of primary function; customers are dissatisfied)
9/10 Very high and hazardous (product becomes inoperative; customers angered; the failure may result unsafe operation and possible injury)

[15]

Step 3: Detection

When appropriate actions are determined, it is necessary to test their efficiency. In addition, design verification is needed. The proper inspection methods need to be chosen. First, an engineer should look at the current controls of the system that prevent failure modes from occurring or which detect the failure before it reaches the customer. Hereafter one should identify testing, analysis, monitoring and other techniques that can be or have been used on similar systems to detect failures. From these controls an engineer can learn how likely it is for a failure to be identified or detected. Each combination from the previous 2 steps receives a detection number (D). This ranks the ability of planned tests and inspections to remove defects or detect failure modes in time. The assigned detection number measures the risk that the failure will escape detection. A high detection number indicates that the chances are high that the failure will escape detection, or in other words, that the chances of detection are low.

Rating Meaning
1 Certain - fault will be caught on test
2 Almost Certain
3 High
4/5/6 Moderate
7/8 Low
9/10 Fault will be passed to customer undetected

[15]

After these three basic steps, risk priority numbers (RPN) are calculated

Risk priority number (RPN)

- RPN play an important part in the choice of an action against failure modes. They are threshold values in the evaluation of these actions.

After ranking the severity, occurrence and detectability the RPN can be easily calculated by multiplying these three numbers: RPN = S × O × D

This has to be done for the entire process and/or design. Once this is done it is easy to determine the areas of greatest concern. The failure modes that have the highest RPN should be given the highest priority for corrective action. This means it is not always the failure modes with the highest severity numbers that should be treated first. There could be less severe failures, but which occur more often and are less detectable.

After these values are allocated, recommended actions with targets, responsibility and dates of implementation are noted. These actions can include specific inspection, testing or quality procedures, redesign (such as selection of new components), adding more redundancy and limiting environmental stresses or operating range. Once the actions have been implemented in the design/process, the new RPN should be checked, to confirm the improvements. These tests are often put in graphs, for easy visualization. Whenever a design or a process changes, an FMEA should be updated.

A few logical but important thoughts come in mind:

  • Try to eliminate the failure mode (some failures are more preventable than others)
  • Minimize the severity of the failure
  • Reduce the occurrence of the failure mode
  • Improve the detection

Timing of FMEA

The FMEA should be updated whenever:

  • A new cycle begins (new product/process)
  • Changes are made to the operating conditions
  • A change is made in the design
  • New regulations are instituted
  • Customer feedback indicates a problem

Uses of FMEA

  • Development of system requirements that minimize the likelihood of failures.
  • Development of methods to design and test systems to ensure that the failures have been eliminated.
  • Evaluation of the requirements of the customer to ensure that those do not give rise to potential failures.
  • Identification of certain design characteristics that contribute to failures, and minimize or eliminate those effects.
  • Tracking and managing potential risks in the design. This helps avoid the same failures in future projects.
  • Ensuring that any failure that could occur will not injure the customer or seriously impact a system.
  • To produce world class quality products

Advantages

  • Improve the quality, reliability and safety of a product/process
  • Improve company image and competitiveness
  • Increase user satisfaction
  • Reduce system development timing and cost
  • Collect information to reduce future failures, capture engineering knowledge
  • Reduce the potential for warranty concerns
  • Early identification and elimination of potential failure modes
  • Emphasize problem prevention
  • Minimize late changes and associated cost
  • Catalyst for teamwork and idea exchange between functions
  • Reduce the possibility of same kind of failure in future
  • Reduce impact of profit margin company
  • Reduce possible scrap in production

Limitations

Since FMEA is effectively dependent on the members of the committee which examines product failures, it is limited by their experience of previous failures. If a failure mode cannot be identified, then external help is needed from consultants who are aware of the many different types of product failure. FMEA is thus part of a larger system of quality control, where documentation is vital to implementation. General texts and detailed publications are available in forensic engineering and failure analysis. It is a general requirement of many specific national and international standards that FMEA is used in evaluating product integrity. If used as a top-down tool, FMEA may only identify major failure modes in a system. Fault tree analysis (FTA) is better suited for "top-down" analysis. When used as a "bottom-up" tool FMEA can augment or complement FTA and identify many more causes and failure modes resulting in top-level symptoms. It is not able to discover complex failure modes involving multiple failures within a subsystem, or to report expected failure intervals of particular failure modes up to the upper level subsystem or system.[citation needed]

Additionally, the multiplication of the severity, occurrence and detection rankings may result in rank reversals, where a less serious failure mode receives a higher RPN than a more serious failure mode.[16] The reason for this is that the rankings are ordinal scale numbers, and multiplication is not defined for ordinal numbers. The ordinal rankings only say that one ranking is better or worse than another, but not by how much. For instance, a ranking of "2" may not be twice as severe as a ranking of "1," or an "8" may not be twice as severe as a "4," but multiplication treats them as though they are. See Level of measurement for further discussion.

Software

Most FMEAs are created as a spreadsheet. Specialized FMEA software packages exist that offer some advantages over spreadsheets.

Types of FMEA

  • Process: analysis of manufacturing and assembly processes
  • Design: analysis of products prior to production
  • Concept: analysis of systems or subsystems in the early design concept stages
  • Equipment: analysis of machinery and equipment design before purchase
  • Service: analysis of service industry processes before they are released to impact the customer
  • System: analysis of the global system functions
  • Software: analysis of the software functions

See also

References

  1. ^ Langford, J. W. (1995). Logistics: Principles and Applications. McGraw Hill. p. 488.
  2. ^ "MIL-P-1629 - Procedures for performing a failure mode effect and critical analysis" (Document). Department of Defense (US). 9 November 1949Template:Inconsistent citations {{cite document}}: Unknown parameter |name= ignored (help); Unknown parameter |note= ignored (help); Unknown parameter |url= ignored (help)CS1 maint: postscript (link)
  3. ^ "MIL-STD-1629A - Procedures for performing a failure mode effect and criticality analysis" (Document). Department of Defense (USA). 24 November 1980Template:Inconsistent citations {{cite document}}: Unknown parameter |name= ignored (help); Unknown parameter |note= ignored (help); Unknown parameter |unused_data= ignored (help); Unknown parameter |url= ignored (help)CS1 maint: postscript (link)
  4. ^ Neal, R.A. (1962). Modes of Failure Analysis Summary for the Nerva B-2 Reactor (PDF). Westinghouse Electric Corporation Astronuclear Laboratory. WANL–TNR–042. Retrieved 2010-03-13. {{cite book}}: Cite has empty unknown parameters: |sectionurl= and |coauthors= (help)
  5. ^ Dill, Robert (1963). State of the Art Reliability Estimate of Saturn V Propulsion Systems (PDF). General Electric Company. RM 63TMP–22. Retrieved 2010-03-13. {{cite book}}: Cite has empty unknown parameter: |sectionurl= (help); Unknown parameter |coauthors= ignored (|author= suggested) (help)
  6. ^ Procedure for Failure Mode, Effects and Criticality Analysis (FMECA) (PDF). National Aeronautics and Space Administration. 1966. RA–006–013–1A. Retrieved 2010-03-13. {{cite book}}: Cite has empty unknown parameters: |sectionurl= and |coauthors= (help)
  7. ^ Failure Modes, Effects, and Criticality Analysis (FMECA) (PDF). National Aeronautics and Space Administration JPL. PD–AD–1307. Retrieved 2010-03-13. {{cite book}}: Cite has empty unknown parameters: |sectionurl= and |coauthors= (help)
  8. ^ Experimenters' Reference Based Upon Skylab Experiment Management (PDF). National Aeronautics and Space Administration George C. Marshall Space Flight Center. 1974. M–GA–75–1. Retrieved 2011-08-16. {{cite book}}: Cite has empty unknown parameters: |sectionurl= and |coauthors= (help)
  9. ^ Design Analysis Procedure For Failure Modes, Effects and Criticality Analysis (FMECA). Society for Automotive Engineers. 1967. ARP926. {{cite book}}: Cite has empty unknown parameters: |sectionurl= and |coauthors= (help)
  10. ^ Dyer, Morris K. (1972). Applicability of NASA Contract Quality Management and Failure Mode Effect Analysis Procedures to the USFS Outer Continental Shelf Oil and Gas Lease Management Program (PDF). National Aeronautics and Space Administration George C. Marshall Space Flight Center. TM X–2567. Retrieved 2011-08-16. {{cite book}}: Cite has empty unknown parameter: |sectionurl= (help); Unknown parameter |coauthors= ignored (|author= suggested) (help)
  11. ^ Sperber, William H.; Stier, Richard F. (December 2009-January 2010). "Happy 50th Birthday to HACCP: Retrospective and Prospective". FoodSafety magazine: 42, 44–46. {{cite journal}}: Check date values in: |date= (help)
  12. ^ Quality Associates International's History of FMEA
  13. ^ Fadlovich, Erik (December 31, 2007). "Performing Failure Mode and Effect Analysis". Embedded Technology.
  14. ^ "Failure Mode Effects Analysis (FMEA)". ASQ. Retrieved 2012-02-15.
  15. ^ a b c Otto, Kevin; Wood, Kristin (2001). Product Design - Techniques in Reverse Engineering and New Product Development. Prentice Hall. ISBN 0-13-021271-7.[page needed]
  16. ^ Kmenta, Steven; Ishii, Koshuke (2004). "Scenario-Based Failure Modes and Effects Analysis Using Expected Cost". Journal of Mechanical Design. 126 (6): 1027. doi:10.1115/1.1799614.