Reliability centered maintenance
|Part of a series of articles on|
|Information & communication|
Reliability Centered Maintenance (RCM) is a process to ensure that assets continue to do what their users require in their present operating context.
It is generally used to achieve improvements in fields such as the establishment of safe minimum levels of maintenance, changes to operating procedures and strategies and the establishment of capital maintenance regimes and plans. Successful implementation of RCM will lead to increase in cost effectiveness, machine uptime, and a greater understanding of the level of risk that the organization is managing.
The late John Moubray, in his industry leading book RCM2 , characterized Reliability-centered Maintenance as a process to establish the safe minimum levels of maintenance. This description echoed statements in the Nowlan and Heap report from United Airlines.
It is defined by the technical standard SAE JA1011 , Evaluation Criteria for RCM Processes, which sets out the minimum criteria that any process should meet before it can be called RCM. This starts with the 7 questions below, worked through in the order that they are listed:
- 1.What is the item supposed to do and its associated performance standards?
- 2.In what ways can it fail to provide the required functions?
- 3.What are the events that cause each failure?
- 4.What happens when each failure occurs?
- 5.In what way does each failure matter?
- 6.What systematic task can be performed proactively to prevent, or to diminish to a satisfactory degree, the consequences of the failure?
- 7.What must be done if a suitable preventive task cannot be found?
Reliability centered maintenance is an engineering framework that enables the definition of a complete maintenance regime. It regards maintenance as the means to maintain the functions a user may require of machinery in a defined operating context. As a discipline it enables machinery stakeholders to monitor, assess, predict and generally understand the working of their physical assets. This is embodied in the initial part of the RCM process which is to identify the operating context of the machinery, and write a Failure Mode Effects and Criticality Analysis (FMECA). The second part of the analysis is to apply the "RCM logic", which helps determine the appropriate maintenance tasks for the identified failure modes in the FMECA. Once the logic is complete for all elements in the FMECA, the resulting list of maintenance is "packaged", so that the periodicities of the tasks are rationalised to be called up in work packages; it is important not to destroy the applicability of maintenance in this phase. Lastly, RCM is kept live throughout the "in-service" life of machinery, where the effectiveness of the maintenance is kept under constant review and adjusted in light of the experience gained.
RCM can be used to create a cost-effective maintenance strategy to address dominant causes of equipment failure. It is a systematic approach to defining a routine maintenance program composed of cost-effective tasks that preserve important functions.
The important functions (of a piece of equipment) to preserve with routine maintenance are identified, their dominant failure modes and causes determined and the consequences of failure ascertained. Levels of criticality are assigned to the consequences of failure. Some functions are not critical and are left to "run to failure" while other functions must be preserved at all cost. Maintenance tasks are selected that address the dominant failure causes. This process directly addresses maintenance preventable failures. Failures caused by unlikely events, non-predictable acts of nature, etc. will usually receive no action provided their risk (combination of severity and frequency) is trivial (or at least tolerable). When the risk of such failures is very high, RCM encourages (and sometimes mandates) the user to consider changing something which will reduce the risk to a tolerable level.
The result is a maintenance program that focuses scarce economic resources on those items that would cause the most disruption if they were to fail.
RCM emphasizes the use of Predictive Maintenance (PdM) techniques in addition to traditional preventive measures.
The term "reliability-centered maintenance" was first used in public papers authored by Tom Matteson, Stanley Nowlan, Howard Heap, and other senior executives and engineers at United Airlines (UAL) to describe a process used to determine the optimum maintenance requirements for aircraft. Having left United Airlines to pursue a consulting career a few months before the publication of the final Nowlan-Heap report, Matteson received no authorial credit for the work. However, his contributions were substantial and perhaps indispensable to the document as a whole. The US Department of Defense (DOD) sponsored the authoring of both a textbook (by UAL) and an evaluation report (by Rand Corporation) on Reliability-Centered Maintenance, both published in 1978. They brought RCM concepts to the attention of a wider audience. The text book described efforts by commercial airlines and the US Navy in the 1960s and 70s to improve the reliability of their new jet the Boeing 747.[which?]
The first generation of jet aircraft had a crash rate that would be considered highly alarming today, and both the Federal Aviation Administration (FAA) and the airlines' senior management felt strong pressure to improve matters. In the early 1960s, with FAA approval the airlines began to conduct a series of intensive engineering studies on in-service aircraft. The studies proved that the fundamental assumption of design engineers and maintenance planners—that every airplane and every major component in the airplane (such as its engines) had a specific "lifetime" of reliable service, after which it had to be replaced (or overhauled) in order to prevent failures—was wrong in nearly every specific example in a complex modern jet airliner.
This was one of many astounding discoveries that have revolutionized the managerial discipline of physical asset management and have been at the base of many developments since this seminal work was published. Among some of the paradigm shifts inspired by RCM were:
- an understanding that the vast majority of failures are not necessarily linked to the age of the asset (this is often modeled by the "memoryless" exponential probability distribution)
- changing from efforts to predict life expectancies to trying to manage the process of failure
- an understanding of the difference between the requirements of assets from a user perspective, and the design reliability of the asset
- an understanding of the importance of managing assets on condition (often referred to as condition monitoring, condition based maintenance and predictive maintenance)
- an understanding of four basic routine maintenance tasks
- linking levels of tolerable risk to maintenance strategy development
Today RCM is defined in the standard SAE JA1011, Evaluation Criteria for Reliability-Centered Maintenance (RCM) Processes. This sets out the minimum criteria for what is, and for what is not, able to be defined as RCM.
The standard is a watershed event in the ongoing evolution of the discipline of physical asset management. Prior to the development of the standard many processes were labeled as RCM even though they were not true to the intentions and the principles in the original report that defined the term publicly.
Today companies can use this standard to ensure that the processes, services and software they purchase and implement conforms with what is defined as RCM, ensuring the best possibility of achieving the many benefits attributable to rigorous application of RCM.
The RCM process described in the DOD/UAL report recognized three principal risks from equipment failures: threats
- to safety,
- to operations, and
- to the maintenance budget.
Modern RCM gives threats to the environment a separate classification, though most forms manage them in the same way as threats to safety.
RCM offers five principal options among the risk management strategies:
- Predictive maintenance tasks,
- Preventive Restoration or Preventive Replacement maintenance tasks,
- Detective maintenance tasks,
- Run-to-Failure, and
- One-time changes to the "system" (changes to hardware design, to operations, or to other things).
RCM also offers specific criteria to use when selecting a risk management strategy for a system that presents a specific risk when it fails. Some are technical in nature (can the proposed task detect the condition it needs to detect? does the equipment actually wear out, with use?). Others are goal-oriented (is it reasonably likely that the proposed task-and-task-frequency will reduce the risk to a tolerable level?). The criteria are often presented in the form of a decision-logic diagram, though this is not intrinsic to the nature of the process.
Identification of Safety Critical Elements (SCE) and maintaining associated pre defined performance standards is the foundation of asset integrity management.
After being created by the commercial aviation industry, RCM was adopted by the U.S. military (beginning in the mid-1970s) and by the U.S. commercial nuclear power industry (in the 1980s).
Starting in the late 1980s, an independent initiative led by John Moubray corrected some early flaws in the process, and adapted it for use in the wider industry. John was also responsible for popularizing the method and for introducing it to much of the industrial community outside of the Aviation industry. (RCM2)
In the two decades since RCM2 was first released, industry has undergone massive change. Increased economic pressures and competition, tied with advances in lean thinking and efficiency methods meant that companies often struggled to find the people required to carry out an RCM initiative.
At this point in time many methods sprung up that took an approach of reducing the rigour of the RCM approach. The result was the propagation of many methods that called themselves RCM, yet had little in common with the original concepts. In some cases these were misleading and inefficient, while in other cases they were even dangerous.
Since each initiative is sponsored by one or more consulting firms eager to help clients use it, there is still considerable disagreement about their relative dangers (or merits). Also there is a tendency for consulting firms to promote a software package as an alternative methodology in place of the knowledge required to perform analyses.
Although a voluntary standard, it provides a reference for companies looking to implement RCM to ensure they are getting a process, software package or service that is in line with the original report.
Disney introduced RCM to its parks in 1997, led by Paul Pressler and consultants McKinsey & Company, laying off a large number of maintenance workers and saving large amounts of money. Some people blamed the new cost-conscious maintenance culture for some of the Incidents at Disneyland Resort that occurred in the following years.
 Standard To Define RCM, (Part 1), Dana Netherton, Maintenance Technology (1998) (Dead links)
 Standard To Define RCM, (Part 2), Dana Netherton, Maintenance Technology (1998)
- SAE JA1011, Evaluation Criteria for Reliability-Centered Maintenance (RCM) Processes, Society of Automotive Engineers, 1 Aug 1998
- SAE JA1012, A Guide to the Reliability-Centered Maintenance (RCM) Standard, Society of Automotive Engineers, 1 Jan 2002
- Moubray, John. Reliability-Centered Maintenance. Industrial Press. New York, NY. 1997. ISBN 978-0-8311-3146-3
- MSG-3. Maintenance Program Development Document. Air Transport Association, Washington, D.C. Revision 2, 1993.
- "Nowlan, F. Stanley, and Howard F. Heap. Reliability-Centered Maintenance. Report Number AD-A066579" (PDF). United States Department of Defense. 1978. Archived from the original on 2013-08-01.
- "MIL-HDBK-2173, Department of Defense Handbook: Reliability-Centered Maintenance (RCM) Requirements for Naval Aircraft, Weapons Systems, and Support Equipment (S/S BY NAVAIR 00-25-403)" (PDF). United States Department of Defense. 30 Jan 1998. Archived from the original on 2013-10-07.
- "MIL-P-24534A, Military Specification: Planned Maintenance System, Development of Maintenance Requirement Cards, Maintenance Index Pages, and Associated Documentation" (PDF). Naval Sea Systems Command. 7 May 1985.
- "MIL-STD-2173, Military Standard: Reliability-Centered Maintenance (RCM) Requirements for Naval Aircraft, Weapons Systems, and Support Equipment (S/S By MIL-HDBK-2173)" (PDF). United States Department of Defense. 21 Jan 1986. Archived from the original on 2013-11-06.
- "MIL-STD-3034, Military Standard: MIL-STD-3034, DEPARTMENT OF DEFENSE STANDARD PRACTICE: RELIABILITY-CENTERED MAINTENANCE (RCM) PROCESS" (PDF). United States Department of Defense. 21 Jan 2011.
- "NASA Reliability Centered Maintenance (RCM) Guide for Facilities and Collateral Equipment" (PDF). NASA. FEV 2000.
- "NAVAIR 00-25-403, Guidelines for the Naval Aviation Reliability-Centered Maintenance (RCM) Process)" (PDF). Naval Air Systems Command. 1 Jul 2005.
- "NAVAIR S9081-AB-GIB-010, Reliability-Centered Maintenance (RCM) Handbook)" (PDF). Naval Sea Systems Command. 18 Apr 2007. Archived from the original on 2013-12-04.
- "TM 5-698-2, Technical Manual: Reliability-Centered Maintenance (RCM) for Command, Control, Communications, Computer, Intelligence, Surveillance, and Reconnaissance (C4ISR) Facilities" (PDF). United States Army. 6 Oct 2006.
- Introduction to Reliability Centered Maintenance (RCM) Part 1 Archived June 3, 2014 at the Wayback Machine
- Disney Ride Upkeep Assailed, Mike Anton and Kimi Yoshino, Los Angeles Times, Nov 9 2003 Archived May 3, 2014 at the Wayback Machine