Cause and effect

05 April 2016

Root cause analysis (RCA) is the basis of sustainable reliability improvement, says Phil Burge, country communication manager at SKF.

With increasing pressure to improve equipment reliability, eliminate unplanned downtime and reduce maintenance costs, equipment owners need to understand not just how problems occur, but why? Doing that, calls for a robust root cause analysis process. RCA is based on the theory that every failure stems from three causes: physical or technical causes; human causes such as errors of omission or commission; and latent or organisational causes that stem from the organisation’s systems, operating procedures and decision-making processes. The procedure to identify those causes includes seven basic elements. 


Problems don’t need to be as obvious as a sudden, unplanned stoppage. A company might be equally interested in understanding the cause of a quality issue, a capacity constraint or other production issue. Whatever it is, if a problem is perceived as normal, it never gets fixed. So the first step in finding the root cause of an issue is to give it a name and assign a team to run the RCA process.


While the problem under investigation might manifest itself as a failure in a single machine component or process variable, the underlying cause may lie elsewhere entirely, so the investigating team needs to ensure it has a full understanding of the equipment or process under investigation. Graphical tools such as process flow diagrams or spider charts can be used to help visualise the system, aiding the identification and discussion of probable causes.

Collect data

Robust RCA is built on fact-based decision making, so teams need the relevant data at their fingertips. That data can come from a variety of sources, including historical process or quality records, or reports from operators. Inspection of faulty components can reveal a lot about the underlying causes of failure. Bearings, for example, can exhibit surface markings that can be a tell-tale indicator that wear was caused by failure in lubrication, stray electrical currents, or problems with installation.


Multiple causes can lead to the same effect, and identifying the most likely causes, or combination of causes, is a key goal of RCA. The “five whys” technique is a surprisingly powerful approach allowing a team to quickly move back through a problem from end symptom to underlying issue. For more complex problems, the fishbone, or Ishikawa diagram is powerful way of graphically connecting multiple possible causes to a single end effect. For really complex systems, companies are increasingly adopting advanced tools such as statistical regression, Bayesian networks or artificial neural networks to find the most likely causes for the issues they observe. SKF, for example, uses a Bayesian network to support bearing failure or damage investigations. 


Typically, eliminating the cause, or causes, of an issue will involve multiple actions across technical, human and organisational aspects of a process. If insufficient lubrication was the root cause of a bearing failure, for example, a sustainable fix may include changes in maintenance and inspection procedures to prevent subsequent failures, together with appropriate training and oversight to ensure the agreed procedures are followed.


Ideally, companies don’t want to wait for another failure to check they have identified the real cause of a problem. The improved understanding of the issue provided by the RCA often creates opportunities for early identification of the conditions that ultimately led to the failure. The installation of condition monitoring equipment, such as vibration or temperature sensors, can allow the first signs of wear or misalignment in rotating equipment to be spotted, for example.


The final – and sometimes ignored – step in the RCA process is to ensure that the organisation learns all it can from the effort. This is an issue for management as much as for the team involved directly in the RCA. The evaluation should consider whether value of the improvements made is sufficient to justify the cost of implementing them, and if the same approach should be applied to other components, machines or processes in the business.