Condition Management - CM & Lube Newsletter Articles Oct 08

From Imrtwiki

Jump to: navigation, search

Return to Condition Management and Defect Elimination

Contents

Condition Management

The performance of any organisation depends on its people, equipment and processes aligning to achieve the desired outcome. In utilities, manufacturing and mining industries there is a heavy reliance on equipment reliability to achieve business outcomes. Condition Management is a terminology that has started to be used more broadly in Australia to describe a specific process of achieving equipment reliability improvements. Key proponents of this approach are Wayne Bissett from OneSteel and Rod Bennett from Silcar/ Bluescope.

Condition Management focuses on ‘Management’ of equipment condition as opposed to just ‘Monitoring’ and reporting of its condition. That is, to be ‘in control of’ rather than to be just an ‘observer of’ equipment condition. Condition Management requires closing the loop on the known causes of failures, poor reliability and poor equipment performance. The approach is based on the fact that people in our maintenance community already know the main failure causes for most equipment. Sharing and use this information broadly to proactively help eliminate failure causes has been proven to give dramatic equipment and business performance improvement.

More traditional Reliability Engineering methodologies tend to take a Pareto 80:20 approach to reliability problems. That is, finding the few reliability hot spots that will make big business improvements when solved. The problem with this approach is that often the hot spots are just ever moving waves on top of a deep lake of failure causes. Once any problem is solved, there are always 20 more to replace it without much overall observable improvement. Condition Management’s focus is to try to drain the lake rather than focus on the current waves of reliability issues. Reliability engineers should make a balanced judgement on how much effort should be focused on each of these approaches.

The highest profile example of the effectiveness of the Condition Management approach is with the reliability improvement of rotating equipment. Over the last 20 years the Condition Monitoring community has developed a detailed understanding of rotating equipment failure causes and root cause solutions. The root cause solutions are mostly based in the areas of improved equipment specifications, assembly quality control, installation practices and commissioning practices. Condition Monitoring is integrally tied to Condition Management as it gives to final quality control on both commissioning and through the life of the equipment. Some examples of these rotating equipment root cause solutions are seal specification, component tolerances, rotor balancing, bearing assembly, baseplate setup, shaft alignment and reduced lube contamination.

Wayne Bissett’s comment is that “Condition monitoring tells you more about your tradesperson and operator skills than it does about your equipment condition”, as so many condition symptom and reliability issues can be traced back to the causes discussed in the paragraph above. This is because there are specifications and standards available for many condition parameters on what “Good Condition” is. If a machine’s condition is “Out of Specification” it is usually not an indicator that it’s about to fail but it is an indicator of an increased likelihood of future failure. These “Out of Specification” issues can often be easily diagnosed to specific root causes. For example a high vibration on a rotating machine might be diagnosed as a likely coupling misalignment issue. Fixing this alignment issue would likely mean that the bearings on the machine might last 3 to 30 times longer depending on the severity of the misalignment.

In most situations finding equipment condition issues and linking them to likely causes is not hard, as there are lots of good condition monitoring and inspection approaches with condition standards available. The challenge is in doing something to eliminate these problems broadly across your plant. It is about getting the tools & training in place and the willingness to manage the improvement from how it has always been. The attitude such as “This machine has always vibrated at around 12mm/sec, It’s normal”, to the realisation that it’s not that hard to make 2mm/sec vibration normal, with the massive improvement in reliability that will result. The experience is that when you put the right monitoring and front line skills & practices in place, most of the background level of failures in your plant vanishes. Once there is management commitment to tackle these standard causes, condition monitoring can give the focus of what machines are to be realigned, which gearboxes will have their oil filtered, which fans are to be balanced etc. and reliability improvements can often be achieved quickly.


Standards & Practices and Setting the Right Culture and Attitude

Let’s take the example of machine alignments and lube cleanliness that are two root cause solutions for rotating equipment. It’s not good enough saying “these three fans are critical and they have to be aligned well” or ‘these 5 gearboxes are critical and we will only put clean oil in”. Whatever is the minimum standard you accept for the least critical item on your plant, this lower standard will naturally creep back into the most critical equipment over time. Unless you set clear and unambiguous standards such as “All drives will be Laser Aligned” and “All gearboxes will be filled with oil through a 3 micron filter” then you will always have confusion on what is required. You should be set up to achieve these improved standards and practices for the critical equipment, so why not do it for all your equipment. This philosophy of implementing improvements and standards broadly, results in reliability improvement for all equipment involved. There will always be exceptions to any standard but once a broad standard is in place, variations tend to be much easier to manage.

If you go visit a maintenance office you will typically find a very busy group of people. It is typically not the most critical equipment they are working on but problems with all the other equipment. As many of these reliability problems have common causes, once some of them have been solved, people have more time to drive further improvements and lock in existing ones.

The experience is that most tradespersons and operators are generally happy and often enthusiastic about being given good Standards and Practices information when it is presented to them in an appropriate way, they are given the means to enable them to comply and are specifically asked to comply. People like to do a good job. A good approach for implementation is to develop an on-site front line champion for each improvement technology who can train people and then assess & reassess people’s competency as required. A key issue is getting the support and cooperation of middle and upper management. As an example, a tradespersons may be given training on say bearing installation, which requires special tools to do the job properly. Middle management are typically constantly under pressure from upper management for cost reductions and often are not given the information or are sceptical about how investment in items like special tools would pay off. This is usually very disheartening for the front line people involved and improvement initiatives can die very quickly. It should be one of the roles for a reliability engineer to communicate with management to ensure the right priority is given on items to enable compliance to standards.

One best practice for communicating root cause issues is the 2 page Skills and Practices Flyers championed by Rod Bennett from Silcar/ Bluescope. These documents can be read in 5 minutes and are ideal for communicating and reinforcing standards and practices improvement information. They are short enough to use in on-site toolbox meeting before starting a job where the learning involved is relevant. They are also printed out with workorder information packs on relevant jobs. Real leaning occurs when people actually use information they have been given on the job and develop and perfect the specific skill. A Skills and Practices Flyer give clear and precise information for use on-the-job where it is needed (Maintenance Skills and Practices Flyers).

In many businesses cost reductions are imperative for long term survival. Finding the balance between being ‘thrifty’ as opposed to being ‘over thrifty’ is vital when deciding how you spend you maintenance and operations dollars. It is necessary to look for innovative cost reductions but not at the cost of installing defects into equipment and processes that will cost the business more dollars in the long run. 'Doing a job right the first time' often means you don’t have to do the job again soon and often costs no more than doing it poorly. Doing the ‘job right’ also means knowing what the ‘right job’ is and that often requires detailed equipment and or process knowledge. Often the causes of problems are down at the detailed parts level of equipment and how they fit together and the forces that are applied. One of the current risks with our aging workforce is that key knowledge will be lost. Often the cause of problems are that people for any number of reasons don’t read the procedures, read the manuals, look at the drawings or don’t ask questions when they don’t understand. Organisations that want to be successful need to develop a learning and improvement culture in their people and encourage ownership/ affinity/ sympathy with the equipment and processes they are involved with.

Maintenance Strategies

Another area of opportunity is with Preventive Maintenance (PM) activities. Often plants will have many PM’s that are ineffective and even worse invasive, where in attempting to find problems they are actually causing them. In most organisations when problems occur there is a management drive to prevent them reoccurring in the future. The standard response is often to ‘create a PM’, without any reliability analysis, best practice search or much thought of how effective it will really be. These PM’s tended to be off-line downday inspections which generate lots of downday workload and often get in the way of actual repair work. Many organisations have realised that a lot of their current PM’s and inspections are either ineffective or are being done too often. The questions they ask are “What PM’s are we currently doing?”, “Which PM’s are generating benefits and which ones should we stop or change the frequency?”, “What equipment and failures should we be focusing on?”, “For these items what is best practice for PM’s and root cause solutions?” and “What can we practically and economically do with our current people and service providers?”. The preferred approach is to get front line people including operators involved in this analysis and provide them with best practice information and technical support. Operators should have a strong involvement with any analysis as they often have a large influence on failure causes and they are also in a good position to do many monitoring activities. Sometimes after attempting to fix the root causes of reliability problems (it should be a serious attempt) it becomes obvious that the equipment should be replaced with something more suitable. This should be brought to management attention, as sometimes the best practice root cause solution is to change to equipment with proven reliability.

When reviewing PM’s, priority should be given to doing more condition monitoring and in-service checks, as opposed to downday PM’s. Often for many of the downday inspections and checks there are alternative condition monitoring and in-service checks that are effective. Ideally downdays should be for fixing problems and should only occur when you have something to fix. Not stopping for a downday requires building enough confidence in your CM data to keep operating your plant until you find a problem. Once there is a known problem then you schedule a downday giving enough planning time to fix the problem properly. Often the current approach is to have regular standard downdays (eg. weekly) with lots of off-line inspections and when problems are found there is usually not enough time to fix them. The ideal is to replace all necessary regular off line inspections with in-service equivalents.

There are lots of techniques to make in-service inspection of issues such as wear easier. An example is wear measurements of steel wheels, which are often monitored off line by micrometer measurement of diameters. A simple wear indicator mark (like tread indicators on car tyres) is all that is required to make on-line wear checks simple and accurate. Another example is material chute wear liner inspections, which are often checked by sticking your head into the chute to visually check for wear but there is usually no reference by which to judge the wear. Adding some simple tell tail holes drilled in the back of the liner makes the job quick simple and practical. These types of checks can be done in service with the availability of low cost video inspection tools or even simpler, a mirror on a stick. It takes a while for maintenance people to get their head around the idea of not stopping for downdays unless they have found problems that needs to be fixed. Downdays easily become a habit.

With a greater reliance on condition monitoring and measurements, there needs to be processes put in place to ensure the information collected is appropriate, accurate and consistent and that it will be used in a timely and appropriate manner. Poor management of these processes may cause the data to end up being useless or ineffective for the original purpose intended or worse that the data is never actually used. When it comes to measuring parameters the only solution is checking and rechecking to ensure it is correct and consistent with previous measurements, as even the best people can make mistakes. To get the level of attention to detail and a real understanding of what is trying to be achieved by the measurements, you need to have a high level of ownership by the people doing the work. Getting front line people involved in improvements is one of the best ways to develop ownership.

As with root cause solutions, when setting up condition monitoring, inspections and PM’s you should be setting minimum standards for use of appropriate monitoring & PM techniques and implementing them broadly. Often one of the largest opportunities is with lubrication practices. Best practice for lubrication and lube contamination reduction is well known with many organisation having received massive benefits from its broad implementation. Many condition monitoring techniques such as Vibration Analysis (VA) and Oil Analysis (OA) should also be implemented broadly to the most relevant equipment types. Again, it does not cost that much more to monitor all relevant equipment in a plant as doing just the critical equipment. Less critical equipment is usually monitored less often. Another under used opportunity to get useful data on equipment problems and condition is from information stored in PLC’s, Process Computers and SCADA system.

Another key cause of equipment failure problems is with spares. It is important to set up to ensure that in any overhauls of your equipment the required standards, practices and QA procedures will be used to produce reliability similar to new equipment. This requires regular visits to overhaul service providers by experienced people to audit their compliance and also having a system to feedback any reliability problems that occur. There also needs to be standards and auditing for transporting and storage of spares, as often spares are stored for long periods. Where no standards are in place it is common for the items to be found defective before use or worst, fail soon after commissioning.

Management of Condition Management

The key issues for management of Condition Management involves:-

  • Reporting Systems and KPI’s
  • Technical Support Resources
  • Implementation Resources
  • Auditing

The key issue with the management of the broad application of maintenance strategies and broadly applied root cause solutions is to understand what is under some level of control and what is not being managed. One approach to this is using an Integrated Condition Reporting System. Examples of these systems are from Nick Lee Loy Yang B Media:Nick_Lee_-_CM_Reporting_and_Alarms.pdf, from Azeez Ahamat of Snowy Hydro (to be presented at this years CM & Lube Forum 2008) and from Paul Gallagher of OneSteel (to be presented at this years CM & Lube Forum 2008). Management may observe condition monitoring Techs, NDT Techs, trades inspectors, Lube Techs and operators all monitoring the condition of items and can easily get the impression that everything is being managed. When the actual details of what is being done is analysed, often only 20% of equipment items are being looked at. Hopefully these are the most critical 20% but it indicates the there is 80% of items that are not being managed and this is often where the bulk of maintenance costs are being generated.

Taking the example of the OneSteel Integrated Condition Reporting System approach. This system is a web based database that is separate from their SAP CMMS. The system records the full plant index data for the plant. A local CM champion or technician records the condition classification of each plant item that has been monitored, inspected or an approved PM performed. Information from a range of inspection and monitoring technologies can be input into the system for each equipment item. A typical condition classification would be 'OK', 'Out of Specification' (eg dirty, hot or running rough), 'Alarm' and 'Severe'. All items that have been given a maintenance strategy of Operate to Failure are identified. Brief comments about faults and workorder notification numbers are also recorded. Any item of plant that has no information recorded against it or has not received an update of its condition in a reasonable time is classified as ‘Condition Unknown’. As plant index information is hierarchical the OneSteel system can display a pie graph at any level of the plant to show what items are being managed and what items are not. This system is capable of creating KPI's for all levels of the organisatin. It can also monitor when the condition of an area of plant is being run down, which is typically very hard to scrutinize. The focus of Condition Management is to reduce the percentage of‘Condition Unknown’ for all equipment items that have not been selected for 'Operate to Failure'.

There is often difficulty in maintaining good inspection, CM and Defect Elimination practices over time, once they are initially put in place. It is easy for the initial hard work to be destroyed, especially with changes in either front line, planning or management personnel. Constant vigilance and good handover practices at all levels is required. Another issue is the unwillingness of some people to share their knowledge, as people often believe that knowledge gives them power and security. A good option to manage these issues is for each front line specialist to have a higher level technical expert available, who might be internal for a large corporate or external for a smaller organisation. This technical expert should have the role of communicating and sharing best practices to encourage learning and improvements and can also perform occasional audits to ensure the detailed systems are still being complied with. Once systems and processes are in place the technical expert should only be required for phone and email support with only an vocational visit to site. A key advantage is that if there is a personnel change on-site at whatever level, then if there has not been adequate handover, the technical expert can train or communicate the required information to the new personnel.

A final opportunity from Condition Management is the use of overall condition KPI’s. There are a number of condition monitoring parameters that have a strong correlation to equipment reliability. Examples are RMS Velocity Vibration on rotating equipment equipment >600rpm, % moisture in lube oil and ISO contamination levels in lube. KPI best practice is for the average levels across a entire plant area for each of these types of parameters to be calculated and trended. As related Condition Management improvements occur, these average parameter levels will decrease and there is always an equivalent decrease in the maintenance costs.

Article by Peter Todd - Most of the information for this article came from attendee comments from the NSW Industrial Maintenance Roundtable Common Interest Workgroup (CIWG) Meeting on Condition Management, Defect Elimination and Lubrication Cleanliness on August 19 2008

Personal tools