Two Major Issues with Managing Safety Lifecycle Data

Two Major Issues with Managing Safety Lifecycle Data

There are many issues with managing Safety Lifecycle data in a real plant environment. The fundamental problem is that the tools available in the plant are traditional and unsuitable for an effective Safety Lifecycle Management program.  Especially for Instrumented Systems, much less other functions such as non-instrumented systems or relief systems.

Plants that manage the Safety Lifecycle generally have to resort to manual data gathering and retention procedures that are outside of what management perceives as their core tools. This results in systems that are very inefficient making it easy to develop gaps or become out of date.  This is a very common issue in the process industry.

infographic

  • Maintenance Management System (MMS)

There are very traditional Maintenance Management Systems that have been implemented by cost driven projects that are transitioning old home-grown systems to large commercially available systems.  Implementation teams usually have directions– “If the old system didn’t do it, the new one shouldn’t either”. However, the old systems do not include instrumentation or new ideas such as Safety Systems that are needed.

The system is typically driven by Work Order Management, Warehouse stock management, and Maintenance Management of major equipment. Attempts to add Safety Instrumented Systems (SIS), Safety Instrumented Functions (SIF), field instruments and the like are exercises in futility. The system can’t handle the sheer volume and really can’t handle things that have a lot of inter-relationships.  Furthermore, management typically does not support the effort required to input additional data into the MMS even if it is known the system can handle it.

  • Instrument Database

Commonly used commercial Instrument Database applications define such things as instrument data sheets, loop diagrams, wiring, etc. They are typically used for large engineering projects, yet still have issues such as rudimentary maintenance functions and not able to support ongoing events. Attempts to force it to fit needs usually won’t work very well.

In effect, maintenance data for instrumentation is only as good as individual records. The MMS could be used for Work Orders and warehouse stock management, but not much else. All Work Order feedback, when existing, is usually manually entered text and seldom contains useful instrument work information.

  • Process Hazard Analysis (PHA) Records

The Process Safety Group is usually responsible for facilitating PHA’s for the facility.  This includes initial PHA/HAZOP, 5-year revalidations, projects, and in-house Management of Changes (MOC). They use a combination of commercial PHA/HAZOP applications, Excel spreadsheets, as well as both paper and electronic MOC check lists. All of this is typically kept in the group’s records, yet they are exceptionally hard to use for other purposes. The PHA/HAZOP applications also usually have draconian license restrictions which only allow the Process Safety Group to have access.

Every PHA/HAZOP and MOC checklist is usually kept in a separate file which causes major efforts to then find it. Requests for information can be met with a “Who wants to know?” response causing substantial delays in actually getting the information, if it is ever received.

Sometimes, master lists of Independent Protection Layers (IPL) that are identified in the LOPA’s do not correlated to actual plant assets, or even exist at all. The operations and maintenance personnel then have no real knowledge of what the IPL’s are.  Also, they tend to lack knowledge of what hazards led to the requirements for the IPL’s to be there in the first place

  • Document Management

Some facilities have a centralized document system that seems to work. Being able to access scanned or source files for just about any drawing or document in the facility can be useful. However, there is sometimes an unspoken rule that the document system would contain “engineering data only”.  Documents are then to be stored only by Unit and document type. That would work if that’s all that is needed, but if not, don’t even think about asking for a list of documents associated with a piece of equipment, or a Safety Function.

  • Independent Protection Layer (IPL), Safety Instrumented System (SIS) and Safety Instrumented Function (SIF) Management

It is becoming clearer that traditional plant management tools are not able to manage the Safety Lifecycle for Instrumented and non-Instrumented protective functions. Previously, there were no commercially available data management tools, so the effort got reduced to setting up a series of folders on a facility network drive. In attempt to  capture a “dossier” of protective systems, scanned copies of widely dispersed data such as PHA/HAZOP/LOPA documents, SRS’s, test procedures, design documents, and data sheets were stored. Other folders are typically created to provide a place to store operationally related things like scanned copies of completed test procedures and Excel spreadsheets of various events.

This process was used parallel with existing documentation systems because it was the only way that all the relevant information could be collected and made accessible. In theory, these documents were available in other systems yet finding them would be a scavenger hunt if the documents weren’t collected separately. The system is very labor intensive as manual labor is required to collect all the relevant documents and then electronically file them. It was something of an underground effort, as site management didn’t really appreciate the value of the data.  Furthermore, this systems longevity depended heavily upon not having a poor quarter of financial performance.

Pro Tip

It can be very difficult to manage the Safety Lifecycle within a plant that only has the traditional commercial Process Safety, Maintenance Management, and Documentation applications that you typically find in any general operation. Some facilities have structured their own system by just filing the relevant documents in a parallel network drive folder, but that isn’t a permanent solution. Safety Lifecycle management requires a separate purpose –built application for proper Safety Lifecycle Management.

Read more about Justifying Investment in a Safety Lifecycle Management Platform

Rick Stanley has over 40 years’ experience in Process Control Systems and Process Safety Systems with 32 years spent at ARCO and BP in execution of major projects, corporate standards and plant operation and maintenance. Since retiring from BP in 2011, Rick formed his company, Tehama Control Systems Consulting Services, and has consulted with Mangan Software Solutions (MSS) on the development and use of MSS’s SLM Safety Lifecycle Management software.

Rick has a BS in Chemical Engineering from the University of California, Santa Barbara and is a registered Professional Control Systems Engineer in California and Colorado. Rick has served as a member and chairman of both the API Subcommittee for Pressure Relieving Systems and the API Subcommittee on Instrumentation and Control Systems.

 

Benefits of Effective Bypass Management

Benefits of Effective Bypass Management

SA-TR84.01.00 and IEC 61511 ed 2, Part 1, contain extensive discussions of the design and operating procedures for Safety Instrumented Function (SIF) bypasses. Clause 16.2 describes operational requirements such as: 

  •  Performing a hazard analysis prior to initiating a bypass
  • Having operational procedures in place for when a protective function has been bypassed
  • Logging of all bypasses

In addition to the need of managing process hazards while a protective function has been bypassed, the time a protective function is in bypass affects the in-service performance of the SIF. While bypassed, the protective function is unavailable, so every hour of bypass time increases the Probability of Failure upon Demand (PFD) of the function.

The fault tree excerpt below illustrates how bypassing of a SIF’s shutdown valves for 20 hours in a year can significantly affect the PFD. Without any bypasses, the Risk Reduction Factor (1/PFD) of the SIF is 306. The 20 hours of bypass reduces the in-service risk reduction factor (RRF) to 180, or about a 40% reduction in performance. 

The Why:

  1. Compliance – The Standards governing the Safety Lifecycle require that bypasses be tracked and define specific information that should associated with each bypass. This is crucial to ensuring overall safety.
  2. Process Safety Management – Excessive bypassing of protective functions has a substantial impact upon overall process safety. Performance of protective functions can be significantly reduced with even moderate levels of bypass. An effective bypass log will help identify bad actors – most bypasses occur for a reason, and if a function is bypassed frequently, it’s typically for the same repetitive reason. 

The Benefits:

  1. Improves Safety and the overall availability through transparent and effective safeguard stewardship – Key Performance Indicators for effective process safety management for safety functions, ensuring the designed integrity is not compromised.
  2. Reduces Operational Risk through effective evaluation and mitigation of occurrences where safety critical functions or equipment is bypassed– visibility of risk, tracking active bypasses, performing override risk assessments prior to bypassing.

 Rick Stanley has over 40 years’ experience in Process Control Systems and Process Safety Systems with 32 years spent at ARCO and BP in execution of major projects, corporate standards and plant operation and maintenance. Since retiring from BP in 2011, Rick formed his company, Tehama Control Systems Consulting Services, and has consulted with Mangan Software Solutions (MSS) on the development and use of MSS’s SLM Safety Lifecycle Management software.

Rick has a BS in Chemical Engineering from the University of California, Santa Barbara and is a registered Professional Control Systems Engineer in California and Colorado. Rick has served as a member and chairman of both the API Subcommittee for Pressure Relieving Systems and the API Subcommittee on Instrumentation and Control Systems.

What Went Wrong With The Process Safety System? Lessons From The 737 Max Crash

What Went Wrong With The Process Safety System? Lessons From The 737 Max Crash

 When there is a major accident somewhere, you have to investigate what errors might have been made that contribute to it.  Boeing’s problem with the 737 Max crash highlights a few fundamental issues with the process safety system that need to be examined.

4 Lessons Learned: 

1.)  An extremely important part of the specification of a Process Safety System is to seriously consider the effects of a spurious trip on the overall safety of a process. They really should be designed to keep you out of trouble, and not to put you into it. If a spurious trip could at any point drive a process to an unsafe condition, there needs to be some careful thinking about how that unsafe condition can be avoided. In the 737 Max case, there are indications that operation of the Maneuvering Characteristics Augmentation System (MCAS) at low altitudes may have not been examined as carefully as it should have been.

2.)  The second issue is the lack of a robust system. From reports so far, it appears that the MCAS operated based on only one sensor, which made the system much more exposed to a spurious trip. The failure of the one and only sensor resulted in behavior that drove two planes into the ground. When designing a Process Safety System that could have unsafe behavior if a spurious trip occurs, having a robust system is really important in order to cope with any potential errors. Designing a system that prevents the airplane from going down extremely fast deserves more than one sensor. From the reports this appears to have finally dawned on Boeing’s engineers after two crashes in 5 months.

Airplane Wing

3.)  This might not entirely be the engineer’s fault who designed the system- there was a second sensor known to be available as an option. This suggests managers could have possibly tried to force the unsafe systems to save some money.

4.)  The last issue is relying on people that operate the plane rather than the safety system itself. That only leaves room for human error.   This suggests a robust system wasn’t not required because the pilots were expected to be able to turn off the system if they needed to. This appears to have been successful in some of the reported incidents from US airlines. However, in the actual crashes that occurred, it is being discussed that the flight crews either could not or did not turn off the system. There is some speculation that their training wasn’t sufficient, but in any case, people under a lot of stress tend to forget things and make mistakes.

People can’t be expected to respond to unexpected events reliably. It’s worse if they haven’t been well trained, or it’s been a long time since they were trained. Expecting operator response comes with a burden to train well and train often.

Summary:

The 737 Max crashes are a stark reminder that when designing a safety system responsible for the lives of people, it deserves healthy portions of realism, pessimism, and all around potential risk consideration. You really need to spend time thinking before deciding that a design is acceptable even if there are other pressures from management. 

Infographic of lessons from a plane crash

 

Rick Stanley has over 40 years’ experience in Process Control Systems and Process Safety Systems with 32 years spent at ARCO and BP in execution of major projects, corporate standards and plant operation and maintenance. Since retiring from BP in 2011, Rick formed his company, Tehama Control Systems Consulting Services, and has consulted with Mangan Software Solutions (MSS) on the development and use of MSS’s SLM Safety Lifecycle Management software.

Rick has a BS in Chemical Engineering from the University of California, Santa Barbara and is a registered Professional Control Systems Engineer in California and Colorado. Rick has served as a member and chairman of both the API Subcommittee for Pressure Relieving Systems and the API Subcommittee on Instrumentation and Control Systems.

 

Issues With Managing Process Hazard Analysis (PHA) Data

Issues With Managing Process Hazard Analysis (PHA) Data

National and local regulations require that all process operations have a formal Hazards Analysis performed on the original installation as well as for all modifications to the facility. Most regulations also require that the Process Hazard Analysis (PHA) of record be re-validated at regular intervals, such as the 5-year re validation cycle required in the US. 

PHA is a complex tool used during the lifecycle of a facility and two of the biggest issues with them are coordination and consistency (see figure 1 below).  A PHA of Record represents a point in time, but in reality plant cycles are not static.  They are actually very dynamic with multiple independent modifications in progress. Some records are implemented even though the plant is in operation while a backlog of modifications are scheduled for the next turnaround.  They start collecting the day the plant is started up after its last turn around. Every time a plant is modified, some form of PHA is performed. The scope of these modifications can range from a small in-house modification to large projects that expand, de-bottleneck, or fix the process.

Figure 1:issues to consider info graphic

 So, in facilities, the Process Safety Management (PSM) Teams are faced with the almost impossible task of monitoring and collecting all of the completed Hazard Assessments and incorporating them into the PHA of Record as the modifications are implemented. If this hasn’t been done as time goes along, the PSM team then has an even bigger job of collecting all the incremental changes and identifying how they relate to the PHA of Record before they start the Re-validation process. All is a lot of work and consumes several full-time equivalents of work just to keep up. Most places don’t have these resources, so they make due as best they can.

 

Rick Stanley has over 40 years’ experience in Process Control Systems and Process Safety Systems with 32 years spent at ARCO and BP in execution of major projects, corporate standards and plant operation and maintenance. Since retiring from BP in 2011, Rick formed his company, Tehama Control Systems Consulting Services, and has consulted with Mangan Software Solutions (MSS) on the development and use of MSS’s SLM Safety Lifecycle Management software.

Rick has a BS in Chemical Engineering from the University of California, Santa Barbara and is a registered Professional Control Systems Engineer in California and Colorado. Rick has served as a member and chairman of both the API Subcommittee for Pressure Relieving Systems and the API Subcommittee on Instrumentation and Control Systems.