System Failures

System Failures

System failures are not always obvious, often because we blame ourselves (in hindsight) for adverse events. Just because the processes performed as they were built does not mean they function well. The system is the only constant in our work, and system failures are everywhere.

Defining "The System"

We work in an increasingly-complex environment. Human behavior is only one small variable; everything else is considered “the system.” This includes processes, physical environment, staffing models, technology, equipment, training, the organization’s values… everything. 

To make meaningful change, we must
seek out and address system failures.

Who is responsible?

Leaders are responsible for ensuring identified system failures are addressed, and for not holding staff inappropriately responsible in the meantime.

Everyone gets involved in solving the problem. That is the “robust process improvement” part of the High Reliability triad. 

How do we fix system failures?

The best tool to use depends on the issue you want to tackle. 

Smaller-scale problems can use simpler solutions (such as those outlined below). See below for some general help. 

If you have a complex problem, you may need more robust PI tools. The PI department can help you!

General Guidance: How to Address System Failures

Take stock of the current process.

  • Why was it created?
  • Who uses it now? What do they need to do?
  • Who would be affected by a change?
  • Is there any data available to measure current state and change?

Choose a change that best fits the situation. Some solutions are stronger than others. Partner with frontline staff and other departments that would be affected by the change. 

Ask yourself, “how are humans going to make mistakes?” and “where will they take shortcuts?” Use those answers to inform the design.

Then, implement the change.

We cannot simply implement and walk away; we need to actively monitor and ask for feedback. 

Remember, simply because you haven’t heard any complaints doesn’t mean everything is going well. You may have to adjust your solution, and that’s OK!


  • Architectural/physical plant changes
  • New devices with usability testing before purchasing
  • Engineering control, interlock, forcing functions
  • Simplify the process and remove unnecessary steps
  • Standardize on equipment or process or care maps
  • Tangible involvement and action by leadership in support of safety


  • Redundancy/back-up systems
  • Increase in staffing/decrease in workload
  • Software enhancements/modifications
  • Eliminate/reduce distractions
  • Checklist/cognitive aid
  • Eliminate look- and sound-alikes
  • Enhanced documentation/communication


*If choosing a weak solution, be sure to pair it with something rated Intermediate or Strong (especially if there is a high risk of harm)

  • Double checks
  • Warnings and labels
  • New procedure/memorandum/policy
  • Training
  • Additional study/analysis

Hierarchy of Controls

Talk about where this is relevant… like physical safety! For example, in workplace, or where there are physical hazards!

Share a system solution!

Stories are powerful. 

Please consider sharing a problem you addressed so that others can learn from your experience. 

Close Menu