Detailed Class Agenda
The class will be taught over Zoom with eight virtual class sessions, each four hours long.
Session 1: Safety Fundamentals & New Paradigms (November 4)
Basic Concepts from System Engineering
The Hidden Cost of Defects Over Time
Intro to System Safety
History of Safety Engineering
Safety: Old View vs. New View (aka Component View vs. Systems View)
Traditional Safety Engineering
- Losses Caused by Failures
- Methods to Analyze Failures
- Limitations of Failure-based Analysis
- Redundancy & Independence
- Unintended Consequences of Redundancy
- Component View of Safety
- System Safety Order of Precedence
- Learning from Real Accidents: Patterns that Defeat Traditional Safety Efforts
The Next Generation of System Safety
- Losses Caused by Interactions, Requirements Errors, and Other Non-failure Causes
- Anticipating Interactions and Errors
- Modeling Control Loops
- Failure Accidents vs. Interaction Accidents
- Learning from Real Accidents: Recognizing New Patterns
Probability in Safety
- Common uses of probability
- Common probability restrictions
- Learning from Real Accidents: Probability insights and oversights
- Probability-based view vs. Control-based view
Automation and Software: Safety Principles and Lessons Learned
- The Primary Cause of Unsafe Software Behavior
- Failure and Error Concepts Applied to Software
- Rules and Limitations Applying Probability to Software
- Standard Risk Matrix vs. Software Risk Matrix
- Software Reuse: Case Study
- Software Redundancy: Case Study
- Software Diversity: Case Study
- Software Independence and N-Version Programming (NVP)
- The "Single Hardest Part" of Building a Software System
- Learning from Real Accidents: Independent Systems Controlled by Software
Safety and Reliability
- Common Confusions and Critical Differences
- Tradeoffs
Summary
Session 2: Human/Automation Interactions (November 6)
Terminology: Control Actions, Unsafe Control Actions, Decision Making, Process Model, Feedback, Controlled Process
Learning from Real Accidents: Poor Automation Defeating Human Operators
Statistics: Automation's Counter-intuitive Impact on Safety
Counter-intuitive Impact of Tasks on Human Performance
Learning from Real Accidents: Does Automation Simplify or Complicate Human Tasks?
The Automation Dilemma: Human-Centered vs. Engineering-Centered Design
Learning from Real Accidents: Poor Engineering Decisions That Appeared Reasonable
Modeling Both Humans and Automation Together
Safety Conclusions: Component View vs. Systems View
Hindsight Bias
Restrictions on Using "Failure" in a Human Context
Human Factors: Old View vs. Systems View
Case Study: Human Factors Engineering
Human Factors: Natural Mapping
Examples: Designs that Cause (or Prevent) Human Error
Learning from Accidents: Catching Poor Engineering Assumptions about Human Operators
Human Factors: Management by Exception
Limitations of Management by Exception
Human Factors Experts: Recommendations to Designers
Summary
Session 3: Systems Thinking, Accident Models, and STAMP (November 8)
Three modes of thought: strengths and weaknesses
- Analytic Reduction (aka Decomposition, or Divide & Conquer)
- Statistics and Probability
- Systems Theory (for Complex Systems)
Underlying Assumptions in Which Each Mode is Effective
Pillars of Systems Theory
- Hierarchy
- Emergence
- Communication
- Control / Feedback
Hazard Analysis as a Search Method
- Forward Search Methods
- Backward Search Methods
- Bottom-up Search Methods
- Top-down Search Methods
Introduction to Accident Models
Traditional Accident Models in System Safety
- Chain of Events Model
- Failure Propagation Model
- Swiss Cheese Model
- Functional Failure Propagation Model
Brief Overview of Common Traditional Methods
- Failure Modes and Effects Analysis (FMEA)
- Fault Tree Analysis (FTA)
- Functional Hazard Analysis (FHA)
Systems Theory Accident Model (STAMP)
- Principles from Control Theory
- Classifying Causal Factors Using a Control Loop
- The Control Structure
- Common Types of Control Loops in Safety
System-Theoretic Process Analysis (STPA)
- Overview
- Step 1: Define the Purpose of the Analysis
- Step 2: Model the Control Structure
- Step 3: Identify Unsafe Control Actions
- Step 4: Identify Loss Scenarios
Session 4: System-Theoretic Process Analysis (STPA) (November 13)
STPA Step 1 Artifacts: Stakeholders Losses, Hazards, Constraints
STPA Step 2 Artifacts: Control Structure, Controllers, Controlled Processes, Control Actions, Feedback, Abstraction
STPA Step 3 Artifacts: Unsafe Control Actions, UCA Bounding, UCA Syntax, Generating Constraints and Requirements, Operator Procedures
STPA Step 4 Artifacts: Loss Scenarios, Causes of UCAs, Control Actions Not Executed or Followed Properly, Human Interactions, Solutions
Real-world Examples From Each Step
Session 5: STPA Examples and Exercises (November 15)
Short Interactive STPA Exercises
Short STPA Examples
Session 6: In-Depth STPA Exercises (November 18)
Group Exercise: Apply STPA Given a Short Description of a System
- Review and discuss answers for each STPA Step
STPA Lessons Learned
- Identifying the "system" in the hazard specification
- Deciding what to model as a controller in the control structure
- Determining level of authority / control
- Distinguishing the physical and control layers
- UCA Context: specifying actual state vs. feedback mechanism
- Overlapping UCAs
- Using UCAs to identify new Hazards
- Building Scenarios with Process Model Flaws
- Building Scenarios with Decision Making Flaws
- Evaluation: How Do Your STPA Results Compare to Previous Findings?
Session 7: In-Depth STPA Exercises (November 20)
Group Exercise: Apply STPA Given a System Description
- Review and discuss answers for each STPA Step
Compare Your STPA Results to the Original FMEA and FTA Results
Identifying differences across the outputs from failure-based methods and STPA
STPA Lessons Learned
- How to Model a Control Structure Starting from Controlled Processes
- How to identify UCAs starting with high-level terms, then define the terms in more detail
- Strong vs. weak UCAs
- Generating concrete requirements from UCAs
- Building Scenarios with Process Model Flaws
- Using Scenarios to Ask New Questions
Session 8: STPA in Practice: Lessons Learned (November 22)
Identifying Common Mistakes in STPA
Planning and Implementing STPA Projects
- Team Roles
- The STPA Facilitator Role
- Getting leadership buy-in to apply STPA
- STPA Rollout / Roadmap
- STPA Levels of Certification
- Lessons from Successful and Unsuccessful Projects
- STPA Strengths and Weaknesses
- Two-phase Approach to Developing Solutions from STPA Results
- Planning Vector Checks
- Strategies to Accelerate STPA Learning and Rollout
- Time Data from Past STPA Projects
- Breakdown of activities that take the most / least amount of time
- Time Reductions as Experience Accumulates
Results and Conclusions from Past STPA Efforts
Summary