Fault-Tolerance Techniques for Spacecraft Control Computers
Inbunden, Engelska, 2017
2 009 kr
Produktinformation
- Utgivningsdatum2017-04-04
- Mått145 x 226 x 23 mm
- Vikt590 g
- SpråkEngelska
- Antal sidor344
- FörlagJohn Wiley & Sons Inc
- EAN9781119107279
Tillhör följande kategorier
Dr. Yang Mengfei, Professor, Chief Engineer and Chief Commander of China Academy of Space Technology, Beijing, China. Professor Yang Mengfei received his Master's degree in computer application from Beijing Institute of Control Engineering, China Academy of Space Technology in 1985. He then devoted himself to the research of fault tolerance computing, control of computer technology for space applications, and high-dependable software. In 2005, he received Ph.D. degree from Tsinghua University. Professor Yang has received numerous awards for his outstanding work and contribution to this sector.Dr. Hua Gengxin, Professor, Chief Engineer, Beijing Institute of Control Engineering, Beijing, China.Dr. Feng Yanjun, Senior Engineer, Director, China Academy of Space Technology, Beijing, China.Dr. Gong Jian, Senior Engineer, Engineer in Charge, Beijing Institute of Control Engineering, Beijing, China.
- Brief Introduction xiiiPreface xv1 Introduction 11.1 Fundamental Concepts and Principles of Fault-tolerance Techniques 11.1.1 Fundamental Concepts 11.1.2 Reliability Principles 41.1.2.1 Reliability Metrics 41.1.2.2 Reliability Model 61.2 The Space Environment and Its Hazards for the Spacecraft Control Computer 91.2.1 Introduction to Space Environment 91.2.1.1 Solar Radiation 91.2.1.2 Galactic Cosmic Rays (GCRs) 101.2.1.3 Van Allen Radiation Belt 101.2.1.4 Secondary Radiation 121.2.1.5 Space Surface Charging and Internal Charging 121.2.1.6 Summary of Radiation Environment 131.2.1.7 Other Space Environments 141.2.2 Analysis of Damage Caused by the Space Environment 141.2.2.1 Total Ionization Dose (TID) 141.2.2.2 Single Event Effect (SEE) 151.2.2.3 Internal/surface Charging Damage Effect 201.2.2.4 Displacement Damage Effect 201.2.2.5 Other Damage Effect 201.3 Development Status and Prospects of Fault Tolerance Techniques 21References 252 Fault-Tolerance Architectures and Key Techniques 292.1 Fault- tolerance Architecture 292.1.1 Module-level Redundancy Structures 302.1.2 Backup Fault-tolerance Structures 322.1.2.1 Cold-backup Fault-tolerance Structures 322.1.2.2 Hot-backup Fault-tolerance Structures 342.1.3 Triple-modular Redundancy (TMR) Fault-tolerance Structures 362.1.4 Other Fault-tolerance Structures 402.2 Synchronization Techniques 402.2.1 Clock Synchronization System 402.2.1.1 Basic Concepts and Fault Modes of the Clock Synchronization System 402.2.1.2 Clock Synchronization Algorithm 412.2.2 System Synchronization Method 522.2.2.1 The Real-time Multi-computer System Synchronization Method 522.2.2.2 System Synchronization Method with Interruption 562.3 Fault-tolerance Design with Hardware Redundancy 602.3.1 Universal Logic Model and Flow in Redundancy Design 602.3.2 Scheme Argumentation of Redundancy 612.3.2.1 Determination of Redundancy Scheme 612.3.2.2 Rules Obeyed in the Scheme Argumentation of Redundancy 622.3.3 Redundancy Design and Implementation 632.3.3.1 Basic Requirements 632.3.3.2 FDMU Design 632.3.3.3 CSSU Design 642.3.3.4 IPU Design 652.3.3.5 Power Supply Isolation Protection 672.3.3.6 Testability Design 682.3.3.7 Others 682.3.4 Validation of Redundancy by Analysis 692.3.4.1 Hardware FMEA 692.3.4.2 Redundancy Switching Analysis (RSA) 692.3.4.3 Analysis of the Common Cause of Failure 692.3.4.4 Reliability Analysis and Checking of the Redundancy Power 702.3.4.5 Analysis of the Sneak Circuit in the Redundancy Management Circuit 722.3.5 Validation of Redundancy by Testing 732.3.5.1 Testing by Failure Injection 732.3.5.2 Specific Test for the Power of the Redundancy Circuit 742.3.5.3 Other Things to Note 74References 743 Fault Detection Techniques 773.1 Fault Model 773.1.1 Fault Model Classified by Time 783.1.2 Fault Model Classified by Space 783.2 Fault Detection Techniques 803.2.1 Introduction 803.2.2 Fault Detection Methods for CPUs 813.2.2.1 Fault Detection Methods Used for CPUs 823.2.2.2 Example of CPU Fault Detection 833.2.3 Fault Detection Methods for Memory 873.2.3.1 Fault Detection Method for ROM 883.2.3.2 Fault Detection Methods for RAM 913.2.4 Fault Detection Methods for I/Os 95References 964 Bus Techniques 994.1 Introduction to Space-borne Bus 994.1.1 Fundamental Concepts 994.1.2 Fundamental Terminologies 994.2 The MIL-STD-1553B Bus 1004.2.1 Fault Model of the Bus System 1014.2.1.1 Bus-level Faults 1034.2.1.2 Terminal Level Faults 1044.2.2 Redundancy Fault-tolerance Mechanism of the Bus System 1064.2.2.1 The Bus-level Fault-tolerance Mechanism 1074.2.2.2 The Bus Controller Fault-tolerance Mechanism 1084.2.2.3 Fault-tolerance Mechanism of Remote Terminals 1134.3 The CAN Bus 1164.3.1 The Bus Protocol 1174.3.2 Physical Layer Protocol and Fault-tolerance 1174.3.2.1 Node Structure 1174.3.2.2 Bus Voltage 1184.3.2.3 Transceiver and Controller 1194.3.2.4 Physical Fault-tolerant Features 1194.3.3 Data Link Layer Protocol and Fault-tolerance 1204.3.3.1 Communication Process 1204.3.3.2 Message Sending 1204.3.3.3 The President Mechanism of Bus Access 1204.3.3.4 Coding 1214.3.3.5 Data Frame 1214.3.3.6 Error Detection 1224.4 The SpaceWire Bus 1244.4.1 Physical Layer Protocol and Fault-tolerance 1264.4.1.1 Connector 1264.4.1.2 Cable 1264.4.1.3 Low Voltage Differential Signal 1264.4.1.4 Data Filter (DS) Coding 1284.4.2 Data Link Layer Protocol and Fault-tolerance 1294.4.2.1 Packet Character 1294.4.2.2 Packet Parity Check Strategy 1314.4.2.3 Packet Structure 1314.4.2.4 Communication Link Control 1314.4.3 Networking and Routing 1364.4.3.1 Major Technique used by the SpaceWire Network 1364.4.3.2 SpaceWire Router 1384.4.4 Fault-tolerance Mechanism 1394.5 Other Buses 1414.5.1 The IEEE 1394 Bus 1414.5.2 Ethernet 1434.5.3 The I2C Bus 145References 1485 Software Fault-Tolerance Techniques 1515.1 Software Fault-tolerance Concepts and Principles 1515.1.1 Software Faults 1515.1.2 Software Fault-tolerance 1525.1.3 Software Fault Detection and Voting 1535.1.4 Software Fault Isolation 1545.1.5 Software Fault Recovery 1555.1.6 Classification of Software Fault-tolerance Techniques 1565.2 Single-version Software Fault-tolerance Techniques 1565.2.1 Checkpoint and Restart 1575.2.2 Software-implemented Hardware Fault-tolerance 1605.2.2.1 Control Flow Checking by Software Signatures (CFCSS) 1615.2.2.2 Error Detection by Duplicated Instructions (EDDI) 1645.2.3 Software Crash Trap 1655.3 Multiple-version Software Fault-tolerance Techniques 1655.3.1 Recovery Blocks (RcB) 1655.3.2 N-version Programming (NVP) 1675.3.3 Distributed Recovery Blocks (DRB) 1685.3.4 N Self-checking Programming (NSCP) 1695.3.5 Consensus Recovery Block (CRB) 1725.3.6 Acceptance Voting (AV) 1725.3.7 Advantage and Disadvantage of Multiple-version Software 1725.4 Data Diversity Based Software Fault-tolerance Techniques 1735.4.1 Data Re-expression Algorithm (DRA) 1735.4.2 Retry Blocks (RtB) 1745.4.3 N-copy Programming (NCP) 1745.4.4 Two-pass Adjudicators (TPA) 175References 1776 Fault-Tolerance Techniques for FPGA 1796.1 Effect of the Space Environment on FPGAs 1806.1.1 Single Event Transient Effect (SET) 1816.1.2 Single Event Upset (SEU) 1816.1.3 Single Event Latch-up (SEL) 1826.1.4 Single Event Burnout (SEB) 1826.1.5 Single Event Gate Rupture (SEGR) 1826.1.6 Single Event Functional Interrupt (SEFI) 1836.2 Fault Modes of SRAM-based FPGAs 1836.2.1 Structure of a SRAM-based FPGA 1836.2.2 Faults Classification and Fault Modes Analysis of SRAM-based FPGAs 1866.2.2.1 Faults Classification 1866.2.2.2 Fault Modes Analysis 1866.3 Fault-tolerance Techniques for SRAM-based FPGAs 1906.3.1 SRAM-based FPGA Mitigation Techniques 1916.3.1.1 The Triple Modular Redundancy (TMR) Design Technique 1916.3.1.2 The Inside RAM Protection Technique 1936.3.1.3 The Inside Register Protection Technique 1946.3.1.4 EDAC Encoding and Decoding Technique 1956.3.1.5 Fault Detection Technique Based on DMR and Fault Isolation Technique Based on Tristate Gate 1986.3.2 SRAM-based FPGA Reconfiguration Techniques 1996.3.2.1 Single Fault Detection and Recovery Technique Based on ICAP+FrameECC 1996.3.2.2 Multi-fault Detection and Recovery Technique Based on ICAP Configuration Read-back+RS Coding 2056.3.2.3 Dynamic Reconfiguration Technique Based on EAPR 2106.3.2.4 Fault Recovery Technique Based on Hardware Checkpoint 2166.3.2.5 Summary of Reconfiguration Fault-tolerance Techniques 2176.4 Typical Fault-tolerance Design of SRAM-based FPGA 2196.5 Fault-tolerance Techniques of Anti-fuse Based FPGA 227References 2307 Fault-Injection Techniques 2337.1 Basic Concepts 2337.1.1 Experimenter 2347.1.2 Establishing the Fault Model 2347.1.3 Conducting Fault-injection 2357.1.4 Target System for Fault-injection 2357.1.5 Observing the System’s Behavior 2357.1.6 Analyzing Experimental Findings 2357.2 Classification of Fault-injection Techniques 2367.2.1 Simulated Fault-injection 2367.2.1.1 Transistor Switch Level Simulated Fault-injection 2377.2.1.2 Logic Level Simulated Fault-injection 2377.2.1.3 Functional Level Simulated Fault-injection 2377.2.2 Hardware Fault-injection 2387.2.3 Software Fault-injection 2407.2.3.1 Injection During Compiling 2407.2.3.2 Injection During Operation 2417.2.4 Physical Fault-injection 2427.2.5 Mixed Fault-injection 2447.3 Fault-injection System Evaluation and Application 2457.3.1 Injection Controllability 2457.3.2 Injection Observability 2467.3.3 Injection Validity 2467.3.4 Fault-injection Application 2477.3.4.1 Verifying the Fault Detection Mechanism 2477.3.4.2 Fault Effect Domain Analysis 2477.3.4.3 Fault Restoration 2477.3.4.4 Coverage Estimation 2477.3.4.5 Delay Time 2477.3.4.6 Generating Fault Dictionary 2487.3.4.7 Software Testing 2487.4 Fault-injection Platform and Tools 2487.4.1 Fault-injection Platform in Electronic Design Automation (EDA) Environment 2497.4.2 Computer Bus-based Fault-injection Platform 2527.4.3 Serial Accelerator Based Fault-injection Case 2547.4.4 Future Development of Fault-injection Technology 256References 2588 Intelligent Fault-Tolerance Techniques 2618.1 Evolvable Hardware Fault-tolerance 2618.1.1 Fundamental Concepts and Principles 2618.1.2 Evolutionary Algorithm 2668.1.2.1 Encoding Methods 2708.1.2.2 Fitness Function Designing 2728.1.2.3 Genetic Operators 2738.1.2.4 Convergence of Genetic Algorithm 2778.1.3 Programmable Devices 2778.1.3.1 ROM 2788.1.3.2 PAL and GAL 2798.1.3.3 FPGA 2818.1.3.4 VRC 2828.1.4 Evolvable Hardware Fault-tolerance Implementation Methods 2858.1.4.1 Modeling and Organization of Hardware Evolutionary Systems 2868.1.4.2 Reconfiguration and Its Classification 2898.1.4.3 Evolutionary Fault-tolerance Architectures and Methods 2918.1.4.4 Evolutionary Fault-tolerance Methods at Various Layers of the Hardware 2938.1.4.5 Method Example 2988.2 Artificial Immune Hardware Fault-tolerance 3028.2.1 Fundamental Concepts and Principles 3028.2.1.1 Biological Immune System and Its Mechanism 3048.2.1.2 Adaptive Immunity 3058.2.1.3 Artificial Immune Systems 3078.2.1.4 Fault-tolerance Principle of Immune Systems 3108.2.2 Fault-tolerance Methods with Artificial Immune System 3148.2.2.1 Artificial Immune Fault-tolerance System Architecture 3168.2.2.2 Immune Object 3188.2.2.3 Immune Control System 3218.2.2.4 Working Process of Artificial Immune Fault-tolerance System 3258.2.3 Implementation of Artificial Immune Fault-tolerance 3288.2.3.1 Hardware 3288.2.3.2 Software 330References 334Acronyms 337Index 343