Multi-Processor System-on-Chip 1

Architectures

Inbunden, Engelska, 2021

Av Liliana Andrade, Liliana Andrade, Frederic Rousseau

2 299 kr

Beställningsvara. Skickas inom 7-10 vardagar

Fri frakt för medlemmar vid köp för minst 249 kr.

A Multi-Processor System-on-Chip (MPSoC) is the key component for complex applications. These applications put huge pressure on memory, communication devices and computing units. This book, presented in two volumes – Architectures and Applications – therefore celebrates the 20th anniversary of MPSoC, an interdisciplinary forum that focuses on multi-core and multi-processor hardware and software systems. It is this interdisciplinarity which has led to MPSoC bringing together experts in these fields from around the world, over the last two decades.Multi-Processor System-on-Chip 1 covers the key components of MPSoC: processors, memory, interconnect and interfaces. It describes advance features of these components and technologies to build efficient MPSoC architectures. All the main components are detailed: use of memory and their technology, communication support and consistency, and specific processor architectures for general purposes or for dedicated applications.

Produktinformation

Utgivningsdatum2021-05-07
Mått10 x 10 x 10 mm
Vikt454 g
FormatInbunden
SpråkEngelska
Antal sidor320
FörlagISTE Ltd
ISBN9781789450217

Tillhör följande kategorier

Elektronik och kommunikationer inom Naturvetenskap och teknik

Liliana Andrade is Associate Professor at TIMA Lab, Université Grenoble Alpes in France. She received her PhD in Computer Science, Telecommunications and Electronics from Université Pierre et Marie Curie in 2016. Her research interests include system-level modeling/validation of systems-on-chips, and the acceleration of heterogeneous systems simulation. Frédéric Rousseau is Full Professor at TIMA Lab, Université Grenoble Alpes in France. His research interests concern Multi-Processor Systems-on-Chip design and architecture, prototyping of hardware/software systems including reconfigurable systems and highlevel synthesis for embedded systems.

Foreword xiiiAhmed JERRAYAAcknowledgments xvLiliana ANDRADE and Frédéric ROUSSEAUPart 1. Processors 1Chapter 1. Processors for the Internet of Things 3Pieter VAN DER WOLF and Yankin TANURHAN1.1. Introduction 31.2. Versatile processors for low-power IoT edge devices 41.2.1. Control processing, DSP and machine learning 41.2.2. Configurability and extensibility 61.3. Machine learning inference 81.3.1. Requirements for low/mid-end machine learning inference 101.3.2. Processor capabilities for low-power machine learning inference 141.3.3. A software library for machine learning inference 171.3.4. Example machine learning applications and benchmarks 201.4. Conclusion 231.5. References 24Chapter 2. A Qualitative Approach to Many-core Architecture 27Benoît DUPONT DE DINECHIN2.1. Introduction 282.2. Motivations and context 292.2.1. Many-core processors 292.2.2. Machine learning inference 302.2.3. Application requirements 322.3. The MPPA3 many-core processor 342.3.1. Global architecture 342.3.2. Compute cluster 362.3.3. VLIW core 382.3.4. Coprocessor 392.4. The MPPA3 software environments 422.4.1. High-performance computing 422.4.2. KaNN code generator 432.4.3. High-integrity computing 462.5. Conclusion 472.6. References 48Chapter 3. The Plural Many-core Architecture – High Performance at Low Power 53Ran GINOSAR3.1. Introduction 543.2. Related works 553.3. Plural many-core architecture 553.4. Plural programming model 563.5. Plural hardware scheduler/synchronizer 583.6. Plural networks-on-chip 613.6.1. Schedule rNoC 613.6.2. Shared memory NoC 613.7. Hardware and software accelerators for the Plural architecture 623.8. Plural system software 633.9. Plural software development tools 653.10. Matrix multiplication algorithm on the Plural architecture 653.11. Conclusion 673.12. References 67Chapter 4. ASIP-Based Multi-Processor Systems for an Efficient Implementation of CNNs 69Andreas BYTYN, René AHLSDORF and Gerd ASCHEID4.1. Introduction 704.2. Related works 714.3. ASIP architecture 744.4. Single-core scaling 754.5. MPSoC overview 784.6. NoC parameter exploration 794.7. Summary and conclusion 824.8. References 83Part 2. Memory 85Chapter 5. Tackling the MPSoC Data Locality Challenge 87Sven RHEINDT, Akshay SRIVATSA, Oliver LENKE, Lars NOLTE, Thomas WILD and Andreas HERKERSDORF5.1. Motivation 885.2. MPSoC target platform 905.3. Related work 915.4. Coherence-on-demand: region-based cache coherence 925.4.1. RBCC versus global coherence 935.4.2. OS extensions for coherence-on-demand 945.4.3. Coherency region manager 945.4.4. Experimental evaluations 975.4.5. RBCC and data placement 995.5. Near-memory acceleration 1005.5.1. Near-memory synchronization accelerator 1025.5.2. Near-memory queue management accelerator 1045.5.3. Near-memory graph copy accelerator 1075.5.4. Near-cache accelerator 1105.6. The big picture 1115.7. Conclusion 1135.8. Acknowledgments 1145.9. References 114Chapter 6. mMPU: Building a Memristor-based General-purpose In-memory Computation Architecture 119Adi ELIAHU, Rotem BEN HUR, Ameer HAJ ALI and Shahar KVATINSKY6.1. Introduction 1206.2. MAGIC NOR gate 1216.3. In-memory algorithms for latency reduction 1226.4. Synthesis and in-memory mapping methods 1236.4.1. SIMPLE 1246.4.2. SIMPLER 1266.5. Designing the memory controller 1276.6. Conclusion 1296.7. References 130Chapter 7. Removing Load/Store Helpers in Dynamic Binary Translation 133Antoine FARAVELON, Olivier GRUBER and Frédéric PÉTROT7.1. Introduction 1347.2. Emulating memory accesses 1367.3. Design of our solution 1407.4. Implementation 1437.4.1. Kernel module 1437.4.2. Dynamic binary translation 1457.4.3. Optimizing our slow path 1477.5. Evaluation 1497.5.1. QEMU emulation performance analysis 1507.5.2. Our performance overview 1517.5.3. Optimized slow path 1537.6. Related works 1557.7. Conclusion 1577.8. References 158Chapter 8. Study and Comparison of Hardware Methods for Distributing Memory Bank Accesses in Many-core Architectures 161Arthur VIANES and Frédéric ROUSSEAU8.1. Introduction 1628.1.1. Context 1628.1.2. MPSoC architecture 1638.1.3. Interconnect 1648.2. Basics on banked memory 1658.2.1. Banked memory 1658.2.2. Memory bank conflict and granularity 1668.2.3. Efficient use of memory banks: interleaving 1688.3. Overview of software approaches 1708.3.1. Padding 1708.3.2. Static scheduling of memory accesses 1728.3.3. The need for hardware approaches 1728.4. Hardware approaches 1728.4.1. Prime modulus indexing 1728.4.2. Interleaving schemes using hash functions 1748.5. Modeling and experimenting 1818.5.1. Simulator implementation 1828.5.2. Implementation of the Kalray MPPA cluster interconnect 1828.5.3. Objectives and method 1848.5.4. Results and discussion 1858.6. Conclusion 1918.7. References 192Part 3. Interconnect and Interfaces 195Chapter 9. Network-on-Chip (NoC): The Technology that Enabled Multi-processor Systems-on-Chip (MPSoCs) 197K. Charles JANAC9.1. History: transition from buses and crossbars to NoCs 1989.1.1.NoC architecture 2029.1.2. Extending the bus comparison to crossbars 2079.1.3. Bus, crossbar and NoC comparison summary and conclusion 2079.2. NoC configurability 2089.2.1. Human-guided design flow 2089.2.2. Physical placement awareness and NoC architecture design 2099.3. System-level services 2119.3.1. Quality-of-service (QoS) and arbitration 2119.3.2. Hardware debug and performance analysis 2129.3.3. Functional safety and security 2129.4. Hardware cache coherence 2159.4.1. NoC protocols, semantics and messaging 2169.5. Future NoC technology developments 2179.5.1. Topology synthesis and floorplan awareness 2179.5.2. Advanced resilience and functional safety for autonomous vehicles 2189.5.3. Alternatives to von Neumann architectures for SoCs 2199.5.4. Chiplets and multi-die NoC connectivity 2219.5.5. Runtime software automation 2229.5.6. Instrumentation, diagnostics and analytics for performance, safety and security 2239.6. Summary and conclusion 2249.7. References 224Chapter 10. Minimum Energy Computing via Supply and Threshold Voltage Scaling 227Jun SHIOMI and Tohru ISHIHARA10.1. Introduction 22810.2. Standard-cell-based memory for minimum energy computing 23010.2.1. Overview of low-voltage on-chip memories 23010.2.2. Design strategy for area- and energy-efficient SCMs 23410.2.3. Hybrid memory design towards energy- and area-efficient memory systems 23610.2.4. Body biasing as an alternative to power gating 23710.3. Minimum energy point tracking 23810.3.1. Basic theory 23810.3.2. Algorithms and implementation 24410.3.3. OS-based approach to minimum energy point tracking 24610.4. Conclusion 24910.5. Acknowledgments 24910.6. References 250Chapter 11. Maintaining Communication Consistency During Task Migrations in Heterogeneous Reconfigurable Devices 255Arief WICAKSANA, OlivierMULLER, Frédéric ROUSSEAU and Arif SASONGKO11.1. Introduction 25611.1.1. Reconfigurable architectures 25611.1.2. Contribution 25711.2. Background 25711.2.1. Definitions 25811.2.2. Problem scenario and technical challenges 25911.3. Related works 26111.3.1. Hardware context switch 26111.3.2. Communication management 26211.4. Proposed communication methodology in hardware context switching 26311.5. Implementation of the communication management on reconfigurable computing architectures 26611.5.1. Reconfigurable channels in FIFO 26711.5.2. Communication infrastructure 26811.6. Experimental results 26911.6.1. Setup 26911.6.2. Experiment scenario 27011.6.3. Resource overhead 27111.6.4. Impact on the total execution time 27311.6.5. Impact on the context extract and restore time 27511.6.6. System responsiveness to context switch requests 27611.6.7. Hardware task migration between heterogeneous FPGAs 28011.7. Conclusion 28211.8. References 283List of Authors 287Authors Biographies 291Index 299