Author : Hongyang Jia
Publisher :
ISBN 13 :
Total Pages : 0 pages
Book Rating : 4.4/5 (711 download)
Book Synopsis Designing Computing Systems Based on Unconventional Technologies for Hardware Acceleration by : Hongyang Jia
Download or read book Designing Computing Systems Based on Unconventional Technologies for Hardware Acceleration written by Hongyang Jia and published by . This book was released on 2021 with total page 0 pages. Available in PDF, EPUB and Kindle. Book excerpt: Hardware specialization is being widely adopted to address the energy and throughput limitations in a range of applications. However, two critical challenges are: (1) degraded programmability; and (2) bottlenecks posed by memory accessing and data movement. This thesis investigates unconventional technologies for computation, enabling unconventional accelerator architectures and associated programmability and physical-design tradeoffs, to overcome these challenges. This employs co-design at the circuit, architectural, and software levels, applied to custom integrated-circuit (IC) prototypes to validate the cross-layer implications. First, the challenge of programmability is explored through opportunities enabled by approximate computing. Accelerator programmability is enhanced by adopting the code-synthesis framework of genetic-programming (GP), which approximates computations from high level specifications (input-output pairs), by using highly structured models of computation, which, in turn, enable accelerator specialization for energy efficiency. A programmable heterogeneous platform for sensor inference is demonstrated, including: a 130nm CMOS IC, integrating a CPU, fixed-function classification accelerator, and programmable feature-extraction accelerator; and a compiler flow for code generation and approximation-aware model training.Next, the challenge of memory accessing and data movement is explored through mixed-signal in-memory computing (IMC), which amortizes accessing of raw bits into accessing of a computational result over all bits in a memory column. This fundamentally increases signal dynamic range, instating an energy/throughput-vs.-SNR tradeoff. A recent approach to high-SNR IMC is exploited to form robust abstractions of the computations, required for architectural integration and software-level interfacing. A programmable heterogeneous processor is demonstrated, including: a 65nm CMOS IC, integrating a CPU, near-memory-computing digital accelerator, and bit-scalable IMC accelerator; and associated programming model and software libraries for neural-network training and mapping. Finally, an architecture and application-mapping algorithms are explored to enable scalability of IMC platforms, especially addressing memory-system energy and latency required for virtualization of IMC hardware. An arrayed dataflow architecture is designed with integrated microarchitectural support for efficient and scalable scheduling and execution of computations for diverse neural-network models. A reconfigurable IMC platform is demonstrated, including: a 16nm CMOS IC, integrating a 4x4 array of IMC modules and scalable network-on-chip; and application-mapping algorithms and toolchain, optimizing energy-efficiency and throughput at the IMC hardware design point.