Author : Ho-Cheung Ng
Publisher :
ISBN 13 : 9781361035634
Total Pages : pages
Book Rating : 4.0/5 (356 download)
Book Synopsis A Soft Processor Overlay with Tightly-Coupled FPGA Accelerator by : Ho-Cheung Ng
Download or read book A Soft Processor Overlay with Tightly-Coupled FPGA Accelerator written by Ho-Cheung Ng and published by . This book was released on 2017-01-26 with total page pages. Available in PDF, EPUB and Kindle. Book excerpt: This dissertation, "A Soft Processor Overlay With Tightly-coupled FPGA Accelerator" by Ho-cheung, Ng, 吳浩彰, was obtained from The University of Hong Kong (Pokfulam, Hong Kong) and is being sold pursuant to Creative Commons: Attribution 3.0 Hong Kong License. The content of this dissertation has not been altered in any way. We have altered the formatting in order to facilitate the ease of printing and reading of the dissertation. All rights not granted by the above license are retained by the author. Abstract: FPGA overlays have shown the potential to improve designers' productivity through balancing flexibility and ease of configuration of the underlying fabric while maintaining considerable overall performance promised by FPGAs. To truly facilitate full application acceleration, it is often necessary to also include a highly efficient processor that integrates and collaborates with the accelerators while maintaining the benefits of being implemented within the same overlay framework. This thesis presents an open-source soft processor that is tightly-coupled with FPGA accelerator as part of an overlay framework. RISC-V is chosen as the instruction set for its openness and simplicity, and the soft processor is designed as a 4-stage pipeline to balance resource consumption and performance when implemented on FPGAs. The processor is generically implemented so as to promote design portability and compatibility across different FPGA platforms. Experiment shows that the integrated software-hardware applications using the proposed tightly-coupled architecture achieve comparable performance as hardware-only accelerators while the proposed architecture provides additional run-time flexibility. The processor can be synthesized to both low-end and high-performance FPGA families from different vendors, achieving the highest frequency of 268:67MHz on Virtex-7 device. Synthesized results of the soft processor also display improvement on FPGA resource consumption and efficiency when compared to existing RISC-V design. In addition, this thesis also presents an FPGA-centric approach that allows gateware to directly access the virtual memory space as part of the executing process without involving the CPU. It allows efficient access to memory in heterogeneous systems and complements traditional software-centric approach by providing a simplified memory access model to improve designers' productivity and high-level compilation tools portability. In this approach, a caching address translation buffer was implemented alongside the user FPGA gateware to provide runtime mapping between virtual and physical memory addresses. It coordinates with the OS running on the CPU to update address translations and to maintain memory consistency. The system was implemented on a commercial off-the-shelf FPGA add-on card to demonstrate the viability of such approach in low-cost systems. Experiment with a 2D stencil computing application implemented with this FPGA-centric approach results in reasonable performance improvement when compared to a typical software-centric implementation; while the number of context switches between FPGA and CPU in both kernel and user mode was significantly reduced, freeing the CPU for other concurrent user tasks. Subjects: Field programmable gate arrays