We present TAPAS, an HLS toolchain for generating parallel hardware accelerators from programs with dynamic parallelism. TAPAS is built on top of Tapir, which embeds fork-join parallelism into the compiler’s intermediate-representation. TAPAS leverages the compiler IR to identify the parallelism patterns and synthesize the hardware logic for managing it. TAPAS provides first-class architecture support for spawning, coordinating and synchronizing tasks during accelerator execution. We demonstrate that dynamic tasks enable TAPAS to generate flexible architectures for concurrent programs with heterogeneous and nested parallelism, without being restricted to a specific pattern. Our evaluation using an Intel FPGA SoC demonstrates that TAPAS can generate accelerators that scale effectively. The accelerators can tap into the available concurrency and deliver well over 20× the performance/watt when compared to typical desktop microprocessor. We find that TAPAS enables lightweight tasks that can be spawned in a single clock cycle which allows accelerators to exploit fine-grain parallelism.
Copyright is held by the author.
This thesis may be printed or downloaded for non-commercial research and scholarly purposes.
Supervisor or Senior Supervisor
Thesis advisor: Shriraman, Arrvindh
Member of collection