On-Chip Network-Enabled Many-Core Architectures for Computational Biology Applications
MetadataShow full item record
Large-scale integration of multiple cores on a single chip is the current answer to the challenge of attaining higher computation throughput while restricting power consumption within acceptable limits. Network-on-Chip (NoC) is an emerging paradigm that can efficiently support integration of a massive number of cores on a chip by decoupling the on-chip computation and communication infrastructure, thereby overcoming scalability issues faced by conventional buses.Many scientific computing disciplines, such as computational biology, have seen a significant increase in the availability of parallel algorithms and high-performance computing (HPC) tools owing to high runtime complexities and/or the data-intensive nature underlying the computation. Software-only solutions are likely to be inadequate, creating the need for hardware accelerators. This dissertation explores the design and development of highly optimized NoC-based hardware accelerators for a particular class of biocomputing applications, viz. phylogeny reconstruction, which is important for evolutionary inferences in computational biology.This dissertation focuses on two computationally distinct phylogeny reconstruction approaches to demonstrate that NoC-based many-core platforms can deliver orders of magnitude reduction in time-to-solution, compared to existing approaches. The Maximum Parsimony (MP) phylogeny reconstruction problem can be reduced to one of solving numerous instances of the classical Traveling Salesman Problem (TSP). 99% of the total software runtime is spent in computing TSP instances, whose solution typically involves an application of branch-and-bound runtime heuristics. This dissertation presents the design of many-core systems with core-level pipelined micro-parallel architecture and different interconnection topologies to achieve significant speedup and energy efficiency. In Maximum Likelihood (ML) phylogeny reconstruction, the improved quality of result comes at a higher computational cost, as this approach involves optimization over multi-dimensional real continuous space. We present NoC-based hardware accelerators that target function kernels contributing to a bulk of the runtime. These platforms combine novel ideas and approaches, such as space-filling Hilbert curves, parallelized core allocation schemes, and 3-D integration. We also explore the use of long-range on-chip wireless links on existing regular topologies to reduce network diameter, thereby reducing the average communication latency between cores. These platforms have the potential to serve a broader class of throughput-oriented HPC applications.