All researchs
All researchs
All researchs
All researchs
CoopMC: Algorithm-Architecture Co-Optimization for Markov Chain Monte Carlo Accelerators
CoopMC: Algorithm-Architecture Co-Optimization for Markov Chain Monte Carlo Accelerators
Published:17 May, 2022
Published:17 May, 2022
Bayesian machine learning is useful for applications that may make high-risk decisions with limited, noisy, or unlabeled data, as it provides great data efficiency and uncertainty estimation. Building on previous efforts, this work presents CoopMC, an algorithm-architecture co-optimization for developing more efficient MCMC-based Bayesian inference accelerators. CoopMC utilizes dynamic normalization (DyNorm), LUT-based exponential kernels (TableExp), and log-domain kernel fusion (LogFusion) to reduce computational precision and shrink ALU area by 7.5× without noticeable reduction in model performance. Also, a Tree-based Gibbs sampler (TreeSampler) improves hardware runtime from O(N) to O(log(N)), an 8.7× speedup, and yields 1.9× better area efficiency than the existing state-of-the-art Gibbs sampling architecture. These methods have been tested on 10 diverse workloads using 3 different types of Bayesian models, demonstrating applicability to many Bayesian algorithms. In an end-to-end case study, these optimizations achieve a 33% logic area reduction, 62% power reduction, and 1.53× speedup over previous state-of-the-art end-to-end MCMC accelerators.
Bayesian machine learning is useful for applications that may make high-risk decisions with limited, noisy, or unlabeled data, as it provides great data efficiency and uncertainty estimation. Building on previous efforts, this work presents CoopMC, an algorithm-architecture co-optimization for developing more efficient MCMC-based Bayesian inference accelerators. CoopMC utilizes dynamic normalization (DyNorm), LUT-based exponential kernels (TableExp), and log-domain kernel fusion (LogFusion) to reduce computational precision and shrink ALU area by 7.5× without noticeable reduction in model performance. Also, a Tree-based Gibbs sampler (TreeSampler) improves hardware runtime from O(N) to O(log(N)), an 8.7× speedup, and yields 1.9× better area efficiency than the existing state-of-the-art Gibbs sampling architecture. These methods have been tested on 10 diverse workloads using 3 different types of Bayesian models, demonstrating applicability to many Bayesian algorithms. In an end-to-end case study, these optimizations achieve a 33% logic area reduction, 62% power reduction, and 1.53× speedup over previous state-of-the-art end-to-end MCMC accelerators.





Harvard Innovation Labs
125 Western Ave
Boston, MA 02163

Harvard Innovation Labs
125 Western Ave
Boston, MA 02163

Harvard Innovation Labs
125 Western Ave
Boston, MA 02163

Harvard Innovation Labs
125 Western Ave
Boston, MA 02163