Comparing the performance of stochastic simulation on GPUs and OpenMP

Weijun Xiao, Peng Li, David J. Lilja

Research output: Contribution to journalArticle

Abstract

Since stochastic computing performs operations using streams of bits that represent probability values instead of deterministic values, it can tolerate a large number of failures in a noisy system. However, the simulation of a stochastic implementation is extremely time-consuming. In this paper, we investigate two approaches to speed up the stochastic simulation: a GPU-based simulation and an OpenMP-based simulation. To compare these two approaches, we start with several basic stochastic computing elements (SCEs) and then use the stochastic implementation of a frame difference-based image segmentation algorithm as case study to conduct extensive experiments. Measured results show that the GPU-based simulation with 448 processing elements can achieve up to 119x performance speedup compared to the single-threaded CPU simulation and 17x performance speedup over the OpenMP-based simulation with eight processor cores. In addition, we present several performance optimisations for the GPU-based simulation which significantly benefit the performance of stochastic simulation.

Original languageEnglish (US)
Pages (from-to)34-46
Number of pages13
JournalInternational Journal of Computational Science and Engineering
Volume8
Issue number1
DOIs
StatePublished - Feb 25 2013

    Fingerprint

Keywords

  • Fault-tolerance
  • GPU computing
  • Image processing
  • Parallel computing
  • Stochastic computing

Cite this