Sparse fast Fourier transform on GPUs and multi-core CPUs

Jiaxi Hu, Zhaosen Wang, Qiyuan Qiu, Weijun Xiao, David J Lilja

Research output: Contribution to journalConference articlepeer-review

9 Scopus citations


Given an N-point sequence, finding its k largest components in the frequency domain is a problem of great interest. This problem, which is usually referred to as a sparse Fourier Transform, was recently brought back on stage by a newly proposed algorithm called the sFFT. In this paper, we present a parallel implementation of sFFT on both multi-core CPUs and GPUs using a human voice signal as a case study. Using this example, an estimate of k for the 3dB cutoff points was conducted through concrete experiments. In addition, three optimization strategies are presented in this paper. We demonstrate that the multi-core-based sFFT achieves speedups of up to three times a single-threaded sFFT while a GPU-based version achieves up to ten times speedup. For large scale cases, the GPU-based sFFT also shows its considerable advantages, which is about 40 times speedup compared to the latest out-of-card FFT implementations [2].

Original languageEnglish (US)
Article number6374775
Pages (from-to)83-91
Number of pages9
JournalProceedings - Symposium on Computer Architecture and High Performance Computing
StatePublished - Dec 1 2012
Event24th International Symposium on Computer Architecture and High Performance Computing, SBAC-PAD 2012 - New York, NY, United States
Duration: Oct 24 2012Oct 26 2012


  • GPUs
  • Multi-core CPUs
  • Sparse FFT
  • performance speedup


Dive into the research topics of 'Sparse fast Fourier transform on GPUs and multi-core CPUs'. Together they form a unique fingerprint.

Cite this