Speaker identification over narrowband VoIP networks

Hemant A. Patil, Aaron E. Cohen, Keshab K. Parhi

Research output: Chapter in Book/Report/Conference proceedingChapter

1 Scopus citations

Abstract

Automatic Speaker Recognition (ASR) has been an active area of research for the past four decades with speech collected mostly in research laboratory environments. However, due to growing applications and possible misuses of Voice over Internet Protocol (VoIP) networks, there is a need to employ robust ASR systems over VoIP networks, especially within the context of internet security and law enforcement activities. There is, however, little systematic study on analyzing effects of several artifacts of VoIP (such as speech codec, packet loss, packet reordering, network jitter and foreign-cross talk or echo) on performance of an ASR system. This chapter investigates each of the issues of VoIP individually and trades it with the performance of the ASR system. In this chapter, a narrowband 2.4 kbps mixed-excitation linear prediction (MELP) codec is used over a VoIP network.

Original languageEnglish (US)
Title of host publicationForensic Speaker Recognition
Subtitle of host publicationLaw Enforcement and Counter-Terrorism
PublisherSpringer New York
Pages125-151
Number of pages27
ISBN (Electronic)9781461402633
ISBN (Print)9781461402626
DOIs
StatePublished - Jan 1 2012

Fingerprint Dive into the research topics of 'Speaker identification over narrowband VoIP networks'. Together they form a unique fingerprint.

  • Cite this

    Patil, H. A., Cohen, A. E., & Parhi, K. K. (2012). Speaker identification over narrowband VoIP networks. In Forensic Speaker Recognition: Law Enforcement and Counter-Terrorism (pp. 125-151). Springer New York. https://doi.org/10.1007/9781461402633_6