TY - GEN
T1 - Improved approach for calculating model parameters in speaker recognition using Gaussian mixture models
AU - Metkar, Prashant
AU - Cohen, Aaron
AU - Parhi, Keshab
PY - 2010
Y1 - 2010
N2 - In speaker identification, most of the computation is due to the distance or likelihood calculation between feature vectors of the test signal and the speaker model in the database. The time required for identifying a speaker is a function of feature vectors and their dimensionality and the number of speakers in the database. In this paper, we focus on optimizing the performance of Gaussian mixture (GMM) based speaker identification system. An improved approach for model parameter calculation is presented. The advantage of proposed approach lies in the reduction in computational time by a significant amount over an approach which uses expectation maximization (EM) algorithm to calculate the model parameter values. This approach is based on forming clusters and assigning weights to them depending upon the number of mixtures used for modeling the speaker. The reduction in computation time depends upon how many mixtures are used for training the speaker model.
AB - In speaker identification, most of the computation is due to the distance or likelihood calculation between feature vectors of the test signal and the speaker model in the database. The time required for identifying a speaker is a function of feature vectors and their dimensionality and the number of speakers in the database. In this paper, we focus on optimizing the performance of Gaussian mixture (GMM) based speaker identification system. An improved approach for model parameter calculation is presented. The advantage of proposed approach lies in the reduction in computational time by a significant amount over an approach which uses expectation maximization (EM) algorithm to calculate the model parameter values. This approach is based on forming clusters and assigning weights to them depending upon the number of mixtures used for modeling the speaker. The reduction in computation time depends upon how many mixtures are used for training the speaker model.
UR - http://www.scopus.com/inward/record.url?scp=79958004260&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=79958004260&partnerID=8YFLogxK
U2 - 10.1109/ACSSC.2010.5757624
DO - 10.1109/ACSSC.2010.5757624
M3 - Conference contribution
AN - SCOPUS:79958004260
SN - 9781424497218
T3 - Conference Record - Asilomar Conference on Signals, Systems and Computers
SP - 567
EP - 570
BT - Conference Record of the 44th Asilomar Conference on Signals, Systems and Computers, Asilomar 2010
T2 - 44th Asilomar Conference on Signals, Systems and Computers, Asilomar 2010
Y2 - 7 November 2010 through 10 November 2010
ER -