CFP last date
15 May 2024
Reseach Article

Improving Speaker Identification Performance by Combining Vocal Tract Features

by S. Selva Nidhyananthan, R. Shantha Selva Kumari, G. Jaffino
International Journal of Applied Information Systems
Foundation of Computer Science (FCS), NY, USA
Volume 3 - Number 1
Year of Publication: 2012
Authors: S. Selva Nidhyananthan, R. Shantha Selva Kumari, G. Jaffino
http:/ijais12-450433

S. Selva Nidhyananthan, R. Shantha Selva Kumari, G. Jaffino . Improving Speaker Identification Performance by Combining Vocal Tract Features. International Journal of Applied Information Systems. 3, 1 ( July 2012), 27-33. DOI=http:/ijais12-450433

@article{ http:/ijais12-450433,
author = { S. Selva Nidhyananthan, R. Shantha Selva Kumari, G. Jaffino },
title = { Improving Speaker Identification Performance by Combining Vocal Tract Features },
journal = { International Journal of Applied Information Systems },
issue_date = { July 2012 },
volume = { 3 },
number = { 1 },
month = { July },
year = { 2012 },
issn = { 2249-0868 },
pages = { 27-33 },
numpages = {9},
url = { https://www.ijais.org/archives/volume3/number1/197-0433/ },
doi = { http:/ijais12-450433 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2023-07-05T10:45:17.835907+05:30
%A S. Selva Nidhyananthan
%A R. Shantha Selva Kumari
%A G. Jaffino
%T Improving Speaker Identification Performance by Combining Vocal Tract Features
%J International Journal of Applied Information Systems
%@ 2249-0868
%V 3
%N 1
%P 27-33
%D 2012
%I Foundation of Computer Science (FCS), NY, USA
Abstract

This paper proposes fusion and addition techniques of vocal tract features such as Mel Frequency Cepstral Coefficients (MFCC) and Dynamic Mel Frequency Cepstral Coefficients (DMFCC) in speaker identification. Feature extraction plays an important role as a front end processing block in Speaker Identification (SI) process. Mel frequency features are used to extract the spectral characteristics of the speech such as formant frequency and the bandwidth of formant frequency. This feature estimation method leads to robust recognition performance. The Dynamic Mel frequency features are used to extract the dynamic behavior of the human vocal tract using pitch frequency. This work is focused to increase the identification accuracy with databases containing short length speech signal. Experimental evaluation is carried out on TIMIT database with 630 speakers using Gaussian Mixture Model (GMM).

References
  1. Douglas O' Shaughnessy,"Speech Communication Human and Machines," II nd edition, Universities press (India) Limited (2001).
  2. S. Selva Nidhyananthan, R. Shantha Selva Kumari and G. Jaffino," Text-Independent speaker identification using residual feature extraction Technique," CiiT International Journal of Digital signal processing, march 2012.
  3. A. E. Rosenberg et al. , "Connected word talker verification using whole word Hidden Markov Models," in Proc. ICASSP, 1991, pp. 381-384.
  4. D. A. Reynolds and R. C. Rose published a paper, "Robust test-independent speaker identification using Gaussian mixture Speaker models. "IEEE Transaction on Speech Audio Processing, vol. 3, 1995, pp 72-83.
  5. Tomoko Matsui and Sadaoki Furui, "Comparison of Text Independent Speaker Recognition Methods Using VQ Distortion and Discrete Continuous HMM's," IEEE transactions on speech and audio processing, vol. 2, no. 3, July 1994.
  6. Md. Rashidul Hasan, Mustafa Jamil Md. Golam Rabbani,Md. Saifur Rahman, "Speaker Identification using Mel Frequency Cepstral Coefficients", 3rd International conference on Electrical and computer engineering ICECE 2004,Dec 2004.
  7. Douglas O' Shaughnessy, "Speech communication Human and Machines", IInd edition , Universities Press(India) Limited(2001).
  8. Prodesy and speech recognition by Alex Waibel, vol. 1, Nos. 1-2, 2007.
  9. Sandipan Chakroborty, Goutam Saha, "Improved Text-Independent Speaker Identification using Fused MFCC & IMFCC Feature Sets based on Gaussian Filter" , International Journal of Signal Processing 5:1, 2009.
  10. Wang Yutai, Li Bo, Jiang Xiaoqing, Liu Feng, Wang Lihao," Speaker Recognition Based on Dynamic MFCC Parameters" IEEE proceedings 2009.
  11. Tomi Kinnunen, Haizhou Li,"An Overview of Text-Independent Speaker Recognition: From Features to Super vectors", august 2009.
  12. Miyajima, Y. Hattori, K. Tokuda, T. Kabayashi and T. Kitamura " Text-Independent Speaker Identification using Gaussian Mixture Models based on multispace probability distribution," IEEE Transactions on information and system, vol. E84-B,2001,pp. 847-855.
  13. C. Miyajima, Y. Hattori, K. Tokuda, T. Kabayashi and T. Kitamura," Text-Independent speaker identification using Gaussian mixture models based on multispace probability distribution," IEEE Transactions on information and system, vol. E84-B, 2001, pp. 847-855.
  14. Murthy. K and Yegnanarayana. B," Combining evidence from residual phase and MFCC features for speaker Recognition," Signal Processing Letters; IEEE, vol. 13, no. 1, pp. 52- 55, Jan 2006.
  15. Victor Zue, Stephanie Seneff, James Glass,"Speech database development at MIT: Timit and beyond", Speech Communication, Volume 9, Issue 4, August 1990, Pages 351–356.
Index Terms

Computer Science
Information Sciences

Keywords

Dmfcc Mfcc Gmm Feature Extraction Speaker Identification