CFP last date
15 May 2024
Reseach Article

Neural Network on the Performance of Bangla Automatic Speech Recognition

by Qamrun Nahar Eity, Md. Khairul Hasan, G. M. Monjur Morshed Mrida, Mohammad Nurul Huda
International Journal of Applied Information Systems
Foundation of Computer Science (FCS), NY, USA
Volume 12 - Number 24
Year of Publication: 2019
Authors: Qamrun Nahar Eity, Md. Khairul Hasan, G. M. Monjur Morshed Mrida, Mohammad Nurul Huda
10.5120/ijais2019451822

Qamrun Nahar Eity, Md. Khairul Hasan, G. M. Monjur Morshed Mrida, Mohammad Nurul Huda . Neural Network on the Performance of Bangla Automatic Speech Recognition. International Journal of Applied Information Systems. 12, 24 ( October 2019), 7-11. DOI=10.5120/ijais2019451822

@article{ 10.5120/ijais2019451822,
author = { Qamrun Nahar Eity, Md. Khairul Hasan, G. M. Monjur Morshed Mrida, Mohammad Nurul Huda },
title = { Neural Network on the Performance of Bangla Automatic Speech Recognition },
journal = { International Journal of Applied Information Systems },
issue_date = { October 2019 },
volume = { 12 },
number = { 24 },
month = { October },
year = { 2019 },
issn = { 2249-0868 },
pages = { 7-11 },
numpages = {9},
url = { https://www.ijais.org/archives/volume12/number24/1065-2019451822/ },
doi = { 10.5120/ijais2019451822 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2023-07-05T19:09:57.810076+05:30
%A Qamrun Nahar Eity
%A Md. Khairul Hasan
%A G. M. Monjur Morshed Mrida
%A Mohammad Nurul Huda
%T Neural Network on the Performance of Bangla Automatic Speech Recognition
%J International Journal of Applied Information Systems
%@ 2249-0868
%V 12
%N 24
%P 7-11
%D 2019
%I Foundation of Computer Science (FCS), NY, USA
Abstract

In this paper, the performance of different Bangla (widely used as Bengali) Automatic Speech Recognition (ASR) systems based on local features (LFs) to observe the effects of multilayer neural network (MLN) on it, is evaluated. These ASR systems use 3000 sentences uttered by 30 speakers from a wide area of Bangladesh, where Bangla is used as a native language. In the experiments, at first LFs are extracted from the input speech and these LFs are inputed into a multilayer neural network (MLN) for obtaining phoneme probabilities for all the Bengali phonemes considered in this study. Then, these phoneme probabilities are modified by taking logarithm or normal values, and either of these values are inputted to the hidden Markov model (HMM) based classifier to obtain word corrrect rate (WCR), word accuracy(WA) and sentence correct rate (SCR). From this study, it is observed that the ASR method which incorporates an MLN in its arechitecture improves the word recognition accuracy with fewer components in HMMs.

References
  1. http://en.wikipedia.org/wiki/Listoflanguagesbytotalspeakers, Last accessed April 11, 2009.
  2. S. P. Kishore, A. W. Black, R. Kumar, and Rajeev Sangal, ”Experiments with unit selection speech databases for Indian languages,” Carnegie Mellon University.
  3. http://en.wikipedia.org/wiki/Bengaliphonology, Last accessed April 11, 2009.
  4. S. A. Hossain, M. L. Rahman, and F. Ahmed, Bangla vowel characterization based on analysis by synthesis, Proc. WASET, vol. 20, pp. 327-330, April 2007.
  5. M. A. Hasnat, J. Mowla, and Mumit Khan, ” Isolated and Continuous Bangla Speech Recognition: Implementation Performance and application perspective, ” in Proc. International Symposium on Natural Language Processing (SNLP), Hanoi, Vietnam, December 2007.
  6. R. Karim, M. S. Rahman, and M. Z Iqbal, ”Recognition of spoken letters in Bangla,” in Proc. 5th International Conference on Computer and Information Technology (ICCIT02), Dhaka, Bangladesh, 2002.
  7. A. K. M. M. Houque, ”Bengali segmented speech recognition system,” Undergraduate thesis, BRAC University, Bangladesh, May 2006.
  8. K. Roy, D. Das, and M. G. Ali, ”Development of the speech recognition system using artificial neural network,” in Proc. 5th International Conference on Computer and Information Technology (ICCIT02), Dhaka, Bangladesh, 2002
  9. M. R. Hassan, B. Nath, and M. A. Bhuiyan, ”Bengali phoneme recognition: a new approach,” in Proc. 6th International Conference on Computer and Information Technology (ICCIT03), Dhaka, Bangladesh, 2003.
  10. K. J. Rahman, M. A. Hossain, D. Das, T. Islam, and M. G. Ali, ”Continuous bangle speech recognition system,” in Proc. 6th International Conference on Computer and Information Technology (ICCIT03), Dhaka, Bangladesh, 2003.
  11. S. A. Hossain, M. L. Rahman, F. Ahmed, and M. Dewan, ”Bangla speech synthesis, analysis, and recognition: an overview,” in Proc. NCCPB, Dhaka, 2004.
  12. C. Masica, The Indo-Aryan Languages, Cambridge University Press, 1991.
  13. www.prothom-alo.com.
  14. M. R. A. Kotwal, F. Hassan, G. Md. M. Islam, M. Rakibuzzaman, M. M. Hasan, M. Banik, G. Muhammad and M. N. Huda, Bangla Phoneme Recognition for Different Acoustic Features, ICCAIE 2010, IEEE Explored, Kuala Lumpur, Malaysia, December, 2010.
Index Terms

Computer Science
Information Sciences

Keywords

Local features; multi layer neural network; boost up; logarithm; normalization; hidden Markov model; automatic speech recognition