Google scholar arxiv informatics ads IJAIS publications are indexed with Google Scholar, NASA ADS, Informatics et. al.

Call for Paper

-

July Edition 2023

International Journal of Applied Information Systems solicits high quality original research papers for the July 2023 Edition of the journal. The last date of research paper submission is June 15, 2023.

Neural Network on the Performance of Bangla Automatic Speech Recognition

Qamrun Nahar Eity, Md. Khairul Hasan, G. M. Monjur Morshed Mrida, Mohammad Nurul Huda in Artificial Intelligence

International Journal of Applied Information Systems
Year of Publication: 2019
Publisher: Foundation of Computer Science (FCS), NY, USA
Authors:Qamrun Nahar Eity, Md. Khairul Hasan, G. M. Monjur Morshed Mrida, Mohammad Nurul Huda
10.5120/ijais2019451822
Download full text
  1. Qamrun Nahar Eity, Md. Khairul Hasan, Monjur Morshed G M Mrida and Mohammad Nurul Huda. Neural Network on the Performance of Bangla Automatic Speech Recognition. International Journal of Applied Information Systems 12(24):7-11, October 2019. URL, DOI BibTeX

    @article{10.5120/ijais2019451822,
    	author = "Qamrun Nahar Eity and Md. Khairul Hasan and G. M. Monjur Morshed Mrida and Mohammad Nurul Huda",
    	title = "Neural Network on the Performance of Bangla Automatic Speech Recognition",
    	journal = "International Journal of Applied Information Systems",
    	issue_date = "October, 2019",
    	volume = 12,
    	number = 24,
    	month = "October",
    	year = 2019,
    	issn = "2249-0868",
    	pages = "7-11",
    	url = "http://www.ijais.org/archives/volume12/number24/1065-2019451822",
    	doi = "10.5120/ijais2019451822",
    	publisher = "Foundation of Computer Science (FCS), NY, USA",
    	address = "New York, USA"
    }
    

Abstract

In this paper, the performance of different Bangla (widely used as Bengali) Automatic Speech Recognition (ASR) systems based on local features (LFs) to observe the effects of multilayer neural network (MLN) on it, is evaluated. These ASR systems use 3000 sentences uttered by 30 speakers from a wide area of Bangladesh, where Bangla is used as a native language. In the experiments, at first LFs are extracted from the input speech and these LFs are inputed into a multilayer neural network (MLN) for obtaining phoneme probabilities for all the Bengali phonemes considered in this study. Then, these phoneme probabilities are modified by taking logarithm or normal values, and either of these values are inputted to the hidden Markov model (HMM) based classifier to obtain word corrrect rate (WCR), word accuracy(WA) and sentence correct rate (SCR). From this study, it is observed that the ASR method which incorporates an MLN in its arechitecture improves the word recognition accuracy with fewer components in HMMs.

Reference

  1. http://en.wikipedia.org/wiki/Listoflanguagesbytotalspeakers, Last accessed April 11, 2009.
  2. S. P. Kishore, A. W. Black, R. Kumar, and Rajeev Sangal, ”Experiments with unit selection speech databases for Indian languages,” Carnegie Mellon University.
  3. http://en.wikipedia.org/wiki/Bengaliphonology, Last accessed April 11, 2009.
  4. S. A. Hossain, M. L. Rahman, and F. Ahmed, Bangla vowel characterization based on analysis by synthesis, Proc. WASET, vol. 20, pp. 327-330, April 2007.
  5. M. A. Hasnat, J. Mowla, and Mumit Khan, ” Isolated and Continuous Bangla Speech Recognition: Implementation Performance and application perspective, ” in Proc. International Symposium on Natural Language Processing (SNLP), Hanoi, Vietnam, December 2007.
  6. R. Karim, M. S. Rahman, and M. Z Iqbal, ”Recognition of spoken letters in Bangla,” in Proc. 5th International Conference on Computer and Information Technology (ICCIT02), Dhaka, Bangladesh, 2002.
  7. A. K. M. M. Houque, ”Bengali segmented speech recognition system,” Undergraduate thesis, BRAC University, Bangladesh, May 2006.
  8. K. Roy, D. Das, and M. G. Ali, ”Development of the speech recognition system using artificial neural network,” in Proc. 5th International Conference on Computer and Information Technology (ICCIT02), Dhaka, Bangladesh, 2002
  9. M. R. Hassan, B. Nath, and M. A. Bhuiyan, ”Bengali phoneme recognition: a new approach,” in Proc. 6th International Conference on Computer and Information Technology (ICCIT03), Dhaka, Bangladesh, 2003.
  10. K. J. Rahman, M. A. Hossain, D. Das, T. Islam, and M. G. Ali, ”Continuous bangle speech recognition system,” in Proc. 6th International Conference on Computer and Information Technology (ICCIT03), Dhaka, Bangladesh, 2003.
  11. S. A. Hossain, M. L. Rahman, F. Ahmed, and M. Dewan, ”Bangla speech synthesis, analysis, and recognition: an overview,” in Proc. NCCPB, Dhaka, 2004.
  12. C. Masica, The Indo-Aryan Languages, Cambridge University Press, 1991.
  13. www.prothom-alo.com.
  14. M. R. A. Kotwal, F. Hassan, G. Md. M. Islam, M. Rakibuzzaman, M. M. Hasan, M. Banik, G. Muhammad and M. N. Huda, Bangla Phoneme Recognition for Different Acoustic Features, ICCAIE 2010, IEEE Explored, Kuala Lumpur, Malaysia, December, 2010.

Keywords

Local features; multi layer neural network; boost up; logarithm; normalization; hidden Markov model; automatic speech recognition