Seybold Report ISSN: 1533-9211

Abstract

SPEECH DIGIT RECOGNITION USING DEEP LEARNING


1Nivesh, 2Shobha Bhatt


Vol 18, No 5 ( 2023 )   |  Licensing: CC 4.0   |   Pg no: 223-235   |   Published on: 25-05-2023



Abstract
The way individuals communicate with computers and other devices is evolving as a consequence of recent major developments in speech recognition technology. The development of comparable skills for languages like Hindi, which has a large user base, has been hindered by the reality that a large portion of research in this area has focused on English speech recognition. Because it improves voice-controlled technology and increases accessibility for Hindi speakers, recognising spoken Hindi numbers is very important. The present research uses convolutional neural networks (CNN) to show an innovative technique for Hindi speech digit recognition. In computer vision tasks, CNNs performed with remarkable performance, as they also demonstrated potential in speech recognition. CNNs are capable of learning and classifying Hindi digits with high accuracy by utilising the built-in patterns and characteristics in spoken digit audio signals. Using recordings of 300 individuals saying Hindi digits from 0 to 9, a carefully curated dataset was created to train and evaluate the CNN model. The dataset contains a range of speakers and considers consideration variances in age, gender, and regional accents to ensure the accuracy and generality of the proposed model. The model give accuracy of 96.4 in speech recognition.


Keywords:
Automatic Speech Recognition (ASR), Convolutional Neural Network (CNN), Spectrogram Extraction, Epochs, Confusion Matrix.



Download Full Article PDF


Back to Current Issue Page