IJFANS International Journal of Food and Nutritional Sciences

ISSN PRINT 2319 1775 Online 2320-7876

Speech Emotion Recognition

Main Article Content

Mr. M China Pentu Saheb,P Sai Srujana, P Lalitha Rani, M Siva Jyothi
» doi: 10.48047/IJFANS/V11/I12/203

Abstract

Emotions are the best way for people to communicate their thoughts and actions to others. The most important technology in the world today is the ability to recognize emotions from a single speaker's voice. The ability to recognize emotions is very useful in gaining various insightful insights into a person's thoughts. The process of extracting emotions from human speech is called Speech Emotion Recognition (SER). We used the RAVDESS (Ryerson AudioVisual Database of Emotional Speech and Song) dataset to extract emotions from Speech. Emotions are extracted from speech based on speech parameters such as Mel-FrequencyCepstral -Coefficients (MFCC) and Mel Spectrogram. After training with a Multilayer Perceptron classifier (MLP), the obtained data had an accuracy of 68.33% and accuracy of 80.64% after training with Convolutional Neural Networks Long Short Term Memory (CNN LSTM).

Article Details