IJFANS International Journal of Food and Nutritional Sciences

Volume 13 Issue 4

Effect of Biostimulants (Azospirillum, Pseudomonas and Bacillus) on the growth and disease suppression of neem Azadirachta indica (A) juss.seedlings
Volume 13 | Issue 4

“Pigments of Imagination & Color Psychology of Consumers towards Apparel: A Perceptual Study”
Volume 13 | Issue 4

Exploring The Relationship Between Weather Patterns and Energy Consumption in Smart Homes: A Regression Analysis
Volume 13 | Issue 4

DEEP LEARNING BASED APPROACH FOR BIRD SPECIES IDENTIFICATION AND CLASSIFICATION
Volume 13 | Issue 4

ML-DRIVEN WASTE CLASSIFICATION FOR EFFECTIVE ORGANIC AND NON-ORGANIC WASTE MANAGEMENT
Volume 13 | Issue 4

Speech Emotion Recognition

PDF

Keywords:

CNN LSTM, Mel, MFCC, MLP, SER

Mr. M China Pentu Saheb,P Sai Srujana, P Lalitha Rani, M Siva Jyothi
» doi: 10.48047/IJFANS/V11/I12/203

Abstract

Emotions are the best way for people to communicate their thoughts and actions to others. The most important technology in the world today is the ability to recognize emotions from a single speaker's voice. The ability to recognize emotions is very useful in gaining various insightful insights into a person's thoughts. The process of extracting emotions from human speech is called Speech Emotion Recognition (SER). We used the RAVDESS (Ryerson AudioVisual Database of Emotional Speech and Song) dataset to extract emotions from Speech. Emotions are extracted from speech based on speech parameters such as Mel-FrequencyCepstral -Coefficients (MFCC) and Mel Spectrogram. After training with a Multilayer Perceptron classifier (MLP), the obtained data had an accuracy of 68.33% and accuracy of 80.64% after training with Convolutional Neural Networks Long Short Term Memory (CNN LSTM).

Issue

Volume 11, Issue - 12 (2022 )

Submit article