An Ensemble Approach to Identify Hindi Speech Emotions

Dipika Ramchandani

Speech is a vocalized form of communication used by humans. Today, speech processing has become an essential application of the technological development in several fields. This application helps to extract an emotional state of the speaker from the speech. We propose an approach to recognize the emotions of Hindi speech that is given as an input. We have used a self-designed corpus as an input dataset that consists of Hindi sentences spoken by politicians and dialogues from movies and advertisements. This project identifies the emotions in a Hindi sentence, by using features extracted from the speaker voice signal. Feature extraction refers to the process of retaining only useful information from speech. In this paper, the input audio to the system is converted from speech to text and given to the text classifiers (Naïve Bayes and K-NN). The proposed plan extracts the features from a given Hindi sentence to identify different emotions (like happy, sad, angry, fear and surprise). We employed a statistical model SVM (RBF, Sigmoid, Polynomial function) and Naïve Bayes, K-NN to classify emotions. In this paper, we have also used the ensemble learning method. The result is given as an emotion of the testing tuple. This is done by using a confusion matrix created by each classifier which determines the correctly classified samples and their accuracy. To verify the correctness of the proposed approach, results have been validated using the Recurrence plot technique and significance testing tools like ANOVA and Kruskal–Wallis test that analyze the similarity of the same emotions.

Volume 11 | Issue 5

Pages: 553-576