Need of Boosted GMM in Speech Emotion Recognition System Implemented Using Gaussian Mixture Model

Prof. A. A. Chaudhari, Dr. M. A. Pund, Dr. G. R. Bamnote, Prof. S. V. Pattalwar

Abstract


Speech feeling recognition is a vital issue that affects the human machine interaction. Automatic recognition of human feeling in speech aims at recognizing the underlying spirit of a speaker from the speech signal. Gaussian mixture models (GMMs) and therefore the minimum error rate classifier (i.e., theorem optimum classifier) is widespread and effective tools for speech feeling recognition. Typically, GMMs are wont to model the class-conditional distributions of acoustic options and their parameters are calculable by the expectation maximization (EM) algorithmic rule supported a coaching information set. During this paper, we have a tendency to introduce a boosting algorithmic rule for faithfully and accurately estimating the class-conditional GMMs. The ensuing algorithmic rule is known as the Boosted-GMM algorithmic rule. Our speech feeling recognition experiments show that the feeling recognition rates are effectively and considerably boosted by the Boosted-GMM algorithmic rule as compared to the EM-GMM algorithmic rule. During this interaction, human beings have some feelings that they want to convey to their communication partner with whom they are communicating, and then their communication partner may be the human or machine. This work dependent on the emotion recognition of the human beings from their speech signal. Emotion recognition from the speaker’s speech is very difficult because of the following reasons: Because of the existence of the different sentences, speakers, speaking styles, speaking rates accosting variability was introduced. The same utterance may show different emotions. Therefore, it is very difficult to differentiate these portions of utterance. Another problem is that emotion expression is depending on the speaker and his or her culture and environment. As the culture and environment gets change the speaking style also gets change, which is another challenge in front of the speech emotion recognition system.

Human beings normally used their essential potentials to make communication better between themselves as well as between human and machine. During this interaction, human beings have some feelings that they want to convey to their communication partner with whom they are communicating, and then their communication partner may be the human or machine. This dissertation work dependent on the emotion recognition of the human beings from their speech signal. In this chapter introduction of the speech emotion recognition based on the problem overview and need of the system is provided. Emotional speech recognition aims at automatically identifying the emotional or physical state of a human being from his or her voice. Although feeling detection from speech could be a comparatively new field of analysis, it is several potential applications. In human-computer or human-human interaction systems, feeling recognition systems might give users with improved services by being adaptative to their emotions. The body of labor on sleuthing feeling in speech is sort of restricted. Currently, researchers area unit still debating what options influence the popularity of feeling in speech. There is conjointly appreciable uncertainty on the simplest algorithmic program for classifying feeling, and those emotions to category along.

 


Full Text:

PDF

Refbacks

  • There are currently no refbacks.