A Review Paper on Recognition of Human Emotions using Speech Signals

Ms. Manpreet Kaur, Ms. Veenu Kansal, Dr. Simranjit Singh


Human speech conveys much more than just linguistic meaning. It enables us to observe various aspects of speech that provide huge information about emotions reflected by human speech. Study of emotional content of speech helps in indicating the stress level of individuals. Stress level can be identified by manipulating human speech after the recording process is done. Emotion recognition in speech processing field is a very challenging issue due to its numerous applications. This paper presented a survey on certain essential speech parameters which can be used to deliver feelings from human speech and also, numerous strategies are mentioned for categorizing speech based totally on distinct feelings.

Full Text:



B. Schuller, G. Rigoll and M. Lang, “Hidden Markov Model-based Speech Emotion Recognition,” in Proc. IEEE Int. Conf. Multimedia & Expo, 2003.

Chul Min Lee and S. S. Narayanan, “Toward Detecting Emotions in Spoken Dialogs,” IEEE Transactions on Speech and Audio Processing, vol. 13, no. 2, pp. 293-303, Feb., 2005.

M. Lugger, B. Yang, and W. Wokurek , “Robust Estimation of Voice Quality Parameters Under Real world Disturbances,” in Proc. IEEE Int. Conf. Acoustic., Speech Signal Process., vol. 1, pp. 14–19, May, 2006.

M. Y. You, C. Chen, J. J. Bu, J. Liu, and J. H. Tao, “Emotion Recognition from Noisy Speech,” in Proc. IEEE Int. Conf. Multimedia Expo, pp. 1653–1656, Jul., 2006.

Dimitrios Ververidis, C. Kotropoulos , “Emotional Speech Recognition : Resources, Features and Methods,” Elsevier Speech comm., vol. 48, no. 9, pp. 1162-1181, Sep., 2006.

M. H. Sedaaghi, D. Ververidis, C. Kotropoulos , “Using Adaptive Genetic Algorithms to Improve Speech Emotion Recognition,” in Proc. IEEE 9th Workshop Multimedia Signal Processing, 2007.

X. Li, J. Tao, M. T. Johnson, J. Soltis, et. al., “Stress and Emotion Classification using Jitter and Shimmer Features,” in Proc. IEEE Int. Conf. Acoustic., Speech Signal Process., vol. 4, pp. IV–1081–IV-1084, 2007.

S. Chandrakala and C. C. Sekhar, “Combination of Generative Models and SVM based Classifier for Speech Emotion Recognition,” in Proc. IEEE Int. Joint Conf. Neural Network., pp. 1374–1379, 2009.

V. Le, H. Tang et. al., “ Emotion Recognition from 3-D Faces using Robust Spatio -Temporal Shape Features,” in Proc. IEEE Face and Gesture, 2011.

P. H. David, V. Bogdan , B. Ronald, and W. Andreas, “The Performance of the Speaking Rate Parameter in Emotion Recognition from Speech,” in Proc. IEEE Int. Conf. Multimedia Expo Workshops, pp. 296–301, 2012.

M. Hayat and M. Bennamoun, “An automatic framework for textured 3D video-based facial expression recognition,” IEEE Trans. Affective Comput. , vol. 5, no. 3, pp. 301–313, Sep., 2014.

M. R. Amer, B. Siddiquie, “Emotion Detection in Speech Using Deep Networks,” in Proc. IEEE Int. Conf. Acoustics, Speech and Signal Processing, 2014.

V. B. Waghmare, R. R. Deshmukh, “Development of Isolated Marathi Words Emotional Speech Data-base,” International Journal of Computer Applications (0975 – 8887), Vol. 94, no. 4, pp. 19-22, May, 2014.

Kunxia Wang and Ning An, “Speech Emotion Recognition Using Fourier Parameters,” IEEE Transactions on Affective Computing, vol.6, no.1, pp.69-75, Jan., 2015.

P. Tazirakis, G. Trigeorgis, “End-to-End Multimodal Emotion Recognition Using Deep Neural Networks,” IEEE Journal of Selected Topics in Signal Processing, vol. 11, no. 8, pp. 1301-1309, Oct., 2017.

M. Shammim Hossain and Ghulam Mohammad, “An Emotion Recognition System for Mobile Applications,” IEEE Access, vol. 5, pp. 2281-2287, Feb., 2017.

I. J. Tashev , K. Godin, “Speech Emotion Recognition Based on Gaussian Mixture Models and Deep Neural Networks,” in Proc. IEEE Int. Conf. Information Theory and Applications Workshop (ITA), 2017.

V. V. Yerigeri and L. K. Ragha, “Marathi speech emotion detection: A retrospective Analysis,” in Proc. IEEE Int. Conf. Computing, Communication and Networking Technologies, 2017.

P. Kurniawati , D. P. Lestari, “Speech Emotion Recognition from Indonesian Spoken Language using Acoustic and Lexical Features,” in Proc. IEEE, 20th Conference of the Oriental Chapter of the International Coordinating Committee on Speech Databases and Speech I/O Systems and Assessment (O-COCOSDA), 2017.

J. Deng, X. Xu et. al., “Semi-Supervised Auto encoders for Speech Emotion Recognition,” IEEE/ACM Transactions on Audio, Speech, and language processing, vol. 26, no. 1, pp. 31-43, Oct., 2017.

Peng Shi, “Speech Emotion Recognition based on Deep Belief Network,” in Proc. IEEE, 15th International Conference on Networking, Sensing and Control (ICNSC), 2018.


  • There are currently no refbacks.