Emotion recognition

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search

Emotion recognition is the process of identifying human emotion, most typically from facial expressions as well as from verbal expressions. This is both something that humans do automatically but computational methodologies have also been developed.


Humans show universal consistency in recognising emotions but also show a great deal of variability between individuals in their abilities. This has been a major topic of study in psychology.


This process leverages techniques from multiple areas, such as signal processing, machine learning, and computer vision. Different methodologies and techniques may be employed to interpret emotion such as Bayesian networks.[1] , Gaussian Mixture models[2] and Hidden Markov Models.[3]


The task of emotion recognition often involves the analysis of human expressions in multimodal forms such as texts, audio, or video.[4] Different emotion types are detected through the integration of information from facial expressions, body movement and gestures, and speech.[5] The technology is said to contribute in the emergence of the so-called emotional or emotive Internet.[6]

The existing approaches in emotion recognition to classify certain emotion types can be generally classified into three main categories: knowledge-based techniques, statistical methods, and hybrid approaches.[7]

Knowledge-based Techniques[edit]

Knowledge-based techniques (sometimes referred to as lexicon-based techniques), utilize domain knowledge and the semantic and syntactic characteristics of language in order to detect certain emotion types.[8] In this approach, it is common to use knowledge-based resources during the emotion classification process such as WordNet, SenticNet,[9] ConceptNet, and EmotiNet,[10] to name a few.[11] One of the advantages of this approach is the accessibility and economy brought about by the large availability of such knowledge-based resources.[7] A limitation of this technique on the other hand, is its inability to handle concept nuances and complex linguistic rules.[7]

Knowledge-based techniques can be mainly classified into two categories: dictionary-based and corpus-based approaches.[8] Dictionary-based approaches find opinion or emotion seed words in a dictionary and search for their synonyms and antonyms to expand the initial list of opinions or emotions.[12] Corpus-based approaches on the other hand, start with a seed list of opinion or emotion words, and expand the database by finding other words with context-specific characteristics in a large corpus.[12] While corpus-based approaches take into account context, their performance still vary in different domains since a word in one domain can have a different orientation in another domain.[13]

Statistical Methods[edit]

Statistical methods commonly involve the use of different supervised machine learning algorithms in which a large set of annotated data is fed into the algorithms for the system to learn and predict the appropriate emotion types.[7] This approach normally involves two sets of data: the training set and the testing set, where the former is used to learn the attributes of the data, while the latter is used to validate the performance of the machine learning algorithm.[14] Machine learning algorithms generally provide more reasonable classification accuracy compared to other approaches, but one of the challenges in achieving good results in the classification process, is the need to have a sufficiently large training set.[7][14]

Some of the most commonly used machine learning algorithms include Support Vector Machines (SVM), Naive Bayes, and Maximum Entropy.[15] Deep learning, which is under the unsupervised family of machine learning, is also widely employed in emotion recognition.[16][17][18] Well-known deep learning algorithms include different architectures of Artificial Neural Network (ANN) such as Convolutional Neural Network (CNN), Long Short-term Memory (LSTM), and Extreme Learning Machine (ELM).[15] The popularity of deep learning approaches in the domain of emotion recognition maybe mainly attributed to its success in related applications such as in computer vision, speech recognition, and Natural Language Processing (NLP).[15]

Hybrid Approaches[edit]

Hybrid approaches in emotion recognition are essentially a combination of knowledge-based techniques and statistical methods, which exploit complementary characteristics from both techniques.[7] Some of the works that have applied an ensemble of knowledge-driven linguistic elements and statistical methods include sentic computing and iFeel, both of which have adopted the concept-level knowledge-based resource SenticNet.[19][20] The role of such knowledge-based resources in the implementation of hybrid approaches is highly important in the emotion classification process.[11] Since hybrid techniques gain from the benefits offered by both knowledge-based and statistical approaches, they tend to have better classification performance as opposed to employing knowledge-based or statistical methods independently.[8] A downside of using hybrid techniques however, is the computational complexity during the classification process.[11]


Data is an integral part of the existing approaches in emotion recognition and in most cases it is a challenge to obtain annotated data that is necessary to train machine learning algorithms.[12] While most publicly available data are not annotated, there are existing annotated datasets available to perform emotion recognition research.[14] For the task of classifying different emotion types from multimodal sources in the form of texts, audio, videos or physiological signals, the following datasets are available:

  1. HUMAINE: provides natural clips with emotion words and context labels in multiple modalities[21]
  2. Belfast database: provides clips with a wide range of emotions from TV programs and interview recordings[22]
  3. SEMAINE: provides audiovisual recordings between a person and a virtual agent and contains emotion annotations such as angry, happy, fear, disgust, sadness, contempt, and amusement[23]
  4. IEMOCAP: provides recordings of dyadic sessions between actors and contains emotion annotations such as happiness, anger, sadness, frustration, and neutral state [24]
  5. eNTERFACE: provides audiovisual recordings of subjects from seven nationalities and contains emotion annotations such as happiness, anger, sadness, surprise, disgust, and fear [25]
  6. DEAP: provides electroencephalography (EEG), electrocardiography (ECG), and face video recordings, as well as emotion annotations in terms of valence, arousal, and dominance of people watching film clips [26]
  7. DREAMER: provides electroencephalography (EEG) and electrocardiography (ECG) recordings, as well as emotion annotations in terms of valence, arousal, and dominance of people watching film clips [27]


The computer programmers often use Paul Ekman's Facial Action Coding System as a guide.

Emotion recognition is used for a variety of reasons. Affectiva uses it to help advertisers and content creators to sell their products more effectively.[28] Affectiva also makes a Q-sensor that gauges the emotions of autistic children. Emotient was a startup company which utilized artificial intelligence to predict "attitudes and actions based on facial expressions".[29] Apple indicated its intention to buy Emotient in January 2016.[29] nViso provides real-time emotion recognition for web and mobile applications through a real-time API.[30] Visage Technologies AB offers emotion estimation as a part of their Visage SDK for marketing and scientific research and similar purposes.[31] Eyeris is an emotion recognition company that works with embedded system manufacturers including car makers and social robotic companies on integrating its face analytics and emotion recognition software; as well as with video content creators to help them measure the perceived effectiveness of their short and long form video creative.[32][33] Emotion recognition and emotion analysis are being studied by companies and universities around the world.

See also[edit]

Emotion is universal Over the decades, emotional intelligence is increasingly gaining attention due to its contribution and meaningful outcomes in society. Generally, psychologists and other stakeholders in the psychology industry have adopted the ability model to explain various concepts about emotional intelligence. In this case, the ability model is widely used because it provides experts with the opportunity to investigate emotions and the way it affects the universe across every domain in the society. The model defines emotional intelligence as an individual's ability to incorporate emotional knowledge to improve thought. Further, the ability model divides emotional intelligence into three main branches: emotion understanding, emotion perception, and emotion regulation. The cascading model outlines that emotion perception leads to emotion amongst people while emotion awareness allows a person to understand the way emotions function. By noting emotions as they unveil, individuals gain an understanding of the causes and effects of emotions and the way emotions evolve over a period. In addition, emotion perception expedites emotion understanding among human beings. Despite the growing literature on emotional intelligence, numerous questions are raised concerning the construct of emotion. One area that remains controversial is whether emotion affects all people across cultures throughout the world. Emotions play a significant role in the lives of human beings particularly among infants who need constant attention which, assist them to grow holistically in the environment.

Over the years, the evolutionary perspective to emotions has focused on a subgroup of emotions that are uniquely shared not only amongst human beings but also with other species. According to the evolutionary approach, emotions are characterised by different signals and designed to solve various adaptive challenges. Overall, various events usually evoke certain emotions followed by specific behaviour and subjective feelings towards people across different cultures (Shao, Doucet, & Caruso, 2015).[full citation needed] On the other hand, the culture-specific perspective contends that emotions are not only remnants of our past but could also be explained through physiological terms. Multiple empirical studies argue that emotion is universal and links it with the cultural differences among various societies. For example, Elfenbein and Ambady (2012) conducted a study and concluded that emotional expressions were evident in cross-cultural meetings whereby people from different cultures were examined and expressed judgements that in turn evoked the same degree of emotions across all the participants. Further, a different study supported the relationship between emotions and cultural practices among different people worldwide. For instance, Matsumoto et al. (2008) argued that emotion reappraisal and emotion expression existed in 23 nations thereby illuminating the fact that cultures which practiced hierarchy and social order tended to suppress emotions, reappraisal, and suppression which in turn correlated with the cultures under study. Merged, the research denotes the fact that both the social constructivist approach and the evolutionary perspective of emotions connect

Facial expression is widely referred to as a globally understood signal that is triggered by a hidden event. Majority of the psychologists establish that the human face is important to understanding a person's emotion. Similarly, they argue that a person can understand the mood of another individual through analysing the facial expression. (Mandal & Ambady, 2004). For over three decades, psychologists such as Carroll Izard and Maurice Merleau have established the relationship between expression and emotions. Maurice posits that emotional expressions such as hate, anger, and love are not mental facts concealed behind the consciousness of a human being rather he argues that the emotions are evident on a person's face and not in his or her consciousness (Al-Shawaf, Conroy-Beam, Asao, & Buss, 2016). Understanding facial expression may be viewed as a common phenomenon but it has emerged as the most essential indication in the history of psychology of emotion. Most recently, research on facial expressions has led to the emergence of fresh concepts, new techniques, and innovative findings. In this case, scientists are formulating alternative accounts while older versions are obtaining renewed interests. Darwin arguments suggest that emotions have evolved to an extent that it acts as a communication tool between individuals. Thus, emotions convey important information across cultures around the world. The social constructivists argue that emotional knowledge is only expressed in a symbolic manner (Vytal & Hamann, 2010). According to this argument, the denotation of emotional words constitutes of a set of rules, which itemize the types of people, actions, and situations that the emotional remark applies. In this context, the rules of emotion allow people to communicate with each other without evoking negative feelings towards the other party. In case a person feels that the set of rules embroiled in the social constructivist theory have been violated he or she has the moral right to experience emotion (Mesquita, 2011). Further, the prototype approach contends that emotions can be communicated through non-verbal emotion scripts. In this scenario, the prototype theory includes typical reactions, eliciting situations and self-control techniques. The theory establishes that human beings have the capacity to express their thoughts non-verbally without being misunderstood by the other party.

Lead psychologist Ekman and other practitioners posit that facial displays commonly known as the “Big Six” are widely recognized and practised in every culture. In this context, both the illiterate and literate understand the facial displays regardless of who conveys them whether a member of a particular culture or a person from a different cultural background. Generally, psychology categorizes basic emotion into six distinct types namely happiness, anger, sadness, disgust, fear, and surprise. Numerous psychologists refer to happiness as an emotion that is triggered when nice things happen to people's lives. The feeling entails of pleasantries accompanied by reasonable levels of arousal. More importantly, happiness often occurs with particular facial expressions such as a smile or a laugh. The psychology field regards anger as a feeling caused by something or someone that the victim feel has intentionally done him or her wrong. Psychologists argue that anger can be a good feeling that provides with the best alternative to communicate negative feelings that in turn would motivate the victim to find solutions to specific problems. The feeling of sadness is deeply rooted in an individual's life. It mostly occurs when a person loses a loved one, fails to meet a specific goal, and loses their self-control. Victims that have undergone the sadness phase often exhibit certain characteristics such as being quiet and regularly withdraw themselves from close friends and family members. On the other hand, disgust is a feeling evoked by revulsion or rejection. In most scenarios, people experience this feeling because of consuming something contagious or unpleasant. For instance, mushrooms are considered a delicacy in some cultures but may be revolting to others. Notably, individuals experience fear either in response to a particular event that occurred in the past, present or in future (Russell & Dols, 1997). People respond to fear differently some may experience shock, loss appetite or loss self-control. Surprise is an emotional response that expresses wonder, astonishment, and amazement. The experience varies for different people since it can evoke a positive or negative reaction. Some psychologists argue that surprise helps people fulfil the curiosities that enable them to learn. Newly born infants display a narrow range of emotions. At this age, they express various kinds of emotional behaviour. For example, they express distress when they are lonely, in pain, when seeking attention or in need of food. Toddlers’ exhibit varied emotions during their life span. In this case, they look attentive, listen to sounds, and respond when tickled (Herba, Landau, Russell, Ecker & Phillips, 2006). Furthermore, they exhibit positive emotions such as contentment and joy. Similarly, when picked up, fed, or changed they smile to show a relaxed body posture and generally appear content. In the life of an infant, primary emotions appear in the first six months. In the middle of the second year, the infant starts to experience consciousness that in turn leads to the first group of self-conscious emotions. During the fourth year, the same grown children would potentially exhibit a wide range of human emotions. For instance, they feel ashamed when they fail to accomplish a particular assignment and pride themselves when they achieve. Infants express facial features of emotion immediately at birth. Importantly, infants understand and express emotion at different phases of their infantry life. In fact, by the fourth month, joy emerges. At this age, the infant recognizes smiles from the caring family; the baby actually shows excitement when held by familiar faces or display anger when confronted with strangers. Also at this point, infants show disgust especially when trying to spit smelling or unpleasant food or objects from their mouths. Fear emerges around 6–8 months of an infant's life even though it fully emerges at 18 months. In this context, the infant becomes fearful when an unfamiliar face approaches and the emotion is accompanied with constant crying as the baby tries to escape from the stranger. The mind theory explains the way infants understand other people's intentions. For example, the baby may adopt selective attention to particular products while at the same time ignoring other objects. The mind theory explains the development of social-emotional skills among infants that have attained their second year. As the child grows, he or she adopts joint attention. In this case, joint attention is mental phenomenon whereby although the child and the caregiver have something in common, the baby has developed the capability to observe the adult's intention and attention to a specific object within the vicinity. Similarly, the attachment theory explains the different phases a toddler undergoes, starting before birth when biological mothers develop initial feelings towards their unborn infants (Trawick-Smith, & Smith, 2014). The National Research Council and Institute of Medicine (NRCIM) describe newly born infants as babies who develop emotions with time and are ready to learn. At the attachment stage, this theory contends that babies learn on how to seek attention from the mother or caregivers who in turn respond by offering the baby with food or comfort. Importantly, at this stage, infants learn on the way to handle stressful scenarios by interacting with the mother or other close family members that ultimately assist the toddler to control his or her emotions or self-sooth. The quality of caregiving at the early stages when the infant is born either helps or hampers the baby's capabilities to control inner emotions. For example, when the mother consistently acts following the baby's signals, the toddler starts to enjoy social interactions even with strangers. Prosocial behaviour manifests itself in children as young as 18 months. At this age, the infants are able to alert the caregiver about the unseen scenario or point to an out-of-reach item. Children between age three and four develop more complex prosocial behaviour. Children of this age respond to other people negative scenarios by helping and comforting them. The emotional and social competencies for babies in the first three years of life develop certain traits that help them safeguard themselves against unfamiliar faces. Babies in this category are often attached to their parents; therefore, any effort to seek attention from the toddlers will result in rejection and infants will most likely cry as a self-protection mechanism against strangers. However, as the babies develop, they become less attached to their caregivers or mothers, which in turn lead self-confidence allowing the babies to accommodate and embrace new faces in their lives. The evolutionary theory argues that emotions are characterised by different signals and designed to solve various adaptive problems. Generally, various events usually induce certain emotions followed by specific behaviour and subjective feelings towards people across different cultures. Psychologists define facial expression as a globally understood signal triggered by a hidden event. Maurice posits that emotional expressions such as hate, anger, and love are not mental facts concealed behind the consciousness of a human being rather he argues that the emotions are evident on a person's face and not in his or her consciousness. Darwin maintains that emotions have evolved over the years to an extent that it acts as a communication tool between individuals. In this case, emotions express information in diverse cultures across the world.  


  1. ^ Miyakoshi, Yoshihiro, and Shohei Kato. "Facial Emotion Detection Considering Partial Occlusion Of Face Using Baysian Network". Computers and Informatics (2011): 96–101.
  2. ^ Hari Krishna Vydana, P. Phani Kumar, K. Sri Rama Krishna and Anil Kumar Vuppala. "Improved emotion recognition using GMM-UBMs". 2015 International Conference on Signal Processing and Communication Engineering Systems
  3. ^ B. Schuller, G. Rigoll M. Lang. "Hidden Markov model-based speech emotion recognition". ICME '03. Proceedings. 2003 International Conference on Multimedia and Expo, 2003.
  4. ^ Poria, Soujanya; Cambria, Erik; Bajpai, Rajiv; Hussain, Amir (September 2017). "A review of affective computing: From unimodal analysis to multimodal fusion". Information Fusion. 37: 98–125. doi:10.1016/j.inffus.2017.02.003.
  5. ^ Caridakis, George; Castellano, Ginevra; Kessous, Loic; Raouzaiou, Amaryllis; Malatesta, Lori; Asteriadis, Stelios; Karpouzis, Kostas (19 September 2007). "Multimodal emotion recognition from expressive faces, body gestures and speech". IFIP The International Federation for Information Processing. Springer US: 375–388. doi:10.1007/978-0-387-74161-1_41.
  6. ^ Price. "Tapping Into The Emotional Internet". TechCrunch. Retrieved 2018-12-12.
  7. ^ a b c d e f Cambria, Erik (March 2016). "Affective Computing and Sentiment Analysis". IEEE Intelligent Systems. 31 (2): 102–107. doi:10.1109/MIS.2016.31.
  8. ^ a b c Rani, Meesala Shobha; S, Sumathy (26 September 2017). "Perspectives of the performance metrics in lexicon and hybrid based approaches: a review". International Journal of Engineering & Technology. 6 (4): 108. doi:10.14419/ijet.v6i4.8295.
  9. ^ Cambria, Erik; Poria, Soujanya; Bajpai, Rajiv; Schuller, Bjoern (2016). "SenticNet 4: A Semantic Resource for Sentiment Analysis Based on Conceptual Primitives". Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers.
  10. ^ Balahur, Alexandra; Hermida, JesúS M.; Montoyo, AndréS (1 November 2012). "Detecting implicit expressions of emotion in text: A comparative analysis". Decision Support Systems. 53 (4): 742–753. doi:10.1016/j.dss.2012.05.024. ISSN 0167-9236.
  11. ^ a b c Medhat, Walaa; Hassan, Ahmed; Korashy, Hoda (December 2014). "Sentiment analysis algorithms and applications: A survey". Ain Shams Engineering Journal. 5 (4): 1093–1113. doi:10.1016/j.asej.2014.04.011.
  12. ^ a b c Madhoushi, Zohreh; Hamdan, Abdul Razak; Zainudin, Suhaila (2015). "Sentiment analysis techniques in recent works - IEEE Conference Publication". ieeexplore.ieee.org. doi:10.1109/SAI.2015.7237157.
  13. ^ Hemmatian, Fatemeh; Sohrabi, Mohammad Karim (18 December 2017). "A survey on classification techniques for opinion mining and sentiment analysis". Artificial Intelligence Review. doi:10.1007/s10462-017-9599-6.
  14. ^ a b c Sharef, Nurfadhlina Mohd; Zin, Harnani Mat; Nadali, Samaneh (1 March 2016). "Overview and Future Opportunities of Sentiment Analysis Approaches for Big Data". Journal of Computer Science. 12 (3): 153–168. doi:10.3844/jcssp.2016.153.168.
  15. ^ a b c Sun, Shiliang; Luo, Chen; Chen, Junyu (July 2017). "A review of natural language processing techniques for opinion mining systems". Information Fusion. 36: 10–25. doi:10.1016/j.inffus.2016.10.004.
  16. ^ Majumder, Navonil; Poria, Soujanya; Gelbukh, Alexander; Cambria, Erik (March 2017). "Deep Learning-Based Document Modeling for Personality Detection from Text". IEEE Intelligent Systems. 32 (2): 74–79. doi:10.1109/MIS.2017.23.
  17. ^ Mahendhiran, P. D.; Kannimuthu, S. (May 2018). "Deep Learning Techniques for Polarity Classification in Multimodal Sentiment Analysis". International Journal of Information Technology & Decision Making. 17 (03): 883–910. doi:10.1142/S0219622018500128.
  18. ^ Yu, Hongliang; Gui, Liangke; Madaio, Michael; Ogan, Amy; Cassell, Justine; Morency, Louis-Philippe (23 October 2017). "Temporally Selective Attention Model for Social and Affective State Recognition in Multimedia Content". ACM: 1743–1751. doi:10.1145/3123266.3123413.
  19. ^ Cambria, Erik; Hussain, Amir (2015). Sentic Computing: A Common-Sense-Based Framework for Concept-Level Sentiment Analysis. Springer Publishing Company, Incorporated. ISBN 3319236539.
  20. ^ Araújo, Matheus; Gonçalves, Pollyanna; Cha, Meeyoung; Benevenuto, Fabrício (7 April 2014). "iFeel: a system that compares and combines sentiment analysis methods". ACM: 75–78. doi:10.1145/2567948.2577013.
  21. ^ editors, Paolo Petta, Catherine Pelachaud, Roddy Cowie, (2011). Emotion-oriented systems the humaine handbook. Berlin: Springer. ISBN 978-3-642-15184-2.
  22. ^ Douglas-Cowie, Ellen; Campbell, Nick; Cowie, Roddy; Roach, Peter (1 April 2003). "Emotional speech: towards a new generation of databases". Speech Communication. 40 (1–2): 33–60. doi:10.1016/S0167-6393(02)00070-5. ISSN 0167-6393.
  23. ^ McKeown, G.; Valstar, M.; Cowie, R.; Pantic, M.; Schroder, M. (January 2012). "The SEMAINE Database: Annotated Multimodal Records of Emotionally Colored Conversations between a Person and a Limited Agent". IEEE Transactions on Affective Computing. 3 (1): 5–17. doi:10.1109/T-AFFC.2011.20.
  24. ^ Busso, Carlos; Bulut, Murtaza; Lee, Chi-Chun; Kazemzadeh, Abe; Mower, Emily; Kim, Samuel; Chang, Jeannette N.; Lee, Sungbok; Narayanan, Shrikanth S. (5 November 2008). "IEMOCAP: interactive emotional dyadic motion capture database". Language Resources and Evaluation. 42 (4): 335–359. doi:10.1007/s10579-008-9076-6. ISSN 1574-020X.
  25. ^ Martin, O.; Kotsia, I.; Macq, B.; Pitas, I. (3 April 2006). "The eNTERFACE'05 Audio-Visual Emotion Database". IEEE Computer Society: 8. doi:10.1109/ICDEW.2006.145.
  26. ^ Koelstra, Sander; Muhl, Christian; Soleymani, Mohammad; Lee, Jong-Seok; Yazdani, Ashkan; Ebrahimi, Touradj; Pun, Thierry; Nijholt, Anton; Patras, Ioannis (January 2012). "DEAP: A Database for Emotion Analysis Using Physiological Signals". IEEE Transactions on Affective Computing. 3 (1): 18–31. doi:10.1109/T-AFFC.2011.15. ISSN 1949-3045.
  27. ^ Katsigiannis, Stamos; Ramzan, Naeem (January 2018). "DREAMER: A Database for Emotion Recognition Through EEG and ECG Signals From Wireless Low-cost Off-the-Shelf Devices". IEEE Journal of Biomedical and Health Informatics. 22 (1): 98–107. doi:10.1109/JBHI.2017.2688239. ISSN 2168-2194.
  28. ^ "Affectiva".
  29. ^ a b DeMuth Jr., Chris (8 January 2016). "Apple Reads Your Mind". M&A Daily. Seeking Alpha. Retrieved 9 January 2016.
  30. ^ "nViso". nViso.ch.
  31. ^ "Visage Technologies".
  32. ^ "Feeling sad, angry? Your future car will know".
  33. ^ "Cars May Soon Warn Drivers Before They Nod Off".