Browse Prior Art Database

Method and System for Detecting Emotions in Text

IP.com Disclosure Number: IPCOM000240190D
Publication Date: 2015-Jan-11
Document File: 6 page(s) / 241K

Publishing Venue

The IP.com Prior Art Database

Abstract

A method and system is disclosed for detecting emoticons in text.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 41% of the total text.

Page 01 of 6

Method and System for Detecting Emotions in Text

Currently, most messaging applications include emotion detection in text. Generally, dictionaries and bag of words are used for detecting emotions. However, dictionaries can be noisy and context dependent.

Disclosed is a method and system for detecting emoticons in text. The method and system detects emotions in text by adding of four constraints such as topical, user, temporal and social. The topical constraints can be documents on same topics with similar emotions. The user constraints can be a user not varying much in emotions. The temporal constraints can be users' emotion over small time periods need to be similar. The social constraints can be user's network need not exhibit contrasting emotions from the user. Using the four constraints and statistical significance of the four constraints leads to getting logical emotion categories for the text.

Consider a scenario where be a latent word-emotion matrix and be a latent document-emotion matrix. The two matrices contain scores for each word/document per emotion category. The method and system classifies emotions based on non-negative matrix factorization (NMF). Further, consider that X is drawn from the product of D and S with a random independent and identically distributed Gaussian noise. This leads to the following problem formulation.


(1)

where ||M||F represents the Frobenius norm of matrix M. Eq. 1 captures the essence of the method and system. There are several novel constraints for the problem formulation such as unknown factors, emotion lexicon and ordering, topic constraints, and emotion constraints.

Unknown factors: There can be several unaccountable factors that can lead to the inclusion of a word (or feature) in a document. For example, popular words have higher probability of being mentioned in a document than a rare word. Conversely, a large document has higher probability of mentioning several words in comparison to a terse document. To account for such factors, multivariate half-normally distributed random variables are considered which are as follows:

1


Page 02 of 6

(word-level)

(document-level)

The variables and are half-normal as the variables must satisfy the constraints: ; . Incorporating the unknown factors leads to the following problem formulation.


(2)

and

. The minimization problem posed by Eq. 2 is a generalization over Eq. 1.

Setting , , in Eq. 2 gives the solution to Eq. 1.

Emotion Lexicon and Ordering: The mapping of emotions to the columns of the latent emotion matrices D and S is

unknown. To fix the mapping, the word emotion lexicon is used for which the mapping of columns to emotions is known and fixed. Additional benefit of the emotion lexicon is that the emotion lexicon enables encoding of prior

knowledge about the word-emotion categories. The

emotional constraint is added to the model. Here,

   indicates the first rows of matrix . Firstly, rows of X are ordered to first represent the emotion lexic...