Sentiment analysis for reviews and microtexts based on lexico-syntactic

We describe two methods to perform sentiment analysis both on long and short texts written in Spanish language. We first present an unsupervised method based on dependency parsing which calculates the semantic orientation (SO) of the sentences in order to classify the polarity. We then propose a hybrid approach which uses the computed SO and lexico-syntactic knowledge as features for a supervised classifier. Experimental results show the utility of employing syntactic information to classify the polarity in both types of texts and the importance of defining mechanisms to adapt the system for a specific domain and social medium.


INTRODUCTION
With the apparition of Web 2.0 and the rise of blogs, forums and social networks, users express their views about various topics on these sites.They discuss current issues and praise, compare or complain about products, services and even people.The economic benefits that can be derived from this knowledge are obvious, so the market has begun to demand solutions to analyse this enormous flow of opinions.In this respect, sentiment analysis (SA) is a growing field of research focussed on automatic processing of subjective information, where one of the main tasks is polarity classification, i.e., to determine whether the opinion expressed is positive, negative, neutral or mixed.There is no standardisation about the polarity categories, but most of studies perform a binary (positive, negative) or a ternary classification (positive, negative, neutral), although there is also related research which takes into account more categories.
The polarity classification task has been tackled in the last decade from two different perspectives: supervised machine learning (ML) approaches and non-supervised semantic-based methods.ML solutions involve building classifiers from a collection of annotated texts (Pang et al. (2002)), where each text is usually represented as a bag-ofwords.It is also common to include some linguisticrelated processing for preparing features (Bakliwal et al. (2012)).The main drawback of this angle is that it is highly domain dependent (Taboada et al. (2011)).On the other side, semantic-based approaches (Turney (2002)) involve the use of dictionaries where different kinds of words are tagged with their semantic orientation (SO); they have been applied successfully in many contexts but their performance is not optimum because different application domains and social media have many specific subjective elements, this results in a low recall of the opinion lexicons (Zhang et al. (2011)).
Traditionally, SA research has focussed on long texts.For example, Taboada et al. (2011) propose a lexicon-based method which deals with phenomena such as intensification, negation or irrealis.With a similar aim, Abbasi et al. (2008) describe an approach which takes stylistic and syntactic components as features for a supervised classifier.However, the recent success of microblogging social networks, such as Twitter, has increased interest in monitoring short texts.In this line, Bakliwal et al. (2012) performed an sentiment scoring algorithm which uses prior information to classify the polarity of tweets, and Sidorov et al. (2012) explore different settings of parameters for a supervised classifier.
In this context, this paper proposes an unsupervised and a supervised approach which are able to perform binary polarity classification over reviews and short texts.We adopt in both cases an NLP perspective which takes into account lexical information and syntactic relations between words.The unsupervised approach is able to treat relevant linguistic phenomena, such as intensification, subordinate adversative clauses or negation, to then calculate the SO of the text.The ML approach uses lexicosyntactic knowledge and the information provided by our unsupervised system as features for a classifier.
The methods proposed in this article have been tested with the following corpora:

NATURAL LANGUAGE PROCESSING TASKS
In order to employ linguistic knowledge in SA, we first need to apply natural language processing (NLP) to the texts.As a previous step, all reviews were preprocessed as follows: • Unification of compound expressions.There are many compound expressions in Spanish like 'sin embargo' ('however') or 'en absoluto' ('not at all'), that must usually be interpreted as single units of meaning.To find them, we use a dictionary of compound expressions, extracted from the Ancora corpus (Taul é et al. (2008)).If the pre-processing algorithm identifies a group of these words, it unifies them into a single token ('en absoluto' becomes 'en absoluto').
• Normalization of punctuation marks.People do not usually respect punctuation rules in web reviews.This is a handicap for the rest of processing.To resolve this, pre-processing homogenises all punctuation mark representation by adding blanks when required.
• URL normalisation: Web addresses are replaced with the string 'URL'.
• Hashtags ('#'): If it appears at the beginning or the end of a tweet, then the complete hashtag is eliminated.Otherwise we only delete the '#'.
As a second step, Part-of-Speech (PoS) tagging is performed by running the Brill tagger (Brill (1992)), using Ancora as the training corpus.A challenge for Spanish PoS tagging is that the use of accents is commonly ignored by people when writing in web reviews.The drawback is that taggers trained with regular corpora are not able to tag pairs of words that should use diacritical accents in order to difference their meaning.To improve performance, we have expanded the training set: we cloned each sentence to obtain its equivalent without any acute accent.
Once these steps have been performed, we use dependency parsing (G ómez-Rodríguez et al.  2008)).As a result, we obtain a dependency tree for each sentence, consisting of a set of head/dependent binary relations, called dependencies, between words.Each dependency has a label with a given dependency type, which denotes the existing syntactic relation between head and dependent.To simplify computational implementation, an artificial ROOT node is added as the first word of each sentence.Figure 1 shows an example of this type of analysis for the sentence: 'Esa película no es muy buena' which translates to 'That movie is not too good'.

AN UNSUPERVISED SYSTEM BASED ON THE SEMANTIC ORIENTATION OF THE SENTENCES (US)
Most unsupervised SA systems are typically lexiconbased solutions that cannot interpret the syntactic structure of texts.In order to try to overcome these limitations, it is common to implement heuristics to simulate a comprehension of negation, intensification and other linguistic constructions, but these often fail, given the complexity of human language.As an alternative, in this section we propose an unsupervised, dependency parsing based method for determining the SO of reviews.We use the SODictionariesV1.11Spa(Brooke et al. (2009)) as our opinion lexicon.It is a collection of subjective words where each one has associated an SO between +5 and -5, according to its generic perception (e.g.'happiness' has an SO of +5, 'killer' has -5 and 'good' is associated with a value of +2).

Treatment of intensification
An intensifier is a word or an expression which plays the role of a valence shifter in a sentence.There are two types according to their category: amplifiers and downtoners.The former maximize SO of one or more tokens, such as 'muy' ('very'); whereas the latter decrement it, 'en absoluto' ('not at all') or 'poco' ('little').The SODictionariesV1.11Spa have a specific dictionaries for intensifiers, where each intensifier has an associated percentage, positive if it is an amplifier and negative if it is a downtoner.We use syntactic dependencies to identify the scope of an intensifier; whenever an adverb is a dependent of a specifier (spec, espec) or an adjunct (cc, sadv ) type, we take that word as a valence shifter and its head as the exact scope to be shifted.For example, in the Figure 1, the term 'muy' would modify the SO of 'buena' by +25%, according to its value in SODictionariesV1.11Spa.

Treatment of subordinate adversative clauses
A subordinate adversative clause expresses an event or fact that is the opposite to that of the main clause.In an SA context, we hypothesise that these type of constructions are a way of restricting, excluding or amplifying the sentiment reflected by both the main and subordinate clauses.We consider subordinate adversative clauses as a special case of intensification, but involving clauses, not individual terms.In this respect, we distinguish two different types of adversative conjunctions, as is pointed out in Campos (1993), Chapter 3. The first type, restrictives ('but', 'while', . . .), increase the sentiment of the subordinate clause and decrease the SO of the main clause.The second type, exclusives ('but rather', 'but in the other hand', . . .), ignore totally the sentiment reflected in the main clause.In this way, our approach is able to calculate coherently the sentiment of sentences such as 'The actor acted badly but the movie was great', where the sentiment of 'The actor acted badly' is partially diminished by the subordinate adversative clause.Modifier percentage of restrictive conjunctions was established by an empirical process over the SFU Spanish Review Corpus.The SO of the main clause is increased by 40% and the SO of the subordinate sentence is decreased by 25%.

Treatment of negation
The most common and simple way to negate a sequence of tokens in Spanish is the adverb 'no' ('no'/'not'), but other terms such as 'sin' ('without') or 'nunca' ('never') are frequently employed.However, some types of Spanish sentences usually require the use of double negatives to make a negative sentence.In this respect, words like 'nada' ('nothing'), 'ninguno' ('none') or 'nadie' ('nobody') are commonly preceded by 'no'.Moreover, the difference between a negation term and a downtoner is diffuse.Tokens like 'apenas' ('barely') or 'casi' ('almost') could easily be classified in either of these two categories.We have chosen to consider these type of expressions as intensifiers and therefore we only consider explicitly as negators the adverbs 'no', 'nunca' and 'sin', which cover a great number of negative sentences.Our treatment of a negation consists of two basic steps: 1) identify the scope of a negation term and 2) modify the semantic orientation of affected tokens.

Scope identification
The procedure for identifying the scope of a negation depends on the adverb used in the phrase.The syntactic structure used in Ancora for representing an adverb 'sin' assures us that its child node should be the scope of negation, without needing to analyse the dependency type.But we cannot assume the same for the negators 'no' and 'nunca'.
Normally they are represented as leaf nodes and the candidate scope of negation always involves a head node or a collection of sibling nodes, so we require a more complex algorithm for their treatment.
We follow a procedure based on Jia et al. ( 2009), but we have adapted this procedure to profit from the additional information provided by the syntactic structure of the sentence.We use dependency types to directly extract the scope of negation.When a token has a negator 'no' ('not') or 'nunca' ('never') as a child node and it is a dependency of type 'neg' or 'mod'; we try a collection of syntactic heuristic rules in the following order: 3 1. Subjective parent rule: Whenever a parent node of a negation term has sentiment, only that node is negated.For example, in the sentence 'he does not praise my work', the negation 'not' depends on 'praise', which is included as a subjective word in the SO dictionaries, so we consider this term as the scope of the negation.
2. Subject complement/Direct object rule: Whenever a branch at the same level as a negation node is labelled with a dependency of type subject complement (atr ) ('the meal is not good') or a direct object (cd) ('the meal does not look good'), our sentiment analyser negates that branch.
3. Adjunct rule: Whenever a negation term has an adjunct branch (cc) at the same level, the sentiment of that branch is shifted.If there is more than one adjunct, only the first one is negated.For example, in the sentence 'he does not work efficiently on Fridays', our method takes the mood adjunct ('efficiently') as the scope of the negation, because it is the nearest to the negation.

Default rule:
If none of the previous rules matches, we consider as scope the sibling branches of a negator.

Polarity flip
We follow a shift negation algorithm where the SO value is shifted toward the opposite polarity by a 3 Only the first matching rule is applied.
fixed amount: following Taboada et al. (2011), we have chosen a shift value of 4 for the adverbs 'no' ('not') and 'nunca' ('never').For the adverb 'sin' ('without'), based on our experimental setup, we have chosen a value of 3.5.We hypothesise this kind of negation as being less potent, given that its scope is fairly local.Experimental results showed an slightly improvement in accuracy when carrying out this strategy.

A SUPERVISED SYSTEM BASED ON LINGUISTIC FEATURES
We now propose a hybrid system which combines lexical, syntactic and semantic knowledge with ML techniques.In particular, linguistic features are used to feed an SMO, an implementation of SVM, presented in (Platt (1999)), and incorporated by default in the WEKA4 data mining software.

Base supervised system (BSS)
We include the SO obtained by our unsupervised system, and the number of positive and negative words in a text, as features for a supervised classifier.We use the SODictionariesV1.11Spa to determine which words are opinionated.

Lexico-syntactic features (LSF)
The employment of POS-tagging information in polarity classification tasks is a widely discussed issue.Pak and Paroubek (2010) suggest that certain POS-tags, such as adjectives or personal pronouns, are more frequent in subjective texts.In this respect, we observed a similar tendency in the training sets employed in this paper.Table 1 shows a selection of relevant tag frequencies.In the same way, we hypothesise that dependency types are also useful in order to classify the polarity of the tweets.Table 2 shows the frequency of some dependency types5 on the HOpinion and TASS 2012 training sets.

Specific domain features (SDF)
Each domain and social medium have some specific elements that denote (implicitly or explicitly) We rank the terms by measuring the information gain with respect to the class, employing the attribute selection tools provided by WEKA and the respective training set.We extracted around 14,000 discriminating terms for the TASS 2012 corpus and about 40,000 in case of HOpinion.However, we saw in both cases that only few hundred of terms were needed for achieving the best performance.Table 3 compares the rank, between HOpinion and the TASS 2012 corpus, for some discriminating terms.

EXPERIMENTAL RESULTS
Tables 4 and 5 show the performance on HOpinion and the TASS 2012 test sets. 6In both corpora, the unsupervised system (US) obtains a good accuracy, but it has a lower performance for negative texts, specially on the HOpinion corpus.This tendency to favour positive classifications is an issue widely discussed on the literature (Brooke et al. (2009)).The employment of the SO and the total number of positive and negative words (+BSS) has a satisfactory effect on the performance, which suggests that the information provided by our unsupervised approach is useful for a supervised classifier.The incorporation of PoStag and syntactic information (+LSF) improves the classification performance on positive and negative texts.This reinforces the idea that certain POS-tags and syntactic functions more frequently depending on the polarity of the review.The accuracy obtained by our final approach (+SDF) suggests that, although generic opinion lexicons and the morphosyntactic structure are helpful to classify the sentiment, we need to incorporate domain semantic knowledge to optimise the performance.Moreover, this final version is able to partially counteract the favourable tendency to positive classifications present in the rest of versions.Finally, we compare our methods with a pure ML approach.We trained an SMO (keeping the WEKA default configuration) which takes as features the bag of words of a text.They are preprocessed as indicated in Section 2 and lemmatised, to then use binary occurrence as weighting factor for tweets, and total occurrence for reviews.

CONCLUSIONS AND FUTURE WORK
We describe an unsupervised and a hybrid method based on linguistic knowledge.Experimental results suggest that both approaches satisfactorily perform sentiment analysis on reviews and microtexts, which reinforces the utility of employing lexico-syntactic knowledge in order to classify the polarity of opinions.
positives, and R is the number of true positives divided by the sum of the true positives and false negatives.Fp and Fn refers to F-measure for positive and negative opinions, respectively.
As future work, we would like to incorporate more linguistic phenomena in our unsupervised method, such as the irrealis or the subjunctive mood, as other systems do (Taboada et al. (2011)).In the same line, we think that expanding our treatment of negation, including more negation terms, would have a positive effect.With regard to the supervised system, we desire to explore more thoroughly the employment of syntactic knowledge as features for the classifier.

Figure 1 :
Figure 1: Dependency parsing for a Spanish sentence

Table 2 :
Dependency type frequencies in the training set:To treat this issue, we have developed an automatic mechanism that enriches and adapts semantic knowledge to a particular field.The goal is to create a ranked list of words to help distinguish between the different polarities, and use each word of that list as a feature for the classifier.We use binary occurrence as the weighting factor in case of tweets, because we hypothesise that each word usually appears at most once in a tweet; and the total occurrence in case of long texts.

Table 3 :
Ranking of some of discriminating terms on the training set of the TASS 2012 corpus

Table 4 :
Results on the test set of the HOpinion 2012

Table 5 :
Results on the test set of the TASS 2012