何志勇 An Improved Sentiment Analysis Algorithm for Chinese News
新闻来源:IR实验室       发布时间:2016/3/28 11:18:44

An Improved Sentiment Analysis Algorithm for Chinese News

Yu Huangfu, Guoshi Wu, Yu Su

Jing Li, Pengfei Sun, Jie Hu

School of Software Engineering Beijing University of Posts and Telecommunications, 100876, Beijing, China, Communication and Technical Bureau, Xinhua News Agency, 100803, Beijing, China

In recent years,News Sentiment Analysis is a hotspot in the fields of natural language process,and it is also a challenging problem.The current article sentiment analysis methods are mainly divided into two categories: based on semantic direction approach and based on machine learning method.

According to the characters of News,author’s job following:

① News sentiment analysis divided into title sentiment analysis and text sentiment analysis

② Using the rule set for the title to recognize neutral sentiment News

③ For the text , classifying the subjective sentence, recognizing the subject word

④ Analyzing sentiment tendency of the News

Based on these job,the method is called Improved Sentiment Analysis(ISA).

Another methods is called General Sentiment Analysis(GSA).

Details are as follows:

1)      Analysis of Neutral News

u Some news does not have any sentiment ,for reduce misleading of sentiment analysis result,filtering it

u Method: rule set

2) Subjective Sentence Recognition

u Subjective sentence usually has a certain sentiment.

u Recognition algorithm is based on N-POS model and Bayes classification algorithm.

u Formula

l  CHI statistics method:

l  Bayes text classification method:

l  Explanation

 Prior probability :

 

Posterior probability:

 

Subjective Weight:

3) Subject Word Recognition

For each subject sentence, finding the subject word in the most former position according to the subject words dictionary, recording it.

 

4) Calculating Title Sentiment Value

Considering sentiment words and negative words that modify sentiment words by matching dictionary.

Formula:

5) Calculating Text Sentiment Value

Considering subjective words, sentiment words and negative words

Formula:

6) Weight Algorithm

Logistic regression hypothesis function:

7) Conclusion

                  

if result is large than 0.5, News is positive ,else News is negative .

Experiment and Data

u Data:

News of Xinhua News Agency

u Data pretreatment:

l  Manually annotated the News randomly

   Neutral sentiment News     144

   Postive sentiment News     300

   Negative sentiment News    46

u Result:

 

blob.png 

Summary:

As can be seen from the above two tables,Improved Sentiment Analysis(ISA) method is more accurate than General Sentiment Analysis(GSA) method.