- 04-13李虹磊 基于迁移学习和深度神经网络的序列标注方法
- 04-07刘晓霞 从点嵌入到团嵌入
- 03-17钱凌飞 -基于LSTM的神经网络中文分词方法
- 12-23徐博 面向高效率监督式查询扩展的两阶段特征选择方法
- 11-04李虹磊 强化学习 (Reinforcement Learning)
An Improved Sentiment Analysis Algorithm for Chinese News
Yu Huangfu, Guoshi Wu, Yu Su
Jing Li, Pengfei Sun, Jie Hu
School of Software Engineering Beijing University of Posts and Telecommunications, 100876, Beijing, China, Communication and Technical Bureau, Xinhua News Agency, 100803, Beijing, China
In recent years,News Sentiment Analysis is a hotspot in the fields of natural language process,and it is also a challenging problem.The current article sentiment analysis methods are mainly divided into two categories: based on semantic direction approach and based on machine learning method.
According to the characters of News,author’s job following:
① News sentiment analysis divided into title sentiment analysis and text sentiment analysis
② Using the rule set for the title to recognize neutral sentiment News
③ For the text , classifying the subjective sentence, recognizing the subject word
④ Analyzing sentiment tendency of the News
Based on these job,the method is called Improved Sentiment Analysis(ISA).
Another methods is called General Sentiment Analysis(GSA).
Details are as follows:
1) Analysis of Neutral News
u Some news does not have any sentiment ,for reduce misleading of sentiment analysis result,filtering it
u Method: rule set
2) Subjective Sentence Recognition
u Subjective sentence usually has a certain sentiment.
u Recognition algorithm is based on N-POS model and Bayes classification algorithm.
l CHI statistics method:
l Bayes text classification method:
Prior probability :
3) Subject Word Recognition
For each subject sentence, finding the subject word in the most former position according to the subject words dictionary, recording it.
4) Calculating Title Sentiment Value
Considering sentiment words and negative words that modify sentiment words by matching dictionary.
5) Calculating Text Sentiment Value
Considering subjective words, sentiment words and negative words
6) Weight Algorithm
Logistic regression hypothesis function:
if result is large than 0.5, News is positive ,else News is negative .
Experiment and Data
News of Xinhua News Agency
u Data pretreatment:
l Manually annotated the News randomly
Neutral sentiment News 144
Postive sentiment News 300
Negative sentiment News 46
As can be seen from the above two tables,Improved Sentiment Analysis(ISA) method is more accurate than General Sentiment Analysis(GSA) method.