- 01-04任玉琪 Capsule间的动态路由
- 12-24王治政 异构信息网络的表示学习
- 12-20刘海峰 基于文本上下文来学习网络embedding的关系模型
- 12-14张桐瑄 利用语法结构来进行命名实体识别
- 12-11楚永贺 基于流形学习的词嵌入
- 12-11任璐 迁移学习——跨语言的文本分类
An Improved Sentiment Analysis Algorithm for Chinese News
Yu Huangfu, Guoshi Wu, Yu Su
Jing Li, Pengfei Sun, Jie Hu
School of Software Engineering Beijing University of Posts and Telecommunications, 100876, Beijing, China, Communication and Technical Bureau, Xinhua News Agency, 100803, Beijing, China
In recent years,News Sentiment Analysis is a hotspot in the fields of natural language process,and it is also a challenging problem.The current article sentiment analysis methods are mainly divided into two categories: based on semantic direction approach and based on machine learning method.
According to the characters of News,author’s job following:
① News sentiment analysis divided into title sentiment analysis and text sentiment analysis
② Using the rule set for the title to recognize neutral sentiment News
③ For the text , classifying the subjective sentence, recognizing the subject word
④ Analyzing sentiment tendency of the News
Based on these job,the method is called Improved Sentiment Analysis(ISA).
Another methods is called General Sentiment Analysis(GSA).
Details are as follows:
1) Analysis of Neutral News
u Some news does not have any sentiment ,for reduce misleading of sentiment analysis result,filtering it
u Method: rule set
2) Subjective Sentence Recognition
u Subjective sentence usually has a certain sentiment.
u Recognition algorithm is based on N-POS model and Bayes classification algorithm.
l CHI statistics method:
l Bayes text classification method:
Prior probability :
3) Subject Word Recognition
For each subject sentence, finding the subject word in the most former position according to the subject words dictionary, recording it.
4) Calculating Title Sentiment Value
Considering sentiment words and negative words that modify sentiment words by matching dictionary.
5) Calculating Text Sentiment Value
Considering subjective words, sentiment words and negative words
6) Weight Algorithm
Logistic regression hypothesis function:
if result is large than 0.5, News is positive ,else News is negative .
Experiment and Data
News of Xinhua News Agency
u Data pretreatment:
l Manually annotated the News randomly
Neutral sentiment News 144
Postive sentiment News 300
Negative sentiment News 46
As can be seen from the above two tables,Improved Sentiment Analysis(ISA) method is more accurate than General Sentiment Analysis(GSA) method.