通讯机构:
[Yang, Xiao-Hua] U;Univ South China, Sch Comp Sci & Technol, Hengyang 421001, Peoples R China.
会议名称:
第五届自然语言处理与中文计算会议(NLPCC-ICCPOL2016)
会议时间:
2016-12-02
会议地点:
昆明
会议主办单位:
Kunming Univ Sci & Technol
会议论文集名称:
第五届自然语言处理与中文计算会议(NLPCC-ICCPOL2016)论文集
关键词:
Feature selection;Information gain;Relative document frequency distribution;Low-frequency characteristic
摘要:
Feature selection algorithm plays an important role in text categorization. Considering some drawbacks proposed from traditional and recently improved information gain (IG) approach, an improved IG feature selection method based on relative document frequency distribution is proposed, which combines reducing the impact of unbalanced data sets and low-frequency characteristics, the frequency distribution of features within category and the relative frequency document distribution of features among different categories. The experimental results of NLPCC-ICCPOL 2016 stance detection in Chinese microblogs show that the performance of the improved method is better than traditional IG approach and another improved method in feature selection.
作者:
Chunping, Ouyang;Yongbin, Liu;Shuqing, Zhang;Xiaohua, Yang
期刊:
International Journal of Database Theory and Application,2015年8(6):1-12 ISSN:2005-4270
通讯作者:
Yongbin, Liu(qingbinliu@163.com)
作者机构:
[Yongbin, Liu; Shuqing, Zhang; Chunping, Ouyang; Xiaohua, Yang] School of Computer Science and Technology, University of South China, Hunan Hengyang, China
作者:
Chunping, Ouyang;Lingyun, Luo;Shuqing, Zhang;Xiaohua, Yang
期刊:
International Journal of Database Theory and Application,2014年7(6):191-202 ISSN:2005-4270
作者机构:
[Lingyun, Luo; Shuqing, Zhang; Chunping, Ouyang; Xiaohua, Yang] School of Computer Science and Technology, University of South China, Hunan Hengyang, China
摘要:
Currently, most sentiment analysis of microblog has been focused on coarse-grained sentiment analysis, but fine-grained sentiment is better for reflecting the opinion of the public when they are facing the social focus. Therefore, a hybrid strategy which is a combination of Naïve Bayesian and two-layer CRFs is put forward, which has been applied to the fine-grained sentiment analysis of Chinese microblog. First, microblog is classified into two types: sentiment and non-sentiment by using Naïve Bayesian classification algorithm. And then the first-layer CRFs model is built for the topic emotional sentence. Finally CRFs algorithm is used again to do multi-classification to assign a specific sentiment category. Experimental results show that a good result in sentiment identification based on the combination of Naïve Bayesian and CRFs, and also show the advantage of the combination of Naïve Bayesian and CRFs interrelated with emotional sentence extraction based on CRFs.
期刊:
International Journal of Multimedia and Ubiquitous Engineering,2014年9(11):385-396 ISSN:1975-0080
通讯作者:
Wen, Zhou
作者机构:
[Yu, Ying; Ouyang, Chunping; Liu, Zhiming; Wen, Zhou; Yang, Xiaohua] School of Computer Science and Technology, University of South China, Hunan Hengyang, China
通讯机构:
School of Computer Science and Technology, University of South China, Hunan Hengyang, China
摘要:
We studied the electronic communication of knowledge users collaborating on a community and found that their work and interactions were mediated by the use of tag. Drawing on these, we found social ta
摘要:
Using the genre perspective, we studied the electronic communication of knowledge users collaborating on a movie community and found that their work and interactions were mediated by the use of genres. Drawing on these findings, we develop the concept of genre repertoire to designate the set of genres enacted by groups, organizations, or communities to accomplish their work. Automatic discourse classification according to genre in social information sharing, transfer and knowledge communication provides a higher level of service quality. By investigating user behavior in movie community, the relationship between intertextuality of discourse genre and user behavior was studied. We denoted genre by using vector, and discourse genre intertextuality intensity is measured with vector distance. And for those discourse which genre is unknown, genre intertextuality is calculated using user behavior. The results show that user various behaviors stickiness in movie community and discourse genre intertextuality intensity have potential common features.