Vol 7, No 7 (2016) > Electrical, Electronics and Computer Engineering >

A Social Network Newsworthiness Filter Based on Topic Analysis

Chaluemwut Noyunsan, Tatpong Katanyukul, Yuqing Wu, Kanda Runapongsa Saikaew



Assessing trustworthiness of social media posts is increasingly
important, as the number of online users and activities grows. Current
deploying assessment systems measure post trustworthiness as credibility.
However, they measure the credibility of all posts, indiscriminately. The
credibility concept was intended for news types of posts. Labeling other types
of posts with credibility scores may confuse the users. Previous notable works
envisioned filtering out non-newsworthy posts before credibility assessment as
a key factor towards a more efficient credibility system. Thus, we propose to
implement a topic-based supervised learning approach that uses Term
Frequency-Interim Document Frequency (TF-IDF) and cosine similarity for
filtering out the posts that do not need credibility assessment. Our
experimental results show that about 70% of the proposed filtering suggestions
are agreed by the users. Such results support the notion of newsworthiness,
introduced in the pioneering work of credibility assessment. The topic-based
supervised learning approach is shown to provide a viable social network

Keywords: Credibility measurement; Social media analysis; Topic analysis

Full PDF Download


Aggarwal, A., Rajadesingan, A., Kumaraguru, P., 2012. PhishAri: Automatic Realtime Phishing Detection on Twitter. In: eCrime Researchers Summit (eCrime), IEEE, pp. 1-12

Alonso, O., Carson, C., Gerster, D., Ji, X., Nabar, S.U., 2010. Detecting Uninteresting Content in Text Streams. In: Proceedings of the SIGIR 2010 Workshop on Crowdsourcing for Search Evaluation (CSE 2010) - July 23, 2010, Geneva, Switzerland

Blei, D.M., Ng, A.Y. and Jordan, M.I., 2003. Latent Dirichlet Allocation. Journal of Machine Learning Research, Volume 3, pp. 993-1022

Buckley, C., Salton, G., Allan, J., 1993. Automatic Retrieval with Locality Information Using SMART. In: Proceedings of the First Text REtrieval Conference (TREC-1), pp. 59-72

Castillo, C., Mendoza, M., Poblete, B., 2011. Information Credibility on Twitter. In: Proceedings of the 20th International Conference on World Wide Web, March 28–April 1, 2011, Hyderabad, India, pp. 675-684

Gupta, A., Kumaraguru, P., 2012. Credibility Ranking of Tweets during High Impact Events. In Proceedings of the 1st Workshop on Privacy and Security in Online Social Media, Article No. 2, Lyon, France, April 17, 2012

Gupta, A., Kumaraguru, P., Castillo, C., Meier, P., 2014. Tweetcred: Real-time Credibility Assessment of Content on Twitter. Social Informatics-Lecture Notes in Computer Science, Volume 8851, pp. 228-243. Springer International Publishing Switzerland

Ikegami, Y., Kawai, K., Namihira, Y., Tsuruta, S., 2013. Topic and Opinion Classification Based Information Credibility Analysis on Twitter. In: Proceedings of the 2013 IEEE International Conference on Systems, Man, and Cybernetics, pp. 4676-4681, October 13-16, 2013, IEEE Computer Society Washington, DC, USA

Joachims, T., 2002. Optimizing Search Engines Using Clickthrough Data. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 133-142, Edmonton, Alberta, Canada, July 23-26, 2002, ACM New York, NY, USA

Kawabe, T., Namihira, Y., Suzuki, K., Nara, M., Sakurai, Y., Tsuruta, S., Knauf, R., 2015. Tweet Credibility Analysis Evaluation by Improving Sentiment Dictionary. In: 2015 IEEE Congress on Evolutionary Computation (CEC), pp. 2354-2361

Robertson, S.E., Walker, S., Beaulieu, M., 1999. Okapi at TREC-7: Automatic Ad Hoc, Filtering, VLC and Interactive Track. In E.M. Voorhees & D.K. Harman (Eds), The Seventh Text REtrival Conference (TREC-7), pp. 253-264, Gaithersburg, MD: U.S. Department of commerce, NIST (National Institute of Standards and Technology)

Salton, G., McGill, M.J., 1983. Introduction to Modern Information Retrieval, Mcgraw-Hill College

Yang, M-C., Rim, H-C., 2014. Identifying Interesting Twitter Contents using Topical Analysis. Expert Systems with Applications, Volume 41(9), pp. 4330-4336