Login



Other Articles by Author(s)

Amjad Jumaah Frhan



Author(s) and WSEAS

Amjad Jumaah Frhan


WSEAS Transactions on Computers


Print ISSN: 1109-2750
E-ISSN: 2224-2872

Volume 16, 2017

Notice: As of 2014 and for the forthcoming years, the publication frequency/periodicity of WSEAS Journals is adapted to the 'continuously updated' model. What this means is that instead of being separated into issues, new papers will be added on a continuous basis, allowing a more regular flow and shorter publication times. The papers will appear in reverse order, therefore the most recent one will be on top.



Detection and Tracking of Real-World Events from Online Social Media User Data Using Hierarchical Agglomerative Clustering Based System

AUTHORS: Amjad Jumaah Frhan

Download as PDF

ABSTRACT: Event detection and tracking is always been an efficient strategy of automation. Detecting significant real-world events from the given database or documents using past knowledge has garnered immense research interest in the recent years. Researches have garnered huge in numbers which focuses on utilizing the data like updates, status messages, shared pictures, etc. in social media to identify the occurrence of events. The most popular events of environmental, political, cultural or everyday importance are detected and tracked for various applications all over the world. However detecting the number of common events from the social media content requires efficient strategies as the size of the content and number of users is large, leading to large data to be processed. In order to avoid the limitations of the existing event detection schemes, this paper presents a new approach named Event WebClickviz. This model visualizes the user data and then analyses the similarity between the data to detect the events. Initially the event detection process is considered as a clustering problem as best results are obtained for clustering algorithms. Named Entity recognition with Topical PageRank is employed for extracting the key terms in the texts while the temporal sequences of real values are estimated to build the event sequences. The features are extracted by applying the concept of sentiment analysis using term frequency–inverse document frequency (TF-IDF). Based on these features the content is clustered using Hierarchical Agglomerative clustering algorithm. Thus the event is detected with high efficiency and they are visualized better using the proposed model. The simulation results justify the performance of the proposed Event WebClickviz.

KEYWORDS: Event detection, visualization, Named Entity recognition, Topical PageRank, Hierarchical Agglomerative clustering, term frequency–inverse document frequency, online social networks, WebClickviz, clickstream

REFERENCES:

[1] Go, A., Huang, L., & Bhayani, R. (2009). Twitter sentiment analysis. Entropy, 17, 252.

[2] Bifet, A., & Frank, E. (2010). Sentiment knowledge discovery in twitter streaming data. In International conference on discovery science (pp. 1-15). Springer, Berlin, Heidelberg.

[3] Kouloumpis, E., Wilson, T., & Moore, J. D. (2011). Twitter sentiment analysis: The good the bad and the omg!. Icwsm, 11(538-541), 164.

[4] Tumasjan, A., Sprenger, T. O., Sandner, P. G., & Welpe, I. M. (2010). Predicting elections with twitter: What 140 characters reveal about political sentiment. ICWSM, 10(1), 178-185.

[5] Mathioudakis, M., & Koudas, N. (2010). Twittermonitor: trend detection over the twitter stream. In Proceedings of the 2010 ACM SIGMOD International Conference on Management of data(pp. 1155-1158). ACM.

[6] Cha, M., Haddadi, H., Benevenuto, F., & Gummadi, P. K. (2010). Measuring user influence in twitter: The million follower fallacy. Icwsm, 10(10-17), 30.

[7] Ghosh, S., Sharma, N., Benevenuto, F., Ganguly, N., & Gummadi, K. (2012). Cognos: crowdsourcing search for topic experts in microblogs. In Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval (pp. 575-590). ACM.

[8] Chen, K., Chen, T., Zheng, G., Jin, O., Yao, E., & Yu, Y. (2012). Collaborative personalized tweet recommendation. In Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval (pp. 661-670). ACM.

[9] Galuba, W., Aberer, K., Chakraborty, D., Despotovic, Z., & Kellerer, W. (2010). Outtweeting the twitterers-predicting information cascades in microblogs. WOSN, 10, 3-11.

[10] Kwak, H., Lee, C., Park, H., & Moon, S. (2010). What is Twitter, a social network or a news media?. In Proceedings of the 19th international conference on World wide web (pp. 591-600). ACM.

[11] Signorini, A., Segre, A. M., & Polgreen, P. M. (2011). The use of Twitter to track levels of disease activity and public concern in the US during the influenza A H1N1 pandemic. PloS one, 6(5), e19467.

[12] Dong, A., Zhang, R., Kolari, P., Bai, J., Diaz, F., Chang, Y., & Zha, H. (2010). Time is of the essence: improving recency ranking using twitter data. In Proceedings of the 19th international conference on World Wide Web (pp. 331- 340). ACM.

[13] Starbird, K., & Palen, L. (2012). (How) will the revolution be retweeted?: information diffusion and the 2011 Egyptian uprising. In Proceedings of the acm 2012 conference on computer supported cooperative work (pp. 7-16). ACM.

[14] Alkazemi, M. F., Bowe, B. J., & Blom, R. (2012). Facilitating the egyptian uprising: A case study of facebook and. Cases on Web, 2, 256.

[15] Mills, A., Chen, R., Lee, J., & Raghav Rao, H. (2009). Web 2.0 emergency applications: How useful can Twitter be for emergency response?. Journal of Information Privacy and Security, 5(3), 3- 26.

[16] Becker, H., Naaman, M., & Gravano, L. (2011). Beyond Trending Topics: Real-World Event Identification on Twitter. ICWSM, 11(2011), 438-441.

[17] Petrović, S., Osborne, M., & Lavrenko, V. (2010). Streaming first story detection with application to twitter. In Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics (pp. 181-189). Association for Computational Linguistics.

[18] Valkanas, G., & Gunopulos, D. (2013). How the live web feels about events. In Proceedings of the 22nd ACM international conference on Information & Knowledge Management (pp. 639-648). ACM.

[19] Abdelhaq, H., Sengstock, C., & Gertz, M. (2013). Eventweet: Online localized event detection from twitter. Proceedings of the VLDB Endowment, 6(12), 1326-1329.

[20] Frhan, A. J. (2017). Website Clickstream Data Visualization Using Improved Markov Chain Modelling In Apache Flume. In MATEC Web of Conferences (Vol. 125, p. 04025). EDP Sciences.

[21] Al-Ariki, H. D. E., & Swamy, M. S. (2017). A survey and analysis of multipath routing protocols in wireless multimedia sensor networks. Wireless Networks, 23(6), 1823-1835.

[22] He, Q., Chang, K., & Lim, E. P. (2007). Analyzing feature trajectories for event detection. In Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval (pp. 207-214). ACM.

[23] Weng, J., & Lee, B. S. (2011). Event detection in twitter. International Conference on Weblogs and Social Media 11, 401-408.

[24] Li, R., Lei, K. H., Khadiwala, R., & Chang, K. C. C. (2012). Tedas: A twitterbased event detection and analysis system. In Data engineering (icde), 2012 IEEE 28th international conference on (pp. 1273- 1276). IEEE.

[25] Ciglan, M., & Nørvåg, K. (2010). WikiPop: personalized event detection system based on Wikipedia page view statistics. In Proceedings of the 19th ACM international conference on Information and knowledge management (pp. 1931-1932). ACM.

[26] Ahn, B. G., Van Durme, B., & Callison-Burch, C. (2011). WikiTopics: What is popular on Wikipedia and why. In Proceedings of the Workshop on Automatic Summarization for Different Genres, Media, and Languages (pp. 33-40). Association for Computational Linguistics.

[27] Liu, Z., Huang, W., Zheng, Y., & Sun, M. (2010). Automatic keyphrase extraction via topic decomposition. In Proceedings of the 2010 conference on empirical methods in natural language processing (pp. 366-376). Association for Computational Linguistics.

[28] Chowdhury, G. G. (2010). Introduction to modern information retrieval. Facet publishing.

[29] Shamma, D. A., Kennedy, L., & Churchill, E. F. (2011). Peaks and persistence: modeling the shape of microblog conversations. In Proceedings of the ACM 2011 conference on Computer supported cooperative work (pp. 355-358). ACM.

[30] Benhardus, J., & Kalita, J. (2013). Streaming trend detection in twitter. International Journal of Web Based Communities, 9(1), 122-139.

[31] Parikh, R., & Karlapalem, K. (2013). Et: events from tweets. In Proceedings of the 22nd international conference on World Wide Web (pp. 613- 620). ACM.

[32] Guille, A., & Favre, C. (2014). Mention-anomaly-based event detection and tracking in twitter. In Advances in Social Networks Analysis and Mining (ASONAM), 2014 IEEE/ACM International Conference on (pp. 375-382). IEEE.

[33] Chakrabarti, D., Kumar, R., & Tomkins, A. (2006). Evolutionary clustering. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 554-560). ACM.

[34] Chi, Y., Song, X., Zhou, D., Hino, K., & Tseng, B. L. (2007). Evolutionary spectral clustering by incorporating temporal smoothness. In Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 153-162). ACM.

[35] Clauset, A., Newman, M. E., & Moore, C. (2004). Finding community structure in very large networks. Physical review E, 70(6), 066111.

[36] Yang, T., Jin, R., Chi, Y., & Zhu, S. (2009). Combining link and content for community detection: a discriminative approach. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 927-936). ACM.

[37] Zhou, Y., Cheng, H., & Yu, J. X. (2009). Graph clustering based on structural/attribute similarities. Proceedings of the VLDB Endowment, 2(1), 718-729.

[38] Lin, C. X., Zhao, B., Mei, Q., & Han, J. (2010). PET: a statistical model for popular events tracking in social communities. In Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 929-938). ACM.

[39] Allan, J., Carbonell, J. G., Doddington, G., Yamron, J., & Yang, Y. (1998). Topic detection and tracking pilot study final report. Proc. DARPA Broadcast News Transcription and Understanding Workshop.

[40] Allan, J., Papka, R., & Lavrenko, V. (1998). On-line new event detection and tracking. In Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval (pp. 37-45). ACM.

[41] Yang, Y., Pierce, T., & Carbonell, J. (1998). A study of retrospective and online event detection. In Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval (pp. 28-36). ACM.

[42] Morinaga, S., & Yamanishi, K. (2004). Tracking dynamics of topic trends using a finite mixture model. In Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 811-816). ACM.

[43] Surendran, A. C., & Sra, S. (2006). Incremental aspect models for mining document streams. Lecture notes in computer science, 4213, 633.

WSEAS Transactions on Computers, ISSN / E-ISSN: 1109-2750 / 2224-2872, Volume 16, 2017, Art. #41, pp. 355-365


Copyright © 2017 Author(s) retain the copyright of this article. This article is published under the terms of the Creative Commons Attribution License 4.0

Bulletin Board

Currently:

The editorial board is accepting papers.


WSEAS Main Site