Login



Other Articles by Author(s)

Amjad Jumaah Frhan



Author(s) and WSEAS

Amjad Jumaah Frhan


WSEAS Transactions on Computers


Print ISSN: 1109-2750
E-ISSN: 2224-2872

Volume 16, 2017

Notice: As of 2014 and for the forthcoming years, the publication frequency/periodicity of WSEAS Journals is adapted to the 'continuously updated' model. What this means is that instead of being separated into issues, new papers will be added on a continuous basis, allowing a more regular flow and shorter publication times. The papers will appear in reverse order, therefore the most recent one will be on top.



A Model of Website Usage Visualization Estimated on Clickstream Data with Apache Flume Using Improved Markov Chain Approximation

AUTHORS: Amjad Jumaah Frhan

Download as PDF

ABSTRACT: Visualization of the website clickstream data has been a pivotal process as it aids in defining the user preferences. It includes the processes of gathering, investigating and reporting about the web pages that are being viewed by the users. Clickstream visualization is primarily employed by organizations which focuses on gaining the user preferences and improve their products or services towards achieving maximum satisfaction of users. Most existing visualization tools come up short in helping the organizations achieve this goal. Markov chain model is the commonly utilized method for developing data visualization tools. However the issues such as occlusion and inability to provide clear data visualization display makes the tools volatile. This paper aims at developing a visualization tool named as WebClickviz by resolving the above mentioned issues by improving the Markov chain modelling. A heuristic method of Kolmogorov– Smirnov distance and maximum likelihood estimator is introduced for improving the clear display of visualization. These concepts are employed between the underlying distribution states to minimize the Markov distribution. The proposed model named as WebClickviz is performed in Hadoop Apache Flume which is a highly advanced tool. Through the experiments conducted on evaluation dataset, it can be shown that the proposed model outperforms the existing models with higher visualization accuracy.

KEYWORDS: Clickstream data, Data Visualization, Hadoop, WebClickviz, Apache Flume, Markov chain, Kolmogorov– Smirnov distance, maximum likelihood estimator, heuristic approximation

REFERENCES:

[1] Farney, T. A. Click analytics: Visualizing website use data. Information Technology and Libraries, 30(3), 141. 2011

[2] Kimball, R., & Merz, R. The data webhouse toolkit. Wiley, 2000.

[3] Phippen, A., Sheppard, L., & Furnell, S. A practical evaluation of Web analytics. Internet Research, 14(4), 284-293, 2004.

[4] Gonçalves, B., & Ramasco, Human dynamics revealed through Web analytics. Physical Review E, 78(2), 026123, 2008.

[5] Plaza, B. Monitoring web traffic source effectiveness with Google Analytics: An experiment with time series. In Aslib Proceedings (Vol. 61, No. 5, pp. 474-482). Emerald Group Publishing Limited, 2009.

[6] Kohavi, R., Rothleder, N. J., & Simoudis, E.. Emerging trends in business analytics. Communications of the ACM, 45(8), 45-48, 2002.

[7] Hasan, L., Morris, A., & Probets, S.. Using Google Analytics to evaluate the usability of ecommerce sites. Human centered design, 697- 706.2009.

[8] Kohavi, R., Mason, L., Parekh, R., & Zheng, Z. Lessons and challenges from mining retail ecommerce data. Machine Learning, 57(1), 83- 113.2004.

[9] White, T.. Hadoop: The definitive guide. 'O'Reilly Media, Inc.', 2012.

[10] Flanagan, D. JavaScript: the definitive guide. 'O Reilly Media, Inc.', 2006.

[11] Bucklin, R. E., & Sismeiro, C.. Click here for Internet insight: Advances in clickstream data analysis in marketing. Journal of Interactive Marketing, 23(1), 35-48, 2009.

[12] Montgomery, A. L., Li, S., Srinivasan, K., & Liechty, J. C. Modelling online browsing and path analysis using clickstream data. Marketing science, 23(4), 579-595, 2004.

[13] Moe, W. W., & Fader, P. S. Capturing evolving visit behavior in clickstream data. Journal of Interactive Marketing, 18(1), 5-19, 2004.

[14] Van den Poel, D., & Buckinx, W. Predicting online-purchasing behaviour. European journal of operational research, 166(2), 557-575, 2005.

[15] Danaher, P. J., Mullarkey, G. W., & Essegaier, S. Factors affecting web site visit duration: a cross-domain analysis. Journal of Marketing Research, 43(2), 182-194, 2006.

[16] Kateja, R., Rohith, A., Kumar, P., & Sinha, R. VizClick visualizing clickstream data. In Information Visualization Theory and Applications (IVAPP), 2014 International Conference on (pp. 247-255). IEEE, 2014.

[17] De Oliveira, M. F., & Levkowitz, H. From visual data exploration to visual data mining: a survey. IEEE Transactions on Visualization and Computer Graphics, 9(3), 378-394, 2003.

[18] Moe, W. W. An empirical two-stage choice model with varying decision rules applied to internet clickstream data. Journal of Marketing Research, 43(4), 680-692, 2006.

[19] De Bock, K., & Van den Poel, D. Predicting website audience demographics forweb advertising targeting using multi-website clickstream data. Fundamenta Informaticae, 98(1), 49-70, 2010.

[20] Chen, L., & Su, Q. Discovering user's interest at E-commerce site using clickstream data. In Service systems and service management (ICSSSM), 2013 10th international conference on (pp. 124-129). IEEE, 2013.

[21] Schellong, D., Kemper, J., & Brettel, M. Clickstream data as a source to uncover consumer shopping types in a large-scale online setting, 2016.

[22] Shi, C., Fu, S., Chen, Q., & Qu, H. VisMOOC: Visualizing video clickstream data from massive open online courses. In Visualization Symposium (PacificVis), 2015 IEEE Pacific (pp. 159-166). IEEE.

[23] Brinton, C. G., & Chiang, M. Mooc performance prediction via clickstream data and social learning networks. In Computer Communications (INFOCOM), 2015 IEEE Conference on (pp. 2299-2307). IEEE.

[24] Srivastava, J., Cooley, R., Deshpande, M. and Tan, P.N. Web Usage Mining: Discovery and Applications of Usage Patterns from Web Data. ACM SIGKDD Explorations Newsletter, 1(2), 12-23. 2000.

[25] Esmaeili, M., Gabor, F., Finding Sequential Patterns from Large Sequence Data. IJCSI International Journal of Computer Science Issues, Vol. 7, Issue 1, No. 1, 2010.

[26] Gilks, W. R., Richardson, S., & Spiegelhalter, D. (Eds.). Markov chain Monte Carlo in practice. CRC press, 1995.

[27] http://www.msnbc.com

[28] Steinwart, I., & Christmann, A. Support vector machines. Springer Science & Business Media, 2008.

WSEAS Transactions on Computers, ISSN / E-ISSN: 1109-2750 / 2224-2872, Volume 16, 2017, Art. #12, pp. 104-115


Copyright © 2017 Author(s) retain the copyright of this article. This article is published under the terms of the Creative Commons Attribution License 4.0

Bulletin Board

Currently:

The editorial board is accepting papers.


WSEAS Main Site