WSEAS Transactions on Information Science and Applications


Print ISSN: 1790-0832
E-ISSN: 2224-3402

Volume 14, 2017

Notice: As of 2014 and for the forthcoming years, the publication frequency/periodicity of WSEAS Journals is adapted to the 'continuously updated' model. What this means is that instead of being separated into issues, new papers will be added on a continuous basis, allowing a more regular flow and shorter publication times. The papers will appear in reverse order, therefore the most recent one will be on top.



A Model for Web Workload Generation Based on Content Classification

AUTHORS: Carlos Marcelo Pedroso, Keiko Veronica Ono Fonseca

Download as PDF

ABSTRACT: Web server performance is tightly bound to the workload the server has to support. Therefore, understanding the nature of the server workload is particularly important in capacity planning and overload control of Web servers. Web performance analysis can be done, a priori, with a synthetic generation of Web system workload. However, performance analysis results depend on the accuracy of this workload. In this paper, we propose and describe a workload-generation model based on group classification of Web server les according to their contents. This model, henceforth referred to as SURGE-CC (SURGE Content Classification), is an extension of the SURGE (Scalable URL Reference Generator) model. SURGE-CC is simple, very easy to understand and, most important of all, can be readily customized for specific applications. The parameter settings in our model allows the influnce of Web server contents on output load to be investigated from both a qualitative and quantitative point of view. The results of a workload-generation tool based on our model implementation show the workload dependence on the nature of the server contents, the model ability to generate self-similar tra c and the accuracy of the synthetic workload. The model was validated by a careful statistical analysis of massive data from several servers, computational simulations and by comparison of results found in literature. We point some future application of the SURGE-CC model and discuss the new investigation branches derived from the novelty of our model approach.

KEYWORDS: Performance, modeling, world wide web, workload generation

REFERENCES:

[1] M. Crovella and A. Bestavros. Self-similarity in World Wide Web traffic: Evidence and possible causes. IEEE/ACM Transactions on Networking, 5(6):835–846, 1995.

[2] W.E. Leland, M.S. Qaqqu, W. Willinguer, and D.V. Wilson. On the self-similar nature of Ethernet traffic (extended version). IEEE/ACM Transactions on Networking, 2(1):1–15, February 1994.

[3] W. Willinger and K. Park. Self-similar network traffic and performance evaluation. John Wiley & Sons, New York, 1st edition, 2000.

[4] L. Muscariello, M. Mellia, M. Meo, and M. Ajmone-Marsan. An MMPP-based hierarchical model of internet traffic. In IEEE international conference on communications ICC2004, 2004.

[5] Vern Paxson and Sally Floyd. Wide area traf- fic: the failure of Poisson modeling. IEEE/ ACM Transactions on Networking, 3(3):226– 244, 1995.

[6] Martin Arlitt and Tai Jin. A workload characterization study of the 1998 World Cup Web site. IEEE Network, 14:30–37, 2000.

[7] D. Wessels and K. Claffy. Evolution of the NLANR cache hierarchy: Global configuration challenges, 1996. Technical report, NLANR, October 1996. http://www.nlanr.net/Papers/Cache96/.

[8] Paul Barford and Mark Crovella. Generating representative web workloads for network and server performance evaluation. In Joint International Conference on Measurement and Modeling of Computer Systems - Performance Evaluation Review (SIGMETRICS ’98/PERFORMANCE ’98), 1998.

[9] Hyoung-Kee Choi and John O. Limb. A behavioral model of web traffic. In Network Protocols, 1999. (ICNP ’99) Proceedings. Seventh International Conference on, pages 327–334, 1999.

[10] R. Pries, Z. Magyari, and P. Tran-Gia. An http web traffic model based on the top one million visited web pages. In Next Generation Internet (NGI), 2012 8th EURO-NGI Conference on, pages 133–139, 2012.

[11] Raoufehsadat Hashemian, Diwakar Krishnamurthy, and Martin Arlitt. Web workload generation challenges: an empirical investigation. Software: Practice and Experience, 42(5):629–647, 2012.

[12] Jianliang. Xu, Samuel T. Chanson, and SpringerLink (Online service). Web Content Delivery. Web Information Systems Engineering and Internet Technologies Book Series ;. Springer US,, Boston, MA :, 2005.

[13] I. Tsompanidis, A.H. Zahran, and C.J. Sreenan. Mobile network traffic: A user behaviour model. In Wireless and Mobile Networking Conference (WMNC), 2014 7th IFIP, pages 1–8, May 2014.

[14] B.A. Mah. An empirical model of http network traffic. In INFOCOM ’97. Sixteenth Annual Joint Conference of the IEEE Computer and Communications Societies. Driving the Information Revolution., Proceedings IEEE, volume 2, pages 592–600, 1997.

[15] J.J. Lee and M. Gupta. A new traffic model for current user web browsing behavior. Technical report, Intel Cooperation, 2007. Santa Clara, CA, USA.

[16] Bruce A. Mah. An empirical model of http network traffic. In INFOCOM ’97: Proceedings of the INFOCOM ’97. Sixteenth Annual Joint Conference of the IEEE Computer and Communications Societies. Driving the Information Revolution, page 592, Washington, DC, USA, 1997. IEEE Computer Society.

[17] F. Hernandez-Campos, K. Jeffay, and F.D. Smith. Tracing the evolution of the web traf- fic: 1995-2003. In IEEE/ACM MASCOTS 2003 – The 11th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems, 2003.

[18] Mark Crovella and Murad S. Taqqu. Estimating the heavy tail index from scaling properties. In Methodology and Computing in Applied Probability, pages 55–79, 1999.

[19] Daniel A. Menasc and Virgilio A. F. Almeida. Capacity planning for Web performance. Prentice Hall, 1998.

[20] Lee Breslau, Deborah Estrin, Kevin Fall, Sally Floyd, John Heidemann, Ahmed Helmy, Polly Huang, Steven McCanne, Kannan Varadhan, Ya Xu, and Haobo Yu. Advances in network simulation. IEEE Computer, 33(5):59–67, 2000.

[21] Matthew Roughan, Darryl Veitch, and Patrice Abry. On-line estimation of the parameters of long-range dependence. In Proceedings Globecom ’98, volume 6, pages 3716–3721, Sydney, 1998.

[22] Wei-Bo Gong, Yong Liu, Vishal Misra, and Donald F. Towsley. Self-similarity and long range dependence on the internet: a second look at the evidence, origins and implications . Computer Networks, 48(3):377–399, 2005.

[23] Joel Sommers and Paul Barford. Selfconfiguring network traffic generation. In IMC ’04: Proceedings of the 4th ACM SIGCOMM conference on Internet measurement, pages 68–81, New York, NY, USA, 2004. ACM Press.

[24] J. Cao, W. S. Cleveland, Y. Gao, K. Jeffay, F. D. Smith, and M. Weigle. Stochastic Models for Generating Synthetic HTTP Source Traffic. In IEEE Infocom, 2004.

WSEAS Transactions on Information Science and Applications, ISSN / E-ISSN: 1790-0832 / 2224-3402, Volume 14, 2017, Art. #7, pp. 49-63


Copyright © 2017 Author(s) retain the copyright of this article. This article is published under the terms of the Creative Commons Attribution License 4.0

Bulletin Board

Currently:

The editorial board is accepting papers.


WSEAS Main Site