ecms_neu_mini.png

Digital Library

of the European Council for Modelling and Simulation

 

Title:

Efficiency Analysis Of Resource Request Patterns In Classification Of Web Robots And Humans

Authors:

Grazyna Suchacka, Igor Motyka

Published in:

 

 

 

(2018). ECMS 2018 Proceedings Edited by: Lars Nolle, Alexandra Burger, Christoph Tholen, Jens Werner, Jens Wellhausen European Council for Modeling and Simulation. doi: 10.7148/2018-0005

 

ISSN: 2522-2422 (ONLINE)

ISSN: 2522-2414 (PRINT)

ISSN: 2522-2430 (CD-ROM)

 

32nd European Conference on Modelling and Simulation,

Wilhelmshaven, Germany, May 22nd – May 265h, 2018

 

 

Citation format:

Grazyna Suchacka, Igor Motyka (2018). Efficiency Analysis Of Resource Request Patterns In Classification Of Web Robots And Humans, ECMS 2018 Proceedings Edited by: Lars Nolle, Alexandra Burger, Christoph Tholen, Jens Werner, Jens Wellhausen European Council for Modeling and Simulation. doi: 10.7148/2018-0475

DOI:

https://doi.org/10.7148/2018-0475

Abstract:

The paper deals with the problem of classification of Web traffic generated by robots and humans on e-commerce websites. Due to the still growing proliferation and specialization of bots, a large body of research into characterization and recognition of their traffic has been conducted so far. In particular, some approaches to classify bot and human sessions on websites have been proposed in the literature. In this paper we verify and discuss the efficiency of such recently proposed approach, which uses differences in resource request patterns of bots and humans. We reconstructed Web sessions from actual HTTP log data for three different e-commerce sites, varying in the traffic intensity and proportions of bot sessions in the overall traffic. Two heuristic procedures for labeling sessions as driven by a bot or a human were proposed and implemented. Resource request patterns for both session classes, using both session labeling procedures, were analyzed and their potential to differentiate between bot and human sessions was investigated. Results show that the broader session labeling procedure allows one to capture more bot sessions and that resource requests patterns are a good discriminant of bots and humans on e-commerce sites.

Full text: