|
Digital
Library of the European Council for Modelling and Simulation |
Title: |
Efficiency Analysis Of
Resource Request Patterns In Classification Of Web Robots And Humans |
Authors: |
Grazyna Suchacka,
Igor Motyka |
Published in: |
(2018). ECMS 2018
Proceedings Edited by: Lars Nolle, Alexandra
Burger, Christoph Tholen,
Jens Werner, Jens Wellhausen European Council for
Modeling and Simulation. doi:
10.7148/2018-0005 ISSN:
2522-2422 (ONLINE) ISSN:
2522-2414 (PRINT) ISSN:
2522-2430 (CD-ROM) 32nd European Conference on Modelling and Simulation, Wilhelmshaven, Germany, May 22nd
– May 265h, 2018 |
Citation
format: |
Grazyna Suchacka, Igor Motyka (2018). Efficiency Analysis Of Resource Request
Patterns In Classification Of Web Robots And Humans,
ECMS 2018 Proceedings Edited by: Lars Nolle,
Alexandra Burger, Christoph Tholen,
Jens Werner, Jens Wellhausen European Council for
Modeling and Simulation. doi:
10.7148/2018-0475 |
DOI: |
https://doi.org/10.7148/2018-0475 |
Abstract: |
The paper deals with the problem of classification
of Web traffic generated by robots and humans on e-commerce websites. Due to
the still growing proliferation and specialization of bots, a large body of
research into characterization and recognition of their traffic has been
conducted so far. In particular, some approaches to classify bot and human sessions on websites have been proposed in
the literature. In this paper we verify and discuss the efficiency of such
recently proposed approach, which uses differences in resource request
patterns of bots and humans. We reconstructed Web sessions from actual HTTP
log data for three different e-commerce sites, varying in the traffic
intensity and proportions of bot sessions in the
overall traffic. Two heuristic procedures for labeling
sessions as driven by a bot or a human were
proposed and implemented. Resource request patterns for both session classes,
using both session labeling procedures, were
analyzed and their potential to differentiate between bot
and human sessions was investigated. Results show that the broader session labeling procedure allows one to capture more bot sessions and that resource requests patterns are a
good discriminant of bots and humans on e-commerce
sites. |
Full
text: |