|
Digital
Library of the European Council for Modelling
and Simulation |
Title: |
Improving Clustering Of
Web Bot And Human Sessions By Applying Principal Component Analysis |
Authors: |
Grazyna
Suchacka |
Published in: |
(2019). ECMS 2019
Proceedings Edited by: Mauro Iacono, Francesco Palmieri, Marco Gribaudo,
Massimo Ficco, European Council for Modeling and Simulation. DOI: http://doi.org/10.7148/2019 ISSN:
2522-2422 (ONLINE) ISSN:
2522-2414 (PRINT) ISSN:
2522-2430 (CD-ROM) 33rd International ECMS Conference on Modelling
and Simulation,
Caserta, Italy, June 11th – June 14th, 2019 |
Citation
format: |
Grazyna Suchacka (2019). Improving Clustering Of Web Bot And Human Sessions By Applying Principal Component Analysis, ECMS 2019 Proceedings Edited by: Mauro Iacono, Francesco Palmieri, Marco Gribaudo, Massimo Ficco European Council for Modeling and Simulation. doi: 10.7148/2019-0434 |
DOI: |
https://doi.org/10.7148/2019-0434 |
Abstract: |
The paper
addresses the problem of modeling Web sessions of bots and legitimate users
(humans) as feature vectors for their use at the input of classification
models. So far many different features to discriminate bots’ and humans’
navigational patterns have been considered in session models but very few
studies were devoted to feature selection and dimensionality reduction in the
context of bot detection. We propose applying Principal Component Analysis
(PCA) to develop improved session models based on predictor variables being
efficient discriminants of Web bots. The proposed models are used in session
clustering, whose performance is evaluated in terms of the purity of
generated clusters. The efficiency of the proposed approach is experimentally
verified using real server log data. Results show that PCA may be very
efficient in dimensionality reduction and feature selection for session
classification aiming at distinguishing Web robots. |
Full
text: |