Detection of unsolicited web browsing with clustering and statistical analysis

Chwalinski, Pawel (2014) Detection of unsolicited web browsing with clustering and statistical analysis. PhD thesis, Middlesex University. [Thesis]

PDF - Final accepted version (with author's formatting)
Download (1MB) | Preview


Unsolicited web browsing denotes illegitimate accessing or processing web content. The harmful activity varies from extracting e-mail information to downloading entire website for duplication. In addition, computer criminals prevent legitimate users from gaining access to websites by implementing a denial of service attack with high-volume legitimate traffic. These offences are accomplished by preprogrammed machines that avoid rate-dependent intrusion detection systems. Therefore, it is assumed in this thesis that the only difference between a legitimate and malicious web session is in the intention rather than physical characteristics or network-layer information. As a result, the main aim of this research has been to provide a method of malicious intention detection. This has been accomplished by two-fold process. Initially, to discover most recent and popular transitions of lawful users, a clustering method has been introduced based on entropy minimisation. In principle, by following popular transitions among the web objects, the legitimate users are placed in low-entropy clusters, as opposed to the undesired hosts whose transitions are uncommon, and lead to placement in high-entropy clusters. In addition, by comparing distributions of sequences of requests generated by the actual and malicious users across the clusters, it is possible to discover whether or not a website is under attack. Secondly, a set of statistical measurements have been tested to detect the actual intention of browsing hosts. The intention classification based on Bayes factors and likelihood analysis have provided the best results. The combined approach has been validated against actual web traces (i.e. datasets), and generated promising results.

Item Type: Thesis (PhD)
Research Areas: A. > School of Science and Technology
Item ID: 14411
Depositing User: Users 3197 not found.
Date Deposited: 20 Feb 2015 14:09
Last Modified: 29 Nov 2022 23:54

Actions (login required)

View Item View Item


Activity Overview
6 month trend
6 month trend

Additional statistics are available via IRStats2.