Modelling Human Behaviour Based on Similarity Measurements Between Event Sequences
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
From a set of sequences, individual’s behavioural patterns can be identified. Using these sequences of events, the metadata available can be processed into a weighted format to improve the meaningfulness of the sequence comparisons. The usefulness of this process, identifying users’ behavioural patterns, is important in a number of areas such as cybersecurity. This work examines the properties a cybersecurity dataset might contain and demonstrates its effectiveness on a dataset with those properties. Building on the existing sequence comparison method, Damerau-levenshtein distance, this work develops a pipeline of steps that can be used to transform the metadata and integrate this weighted format into the sequence comparison calculation. In this pipeline, one of the most significant transformations that is applied to the meta-data is based on previous work by Brand. This transformation reduces the impact of high popularity pairwise relationships. This pipeline is shown to incorporate the metadata information into the resulting distance values. Thus, producing meaningful changes which demonstrate the benefit of these extra steps.

