An interactive human centered data science approach towards crime pattern analysis

Qazi, Nadeem and Wong, B. L. William ORCID logoORCID: (2019) An interactive human centered data science approach towards crime pattern analysis. Information Processing & Management, 56 (6) , 102066. ISSN 0306-4573 [Article] (doi:10.1016/j.ipm.2019.102066)

PDF - Final accepted version (with author's formatting)
Available under License Creative Commons Attribution-NonCommercial-NoDerivatives 4.0.

Download (1MB) | Preview


The traditional machine learning systems lack a pathway for a human to integrate their domain knowledge into the underlying machine learning algorithms. The utilization of such systems, for domains where decisions can have serious consequences (e.g. medical decision-making and crime analysis), requires the incorporation of human experts' domain knowledge. The challenge, however, is how to effectively incorporate domain expert knowledge with machine learning algorithms to develop effective models for better decision making.

In crime analysis, the key challenge is to identify plausible linkages in unstructured crime reports for the hypothesis formulation. Crime analysts painstakingly perform time-consuming searches of many different structured and unstructured databases to collate these associations without any proper visualization. To tackle these challenges and aiming towards facilitating the crime analysis, in this paper, we examine unstructured crime reports through text mining to extract plausible associations. Specifically, we present associative questioning based searching model to elicit multi-level associations among crime entities. We coupled this model with partition clustering to develop an interactive, human-assisted knowledge discovery and data mining scheme.

The proposed human-centered knowledge discovery and data mining scheme for crime text mining is able to extract plausible associations between crimes, identifying crime pattern, grouping similar crimes, eliciting co-offender network and suspect list based on spatial-temporal and behavioral similarity. These similarities are quantified through calculating Cosine, Jacquard,
and Euclidean distances. Additionally, each suspect is also ranked by a similarity score in the plausible suspect list. These associations are then visualized through creating a two-dimensional re-configurable crime cluster space along with a bipartite knowledge graph. This proposed scheme also inspects the grand challenge of integrating effective human interaction
with the machine learning algorithms through a visualization feedback loop. It allows the analyst to feed his/her domain knowledge including choosing of similarity functions for identifying associations, dynamic feature selection for interactive clustering of crimes and assigning weights to each component of the crime pattern to rank suspects for an unsolved crime.

We demonstrate the proposed scheme through a case study using the Anonymized burglary dataset. The scheme is found to facilitate human reasoning and analytic discourse for intelligence analysis.

Item Type: Article
Research Areas: A. > School of Science and Technology > Computer Science
Item ID: 31440
Notes on copyright: © 2019 Elsevier Ltd. This author's accepted manuscript version is made available under the CC-BY-NC-ND 4.0 license
Useful Links:
Depositing User: William Wong
Date Deposited: 27 Nov 2020 10:45
Last Modified: 29 Nov 2022 18:47

Actions (login required)

View Item View Item


Activity Overview
6 month trend
6 month trend

Additional statistics are available via IRStats2.