Investigating data repair steps for EHR Big Data

Juddoo, Suraj (2022) Investigating data repair steps for EHR Big Data. 2022 3rd International Conference on Next Generation Computing Applications (NextComp). In: IEEE Nextcomp 2022, 06-07 Oct 2022, Flic-en-Flac, Mauritius. . [Conference or Workshop Item] (doi:10.1109/nextcomp55567.2022.9932167)


This paper builds on previous research with the aim of optimizing data quality methodologies for Big Data systems, with a focus on Electronic Health Records. This optimization is performed for organisations aiming to follow a data-centric data quality strategy. One of the most important stages of a data quality lifecycle is involved with correcting dirty data detected. There is a lack of knowledge relative to the performance of existing data repair algorithms and tools in a Big Data context. This study performs a systemic review of data repair algorithms and tools, subsequently undertaking an experiment-based approach to evaluate those algorithms and tools while comparing it with a prototype built based on the results of a previous study. While some algorithms and tools could be seen to be marginally better than others, there was no algorithm or tool which was seen to be extremely adequate in the Big Data context. Thus, recommendations of improvements needed for data repair algorithms and tools for Big Data are given.

Item Type: Conference or Workshop Item (Paper)
Sustainable Development Goals:
Research Areas: A. > School of Science and Technology
Item ID: 36781
Depositing User: Jisc Publications Router
Date Deposited: 21 Nov 2022 12:48
Last Modified: 17 Apr 2023 15:42

Actions (login required)

View Item View Item


Activity Overview
6 month trend
6 month trend

Additional statistics are available via IRStats2.