Investigating data repair steps for EHR Big Data

Juddoo, Suraj (2022) Investigating data repair steps for EHR Big Data. 2022 3rd International Conference on Next Generation Computing Applications (NextComp). In: IEEE Nextcomp 2022, 06-07 Oct 2022, Flic-en-Flac, Mauritius. . [Conference or Workshop Item] (doi:10.1109/nextcomp55567.2022.9932167)

Abstract

This paper builds on previous research with the aim of optimizing data quality methodologies for Big Data systems, with a focus on Electronic Health Records. This optimization is performed for organisations aiming to follow a data-centric data quality strategy. One of the most important stages of a data quality lifecycle is involved with correcting dirty data detected. There is a lack of knowledge relative to the performance of existing data repair algorithms and tools in a Big Data context. This study performs a systemic review of data repair algorithms and tools, subsequently undertaking an experiment-based approach to evaluate those algorithms and tools while comparing it with a prototype built based on the results of a previous study. While some algorithms and tools could be seen to be marginally better than others, there was no algorithm or tool which was seen to be extremely adequate in the Big Data context. Thus, recommendations of improvements needed for data repair algorithms and tools for Big Data are given.

Item Type: Conference or Workshop Item (Paper)
Sustainable Development Goals:
Theme:
Research Areas: A. > School of Science and Technology
Item ID: 36781
Depositing User: Jisc Publications Router
Date Deposited: 21 Nov 2022 12:48
Last Modified: 17 Apr 2023 15:42
URI: https://eprints.mdx.ac.uk/id/eprint/36781

Actions (login required)

View Item View Item

Statistics

Activity Overview
6 month trend
0Downloads
6 month trend
42Hits

Additional statistics are available via IRStats2.