Fusion of colour contrasted images for early detection of oesophageal squamous cell dysplasia from endoscopic videos in real time

Gao, Xiaohong W. ORCID logoORCID: https://orcid.org/0000-0002-8103-6624, Taylor, Stephen, Pang, Wei, Hui, Rui, Lu, Xin, Braden, Barbara and GI Investigators, Oxford (2023) Fusion of colour contrasted images for early detection of oesophageal squamous cell dysplasia from endoscopic videos in real time. Information Fusion, 92 . pp. 64-79. ISSN 1566-2535 [Article] (doi:10.1016/j.inffus.2022.11.023)

PDF - Published version (with publisher's formatting)
Available under License Creative Commons Attribution 4.0.

Download (28MB) | Preview
[img] PDF - Final accepted version (with author's formatting)
Restricted to Repository staff and depositor only
Available under License Creative Commons Attribution-NonCommercial-NoDerivatives 4.0.

Download (1MB)


Standard white light (WL) endoscopy often misses precancerous oesophageal changes due to their only subtle differences to the surrounding normal mucosa. While deep learning (DL) based decision support systems benefit to a large extent, they face two challenges, which are limited annotated data sets and insufficient generalisation. This paper aims to fuse a DL system with human perception by exploiting computational enhancement of colour contrast. Instead of employing conventional data augmentation techniques by alternating RGB values of an image, this study employs a human colour appearance model, CIECAM, to enhance the colours of an image. When testing on a frame of endoscopic videos, the developed system firstly generates its contrast-enhanced image, then processes both original and enhanced images one after another to create initial segmentation masks. Finally, fusion takes place on the assembled list of masks obtained from both images to determine the finishing bounding boxes, segments and class labels that are rendered on the original video frame, through the application of non-maxima suppression technique (NMS). This deep learning system is built upon real-time instance segmentation network Yolact. In comparison with the same system without fusion, the sensitivity and specificity for detecting early stage of oesophagus cancer, i.e. low-grade dysplasia (LGD) increased from 75% and 88% to 83% and 97%, respectively. The video processing/play back speed is 33.46 frames per second. The main contribution includes alleviation of data source dependency of existing deep learning systems and the fusion of human perception for data augmentation.

Item Type: Article
Sustainable Development Goals:
Research Areas: A. > School of Science and Technology > Computer Science > Artificial Intelligence group
Item ID: 36822
Notes on copyright: Published version: Copyright © 2022 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/
Depositing User: Xiaohong Gao
Date Deposited: 23 Nov 2022 12:01
Last Modified: 12 May 2023 12:17
URI: https://eprints.mdx.ac.uk/id/eprint/36822

Actions (login required)

View Item View Item


Activity Overview
6 month trend
6 month trend

Additional statistics are available via IRStats2.