Impact of image resolution on deep learning performance in endoscopy image classification: An experimental study using a large dataset of endoscopic images

Publication details

Recent trials have evaluated the efficacy of deep convolutional neural network (CNN)-
based AI systems to improve lesion detection and characterization in endoscopy. Impressive results
are achieved, but many medical studies use a very small image resolution to save computing resources
at the cost of losing details. Today, no conventions between resolution and performance exist,
and monitoring the performance of various CNN architectures as a function of image resolution provides
insights into how subtleties of different lesions on endoscopy affect performance. This can help
set standards for image or video characteristics for future CNN-based models in gastrointestinal (GI)
endoscopy. This study examines the performance of CNNs on the HyperKvasir dataset, consisting of
10,662 images from 23 different findings. We evaluate two CNN models for endoscopic image classification
under quality distortions with image resolutions ranging from 32 x 32 to 512 x 512 pixels.
The performance is evaluated using two-fold cross-validation and F1-score, maximum Matthews correlation
coefficient (MCC), precision, and sensitivity as metrics. Increased performance was observed
with higher image resolution for all findings in the dataset. MCC was achieved at image resolutions
between 512 X 512 pixels for classification for the entire dataset after including all subclasses. The
highest performance was observed with an MCC value of 0.9002 when the models were trained on
the highest resolution and tested on the same resolution. Different resolutions and their effect on
CNNs are explored. We show that image resolution has a clear influence on the performance which
calls for standards in the field in the future.