Colonoscopy image classification using self-supervised visual feature learning



  • Nguyen Chi Thanh (Corresponding Author) Institute of Information Technology, Academy of Military Science and Technology



Self-Supervised Visual Feature Learning; Transfer Learning; Colonoscopy Image Classification; Polyp Recognization.


Colonoscopy image classification is an image classification task that predicts whether colonoscopy images contain polyps or not. It is an important task input for an automatic polyp detection system. Recently, deep neural networks have been widely used for colonoscopy image classification due to the automatic feature extraction with high accuracy. However, training these networks requires a large amount of manually annotated data, which is expensive to acquire and limited by the available resources of endoscopy specialists. We propose a novel method for training colonoscopy image classification networks by using self-supervised visual feature learning to overcome this challenge. We adapt image denoising as a pretext task for self-supervised visual feature learning from unlabeled colonoscopy image dataset, where noise is added to the image for input, and the original image serves as the label. We use an unlabeled colonoscopy image dataset containing 8,500 images collected from the PACS system of Hospital 103 to train the pretext network. The feature exactor of the pretext network trained in a self-supervised way is used for colonoscopy image classification. A small labeled dataset from the public colonoscopy image dataset Kvasir is used to fine-tune the classifier. Our experiments demonstrate that the proposed self-supervised learning method can achieve a high colonoscopy image classification accuracy better than the classifier trained from scratch, especially at a small training dataset. When a dataset with only annotated 200 images is used for training classifiers, the proposed method improves accuracy from 72,16% to 93,15% compared to the baseline classifier.


[1]. A. M. Leufkens, M. G. H. van Oijen, F. P. Vleggaar, and P. D. Siersema. "Factors influencing the miss rate of polyps in a back-to-back colonoscopy study," Endoscopy, 44(05):470475, 2012.

[2]. D. Vázquez, J. Bernal, F. J. Sánchez, G. Fernández-Esparrach, A. M. López, A. Romero, M. Drozdzal, and A. Courville, “A benchmark for endoluminal scene segmentation of colonoscopy images,” Journal of healthcare engineering, vol. 2017, 2017.

[3]. Jing, Longlong, and Yingli Tian. "Self-supervised visual feature learning with deep neural networks: A survey." IEEE transactions on pattern analysis and machine intelligence (2020).

[4]. Doersch, Carl, Abhinav Gupta, and Alexei A. Efros. "Unsupervised visual representation learning by context prediction." Proceedings of the IEEE international conference on computer vision. 2015.

[5]. Pathak, Deepak, et al. "Context encoders: Feature learning by inpainting." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.

[6]. Zhang, Richard, Phillip Isola, and Alexei A. Efros. "Colorful image colorization." European conference on computer vision. Springer, Cham, 2016.

[7]. Komodakis, Nikos, and Spyros Gidaris. "Unsupervised representation learning by predicting image rotations." International Conference on Learning Representations (ICLR). 2018.

[8]. Bengio, Yoshua, et al. "Generalized denoising auto-encoders as generative models." arXiv preprint arXiv:1305.6663 (2013).

[9]. D. Jha et al., "Kvasir-seg: A segmented polyp dataset" in Proc. Int. Conf. Multimedia Model., 2020, pp. 451–462.

[10]. Sandler, Mark, et al. "Mobilenetv2: Inverted residuals and linear bottlenecks." Proceedings of the IEEE conference on computer vision and pattern recognition. 2018.

[11]. Jamaludin, Amir, Timor Kadir, and Andrew Zisserman. "Self-supervised learning for spinal MRIs." Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support. Springer, Cham, 2017. 294-302.

[12]. Tajbakhsh, Nima, et al. "Surrogate supervision for medical image analysis: Effective deep learning from limited quantities of labeled data." 2019 IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019). IEEE, 2019.

[13]. Ross, Tobias, et al. "Exploiting the potential of unlabeled endoscopic video data with self-supervised learning." International journal of computer assisted radiology and surgery 13.6 (2018): 925-933.

[14]. Kim, Young Jae, et al. "New polyp image classification technique using transfer learning of network-in-network structure in endoscopic images." Scientific Reports 11.1 (2021): 1-8.

[15]. Hsu, Chen-Ming, et al. "Colorectal Polyp Image Detection and Classification through Grayscale Images and Deep Learning." Sensors 21.18 (2021): 5995.

[16]. Wang, Yan, et al. "Multiclassification of endoscopic colonoscopy images based on deep transfer learning." Computational and Mathematical Methods in Medicine 2021 (2021).

[17]. Taha, Dima, et al. "Automated Colorectal Polyp Classification Using Deep Neural Networks with Colonoscopy Images." International Journal of Fuzzy Systems (2021): 1.




How to Cite

Nguyen Chi Thanh. “Colonoscopy Image Classification Using Self-Supervised Visual Feature Learning”. Journal of Military Science and Technology, no. CSCE5, Dec. 2021, pp. 3-13, doi:10.54939/1859-1043.j.mst.CSCE5.2021.3-13.