Professor Florida Institute of Technology INDIAN HARBOUR BEACH, Florida, United States
Introduction: Many complex big data problems have effectively been solved in recent years with advancements in both computational power and machine learning techniques. Medical image classification is one such field where machine learning, especially different neural network-based techniques, has shown significant promise. Optical coherence tomography (OCT) is a high-resolution, non-invasive tomographic imaging modality that relies on the intrinsic scattering properties of biological tissues to generate imaging contrast. The cross-sectional images of tissues are obtained with light in the near-infrared spectral range that has a penetration depth of several hundred microns inside the tissue. OCT images have been used as a very common method in identifying various retinal diseases in clinical setup. An efficient machine learning based method for retinal disease classification from OCT images will enable the medical professionals with faster detection of the disease allowing quicker intervention. In this paper, OCT images from a benchmarking dataset named ‘OCT2017’ [1] are classified with simple 3-layer convolutional neural networks (CNN) after adding Gaussian noises. Several advanced methods are already available for this task with great accuracy [2], but they are challenging to implement and use a lot of computational power. The methods presented in this paper exhibit a classification accuracy of up to 90% with focusing primarily on data manipulation reducing the need for high computational power.
Materials and
Methods: ‘OCT2017’ is a publicly available dataset [1] which has 108,312 OCT images for four different conditions. These conditions are choroidal neovascularization (CNV, 37,206 images), diabetic macular edema (DME, 11,349 images), drusen bodies (DRUSEN, 8,617), healthy retina (NORMAL, 51,140 images) as shown in Figure 1. For a balanced class distribution, 8,000 images are selected in each class, and the data is split into an 80% to 20% ratio for training and testing purposes, while 20% of the training data is used for validating the model performances during the training. We also resized each image into a size of 512×512 pixels before splitting so that the feature distribution remains constant during the training and testing phase. Gaussian noise is added to each training image before passing it to the CNNs, while the testing images are kept unadulterated, as shown in Figure 2. Three different CNNs are considered with 1, 2, and 3 convolution layers with a filter size of 9×9 in each layer with varying numbers of filters. Each of the convolution layers was followed by a max-pooling layer of the same dimensions as the filter size. Finally, the output of the final max-pooling layer was flattened and passed onto a dense layer with 512 neurons. Multi-layer dense neural networks (DNN) and recurrent neural networks (RNN) would also be considered to compare the classification performance with the CNN based deep learning model.
Results, Conclusions, and Discussions: Results and Discussions: The models trained before and after Gaussian noise addition show significantly enhanced accuracy after the Gaussian noise addition, as shown in Table 1. The classification accuracy rose from 69% to 79% for a CNN model with one convolution layer before and after the addition of Gaussian noise. A similar increase in accuracy is observed in the 2-layer CNN model as well, where the accuracy improved from 73% to 85%, and with a 3-layer CNN model, an accuracy of 75% to 90% is achieved before and after the noise addition. With very similar precision and recall values, it is ensured that the method is not biased towards classifying any particular class. The results of this method will be further compared with other machine learning methods like dense neural networks (DNN), recurrent neural networks (RNN) etc.
Conclusions: In this work, retinal OCT images have been successfully classified into four predefined classes with 90% accuracy by employing a simple 3-layer CNN with the addition of Gaussian noise to the original images. The addition of Gaussian noise exhibits excellent promise to improve classification accuracy. The classification accuracy can further be improved by increasing the number of training images, additional hyperparameter tuning of the CNNs, and adding further data preprocessing steps. This method takes a minimal amount of computational resources and will be significantly easier to implement on the user level.
References: 1. Kermany, Daniel S., et al. "Identifying medical diagnoses and treatable diseases by image-based deep learning." cell 172.5 (2018): 1122-1131. 2. He, Jingzhen, et al. "An interpretable transformer network for the retinal disease classification using optical coherence tomography." Scientific Reports 13.1 (2023): 3637.