In this tutorial, we will be exploring how we can use Computer Vision and Machine Learning to detect Coronavirus (COVID-19) cases on chest X-ray images with Python.
More and more people are being diagnosed daily with Coronavirus in different parts of the world. The inventory of testing kits are decreasing rapidly and there is a constant need for new kits to be manufactured. How nice would it be if we could find a reliable testing mechanism to act as an alternative for the testing of the Coronavirus?
In this Machine Learning tutorial, we will be using chest X-rays to build a deep learning model capable of detecting the Coronavirus in a manner that is similar to how radiologists detect various lung diseases. Also, the credits for this tutorial goes to the following kernel that we found on Kaggle: Covid-19 Detection from Lung X-rays.
Please note that this method of testing is made for educational purposes and is not at all recommended in practice. We hope that this tutorial will help data science aspirants as a starting point for their research.
Building a Convolutional Neural Network (CNN) to detect coronavirus
We will be building our own Convolutional Neural Network (CNN) architecture for detecting the Coronavirus (COVID-19) using Keras and TensorFlow. If you do not know what a CNN is, we would suggest you go through this free course before going through this tutorial: ‘Convolutional Neural Network Theoretical Course’.
1. Importing necessary libraries
First, let us import some essential Python libraries for building our model, pre-processing training images, and so on.
# Importing Keras libraries from keras import backend as K from keras.preprocessing.image import ImageDataGenerator from keras.preprocessing.image import load_img, img_to_array from keras.models import Sequential, Model from keras.layers import Conv2D, MaxPooling2D, GlobalAveragePooling2D from keras.layers import Activation, Dropout, BatchNormalization from keras.layers import AvgPool2D, MaxPool2D, Flatten, Dense from keras.models import Sequential, Model from keras.applications.vgg16 import VGG16, preprocess_input from keras.optimizers import RMSprop # Importing TensorFlow import tensorflow as tf # Importing confusion matrix function from scikit-learn from sklearn.metrics import confusion_matrix # Importing common Python libraries import os import numpy as np import pandas as pd import glob import matplotlib.pyplot as plt import matplotlib.image as mpimg %matplotlib inline
2. Importing the dataset
For this tutorial, we will be using the COVID-19 chest X-ray images dataset available on Kaggle. You can download the dataset from here.
To follow along with this tutorial, please download the dataset and keep it in your working folder.
# The directory of the dataset DATASET_DIR = "covid-19-x-ray-10000-images/dataset" # Getting all the chest X-ray images of COVID-19 negative patients normal_images = [] for img_path in glob.glob(DATASET_DIR + '/normal/*'): normal_images.append(mpimg.imread(img_path)) # Plotting chest X-ray image of a COVID-19 negative patient fig = plt.figure() fig.suptitle('normal') plt.imshow(normal_images[0], cmap='gray') # Getting all the chest X-ray images of COVID-19 positive patients covid_images = [] for img_path in glob.glob(DATASET_DIR + '/covid/*'): covid_images.append(mpimg.imread(img_path)) # Plotting chest X-ray image of a COVID-19 positive patients fig = plt.figure() fig.suptitle('covid') plt.imshow(covid_images[0], cmap='gray')
Now, let us see how many images in the dataset are of COVID-19 negative patients and how many images are of COVID-19 positive patients.
print(f"COVID-19 negative patients: {len(normal_images)}") print(f"COVID-19 positive patients: {len(covid_images)}")
COVID-19 negative patients: 28 COVID-19 positive patients: 70
So, we are working with a rather small dataset which is quite imbalanced, i.e., the number of images in both classes are not similar.
3. Initializing the model
We will be building our own Convolutional Neural Network (CNN) model for this tutorial. First, let us define some parameters for the model.
# Width, height and color channels of input image IMG_W = 150 IMG_H = 150 CHANNELS = 3 # Shape of input image INPUT_SHAPE = (IMG_W, IMG_H, CHANNELS) # Number of classes to classify (negative and positive) NB_CLASSES = 2 # Number of epochs and batch size EPOCHS = 48 BATCH_SIZE = 6
Next, let us initialize our model with an architecture that will consist of 5 hidden layers and 1 fully connected layer. Note that the choice of architecture is based on hit-and-trial method and you can experiment with other architectures as well.
# Creating a sequential model model = Sequential() # Hidden Layer 1 model.add(Conv2D(32, (3, 3), input_shape=INPUT_SHAPE)) model.add(Activation('relu')) model.add(MaxPooling2D(pool_size=(2, 2))) # Hidden Layer 2 model.add(Conv2D(32, (3, 3))) model.add(Activation('relu')) model.add(MaxPooling2D(pool_size=(2, 2))) # Hidden Layer 3 model.add(Conv2D(64,(3,3))) model.add(Activation("relu")) model.add(Conv2D(250,(3,3))) model.add(Activation("relu")) # Hidden Layer 4 model.add(Conv2D(128,(3,3))) model.add(Activation("relu")) model.add(AvgPool2D(2,2)) model.add(Conv2D(64,(3,3))) model.add(Activation("relu")) model.add(AvgPool2D(2,2)) # Hidden Layer 5 model.add(Conv2D(256,(2,2))) model.add(Activation("relu")) model.add(MaxPool2D(2,2)) # Fully Connected Layer model.add(Flatten()) model.add(Dense(32)) model.add(Dropout(0.25)) model.add(Dense(1)) model.add(Activation("sigmoid")) # Compiling the model model.compile(loss='binary_crossentropy', optimizer='rmsprop', metrics=['accuracy']) # Looking at the model summary model.summary()
Model: "sequential_1" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= conv2d_1 (Conv2D) (None, 148, 148, 32) 896 _________________________________________________________________ activation_1 (Activation) (None, 148, 148, 32) 0 _________________________________________________________________ max_pooling2d_1 (MaxPooling2 (None, 74, 74, 32) 0 _________________________________________________________________ conv2d_2 (Conv2D) (None, 72, 72, 32) 9248 _________________________________________________________________ activation_2 (Activation) (None, 72, 72, 32) 0 _________________________________________________________________ max_pooling2d_2 (MaxPooling2 (None, 36, 36, 32) 0 _________________________________________________________________ conv2d_3 (Conv2D) (None, 34, 34, 64) 18496 _________________________________________________________________ activation_3 (Activation) (None, 34, 34, 64) 0 _________________________________________________________________ conv2d_4 (Conv2D) (None, 32, 32, 250) 144250 _________________________________________________________________ activation_4 (Activation) (None, 32, 32, 250) 0 _________________________________________________________________ conv2d_5 (Conv2D) (None, 30, 30, 128) 288128 _________________________________________________________________ activation_5 (Activation) (None, 30, 30, 128) 0 _________________________________________________________________ average_pooling2d_1 (Average (None, 15, 15, 128) 0 _________________________________________________________________ conv2d_6 (Conv2D) (None, 13, 13, 64) 73792 _________________________________________________________________ activation_6 (Activation) (None, 13, 13, 64) 0 _________________________________________________________________ average_pooling2d_2 (Average (None, 6, 6, 64) 0 _________________________________________________________________ conv2d_7 (Conv2D) (None, 5, 5, 256) 65792 _________________________________________________________________ activation_7 (Activation) (None, 5, 5, 256) 0 _________________________________________________________________ max_pooling2d_3 (MaxPooling2 (None, 2, 2, 256) 0 _________________________________________________________________ flatten_1 (Flatten) (None, 1024) 0 _________________________________________________________________ dense_1 (Dense) (None, 32) 32800 _________________________________________________________________ dropout_1 (Dropout) (None, 32) 0 _________________________________________________________________ dense_2 (Dense) (None, 1) 33 _________________________________________________________________ activation_8 (Activation) (None, 1) 0 ================================================================= Total params: 633,435 Trainable params: 633,435 Non-trainable params: 0 _________________________________________________________________
Great! We have initialized our CNN model for training.
4. Training the model
As the dataset is very small, we will be using some image augmentation techniques (shearing, zooming, flipping, etc.) to make the dataset more varied. Then, we will be training the model on the augmented images as well as the real images.
# Initializing the training data generator train_datagen = ImageDataGenerator(rescale=1./255, shear_range = 0.2, zoom_range = 0.2, horizontal_flip = True, validation_split = 0.3) # Choosing the training directory for data generator train_generator = train_datagen.flow_from_directory( DATASET_DIR, target_size = (IMG_H, IMG_W), batch_size = BATCH_SIZE, class_mode = 'binary', subset = 'training') # Choosing the validation directory for data generator validation_generator = train_datagen.flow_from_directory( DATASET_DIR, target_size = (IMG_H, IMG_W), batch_size = BATCH_SIZE, class_mode = 'binary', shuffle = False, subset = 'validation') # Fitting the model history = model.fit_generator( train_generator, steps_per_epoch = train_generator.samples // BATCH_SIZE, validation_data = validation_generator, validation_steps = validation_generator.samples // BATCH_SIZE, epochs = EPOCHS)
Found 69 images belonging to 2 classes. Found 29 images belonging to 2 classes. Epoch 1/48 11/11 [==============================] - 5s 431ms/step - loss: 0.9489 - accuracy: 0.6667 - val_loss: 0.6989 - val_accuracy: 0.8750 Epoch 2/48 11/11 [==============================] - 5s 449ms/step - loss: 0.6878 - accuracy: 0.6515 - val_loss: 0.4423 - val_accuracy: 0.7826 Epoch 3/48 11/11 [==============================] - 5s 413ms/step - loss: 0.7247 - accuracy: 0.7333 - val_loss: 0.3371 - val_accuracy: 0.6522 Epoch 4/48 11/11 [==============================] - 4s 392ms/step - loss: 0.6387 - accuracy: 0.7460 - val_loss: 0.3243 - val_accuracy: 0.6522 Epoch 5/48 11/11 [==============================] - 4s 370ms/step - loss: 0.6985 - accuracy: 0.7143 - val_loss: 0.9496 - val_accuracy: 0.6522 Epoch 6/48 11/11 [==============================] - 5s 435ms/step - loss: 0.6414 - accuracy: 0.6190 - val_loss: 0.7891 - val_accuracy: 0.8750 Epoch 7/48 11/11 [==============================] - 5s 447ms/step - loss: 0.5649 - accuracy: 0.7302 - val_loss: 0.5828 - val_accuracy: 0.7826 Epoch 8/48 11/11 [==============================] - 4s 401ms/step - loss: 0.7349 - accuracy: 0.6818 - val_loss: 0.5026 - val_accuracy: 0.6522 Epoch 9/48 11/11 [==============================] - 4s 391ms/step - loss: 0.4979 - accuracy: 0.7778 - val_loss: 0.0903 - val_accuracy: 0.6522 Epoch 10/48 11/11 [==============================] - 4s 369ms/step - loss: 0.6699 - accuracy: 0.6508 - val_loss: 1.2273 - val_accuracy: 0.6957 Epoch 11/48 11/11 [==============================] - 4s 387ms/step - loss: 1.0450 - accuracy: 0.7937 - val_loss: 0.5215 - val_accuracy: 0.9167 Epoch 12/48 11/11 [==============================] - 5s 412ms/step - loss: 0.4828 - accuracy: 0.8413 - val_loss: 0.1677 - val_accuracy: 0.8696 ... Epoch 46/48 11/11 [==============================] - 4s 396ms/step - loss: 0.0094 - accuracy: 1.0000 - val_loss: 1.7849 - val_accuracy: 0.9167 Epoch 47/48 11/11 [==============================] - 5s 455ms/step - loss: 0.1794 - accuracy: 0.9524 - val_loss: 1.5018e-06 - val_accuracy: 1.0000 Epoch 48/48 11/11 [==============================] - 4s 383ms/step - loss: 0.2225 - accuracy: 0.9365 - val_loss: 2.9316e-05 - val_accuracy: 0.9565
We have trained our model for 48 epochs and our final validation accuracy is 0.9565.
5. Visualizing the model results
Let us visualize the training and validation results of the model.
# Plotting the training accuracy and validation accuracy plt.plot(history.history['accuracy']) plt.plot(history.history['val_accuracy']) plt.title('model accuracy') plt.ylabel('accuracy') plt.xlabel('epoch') plt.legend(['train', 'test'], loc='upper left') plt.show() # Plotting the training loss and validation loss plt.plot(history.history['loss']) plt.plot(history.history['val_loss']) plt.title('model loss') plt.ylabel('loss') plt.xlabel('epoch') plt.legend(['train', 'test'], loc='upper left') plt.show()
Looking at the charts, we can see how our loss has gradually decreased and accuracy has gradually increased for both our training and validation dataset.
6. Evaluating the model using confusion matrix
It is time for us to evaluate the model using a confusion matrix.
# Getting some predictions using the validation generator pred = model.predict(validation_generator) predicted_class_indices = np.argmax(pred, axis=1) labels = (validation_generator.class_indices) labels2 = dict((v,k) for k,v in labels.items()) predictions = [labels2[k] for k in predicted_class_indices] # Creating the confusion matrix cf = confusion_matrix(predicted_class_indices,label) print(cf)
array([[21, 8], [ 0, 0]])
Our model has predicted 29 of the images as COVID-19 positive where 21 cases are actually COVID-19 positive (True Positive) and 8 cases are actually COVID-19 negative (False Positive).
The reason why all of our predictions are coming out as COVID-19 positive is because our starting dataset is very imbalanced and thus, accuracy is not the right metric for training the model. Instead, we should be using metrics such as precision and recall. However, that would make this tutorial more complicated and harder to digest in a single go.
In Conclusion
There you have it! The goal of this tutorial was to give students a starting point for using Machine Learning to detect the Coronavirus (COVID-19). Again, this tutorial is just for educational purposes and is not aimed at building a production-ready coronavirus detection model.
Do you want to learn Python, Data Science, and Machine Learning while getting certified? Here are some best selling Datacamp courses that we recommend you enroll in:
- Introduction to Python (Free Course) - 1,000,000+ students already enrolled!
- Introduction to Data Science in Python- 400,000+ students already enrolled!
- Introduction to TensorFlow for Deep Learning with Python - 90,000+ students already enrolled!
- Data Science and Machine Learning Bootcamp with R - 70,000+ students already enrolled!