AI would aid health professionals in the mass prediction of COVID-19 utilizing only x-rays chest images

Viridiana Romero Martinez
DataDrivenInvestor
Published in
7 min readMar 30, 2020

--

Imaging using only X-rays chest images from patients with diminished lung capacities and/or Covid-19? Artificial Intelligence would aid health professionals in the mass prediction of viral infections.

COVID-19: Situation in numbers

According to The World Health Organization (WHO), these are the latest numbers as it relates to the novel Coronavirus:

We begin with China, where according to general consensus, the viral origin and therefore first reported case of COVID-19 occurred. Faced with an unknown assailant attacking its people, they have been widely applauded for their lightning quick response to the then not yet pandemic. The current totals show a confirmed 82,230 cases (152 new) with 3,301 deaths, of which only 3 are recent.

Another country that launched an impressive response and seemingly contained the spread early is South Korea, showing only 144 deaths with 5 being recent out of 9,478 confirmed infections.

The next large scale effects of the virus occurred in both Italy and Spain. In Italy, they have been confronted a total of 86,498 cases with 9,136 deaths. The alarming rates however are that of these, 5,959 cases and 971 deaths are labeled as recent. These would indicate they are still yet to peak as to transmission or containment. Spain has similarly struggled with 4,858 deaths out of 64,059 confirmed cases.

Soon after, the virus affected the United States of America. Already atop any list of confirmed cases with 113,000, the more staggering figures would involving isolating its centers of influence directly. New York alone has reported 52,318 cases with 728 deaths. Logic might dictate that this country would be vulnerable as it is composed of many population centers that are widely spread out and its citizens’ wealth allows for greater ease of travel (increasing likelihood of contagion). Numbers indicate the US is still early in its curve and might face devastating results.

A common expression in my country of Mexico is that “if the United States sneezes, we catch a cold.” Applying this colloquialism to the pandemic, troubling times are probably ahead for my country.

COVID-19 in Mexico

I’m from Mexico and we’ve been late to respond in comparison to other countries. From its origins in China to its migration across Italy and Spain, there were learning lessons along the way that were simply ignored. Given the sizeable advantage of time, with a concerted government response, we could have flattened the curve and reduced worldwide infection and strain on our medical systems. Instead, our country’s leadership has mass communicated its skepticism about the virus, seemingly instructed its people to disregard social distancing guidelines from medical professionals and instead rely on their faith and time to heal all wounds.

Sure enough, people showed its leader the same confidence as during his run for the Presidency itself. Seeing chaos unfold elsewhere, they responded with slight adherence to basic safety protocols and didn’t waver in attending all social functions and obviously their employment. If your elected officials starting at the very top are shaking hands and gathering people, then the learning curve of an entire society becomes steeper. Inevitably, when the danger of the virus seeps into every area of cotidian life, we will look back at our communal non-response as to why we have limited access to resources, possibly insurmountable damage to our economy, and staggering death totals.

Inspired by how AI can help in different ways to tackle this problem all the world in experiencing right now, I will create a deep learning model which detects COVID-19 in X-rays chest images using FastAi and transfer learning technique.

This is merely educational, not a solution for the medical sector. I hope this encourages you to expand your use of AI and apply it to other situations that require alternative solutions.

Outline:

  1. Getting the dataset
  2. Creating the model
  3. Making predictions
  4. Recommendations

1.- Getting the dataset: open database of COVID-19 cases with chest X-ray or CT images

The dataset was taken from Kaggle: CoronaHack -Chest X-Ray-Dataset, inspired by the open database of COVID-19 cases with chest X-ray or CT images shared on GitHub by Joseph Paul Cohen with the goal of using these images to develop AI based approaches to predict and understand the infection

My model has a binary purpose, so I will keep only two classes: Normal and COVID-19. The final dataset has 2 directories:

  1. Train- 2,083 images of shape (300,400,1) of Normal (880) and COVID-19 (60)
  2. Val- 988 images of variable shapes of Normal (450) and COVID-19 (9)

2.- Creating the model

I will use a popular framework called FastAi, As you may know, it’s an open source library for Deep Learning, built on top of PyTorch. It’s very effective because it simplifies training fast and accurate neural networks . It’s based on research into deep learning best practices undertaken at fast.ai.

Importing libraries

Let’s import the libraries needed:

from fastai import *
from fastai.vision import *
from fastai.metrics import error_rate

Now, let’s define the images path and the databunch. This folder contains train and valid folders, with COVID-19 and Normal samples, so it will be easy to detect the classes.

corona_images_path = 'covid-dataset/'

Data Augmentation

I will use standard set of transforms from vision.transform for data augmentation and pass it to the databunch:

tfms = get_transforms()
data = ImageDataBunch.from_folder(corona_images_path, train='train', valid='val', ds_tfms=tfms, size=128, bs=10)

If you want to see a preview of the images I’m working with:

data.show_batch(rows = 3)

I’m not a health professional and had to do my research to find which are the most important features you can find in x-ray chest images to detect if patients have COVID-19 or not. What I found is that patient’s lungs are filled with a sticky mucus that prevents them from inhaling because there is no space for air.

You can check on the images, released by the Radiological Society of North America, that shows what radiologists call ground glass opacity: the partial filling of air spaces.

The fastai library includes several pretrained models from torchvision, so now I’m going to pick resnet50 for this task:

arch = models.resnet50

After loading the pretrained model, it’s ready for fine tuning:

learn = cnn_learner(data, models.resnet50, metrics=[error_rate, accuracy])

Error rate and accuracy are the chosen metrics to track model’s health.

Now it’s time to train the model! let’s pass the databunch and the model creation function to cnn_learner and call fit to start the training:

learn.fit_one_cycle(8,1e-2)

The model did really good and got 99.56 of accuracy, 0.05 and 0.01 train and valid loss, respectively.

Let’s plot the metrics:

As you can see, the results are great. Now let’s take a look at the confusion matrix:

Let’s see which is the most confused class:

interp_resnet50.most_confused()

There are two False Positives cases, two COVID-19 actual cases predicted as Normal cases.

[('COVID-19', 'Normal', 2)]

Let’s plot some test images to see their predictions:

3.- Making predictions

Let’s try our model on images it haven’t seen before:

Patient with COVID-19

img = open_image('covid-dataset/Val/COVID-19/8.jpeg')
img.show()
learn.predict(img)

The results are:

(Category COVID-19, tensor(0), tensor([0.9879, 0.0121]))

Normal case

img = open_image('covid-dataset/Val/Normal/102.jpeg')
img.show()
learn.predict(img)

Results are:

(Category Normal, tensor(1), tensor([0.0387, 0.9613]))

4.- Recommendations

  • Keep growing a dataset with more samples for COVID-19 cases to make more accurate models, right now we don’t have data
  • Right now, this AI solution may not be ideal due to the small amount of collected data, so I believe health professionals have better ways to detect viral infections like COVID-19

Hope you’ve enjoyed the post and encourages you to train your own model. Please, remember to stay at home to flatten the curve!

References

All figures verified with the World Health Organization’s site on Covid-19:

--

--

Data Solutions Infrastructure Manager at Novartis. AI and Machine Learning enthusiast. Data Driven Investor writer. Healthy lifestyle lover ♥