Training a CNN to detect Pneumonia

Alishba Imran
DataDrivenInvestor
Published in
6 min readFeb 12, 2019

--

I remember the day so well. My grandfather started getting random coughs and began having trouble breathing. He was getting weaker and weaker as the days went by. Everyone in our family was worried, and so we urgently rushed him to the nearest doctors office. It was a very long and hectic process and the doctors in Pakistan were unable to diagnose him.

Let me say that again for the people in the back… they were unable to diagnose him. THAT’S CRAZY! I couldn’t believe it. I always heard how much our research and healthcare methods have developed and improved in the last decade, but these doctors couldn’t even diagnose my grandfather. My grandfather has always been a healthy, and strong man and knowing that he couldn’t even get diagnosed and receive the proper care, upset me.

My grandfather

Luckily, we went to a few other doctors until he was diagnosed with Pneumonia. But this is against the point because the fact is:

  1. not all doctors have the proper diagnosing tools to diagnose patients.
  2. or there diagnosing is not very accurate.

It’s not only my grandfather but 2 BILLION people per year suffering from pneumonia! Pneumonia is an infection in the lungs that can be caused by bacteria, viruses, or fungi. Chest X-rays are currently the best method for diagnosing pneumonia.

BUT… there is still a lack of access with almost two-thirds of the world’s population lacking access to radiology diagnostics. It is also much more difficult to make clinical diagnoses with chest X-rays than with other imaging modalities such as CT or MRI. This leads to inaccurate results.

pneumonia vs. normal lung

AND… because of this Pneumonia is responsible for more than 1 million hospitalizations and 50,000 deaths per year in the US alone.

Pneumonia is a common disease we have fought against for thousands of years. It’s about time we put an end to this. Automating this detection task would greatly improve the efficiency of radiologists.

So… Where to Start?

Solution: Convolutional Neural Network (CNN)

I developed a Convolutional Neural Network (CNN) that is able to detect whether a patient has pneumonia, both bacterial and viral, based on an X-ray image of their chest.

Analyzing the X-Rays

I started by analyzing a few x-rays through a CNN. Compared to traditional neural networks, Convolutional Neural Networks are a lot more efficient at processing image data.

In convolutions, instead of going pixel by pixel, we use “filters” to analyze portions of an image. This is such a powerful model, that it can reduce the number of operations from a simple network in the hundreds of millions to less than ten million. Using CNN's I classified all of my data.

Organizing our Data

I used Kaggle’s data of chest x-rays. The dataset is organized into 3 folders (train, test, val) and contains subfolders for each image category (Pneumonia/Normal). There are 5,863 X-Ray images (JPEG) and 2 categories either Pneumonia or Normal.

Detecting pneumonia in an x-ray scan is a simple binary classification problem: either we detect pneumonia, or we don’t. I provided this representation with numbers! 0 meant normal and 1 meant pneumonia. Using Python, I created a data frame for each image, labeled with 0 or 1 depending on its folder, and shuffled them all together:

0: Normal Patient X-Rays

1: Pneumonia affected X-ray images

Classify Images: VGG19 Model in Keras

I implemented transfer learning, using the first 16 layers of a pre-trained Oxford VGG19 Network, to identify the image classes. Researchers from the Oxford Visual Geometry Group (VGG) had participated in the ILSVRC challenge and a convolutional neural network models (CNN) developed by the VGG won the image classification tasks. This is perfect for image classification with our data.

I loaded the VGG model and used in the Keras deep learning library since Keras provides an Applications interface for using pre-trained models. Using this interface, I created a VGG model using the pre-trained weights provided by the Oxford group and used it as a model to directly for classifying images.

How it works:

  1. Get a Sample X-Ray
  2. Load the VGG Model
  3. Load and Prepare X-Ray
  4. Make a prediction

You can learn more about how this model works here.

Training and Testing the Network

My network starts off with some random weights and zeroes bias. These “parameters” will be influencing our network’s decision on whether an image is one of a pneumonia scan or not. Once the input data is processed through my network using these weights and biases, it will use an activation function which will return either a 0 (normal) or a 1 (pneumonia detected).

Then I compared the prediction made with the actual answer using loss function to calculate the margin of error on said prediction. Then went through backpropagation which will adjust the network’s weights and biases to make better predictions. I kept iterating until I got the most accurate result.

VVG16 Accuracy — 88%

Let’s test it!

When I was testing, I provided the trained model with more input data (x-rays) but the label of the data is unknown. Basically making predictions without adjusting parameters to see how accurate the neural network is. Once I finished testing, I saw the accuracy rate:

My model was 88% accurate! Not bad for a first try. If I keep iterating, I can totally make this close to 100% accurate.

A Bright Future: Deep Learning

This pneumonia detector might not be a market-ready product yet, but it makes me excited to see how easy it is to get started with. With even more iterations, data and layers, I am optimistic that we can have a close to 100% accurate product.

People like my grandfather will never have to suffer from not being provided with diagnosis or the accurate kind. This will help us diagnose earlier and create treatments that can help save lives. That’s what is so exciting about deep learning today: the barriers have all been diminished.

I am a Blockchain, VR and Machine Learning developer. If you want to stay up to date with my progress feel free to follow me on LinkedIn, and Medium!

If you enjoyed reading this article, please press the👏 button, and don’t forget to share!

DDI Featured Data Science Courses:

*DDI may receive affiliate commission from these links. We appreciate your continued support.

--

--

Machine learning and hardware developer working on accelerating problems in robotics and renewable energy!