Powering Anomaly Detection for Industry 4.0

Build, track, and organize production-grade anomaly detection models

Abby Morgan
DataDrivenInvestor

--

What is Industry 4.0?

You’ve probably heard the buzz: Industry 4.0 is revolutionizing the way companies manufacture, develop and distribute their products — but what exactly is Industry 4.0?

To understand the Fourth Industrial Revolution, it helps to remember the first three. The First Industrial Revolution began at the end of the 18th century and focused on mechanizing industrial processes. The Second Industrial Revolution introduced electrification, and the Third Industrial Revolution championed automatization.

Black and white photograph of an assembly line from 1913
The Ford assembly line in 1913; source: Wikimedia commons/public domain

Each of these periods focused on reducing human intervention in industry, but what’s left once factories are mechanized, electrified, and automated? The Fourth Industrial Revolution is all about harnessing the power of data and leveraging it to simulate cognition in industry. More than anything, Industry 4.0 is a paradigm shift in the way we organize and manage industrial processes to make the most of cyber-physical systems. These smart manufacturing processes include artificial intelligence, machine learning, cloud- and edge-computing, industrial IoT, distributed computing, augmented reality, and much, much more.

An image of robotic arms overlaid with symbols representing IoT or the Internet of Things
Industrial IoT; image from Connected

What is Anomaly Detection?

One of the most popular ways that artificial intelligence is being incorporated into industrial manufacturing is through automated defect detection or anomaly detection. Anomaly detection is the process of identifying anomalous items in a stream of input data and is a critical component of quality assurance in any production line.

Traditional manual defect detection methods are not only expensive and time-consuming but can also often be ineffective, as not all anomalies are visible to the human eye. However, with the advent of machine learning, computer vision can be leveraged to facilitate human operator work — or even completely automate these processes!

Challenges of Anomaly Detection

Anomaly detection faces a number of unique challenges. It’s often difficult to obtain a large amount of anomalous data, making traditional supervised learning techniques impractical. These class imbalances also mean that popular evaluation metrics like accuracy aren’t relevant. Furthermore, the difference between a normal sample and an anomalous one can be microscopic, and not all types of anomalies are predefined at the outset of an experiment. Fortunately, to address each of these challenges and more, a team of artificial intelligence researchers at Intel have developed a cutting-edge, easy-to-implement, open-source package called Anomalib.

What is Anomalib?

Anomalib is an open-source deep learning library that makes it easy to benchmark different anomaly detection algorithms on both public and custom datasets, all by simply modifying a config file. It’s a comprehensive, end-to-end solution that includes cutting-edge algorithms, relevant evaluation methods, prediction visualizations, hyperparameter optimization, and inference deployment code with Intel’s OpenVINO Toolkit.

Anomaly detection graphics from a Comet + Anomalib experiment where the artificial intelligence has detected and outlined the anomalous areas of bottles.
Anomaly detection graphics from a Comet + Anomalib experiment by Sid Mehta; used with permission.

So, how does it work? Anomalib uses unsupervised ML techniques to learn an implicit representation of normality with AutoEncoders, GANs, or a combination of both. During inference, new samples are compared against the embeddings of normal samples to determine whether or not they are anomalous. In this way, Anomalib allows you to save your sparse anomalous data for testing purposes only.

Anomalib currently supports ten cutting-edge anomaly detection models, including FastFlow, PaDiM, PatchCore, and CFlow models, but is also continuously updated with the latest state-of-the-art algorithms. You can also train models with custom data or access public datasets like the MVTec or BeanTech datasets through the API. What’s best, Anomalib makes end-to-end anomaly detection possible straight out-of-the-box, and without additional GPUs or super long training times.

Anomalib can do more than just support POC projects, however. In the real world, machine learning is a highly iterative process, and the details of all these iterations can get pretty confusing, pretty fast. To optimize your model and get it production-ready, you’ll need to log, manage, and version these details in an experiment tracking tool. Anomalib recently announced an integration with Comet, and in this article, we’ll explore how to use the two together to power production-grade anomaly detection for Industry 4.0!

Anomaly detection of hazelnuts from the MVTec dataset
Anomaly detection of hazelnuts from the MVTec dataset; image from Anomalib.

Anomalib + Comet

Comet is a powerful tool that allows you to manage and version your training data, track and compare training runs, and monitor your models in production — all in one platform. And it’s now fully integrated with Anomalib for experiment management, benchmarking, and hyperparameter optimization!

The Comet + Anomalib integration offers the following features:

  • Auto logging and custom logging of experiment- and project-level metrics and features, including system metrics, hyperparameters, graph definition, evaluation metrics, and more.
  • Organize your project-level dashboard with Comet’s custom panels for an overview that tailors to your team’s specific needs.
  • Image Panels allow you to compare your images across different experiments and throughout different steps. Search for individual images and showcase selected images across individual experiment runs.
  • Log benchmarked results to Comet as a means to track model drift.
  • Isolate the best hyperparameters with HPO powered by the Comet Optimizer.
A screenshot of auto-logged experiment-level charts in Comet ML
Auto-logged experiment-level charts in Comet; image by author.

Logging

In single-experiment view, you’ll find that appropriate evaluation metrics are automatically calculated and logged for you, in both tabular and chart form. By definition, anomaly detection is a problem of class imbalances, and this makes for poor performance with traditional ML metrics like accuracy, which are designed around an assumption of balanced class distribution.

A screenshot of the experiment-level graphics of a broken bottle from the MVTec dataset in Comet, where the artificial intelligence has successfully outlined the broken region of a bottle.
Experiment-level graphics of a broken bottle from the MVTec dataset in Comet; image by author.

Instead, Anomalib + Comet calculates the F1 score and AUROC at both image and pixel levels. The F1 score combines precision and recall into a single metric by taking their harmonic mean, while still accounting for the precision-recall tradeoff. The AUROC curve describes how well a model can distinguish between classes by plotting a probability curve at various thresholds, or degrees of separability, and is another very important metric for evaluating classification problems with class imbalances.

Formula for F1 score using precision and recall
Formula for F1 score, where TP are true positives, FP are false positives, and FN are false negatives
Screenshot of two side-by-side graphs of pixel-level AUROC and F1 Scores, as auto-logged in Comet.
Pixel-level AUROC and F1 Scores, as auto-logged in Comet; image by author.

Scrolling through the left-hand sidebar of the single-experiment view, you’ll find that many more metrics and features are automatically logged for you as well. Basic contextual information like source code, system metrics, installed packages, and output are all logged. Experiment-specific hyperparameters, metrics and graph definitions are also autologged, and with a simple edit to the Anomalib config file, you can also log images and other graphics to Comet. Keeping track of all these metrics is essential for producing production-ready models and monitoring them for concept and data drift.

GIF of auto-logged experiment-level hyperparameter tracking in Comet
Auto-logged experiment-level hyperparameter tracking in Comet; GIF by author.

In panel-level view, you’ll see charts automatically populated with performance metrics for a birds’ eye view of your project across experiment runs. Add in Comet’s new Image Panels to this view to visualize specific prediction images across different experiments, as shown below:

Screenshot of project-level line charts in CometProject-level line charts in Comet ML
Project-level line charts in Comet; image by author.
Screenshot of project-level image panel in Comet ML
Project-level image panel in Comet; image by author.

Lastly, it can be really important to see how things are shaping up between two specific experiment runs. Comet allows you to diff selected runs for a more cross-sectional view of your project, which also allows you to compare specific metrics and parameters:

GIF of how to diff two selected experiment runs in Comet ML
Diffing two selected experiment runs in Comet; GIF by author.

Benchmarking

Anomalib also includes a benchmarking script for relating results across different combinations of models, their parameters, and dataset categories. We can log the model performance and throughputs to Comet as a means to track model drift, or export them to a csv file. You can check out the full documentation here, and once your configuration is setup, it takes one simple command to run your benchmarking:

python tools/benchmarking/benchmark.py \
--config <relative-or-absolute-path>/<paramfile>.yaml

Hyperparameter Optimization

Anomalib also supports HPO with the Comet Optimizer, making it easier to isolate the right combination of hyperparameters. See here for the Anomalib’s HPO docs, or for details on other possible configurations with Comet’s Optimizer, see here.

At the top of your Comet optimizer report, you’ll find each of your hyperparameters ranked by the evaluation method of your choice, in order of the largest magnitude of Spearman correlation coefficient, to the smallest. In the snapshot below, learning rate had the largest correlation coefficient with the model’s F1 score.

A screenshot of the Comet ML Optimizer report
Comet Optimizer report; image by author.

You’ll also find all of your experiment runs ranked by your evaluation metric, but with the additional option to toggle the ranking based on any of the model’s parameters. Lastly, your evaluation metrics will be plotted across all experiment runs in line plots, and feel free to add any of Comet’s publicly available custom panels!

Getting Started

It takes just four quick steps to start tracking our Anomalib projects in Comet. Feel free to follow along with this Colab tutorial.

Or, to take a sneak peek at the finished result, head over to this completed Anomalib project, courtesy of Comet’s own ML Growth Engineer, Sid Mehta.

0. Setup and Installation

Clone the Anomalib repo into your environment and install the necessary dependencies:

git clone https://github.com/openvinotoolkit/anomalib.git
cd anomalib
pip install . --q

1. Configure Comet Credentials

If you don’t already have a Comet account, you can sign up for free here. Make sure to grab your API key from your account settings so you can configure your Comet credentials in any of several ways. For the sake of simplicity, we’ll set them directly through environment variables here:

export COMET_API_KEY = <Your-Comet-API-Key>
export COMET_PROJECT_NAME = <Your-Comet-Project-Name> # this will default to the name of your dataset

2. Modify the Anomalib config File

Next, we’ll need to modify our Anomalib config file to enable logging. The easiest way to do this is to open the existing configuration file at anomalib/anomalib/models/<model-of-your-choice>/config.yaml and adjust the following parameters:

visualization:
show_images: true
save_images: true
log_images: true
mode: full # options: ["full", "simple"]
logging:
logger: comet
log_graph: true

Alternatively, you can also copy the default config template for your particular model into a new yaml file and adjust the parameters as needed. In this Colab notebook, we’ve also demonstrated how to use pyyaml to write a config file in an interactive environment. Note that each model supported by Anomalib has a different config file structure.

3. Training

By default, !python tools/train.py runs the PaDiM model on the bottle category from the MVTec AD (CC BY-NC SA 4.0) dataset.

To use a different algorithm, just switch out the model name in the config file path to another supported algorithm. To use a custom dataset, just update the relevant Anomalib config file accordingly with the path to your dataset.

python tools/train.py \
--config anomalib/models/<specific-model-name>/<config-file>.yaml

Now just head over to the Comet UI to check out your results!

Conclusion

In the real world, machine learning is a highly iterative process, so you’ll likely have many more training runs to visualize and organize. Comet makes it easy to track these runs, share them with members of your team, and collaborate within your organization. By pairing Anomalib with Comet, you can take advantage of all the cutting-edge algorithms of Anomalib, and create production-worthy, maintainable models for your next Industry 4.0 project.

Thanks for making it all the way to the end of this article, and we hope you found this tutorial useful! Feel free to drop any questions or feedback in the comments below, and stay tuned for more content!

This article was originally published on Comet here.

If you liked this post, check out some of my other Medium articles below:

Subscribe to DDIntel Here.

Visit our website here: https://www.datadriveninvestor.com

Join our network here: https://datadriveninvestor.com/collaborate

--

--