What is Unsupervised Machine Learning?

Artificial intelligence (AI), and machine learning (ML) are gaining ground in every industry, including that of process manufacturing. ML tools and platforms all use models that rely on data to learn how to recognize patterns. The models use the patterns to predict future outcomes, detect anomalies, and classify data.

There are two main ways for computers to “learn.” The first is through labeled examples, or supervised learning, where the learner receives both the data and the expected label of each example. The learner extracts the rules, patterns or models that relate the given label to each example.

The second way is called unsupervised learning, where the learner receives only the data, without any additional information about what it is expected to learn. The learner then looks for the main groups, patterns and similarities in the data without external guidance. Both of these methods, and combinations of methods, rely on statistical and mathematical principles and require massive amounts of data.

For example, if you wanted to use unsupervised machine learning to train an ML tool to spot product that hasn’t fully dried out when it leaves an oven, you would feed it with tons of examples of product at different stages of dryness. The model would have to detect the recurring similarities between some examples, and keep on refining those similarities until it can reliably recognize product that is still moist.

When is Unsupervised Machine Learning Relevant for Process Manufacturing Plants?

Unsupervised machine learning is often more relevant for process plants than supervised learning, because process plants are rich in data, but have few of the recurring historical examples needed for supervised learning.

In theory, it’s possible to use unsupervised learning to teach a machine learning tool how to recognize the “normal” state of a plant, and then have it generate an alert whenever it detects an anomaly. But in practice, this is extremely difficult. Because there are so many moving parts in a plant, “normal” is constantly changing. The number of possible states the plant can be in at any given time is larger than the number of atoms in the universe, which means that unsupervised learning models end up registering anomalies more or less constantly.

There are some process plants that try to deal with the problem by focusing on just a few primary assets. Being so focused means they might miss problems that develop in other parts of the plant.  Focusing on specific asset misses contradictions between the behavior of the monitored asset and other parts of the plant, specifically in ones that influence this asset. For example, a monitored pump which is not active is in a perfectly normal inactive state, but this state would not be considered normal if you learn that there is a flow in the pipe feeding into it.

There are some process plants that try to deal with the problem by focusing on just a few primary problems. This supervised approach requires collecting examples of many occurrences of each specific problem in all variations of their occurrence, which takes a lot of time and effort, and ends up by identifying just those specific problems, leaving all other problems, specifically new ones, undetected.

How Can Process Plants Apply Unsupervised Machine Learning?

Although regular unsupervised machine learning isn’t scalable for a modern process plant, there is a way to make unsupervised ML tools effective.

By applying context to the data sets, in the form of a domain map, the unsupervised learning solution can understand the relationships between all the multitude of data points. Once it has this context, it can map the entire plant to recognize true anomalies so as to generate accurate, targeted alerts. These predictive analytics, predictive maintenance, and predictive monitoring solutions can help process plants to increase productivity, reduce costs, extend machinery lifetime, and increase resilience and agility.