Production is at the height of automation today, yet wear and tear on the mechanical parts of production equipment can cause production to be suspended on one or more downstream machines. Monitoring the condition of machines and predicting their failures can save manufacturing companies considerable sums and bring automation to this area as well.
Machine learning algorithms allow us to inspect machines non-invasively
To monitor machine health and predict machine failures we analyze sounds emitted by the machines. It is a non-invasive and promising direction to detect potential machine failures in time. Indeed, few companies currently have experts with trained ears to listen to the machines for diagnosis. Unfortunately, these experts are costly and they also have limited capabilities (audio range, sleeping at nights, etc.).
To compensate for such a lack of manpower and on-site capabilities, in Neuron soundware, we use broad frequency range sensors to record machine sounds and Machine learning (ML) techniques to automate and improve early detection of potential machine failures from the sounds with no diagnostician needed at the premises.
Machine learning algorithms still have limits to be considered when using
Letting the machine learning algorithms detect the failures brings several challenges to be solved. Most of the ML research focuses on tasks with a lot of well-labeled samples, where the goal is to classify a newly given sample into one of the predefined classes. Unfortunately, these methods are of very limited use in predictive maintenance as we typically get from a customer very few, or even none, recorded failures at the time when we are deploying ML model for a new machine.
In these cases, we can use anomaly detection methods, which can say whether the newly observed sound is similar to nominal sounds – the sounds considered to be the sound of the heathy machine – or new sounds possibly caused by an early stage of the machine failure.
The biggest challenge of machine learning trainer: the missing sound of broken machines
When new anomaly detector models are developed, it is not easy to evaluate their performance without the sounds of the broken machines. In this article, we describe how we address the problem of missing sounds of broken machines, so we can create models which actually are useful. In principle, we can use several methods to generate these missing sounds, which allow us to use classification methods as anomaly detectors.
[Regular anomaly detection. Having few samples representing a healthy machine, we can create a model which can distinguish anomalies (all samples within the blue area will be considered to be healthy, samples outside are considered to be anomalous.]
What do we do when we need to detect an unknown failure
Solution no.1: Model-Based Solution
The first possible solution is to create an exact mathematical simulation of the machine sound emission process. Such a simulation can be then also used to generate sounds of different failures with high fidelity.
Obviously, this approach can be challenging and is not scalable due to the need for a detailed analysis of the machines and the necessity to create a detailed model of each machine. Moreover, needed mathematical software is expensive and requires domain experts to model. Therefore, this approach is being used for example in nuclear power plants to simulate turbines and their faulty states, and similar cases where this effort pays off.
[ Retrained classifier with data generated from the model-based approach. Generated samples (of both healthy and broken machines) are depicted inside circles. The solid line represents the newly trained model, while the dashed line shows the baseline anomaly detector trained on observed healthy data (see the figure above). ]
Solution no.2: Data Augmentation
An easier and more scalable approach is to use existing sounds that could be similar to the sounds of broken engines (e.g., the sound of a drill, cracking, squeaking) and combine those with the sound of the healthy engines. The advantage of this approach is that we can control the ratio of the normal and anomalous sound, which allows us to control the model sensitivity.
The disadvantage is that (i) artificially combining two sounds may not result in their realistic sound combination, (ii) that we are limited to anomalous sounds available to us in the database, and finally, (iii) the resulting augmented sounds may be different to the one sounds of the really broken engines.
[ Retrained classifier with data augmentation – other sounds. Augmented data are in circles, the intensity of the other sound is illustrated by the size of the symbol. ]
Solution no.3: Audio Transfer from Different Datasets
The previous method relies on possessing the sounds to augment with (drill, cracking, squeaking), which may not be very realistic. We can overcome these issues by collecting sounds of failures. Let’s suppose we train a model for a fresh new 1MW gas generator with healthy sound. Since we already recorded similar generators during normal operation and before their different failures, we can extract the knowledge of how the sound changes before the failure. Although each generator type sounds differently, the effects of failures on the sound are often very similar. Once we extract the sound of the failure, we can combine it with the sound of a healthy generator to get a very realistic sound of its failure.
[ Retrained classifier with data augmentation using sounds of failures. Sounds of other machines are in circles and dotted arrows represent how the sound of a healthy machine changed when the machine got broken. When we apply it to the observed healthy sounds, we can create sounds of the broken machine – red minus signs. ]
We are currently building a database of the sound of failures for different types of machines and use them during the training of new models.
Fortunately, there are ways to tackle the missing sounds of failures and hence detect the anomalies
ML researchers often struggle with the lack of data and/or labels. On the other hand, the data can be generated to some extent and used for ML to improve its performance. However, the disadvantage of generating artificial samples is they can be misleading for the learning models or can introduce biases. But don’t forget that “All models are wrong, but some are useful“.
How we proceeded in the case of compressors in the automotive industry
Model-Based Solution is a traditional method we used in the very beginning. Now we proceed with Data Augmentation in case of all standard projects (generators, engines, motors, etc.). We augment the data by extra drill “anomaly sounds” to validate the anomaly detector. To be sure it works correctly.
In the case of a compressor in the automotive industry, we recorded nominal data. The Life expectancy of the compressor was 1 year. So far there has not been any fatal failure on the compressor. Therefore we augmented the nominal sounds with additional sounds from the Google AudioSet. The model then detected the intensity of augmented sound mixed with the nominal sound. This way we made sure the full functionality of the model to certainly detect the real anomaly when it happens.
Now our customer is constantly monitoring the condition of his machine. He can watch the condition of the machine from anywhere or listen to how his machine sounded for example last night. In the event of a suspicious sound change, he will receive a notification and can check the device in time.
Anyway, we see the future of our machine learning in the use of audio sets transferred from different machine failure datasets. The goal is to augment the nominal sound and to be able to detect particular anomalies on the machine.
Join us on such an exciting journey, we will be very happy to discuss with you how we can implement our technology in your case and guard your machines.
 Wu, Shuang. “Engine sound simulation and generation in a driving simulator.” (2016).
 arxiv.org/abs/1509.01692: Take and Took, Gaggle and Goose, Book and Read: Evaluating the Utility of Vector Differences for Lexical Relation Learning