Many customers ask us how artificial intelligence works in detecting audio anomalies. We have prepared an article with an illustrative example of escalator monitoring.
Neuron Soundware helps to digitise data from machines and then understand it. In this case, it first converts human listening into a visual form. Based on the graphical values, the system then looks for anomalous conditions in the data and sends smart alerts to customers based on these.
This works in a similar way to the lights in a car when the oil is running low, or the doors are open or seatbelts are not fastened while driving.
An escalator operator needs the same. He needs to see what condition each piece of equipment is in and whether there are any unwanted symptoms that could indicate an impending breakdown.
The basic input for the processing of the machine’s audio is digitized data. These need to be processed from the audio speeches and prepared for artificial intelligence.
The basic measured quantity is the energy of the sound waves coming from the measured object. The sound energy (vibration) is represented by a graph, where the y-axis shows the amount of energy (signal magnitude/loudness) and the x-axis shows the time at which the energy/loudness/vibration changes. On rotating machines, the signal most often appears as periodically repeating. It can be simplistically imagined as a sinusoidal waveform.
In the case of a particular escalator speech, for example, we display the sound as acoustic pressure as follows:
Subsequently, these sound signals are mathematically converted into the so-called frequency spectrum. Frequency spectra are more suitable for the more detailed task of AI analysis than time course data alone. The frequency spectrum after conversion from the timeline looks like this on the escalator:
Different parts of the device have different sound and vibration characteristics. These are generated by the mechanical movement of the individual components. The resulting sound sample can then be divided and separated into individual mechanical disturbances by means of a deeper frequency analysis due to the variety of frequency manifestations of these disturbances.
These manifestations are illustrated by sine wave sounds. Each of the components “sounds” slightly different, as the different faults manifest themselves at different frequencies.
Since the machines are complex, the measured signals are very far from the idealized sine wave. Therefore, it is necessary to decompose the measured sound sample into individual sub-signals and this is where the Fast Fourier Transform (FFT) is best suited.
The Fourier transform decomposes the signal into individual sinusoids and translates these into a frequency spectrum. This allows us to show only small “teeth” in the frequency spectrum on the graph. These teeth are much easier to work with.
However, when we look at a real-world machine we find that the signal generated is not ideally regular during its operation. Therefore, we also use a method of computing a large number of fast Fourier transforms. In order to analyse their time course, we use a special display method called a spectrogram.
The individual colours in the figure indicate the signal intensity (amplitude). The y-axis indicates the frequency and the x-axis the evolution over time. This puts the sound into visual and data form. By looking we can better see the changes in the sound and vibration manifestations of the machine over time.
The processing of the sound from the manifestation in the time sample to the frequency spectrum looks like this in simplicity:
Or in the case of escalators:
Artificial intelligence, in our terms, means clever algorithms that look for patterns in data and evaluate them.
In reality, in simple terms, it looks something like this: The algorithms break down the spectrogram into its individual parts and compare these to existing sound and vibration samples in the database. For example, with the nominal machine manifestations (when the machine is running fine), or with known so-called fault libraries.
Thanks to such data comparison, we can then recognize whether a sound or vibration manifestation is developing on a machine component, which is an undesirable or malfunctioning manifestation. This can be seen in the figure below, which shows examples of defined frequencies for different types of faults.
In reality, in simple terms, it looks something like this: The algorithms break down the spectrogram into its individual parts and compare these to existing sound and vibration samples in the database. For example, with the nominal machine manifestations (when the machine is running fine), or with known so-called fault libraries.
In the example of a broken electric motor driving the escalator it would look like this:
In the first two figures we see a distinct change at the end of the trend. This is then graphically represented in nGuard by exceeding the anomalous score, i.e. the level of deviation of the sounds from normal that the AI evaluates as anomalous, i.e. deviated from normal.
To illustrate this, we present the data processing by artificial intelligence in the following diagram, which presents artificial intelligence, for example, from the perspective of self-driving cars.
These also need to collect a huge amount of data. What does a traffic sign look like, what does a traffic light look like, what does a turn look like, etc. The same applies to artificial intelligence monitoring machines. The more “sound” data it has about the machines, the better it can evaluate the actual sound the machine is making.
We use millions of anonymized recordings of machines around the world to evaluate machine status. In addition, we collect data from a specific machine and its components.
Artificial intelligence helps us to express sound mathematically in the form of a vector. It is the numerical expression of sound in vector form that is by far the most difficult discipline in sound analysis. We can imagine the vector space from the visualization below as a “space with arrows”.
We construct this “arrow space”, a vector-expressed set of all possible machine sounds, based on a database of machine sounds, a database of machine-specific sounds, and the sounds of individual components. See figure below:
When evaluating the machine sound, a mathematical calculation always compares the current sound vector with the set of possible sounds and evaluates whether the machine sound is the same as the nominal sound or whether it is a different vector and therefore the sound differs.
The following figures then show an example from sound energy collection, to comparison and evaluation of the anomalous score.
In each line we see:
We can then trace the evolution of the sound over time (from left to right) as follows:
The last plot of the anomalous score tells, in vector space, the distance from the nominal state. For our purposes, we use a value of 0.5 as the standard distance from the nominal sound. Thus, anything within 0.5 is within the green envelope (within a small distance of the green = nominal arrow). 0.5 is the boundary of the green envelope. Anything above 0.5 is outside the envelope and therefore indicates an anomalous condition.
So within this figure we can see what the analysis of an anomalous score looks like.
Traditional vibrodiagnostics generates a frequency spectrum from the sounds and then evaluates the individual frequencies. Based on the mechanical knowledge and vibration manifestations of individual machines, vibro-diagnostics can then estimate the approximate cause of the vibration.
AI diagnostics goes further and expresses the frequency spectrum mathematically using just numerical vectors. Subsequent analysis between these vectors allows a finer and more detailed analysis of the sounds and has wider applications. For example, the detection of sounds on non-rotating machines, and then sounds that are difficult to identify using vibro-diagnostics (creaks, rustles, etc.).
Neuron Soundware was among the top 5 in the DCASE competition. DCASE is a worldwide competition in sound analysis using artificial intelligence. The contestants get the sounds of the machines without any disturbance. On this, they train a model (i.e., a computational algorithm that works with the sounds). The teams are then given sounds with and without a malfunction and the goal is to distinguish these sounds from each other.
The quality of the results differ from each other, and therefore from competing teams, in how which group of machine learning engineers obtains the vector that represents the sound. Th. how it can translate the sound into a numerical representation
The Neuron Soundware team was ranked among the Top 5 teams in the world in the 2020 DCASE international competition in the application of artificial intelligence to predictive machine maintenance. In 2021, the Neuron Soundware team defended this ranking.
Artificial intelligence in the case of Neuron Soundware can thus be imagined as a set of algorithms comparing the sound and vibration manifestations of machines over time. Thanks to its rich knowledge of the manifestations of machines and their components, AI can evaluate what type of fault it is and what needs to be done to fix it.