How artificial intelligence by Neuron Soundware works

Many customers ask us how artificial intelligence works in detecting audio anomalies. We have prepared an article with an illustrative example of escalator monitoring.

The technical diagnostics industry is facing a shortage of human experts. Thanks to advances in technology, human experts can be replaced by artificial intelligence capable of processing even excessive amounts of data in a relatively short time.

At the beginning is the digitisation of sound

Neuron Soundware helps to digitise data from machines and then understand it. In this case, it first converts human listening into a visual form. Based on the graphical values, the system then looks for anomalous conditions in the data and sends smart alerts to customers based on these. 

This works in a similar way to the lights in a car when the oil is running low, or the doors are open or seatbelts are not fastened while driving.

An escalator operator needs the same. He needs to see what condition each piece of equipment is in and whether there are any unwanted symptoms that could indicate an impending breakdown.

artificial intelligence detecting anomalies

Collecting sounds from machines is done in 3 steps

  1. First, we install the sensors at the corresponding locations on the machine. Where we expect the greatest failure rate of the machine component. 
  2. The sensors send audio signals to the nEdge (Internet of Things) IoT device via a cable. This is either connected to the internet or processes the audio signals right at the point of machine operation, called on-edge. 
  3. The sound signals are processed into digital form, i.e. a computer-processable language, in the nEdge IoT device. At the same time, this audio data is then evaluated by machine learning algorithms in a microcomputer stored in the nEdge “box”.
Neuron Soundware solution for predictive maintenance-artificial intelligence detecting anomalies
By comparing the healthy and current machine data, the nGuard software application then displays actionable information for machine maintenance. These are just the lights we know from the dashboard of a car. Either it’s fine, or there’s something wrong with it. For example, a part needs lubrication or replacement. This evaluation is done by clever algorithms based on general mechanical knowledge of the machine, knowledge of the machine’s operating history and its accumulated nominal noises when running without failure. In addition, any feedback from the machine’s run provided by the operator to the Neuron Soundware specialists will help refine these algorithms.

How audio digitisation works – in more detail

The basic input for the processing of the machine’s audio is digitized data. These need to be processed from the audio speeches and prepared for artificial intelligence.

The basic measured quantity is the energy of the sound waves coming from the measured object. The sound energy (vibration) is represented by a graph, where the y-axis shows the amount of energy (signal magnitude/loudness) and the x-axis shows the time at which the energy/loudness/vibration changes. On rotating machines, the signal most often appears as periodically repeating. It can be simplistically imagined as a sinusoidal waveform.

artificial intelligence detecting anomalies - sound wave graph

In the case of a particular escalator speech, for example, we display the sound as acoustic pressure as follows:

artificial intelligence detecting anomalies: display of sound as acoustic pressure

Subsequently, these sound signals are mathematically converted into the so-called frequency spectrum. Frequency spectra are more suitable for the more detailed task of AI analysis than time course data alone. The frequency spectrum after conversion from the timeline looks like this on the escalator:

artificial intelligence detecting anomalies: sound signals converted frequency spectrum

Conversion from timeline to frequency spectrum


Different parts of the device have different sound and vibration characteristics. These are generated by the mechanical movement of the individual components. The resulting sound sample can then be divided and separated into individual mechanical disturbances by means of a deeper frequency analysis due to the variety of frequency manifestations of these disturbances.

These manifestations are illustrated by sine wave sounds. Each of the components “sounds” slightly different, as the different faults manifest themselves at different frequencies.

Since the machines are complex, the measured signals are very far from the idealized sine wave. Therefore, it is necessary to decompose the measured sound sample into individual sub-signals and this is where the Fast Fourier Transform (FFT) is best suited. 

The Fourier transform decomposes the signal into individual sinusoids and translates these into a frequency spectrum.  This allows us to show only small “teeth” in the frequency spectrum on the graph. These teeth are much easier to work with.

Frequency analysis allows us to perform technical sound diagnostics

However, when we look at a real-world machine we find that the signal generated is not ideally regular during its operation. Therefore, we also use a method of computing a large number of fast Fourier transforms. In order to analyse their time course, we use a special display method called a spectrogram.

Spectrogram - combines frequenscy analysis and time

The individual colours in the figure indicate the signal intensity (amplitude). The y-axis indicates the frequency and the x-axis the evolution over time. This puts the sound into visual and data form. By looking we can better see the changes in the sound and vibration manifestations of the machine over time. 

The processing of the sound from the manifestation in the time sample to the frequency spectrum looks like this in simplicity:

Or in the case of escalators:

How AI processes and evaluates data

Artificial intelligence, in our terms, means clever algorithms that look for patterns in data and evaluate them. 

In reality, in simple terms, it looks something like this: The algorithms break down the spectrogram into its individual parts and compare these to existing sound and vibration samples in the database. For example, with the nominal machine manifestations (when the machine is running fine), or with known so-called fault libraries.

Thanks to such data comparison, we can then recognize whether a sound or vibration manifestation is developing on a machine component, which is an undesirable or malfunctioning manifestation. This can be seen in the figure below, which shows examples of defined frequencies for different types of faults.

In reality, in simple terms, it looks something like this: The algorithms break down the spectrogram into its individual parts and compare these to existing sound and vibration samples in the database. For example, with the nominal machine manifestations (when the machine is running fine), or with known so-called fault libraries.

In the example of a broken electric motor driving the escalator it would look like this: 

  • The sound manifestations can be seen in the first picture.
  • The second picture shows a spectrogram. 
  • The third picture shows the anomalous score in the nGuard portal.

In the first two figures we see a distinct change at the end of the trend. This is then graphically represented in nGuard by exceeding the anomalous score, i.e. the level of deviation of the sounds from normal that the AI evaluates as anomalous, i.e. deviated from normal.

To illustrate this, we present the data processing by artificial intelligence in the following diagram, which presents artificial intelligence, for example, from the perspective of self-driving cars. 

These also need to collect a huge amount of data. What does a traffic sign look like, what does a traffic light look like, what does a turn look like, etc. The same applies to artificial intelligence monitoring machines. The more “sound” data it has about the machines, the better it can evaluate the actual sound the machine is making.

We use millions of anonymized recordings of machines around the world to evaluate machine status. In addition, we collect data from a specific machine and its components.

How artificial intelligence models work for anomaly detection

Artificial intelligence helps us to express sound mathematically in the form of a vector. It is the numerical expression of sound in vector form that is by far the most difficult discipline in sound analysis. We can imagine the vector space from the visualization below as a “space with arrows”. 

We construct this “arrow space”, a vector-expressed set of all possible machine sounds, based on a database of machine sounds, a database of machine-specific sounds, and the sounds of individual components. See figure below:

When evaluating the machine sound, a mathematical calculation always compares the current sound vector with the set of possible sounds and evaluates whether the machine sound is the same as the nominal sound or whether it is a different vector and therefore the sound differs.

The following figures then show an example from sound energy collection, to comparison and evaluation of the anomalous score. 

In each line we see: 

  1. A graphical representation of the recorded machine sound and its loudness over time
  2. A spectrogram that graphically displays the sound as a frequency spectrum
  3. The individual parts of the timeline converted into mathematical expression in vectors
  4. Anomalous score – i.e., a graph that shows the deviation of the sound from its nominal value over time

We can then trace the evolution of the sound over time (from left to right) as follows:

  1. Figure: nominal (normal) machine sound
  2. Figure: nominal machine sound under load
  3. Figure: deflected chain track
  4. Picture: chain track damage
  5. Picture: chain track sound after repair

The last plot of the anomalous score tells, in vector space, the distance from the nominal state. For our purposes, we use a value of 0.5 as the standard distance from the nominal sound. Thus, anything within 0.5 is within the green envelope (within a small distance of the green = nominal arrow). 0.5 is the boundary of the green envelope. Anything above 0.5 is outside the envelope and therefore indicates an anomalous condition.

So within this figure we can see what the analysis of an anomalous score looks like.

The difference between AI fault detection and traditional vibro diagnostics

Traditional vibrodiagnostics generates a frequency spectrum from the sounds and then evaluates the individual frequencies. Based on the mechanical knowledge and vibration manifestations of individual machines, vibro-diagnostics can then estimate the approximate cause of the vibration. 

AI diagnostics goes further and expresses the frequency spectrum mathematically using just numerical vectors. Subsequent analysis between these vectors allows a finer and more detailed analysis of the sounds and has wider applications. For example, the detection of sounds on non-rotating machines, and then sounds that are difficult to identify using vibro-diagnostics (creaks, rustles, etc.).

Neuron Soundware has won awards for this method in competitions worldwide

Neuron Soundware was among the top 5 in the DCASE competition. DCASE is a worldwide competition in sound analysis using artificial intelligence. The contestants get the sounds of the machines without any disturbance. On this, they train a model (i.e., a computational algorithm that works with the sounds). The teams are then given sounds with and without a malfunction and the goal is to distinguish these sounds from each other.

The quality of the results differ from each other, and therefore from competing teams, in how which group of machine learning engineers obtains the vector that represents the sound. Th. how it can translate the sound into a numerical representation

The Neuron Soundware team was ranked among the Top 5 teams in the world in the 2020 DCASE international competition in the application of artificial intelligence to predictive machine maintenance. In 2021, the Neuron Soundware team defended this ranking.


Artificial intelligence in the case of Neuron Soundware can thus be imagined as a set of algorithms comparing the sound and vibration manifestations of machines over time. Thanks to its rich knowledge of the manifestations of machines and their components, AI can evaluate what type of fault it is and what needs to be done to fix it.