Acoustic imaging has been an integral part of radio astronomy and sonar applications, especially in scientific explorations and in developing defence systems.
Most commercial acoustic cameras that recover the sound intensity field by combining linearly the correlated microphone recordings along with a conventional approach is known as Delay-And-Sum (DAS) beamformer.
Beamformer probes are used for acoustic intensities in angular directions and detecting the patterns. Acoustic images which are obtained this way are cheap to compute but are blurred by the beam shape of the microphone array.
To make these acoustic imaging systems more accurate, researchers from Switzerland have developed a recurrent neural network-based model dubbed with DeepWave to capture acoustic patterns more accurately.
Why Do We Need Neural Networks For This
Before we go ahead with the discussion of how DeepWave fares with existing techniques, it is important to know what motivated the researchers towards algorithm-based solutions.
One immediate challenge in acoustic imaging, wrote the researchers, is the angular resolution. Acoustic cameras are often deployed in confined environments, requiring them to be as compact and portable as possible, which limits further the achievable angular resolution.
Over time, the wide adoption of compressed sensing techniques in imaging sciences has inspired algorithmic solutions to the acoustic imaging problem.
However, these gradient-based solutions failed to replace the conventional systems for commercial usage.
Now, with the eruption of deep learning models, the question to find an accurate acoustic camera has raised again. And, this time, the researchers are confident of having to found the answer with the help of recurrent neural networks.
Why RNNs And Not CNNs
Though CNNs enjoy the status of being one of the most widely used architectures across many machine learning applications, they falter in the presence of more complex image reconstruction problems where the input data may not consist of an image, as is the case in biomedical imagery, interferometry, or acoustic imaging.
Moreover, the authors have also observed that the standard convolutional architectures cannot handle images with non-Euclidean domains such as spherical maps produced by omnidirectional acoustic cameras. And, this is where recurrent networks have proven to be useful.
A cascade of recurrent layers with trainable parameters — a variant of RNN that was proposed by Yann Lecun and his peers, was good at learning shortcuts in the reconstruction space, allowing it to achieve a prescribed reconstruction accuracy faster than gradient-based iterative methods.
With techniques like pruning, the recurrent networks got even smaller with fewer parameters.
DeepWave is the first of its kind recurrent neural-network, developed to capture real-time and high-resolution acoustic imaging.
To validate this model, a conference room was set up where 8 people were gathered around a big table and spoke either in turns or simultaneously — with at most 3 concurrent speakers.
Recordings of this conversation were collected in sound frequencies ranging from 1.5 to 4.5 kHz and were mapped to true colours corresponding to lower intensities.
In the above figure, the researchers illustrated how the angular resolution varied significantly between conventional techniques on the left and the DeepWave on the right. The model on the right identified the source of sound. The speakers, which has been highlighted in blue colour, are matched accurately according to their acoustic intensities. Whereas the DAS system fired up all around the sphere of imaging.
Real-data experiments show DeepWave having a similar computational speed to state-of-the-art delay-and-sum imager with vastly superior resolution.
In terms of resolution and contrast, DeepWave outperformed DAS by approximately 27% in terms of resolution and 20% in terms of contrast.
According to the authors, DeepWave paper accomplished the following:
- Rendering high-resolution spherical maps of real-life sound intensity fields in milliseconds.
- Training acoustic imaging problem with a typical training time of less than an hour on a general-purpose CPU.
- Directly processing the raw microphone recordings, unlike most state-of-the-art neural-network architectures.
- Applying DeepWave model to fields such as radio astronomy, radar and sonar applications.
Read the original paper here.