When building audio classification models, whether for voice-control applications or ambient noise monitoring, the quality and diversity of the dataset are key elements to train a robust machine learning model. Collecting and labeling audio data in varied and challenging noise environments can be both time-consuming and resource-intensive. In particular, Signal-to-Noise Ratio (SNR) of audio inputs can greatly influence the performance of models. To address this challenge, we introduce a new synthetic data generation block designed to accelerate the creation of robust audio datasets by mixing audio samples with noise files at specified Signal-to-Noise Ratio (SNR) levels.
Understanding the Audio Mix Noise Generator Block
The Audio Mix Noise Generator block is available on Github. It utilizes the great audiomentations library to blend clean audio files with noise files, achieving specific SNR targets. The block speeds up the generation of real-world audio samples in diverse noise environments directly from Edge Impulse Studio.
Here are the different steps to use the Audio Mix Noise Generator block:
- Upload audio data: Add audio and noise files to your organization bucket, you should have a dedicated folder for each (see our organization hub documentation for more details).
- Define Parameters and Generate Data: Select the path to your audio files and adjust key parameters of the block. In particular you can fine tune the SNR level and labeling method. If you have multiple noise files, one will be randomly picked up for each audio sample.
- Review and Validate: New audio samples are automatically generated, enabling you to review the dataset in Edge Impulse Studio for immediate feedback on data quality. You can also decide to discard all or some of the generated samples directly from the synthetic data tab.
While generating audio samples, metadata is also added to synthesized data including generation date, SNR values, and references to the original audio and noise files. This is particularly useful when looking at subgroup metrics (ie: accuracy and F1-score of data with SNR between 3 and 5 dB) and to enhance real-world corner cases.
Using the Audio Mix Noise Generator Block in Edge Impulse (Enterprise Only)
To start synthesizing audio data in Edge Impulse, follow these steps:
- Clone the Block Repository: Head to our GitHub to clone the source code.
- Install Edge Impulse CLI: Ensure you have the latest version of the Edge Impulse Command Line Interface installed.
- Initialize the Transformation Block: Use edge-impulse-blocks init to initialize a new block with custom settings. You will need your Edge Impulse credentials.
- Push Your Block: After setting up your parameters, push the new block to Edge Impulse using edge-impulse-blocks push.
- Set Up a Project: In your Enterprise project, go to Data acquisition > Synthetic data, and select the Audio Mix Noise Generator Block.
- Set block parameters: Configure Your Parameters such as SNR and labeling method.
- Generate Data: Click “Generate data” to generate audio samples. You’ll see the synthesized audio files displayed in your dataset panel for quick evaluation. These will automatically be added to your project!
This setup allows you to enhance your audio dataset within minutes, tailored to your exact needs in terms of background noise and SNR targets. For further customization, you can modify the code and redeploy the block with the Edge Impulse CLI.
Recap of Audio-Noise Generator block benefits
The Audio Mix Noise Generator Block offers several benefits to synthesize a wide range of audio samples in multiple acoustic environments, enhancing model robustness. Thanks to the full integration in Edge Impulse Studio, it also speeds up the process of creating a tailored audio dataset.
By leveraging this transformation block, embedded engineers and data scientists can expedite the development of high-performance audio models, ensuring reliability across diverse real-world scenarios.
Be sure to sign up for a free Enterprise Trial to test this and many other advanced features.