Did you know? You can utilize the power of the crowd to collect a robust keyword spotting dataset with Edge Impulse! This is very useful in a STEM educational environment or technical conference setting, and can help you to reduce the biases that your keyword spotting model may encounter with differing vocal aspects of your model’s users, like tone, volume, pitch, accent, and more.
Edge Impulse’s Keyword Collector web application records your audience’s keyword samples and automatically splits the full recording into one-second sub-samples after identifying when each keyword stops and ends via audio signal processing techniques.
Check out our CTO Jan Jongboom and Founding tinyML Engineer Dan Situnayake collecting their workshop participant’s keywords live and in real-time during the 2021 tinyML Summit (starting at the one hour mark)!
How do I get started?
From the Edge Impulse Studio, create a new project, add some noise/unknown audio samples and a few samples of the keyword you would like to spot (for example "Hello World" or "Edge Impulse") at a sample length of one second and at a unified sampling frequency.
Then, create and copy a new Edge Impulse “Ingestion” API key for your project, which you’ll find under Dashboard > Keys > Add new API key > Role: Ingestion.
Note this new Ingestion API key somewhere on your computer, then construct the following URL:
https://smartphone.edgeimpulse.com/keyword.html?apiKey=ei_XXX&sampleLength=30000&keyword=helloworld&frequency=11000
Where you replace:
- ei_XXX with your Edge Impulse project’s Ingestion API key.
- 30000 with the desired length that people need to record for in milliseconds (here 30000 = 30 seconds total).
- helloworld with the keyword people should say.
- 11000 with your desired audio frequency (here it is 11KHz).
You can then share this link with your colleagues or audience via a QR code (which you can generate here to easily include it in a presentation for example).
When you are finished with your crowdsourced data collection, you can revoke access to your keyword collector application and QR code by clicking on “Revoke” next to the Ingestion API key via your project’s Dashboard > Keys > ⋮ > Revoke.
Start collecting your crowd’s keyword samples
Now, send the QR code you generated above and the instructions below to your audience or colleagues:
1. Grab your phone, open up the Camera app and scan the QR code by pointing your camera at your computer screen, then click on the link that appears on your phone.
2. Your phone’s web browser should open and connect successfully to the keyword collector application:
3. Click Get started! and give your phone’s web browser access to your microphone by clicking Allow:
4. Wait for audio recording to begin for a couple of seconds.
5. Once audio recording begins, repeat the designated keyword (i.e. "Hello World") over and over into your phone’s microphone, leaving a pause of space between each iteration:
6. Your keyword samples should now be successfully uploaded to the crowdsourced Edge Impulse Keyword Spotting project!
Deploying the crowdsourced keyword spotting model to the edge
Now, follow our Responding to your voice tutorial to build and train your new crowdsourced keyword spotting model (and optionally deploy to an embedded device of your choice).
Once you have trained your model, you can also follow the instructions in the Continuous Audio Classification blog post to perform keyword spotting on the edge directly through the web browser on your phone, without writing any code!