Imagine if you had to go through the day relying on only one of your senses. What might it be like if you had to drive across town using only your sense of vision? You would be able to see where you were going, but would not be able to hear the sirens of emergency vehicles or horns of other cars on the road, not to mention that you would not be able to feel the steering wheel in your hands or the accelerator on your foot.
While someone might learn to get by like this with practice, it is far from an ideal situation. Yet this is essentially what unimodal machine learning models have to do. They must learn about the world from a single source of data, whether it be text, images, audio, or otherwise. To get around this handicap and build more robust and accurate models, developers are increasingly working to create multimodal models that make predictions that are based on findings from multiple sources of data.
As you might expect, these models tend to be larger and require more resources, so they typically run on powerful computing hardware in the cloud. Running algorithms such as these on discreet embedded hardware platforms is considered by many to be out of reach. But is that necessarily the case? Machine learning enthusiast Solomon Githu says that because of recent advances in machine learning optimization techniques and processing hardware, it is not.
Githu does not just blindly assert that this is the case. Rather, he has put his Arduino where his mouth is and proved it. With the help of Edge Impulse’s tools and an Arduino Nano 33 BLE Sense development board, Githu has developed an accurate fire detection system that leverages both visual and temperature data to make a prediction. While Githu hopes that this device will be an important contribution to fire detection, his primary goal for this project was to demonstrate how sensor fusion is possible in edge machine learning applications.
The device itself is very simple — it consists of an Arduino Nano 33 BLE Sense board with an nRF52840 microcontroller and 256 KB of RAM. He paired this with the image sensor found in an Arduino Tiny Machine Learning Kit, and the temperature sensor that is already onboard the Nano 33 BLE Sense. After assembling the system, Githu used it to collect data to train a machine learning algorithm. Playing with fire is a good way to get burned, so for demonstration purposes, Githu decided to use a candle to get images of flames and a nearby oven to generate heat.
With the data collection all square away, Githu got busy building a model and training it. When it comes to sensor fusion applications, this is tricky business, but the project write-up walks us through all of the necessary steps. First, a he created a notebook in Google Colab, defining a custom, multi-input convolutional neural network architecture. The architecture utilized a technique called tensor slicing, which makes it possible to work with subsections of tensors. That enabled Githu to input both image and temperature data into the model. The data then flows through separate branches of the network, and the model ultimately makes a prediction based upon both types of sensor data.
Githu connected the notebook to an Edge Impulse project via API key. This interface allows users to train, test and profile the model with the Edge Impulse Python SDK, and made it easy for Githu to iterate on different model architectures, input data preprocessing options, and hyperparameter values to eke out as much performance as possible.
Once the final design was decided on, the training process was initiated. When it finished, it was observed that 100 percent of samples in the validation dataset were classified correctly. This is an excellent result, but Githu suggests that the small dataset of only 120 samples used for this proof of concept was not very diverse, so the features may have been quite easy for the algorithm to identify. A much larger, and more diverse dataset would certainly be needed before deploying an application in the real world — however the remainder of the development process would be virtually identical.
Next, the Edge Impulse Python SDK was leveraged again to profile the classification pipeline. This makes it simple to see the estimated RAM and ROM consumption, and also the inference times associated with executing the model on a wide variety of hardware platforms. Since this project is using an Arduino Nano 33 BLE Sense, Githu checked resource utilization levels for it and found that they fit well within the constraints of the device, and also that inference times were acceptable for the application.
At this stage, the model could also be saved to Edge Impulse via the Bring Your Own Model feature, which in turn makes deployment to the physical hardware a snap. Using the Deployment tab in Edge Impulse, the complete classification pipeline was packaged into a single Arduino library that can be opened up in Arduino IDE and flashed to the device. This option also makes it possible to customize the code, which would be useful if you wanted to, for example, trigger some type of alarm when a fire is detected.
During deployment, there is also an option to utilize the EON Compiler. This produces more efficient and hardware-optimized C++ source code, which was noted to reduce both memory utilization and inference times.
Diving into sensor fusion on a tiny hardware platform can be a scary prospect, but if you follow the steps laid out in Githu’s project there is nothing to fear. Fire detection or otherwise, applying the same basic principles will get you over the finish line in no time at all.