Webinar video: TinyML for industrial usecases with STMicroelectronics and Avnet
Together with Avnet and STMicroelectronics we hosted a webinar on TinyML for industrial usecases! If you've missed it you can watch the video above which consists of an half hour introduction to TinyML, STM32Cube.AI and industrial usecases, followed by a 20 minutes demo where we explore two ML models, and 10 minutes Q&A.
We've also took the courtesy to write out the most frequently asked questions in this blog post.
Are there tutorials available for the things you showed during the webinar?
Yes! Here they are:
- Continuous gestures: https://docs.edgeimpulse.com/docs/continuous-motion-recognition
- Recognize sounds from audio: https://docs.edgeimpulse.com/docs/audio-classification
- Adding sight to your sensors: https://docs.edgeimpulse.com/docs/image-classification
Can multiple sensor streams go into a real-time classifier?
Yes, this is called sensor-fusion, and yes, you can include data from multiple sensors into the same neural network. You can do this by combining data from multiple sensors in one labeled data sample. Then you can map the raw axes to different DSP blocks, and finally combine them again in a single neural network. Doing this is very straightforward in Edge Impulse as long as you make sure the data is combined.
To capture data from other sensors you can either modify the firmware for the ST IoT Discovery Kit (https://github.com/edgeimpulse/firmware-st-b-l475e-iot01a) or use the Data forwarder (https://docs.edgeimpulse.com/docs/cli-data-forwarder).
Do I need a USB cable, or can you also capture data from the BLE and WiFi modules on the ST IoT Discovery Kit?
The ST IoT Discovery board does support WiFi and the Edge Impulse client can optionally make use of it. When you connect the board to your computer for the first time it'll ask you if you want to set up a WiFi connection, and after that you can capture data wirelessly. To use BLE we'd need to a mobile app as well, which we don't have today.
Can complicate sensor fusion, e.g. HRV and EDA, be done on a single edge processor?
Yes, no problem at all. We have experience with complex HRV derived algorithms, and they can be combined with EDA in the same model if that makes sense for the application.
In order to run TinyML on edge devices, it has to monitor inputs and run the algorithm continuously, this requires more resources and devices can thus not be put in low power modes. How does this affect the applicability of running TinyML models on edge devices?
Indeed, inference takes energy while running, however we very rarely run it continuously. Inference on modern MCUs can be extremely fast. Our typical motion inference takes less than 10 ms. for 1s of sensor data, and thus you can collect data in a low power mode, then switch to a high power mode for a very brief time. Most applications also use triggering to turn on sensors to save power.
What is the key difference between FPGAs and microcontrollers for AI?
MCUs are general purpose processors, which execute software instructions. ML targeting an MCU is a software implementation that executes the ops. FPGAs can be used for just about anything, for example ML acceleration. Usually this still requires an MCU or CPU to control the overall ML algorithm, using the FPGA for math. FPGAs are more expensive, use considerably more power and are powerful, so they don’t usually compete.
Can you run algorithms in realtime on board X? What are the minimum requirements for these models?
Yes, Edge Impulse guides you to create algorithms that run in realtime on the board that we showed today. Typically you can run any vibration model on a Cortex-M0+, audio models on a Cortex-M4F, and vision models on a Cortex-M7. STM32Cube.AI and the ST FP-AI-VISION1 function pack can also predict latency and memory requirements per model.
I have short data samples that are hard to capture (like 'washing machine finished'). How could I sample this data?
Labeling shorter events like this is difficult. Our strategy is to be ready to capture a larger audio sample and edit it down. We are working on some new features to help automate the process of finding the meaningful events from sensor or audio training data to make this easier.
How does STM32Cube.AI compare to TensorFlow Lite for microcontrollers? Are you using CMSIS-NN?
TensorFlow Lite ships with an interpreter, so it has some runtime overhead; STM32Cube.AI applies optimization techniques to minimize the required RAM and ROM. . STM32Cube.AI is embedding CMSIS-NN and additional optimized kernels to ensure all silicon acceleration is used in an optimal manner.
For Edge Impulse we are agnostic to the actual neural network inferencing library. For STM32 we'll use STM32Cube.AI, for other Arm targets we'll use TFLite + CMSIS-NN, and for other targets we'll take what's fastest for those platforms. Additionally we use either CMSIS-DSP (if on an Arm core) or external DSPs to speed up signal processing and classical ML algorithms. They are just as important as the neural network!
What's Edge Impulse's pricing model? Is it free to use, or only coupled with certain hardware targets?
Edge Impulse is free to use on any hardware, and compatible with commercial use (no royalties and all exported code is open source under the Apache 2.0 license). There is an enterprise subscription available for customers going into production, with multiple engineers or large datasets.
Can I use spatial filtering on my images?
You can plug in custom processing blocks (https://docs.edgeimpulse.com/docs/custom-blocks) that do this. We currently use these blocks to reduce colour depth, but it's easy to plug in blocks that do more complex manipulations on image data.
Is fixed point inference (int8) the way forward for TinyML, or will there be coexistence with floating point (float32)?
Int8 is probably the future on MCUs. There's still a large drop in accuracy on complex models (e.g. vision) but quantization-aware training helps a lot. For now you can always select float32 in both Edge Impulse and STM32Cube.AI if you have hardware that's powerful enough, and the deployment tab in Edge Impulse shows guidance on latency and memory vs. accuracy on both models.
Where can we learn more about TinyML?
The tutorials at the top of the screen are a great start, but you can also get the TinyML book by Dan Situnayake and Pete Warden here: https://tinymlbook.com.
For complicated classifications (especially images), the NN is getting really big. What’s the memory footprint necessary?
For image data we try to guide you to first select the model size. We've trained a number of base models with different size requirements, and then can use transfer learning to retrain part of the network with your data. This works very well even on small datasets because the network has learned to generalize well already during the initial training. We can do realtime vision on color images in <512K of RAM, and can go lower when reducing color depth to grayscale only or to more optimized models.
Will you be supporting the upcoming Helium architecture (Arm Cortex-M55)? Are you excited about this (latency and energy consumption wise)?
Definitely. ST is working closely with ARM on these new architectures and is currently extending its MCU family with acceleration for AI. You can get in touch with Raphael at ST if you want to know more.
And from Edge Impulse's side, absolutely! I think having coprocessors in silicon will help tremendously in lowering latency and power consumption; and thus we can do much more on device. We see this already with other silicon vendors that ship specialized neural network accelerators, so having this wider available in an Arm core would be great.
Can you train models on this board?
No, you can only run inferencing. I think this will stay like this for a long time.
What kind of computer vision applications can be built?
The sky is sort of the limit. Edge Impulse does mostly image classification at this point, but ST has also done object detection on MCU. OpenMV has a toolkit (based on ST silicon) that brings the power of full OpenCV to your MCU even, so there's a lot to be found here!
Can you run TinyML models on a Raspberry Pi?
Yes, absolutely. The Raspberry Pi is super powerful, so you can run these models there. In Edge Impulse just head to Deployment, and export as C++ library or WebAssembly package. These can be compiled (C++) on the Pi, or just invoked from Node.js (WebAssembly).
How can you protect against incorrect classifications?
My suggestion would be to always use a 'sanity checking' step next to your ML model. That could f.e. be an anomaly detection block as we added in the webinar. ML models are not very resilient against adverserials (whether they are on purpose or by accident) and thus you need to double-check what you're seeing.
What will you suggest to improve the accuracy of speech recognition on microcontrollers?
- Add a lot of examples of the words that you're trying to recognize. 2. Add more data (especially of speech but without the word that you're hearing). 3. Add a lot of noise. Then during classification never trust a single window, but slide over the data and classify multiple times. That is getting much more robust. We're adding some things (like artificial noise, changing pitch and shift of samples, and this smoothing of windows) into Edge Impulse in the coming weeks.
Are we limited to just CNNs or can you also run other neural network architectures?
No, you're free in the architecture that you choose. We mostly use fully-connected networks and CNNs at this point, but you can click the three dots on the neural network page, select Switch to expert mode and you'll have the full Keras API available.
Is Edge Impulse dependent on ST microcontrollers, or can it run on other Cortex-M processors?
Edge Impulse runs on anything with a modern C++ compiler, but we can load additional optimizations when targeting STM32.
Can we use more than one model on the same MCU?
Yes, nothing stopping you from that (except compute power). You'll just call a single function to invoke the ML algorithm like you'd call any other function.
Any other questions? Ask them on the forums.