How to Generate Hours of Synthetic Data in Minutes with MATLAB

MATLAB is a well-known tool many of us encountered when studying engineering. It’s often used for tasks like data visualization and simulation. However, it’s also a powerful platform for synthetic data generation, especially for time-series and continuous motion scenarios.

In this post, I will explore how MATLAB’s comprehensive toolboxes and MATLAB Online environment can be leveraged to create realistic motion data, add noise, and prepare 8 hours of labeled data for Edge Impulse without installing MATLAB locally. Read our tutorial for the complete steps.

8 hours of synthetic data

Why generate synthetic data?

Getting enough real data can be time-consuming when building edge AI applications, especially those involving motion (e.g., gestures, wave patterns, or machine movement).

Synthetic data:

Basic example: Continuous motion recognition

Consider one of our common projects “continuous motion,” which uses multiple motion classes, such as “idle,” “snake,” and “up-down” patterns. To recreate this in MATLAB to go from 8 mins of manually collected data to 8 hours, we need to start by:

  1. Defining the time series length and sampling rate.
  2. Combining waveforms (sine, square, sawtooth) to simulate different motions.
  3. Adding realistic noise using functions like randn to imitate sensor interference.
Continuous motion recreated in MATLAB

Once this data is exported and downloaded to our computer we can get on to the next step.

Importing to your Edge Impulse project

Once you’ve generated your motion signals and added labels:

  1. Save data as CSV — where each class is separated or labeled appropriately.
  2. Go to Edge Impulse StudioData AcquisitionCSV Wizard.
  3. Upload the CSV files to create labeled time-series samples in Edge Impulse.

This workflow will easily get you started importing your MATLAB or other CSV-based datasets to Edge Impulse.

Continuous motion with 8 hours of data

Read on in our tutorial for a deeper look at the MATLAB script.

Ok, now let's think about scaling up to bigger datasets for more robust training or applying MATLAB custom DSP, another area where MATLAB excels. DSP (digital signal processing) refers to the manipulation and transformation of signals to simulate real-world variations — enabling the generation of synthetic sensor data that mimics different operating conditions. With MATLAB DSP you can loop through multiple parameter variations, such as rotational speeds, fault severities, and sensor noise, creating a large, diverse dataset in a fraction of the time it would take to acquire real data.

Real-world example: Rolling element bearing fault diagnosis

Bearing wear fault detection dataset creation with MATLAB

If you are looking for more advanced DSP algorithms, see the MATLAB guide on “Rolling Element Bearing Fault Diagnosis.” We can find more advanced techniques for identifying faults in rolling element bearings using acceleration signals, even when strong masking signals from other machine components are present.

Many of the features extracted are already available via our Spectral features block; however, if you wish to use MATLAB dedicated bearing wear analysis or other MATLAB-specific DSP, you can already do that by using our sample MATLAB custom processing block by following the dedicated git repository README.

For more information, see the full MATLAB article or review our public project on bearing wear analysis for a sample of bearing wear dataset.

This dataset can be cloned or uploaded to Edge Impulse and used to train models for detecting wear and potential failures

Summary

MATLAB is more than just a scientific computing tool; it’s a powerful platform for DSP and synthetic data, and it builds a base for ML. I would encourage you to explore it as a data source for your Edge Impulse project.

By using MATLAB, you can:

Ready to explore the full potential of MATLAB in AI? Check out our MATLAB tutorial and start building your own industrial edge AI application.

There have been many recent advancements, and I am just beginning to re-explore the MATLAB toolbox. If you have any interesting use cases or experience with MATLAB, please share in the comments below.

Comments

Subscribe

Are you interested in bringing machine learning intelligence to your devices? We're happy to help.

Subscribe to our newsletter