We are excited to announce the support of additional image dataset annotation formats in the Edge Impulse Studio Uploader and in the CLI Uploader.
Both the Studio Uploader and the CLI Uploader allow you to easily upload and manage your existing data samples and datasets to Edge Impulse Studio. With the addition of these new formats, you now have even more flexibility and options when working with your image datasets.
Our ingestion service currently supports various file types, including .cbor
, .json
, .csv
, .wav
, .jpg
, .png
, .mp4
, and .avi
.
These file types cover a wide range of data formats, making it convenient to upload different types of data samples and datasets.
In addition to file types, the Uploader also supports a variety of dataset annotation formats. The new annotation formats provide enhanced capabilities for image labeling and object detection tasks.
Here are the new image dataset annotation formats available:
COCO JSON: The COCO JSON format is widely used for object detection tasks. It provides detailed information about labeled objects and their corresponding bounding boxes, allowing you to precisely define and annotate objects within your image datasets.
Pascal VOC XML: The Pascal VOC XML format is another popular format primarily used for object detection tasks. It follows an XML-based structure and provides labeling information for objects in images.
YOLO TXT: The YOLO TXT format is compatible with most YOLO object detection frameworks. It offers a text file structure that defines bounding boxes and class labels for objects in images.
Open Images CSV: The Open Images CSV format is specific to the Open Images dataset, which is a large collection of annotated images. It follows a CSV structure and provides labeling information for objects in images, including class labels, bounding box coordinates, and additional attributes.
Plain CSV: The Plain CSV format allows you to annotate your image datasets using a simple CSV file. Each row in the CSV file represents an image, and the columns contain information such as the image file path, label, and bounding box coordinates if applicable.
These new dataset annotation formats expand the capabilities of the Uploader, enabling you to work with diverse labeling and object detection requirements. Whether you are training models for image classification, object detection, or visual relationship detection, these formats provide the necessary structure and labeling information for your datasets.
Note that as of today, Edge Impulse only supports bounding boxes for object labels. Segmentation masks or multi-point polygons are not supported.
To take advantage of these new annotation formats, simply navigate to the Data acquisition page in Edge Impulse Studio and access the Uploader. From there, you can select the desired format when uploading your image datasets. The Uploader will automatically detect the format whenever possible, ensuring a smooth and efficient data ingestion process.
You can also use it with our CLI Uploader. We will try to automatically detect the dataset annotation format, if we cannot detect it, the uploader will output the list of formats.
edge-impulse-uploader --directory {path} --dataset-format {name}
Want to try it yourself? We put together these “Cubes on a conveyor belt” datasets available in Egde Impulse Object Detection, COCO JSON, OpenImage CSV, Pascal VOC, Plain CSV, YOLOv5 TXT formats.
We understand that every project is unique, and sometimes the available annotation formats may not fully meet your requirements. In such cases, you can explore the Transformation blocks feature in Edge Impulse Studio. Transformation blocks allow you to parse your data samples and create custom datasets that are fully compatible with Edge Impulse. This offers a high degree of flexibility when working with unique data formats and structures.
With the release of these new dataset annotation formats, we aim to empower you with more options and possibilities when working with your image datasets in Edge Impulse Studio. We are committed to continuously enhancing our platform and providing you with the tools you need to succeed in your machine-learning projects.
If you have any questions or need assistance, our documentation provides detailed information on using the Uploader and the dataset annotation formats. Additionally, our developer relations team and our community are always ready to support and collaborate with you on your machine-learning journey through our forum.