AutoTrain documentation

Object Detection

You are viewing main version, which requires installation from source. If you'd like regular pip install, checkout the latest stable version (v0.8.24).
Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

to get started

Object Detection

Object detection is a form of supervised learning where a model is trained to identify and categorize objects within images. AutoTrain simplifies the process, enabling you to train a state-of-the-art object detection model by simply uploading labeled example images.

Preparing your data

To ensure your object detection model trains effectively, follow these guidelines for preparing your data:

Organizing Images

Prepare a zip file containing your images and metadata.jsonl.

Archive.zip
β”œβ”€β”€ 0001.png
β”œβ”€β”€ 0002.png
β”œβ”€β”€ 0003.png
β”œβ”€β”€ .
β”œβ”€β”€ .
β”œβ”€β”€ .
└── metadata.jsonl

Example for metadata.jsonl:

{"file_name": "0001.png", "objects": {"bbox": [[302.0, 109.0, 73.0, 52.0]], "category": [0]}}
{"file_name": "0002.png", "objects": {"bbox": [[810.0, 100.0, 57.0, 28.0]], "category": [1]}}
{"file_name": "0003.png", "objects": {"bbox": [[160.0, 31.0, 248.0, 616.0], [741.0, 68.0, 202.0, 401.0]], "category": [2, 2]}}

Please note that bboxes need to be in COCO format [x, y, width, height].

Image Requirements

  • Format: Ensure all images are in JPEG, JPG, or PNG format.

  • Quantity: Include at least 5 images to provide the model with sufficient examples for learning.

  • Exclusivity: The zip file should exclusively contain images and metadata.jsonl. No additional files or nested folders should be included.

Some points to keep in mind:

  • The images must be jpeg, jpg or png.
  • There should be at least 5 images per split.
  • There must not be any other files in the zip file.
  • There must not be any other folders inside the zip folder.

When train.zip is decompressed, it creates no folders: only images and metadata.jsonl.

< > Update on GitHub