Skip to content

A Python tool for managing YOLO datasets, including YOLOv5, YOLOv8, YOLOv11 and other Ultralytics-supported models. It streamlines tasks like dataset combination, data augmentation, class removal, and annotation visualization supports bounding box and segmentation formats, making it an essential tool for developers and researchers.

Notifications You must be signed in to change notification settings

alireza-py/YoloDataHelper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

32 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Welcome to YoloDataHelper πŸ’‘

YoloDataHelper is a small Python utility to process YOLO(you only look once) datasets. This is a utility tool for merging datasets, augmenting data, removing classes, visualizing annotations, and other operations that make working with YOLO datasets easier by developers and researchers.

πŸ› οΈ Features //

1. Dataset Combination

  • Combine multiple YOLO datasets while properly aligning classes and adjusting label IDs.
  • Retain the original structure of the datasets and generate a unified data.yaml file.

2. Data Augmentation

  • Apply various transformations to YOLO dataset images, such as:
    • Hue, saturation, and brightness adjustments.
    • Contrast enhancement.
    • Adding random noise.
    • Color jittering.
  • Generate augmented images with updated labels.

3. Class Removal

  • Remove specific classes from the dataset and their associated images and labels.
  • Automatically adjust class IDs and update the data.yaml file accordingly.

4. Annotation Visualization

  • Display bounding boxes or segmentation masks over images for easy verification.
  • Save annotated images to a specified output directory.

5. Classes Equalization

  • Balance the number of images per class to ensure a uniform distribution.
  • Adjust the dataset to prevent class imbalance issues.

6. Dataset Validation

  • Ensure the presence of the necessary directories (train, valid, test) and their subfolders (images, labels).
  • Automatically create any missing directories if they don’t exist.

7. Resize Options

  • Compression: Resize with compressing images
  • Advanced_compression: Resize with advanced compressing images
  • Crop: Resize with cropping images
  • Advanced_crop: Resize with advanced cropping images

πŸ“¦ Installation //

Clone the Repository

To get started, first clone the repository:

git clone https://github.com/alireza-py/YoloDataHelper.git
cd YoloDataHelper

Install Dependencies

Install the necessary dependencies using pip:

pip install -r requirements.txt

πŸš€ How to Use //

Run Directly To use the tool as a standalone application, simply run the main.py file:

python main.py

1. Dataset Combination

Combine multiple YOLO datasets into a unified dataset:

from YoloDatasetsTools import DatasetProcessor

datasets = ["path/to/dataset1", "path/to/dataset2"]
output_path = "path/to/combined_dataset"

processor = DatasetProcessor(output_path)
processor.combine_datasets(datasets)

2. Data Augmentation

Apply data augmentation to a dataset:

from YoloDatasetsTools import DatasetProcessor

output_path = "path/to/augmented_dataset"
augmentation_params = {
     'hue': (-10, 10),
     'saturation': (0.7, 1.3),
     'brightness': (0.7, 1.3),
     'contrast': (0.8, 1.2),
     'noise': (10, 50),
     'color_jitter': (0.9, 1.1)
}
processor = DatasetProcessor(output_path, augmentation_params=augmentation_params, multiplier=3)

processor.process_folder(input_folder="path/to/dataset")

4. Annotation Visualization

Visualize bounding boxes or segmentation masks:

from YoloDatasetsTools import DatasetProcessor

output_path = "path/to/visualized_dataset"
processor = DatasetProcessor(output_path)

processor.visualize_annotations(dataset_folder="path/to/dataset")

5. Classes Equalization

from YoloDatasetsTools import DatasetProcessor

cleaner = DatasetCleaner(dataset_folder="path/to/dataset")

cleaner.classes_equalization(subset=["train", "valid", "test"])

6. Directory Validation

Ensure required directories (train, valid, test) and their subfolders exist:

from YoloDatasetsTools import DatasetProcessor

dataset_path = "path/to/dataset"
processor = DatasetProcessor(dataset_path)

processor.ensure_dataset(dataset_path)

7. Resize Options

from YoloDatasetsTools import DatasetProcessor

dataset_path = "path/to/dataset"
output_path = "path/to/output_dataset"
size = (720, 340)
mode = "advance_crop"

processor = DatasetProcessor(dataset_path)

processor.process_resize_and_crop(
  dataset_path, 
  output_path, 
  size,
  mode
  )

πŸ“š Directory Structure //

This tool assumes the following directory structure for YOLO datasets:

dataset/
  β”œβ”€β”€ train/
  β”‚   β”œβ”€β”€ images/
  β”‚   └── labels/
  β”œβ”€β”€ valid/
  β”‚   β”œβ”€β”€ images/
  β”‚   └── labels/
  └── test/
  β”‚   β”œβ”€β”€ images/
  β”‚   └── labels/
  └── data.yaml

The data.yaml file should include:

  • train, val, and test: Paths to the respective datasets.
  • nc: The number of classes.
  • names: A list of class names.

πŸ’₯ Contributing //

Contributions are welcome! If you'd like to contribute to YoloDataHelper, you can:

  • Fork the repository.
  • Create a new branch for your feature or bug fix.
  • Commit your changes and push the branch.
  • Open a pull request with a description of the changes.
  • If you encounter any issues, feel free to open an issue in the repository.

About

A Python tool for managing YOLO datasets, including YOLOv5, YOLOv8, YOLOv11 and other Ultralytics-supported models. It streamlines tasks like dataset combination, data augmentation, class removal, and annotation visualization supports bounding box and segmentation formats, making it an essential tool for developers and researchers.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy