Skip to main content

Dataset Management

The Dataset tab provides tools for importing, organizing, labeling, and splitting image data used for model training, validation, and testing.


1. Image Import

Access the Dataset tab to manage training data.

Training Data View

Import Methods

MethodDescription
Drag & DropImport image files directly into the workspace
Import Files / Import FoldersLoad unlabeled image data from the local filesystem
Import DatasetLoad a pre-labeled dataset in a supported annotation format
Camera ToolCapture images directly from connected cameras (multi-camera supported)
Spectrogram GeneratorConvert audio or CSV files into spectrogram images

Supported Annotation Formats

Dataset Import

FormatDescription
YOLO.txt annotation files with normalized bounding box coordinates
COCOJSON annotation file with bounding boxes and segmentation masks
Pascal VOCXML annotation files per image
ClassificationDirectory-per-class folder structure
PNG Mask (Semantic)Per-image PNG masks where pixel values encode class IDs — for semantic segmentation datasets
Example Datasets

Try one of these demo datasets to get started quickly:

Camera Tool Integration

Camera Tool Capture

The integrated Camera Tool captures images directly from connected cameras. Images are automatically timestamped, and multi-camera simultaneous capture is supported.

Spectrogram Generator

Spectrogram Generator

Converts audio or CSV files into spectrogram images for audio/time-series classification tasks.

OptionDescription
Target splitTrain, test, or validation directory
Sampling rateMust match the source data's sampling rate
Synthetic Training Data

When real training data is scarce (edge cases, rare objects, privacy constraints), synthetic data generation via Rendered.ai provides a scalable alternative.


2. Label Modes

Label Mode

Choose a label mode based on your task:

ModeAnnotationUse Case
ClassesOne or more class labels per imageQuality control, sorting, counting
ObjectsBounding boxes with class labelsLocating and identifying multiple objects
SegmentationPixel-precise masks per objectSurface defects, medical imaging, precise region delineation

Converting Labels to Classification

Switching the label mode from Objects or Segmentation to Classes converts the dataset at training time (annotation files on disk are not modified). For finer control, keep the original label mode and select a Prediction Type or Segmentation Type in the Model Settings tab:

Prediction TypeBehavior
All Present Object-ClassesSelects all classes present in the image
Class with Largest Combined Object AreaSelects the class with the largest total bounding box area
Class with Most ObjectsSelects the most frequently occurring class
At Least One Object? (Y/N)Binary: any object present or not

3. Dataset Splitting

Images are organized into three splits:

SplitPurposeRecommendations
Training (~70%)Primary data the model learns fromMinimum 50 images per class
ValidationMonitors performance on unseen data during training; labels required20–30% of training set if no separate images available
TestFinal evaluation after training; should represent deployment conditionsLabels optional but recommended for quantitative metrics

Validation Set Settings

If no separate validation images are available, enable Use Validation Split to automatically partition the training set:

Dataset SizeRecommended Split
Small30%
Standard20%
Large10%

Test Set Settings

note

If no separate test set is available, the validation set can be used for final evaluation. Because ONE AI uses the validation set only for early stopping (not hyperparameter tuning), results will be reasonably representative.


4. Labeling

Open the Labels tab to define class labels. Each label can be assigned a unique color for visual distinction in the annotation tool.

Classification

Classification Annotation

Select or deselect class labels for each image using the label checklist. Multiple labels can be assigned simultaneously for multi-label classification tasks.

Object Detection

Object Detection Annotation

Draw bounding boxes around objects and assign a class label to each box.

ShortcutToolDescription
RRectangleDraw a bounding box around an object
CCursorSelect, move, and resize existing boxes
DeleteDeleteRemove the selected bounding box

Segmentation

Segmentation Annotation

Create pixel-precise masks to delineate object boundaries.

ShortcutToolDescription
BBrushPaint mask regions directly onto the image
SSmartFillDraw a rectangle — SAM automatically fills the enclosed mask
EEraserRemove parts of existing masks
ControlRangeDescription
Brush size4–120 pxAdjustable via slider
Opacity0–100%Controls mask overlay transparency

5. AI-Assisted Annotation

Speed up labeling by using a trained model or Segment Anything (SAM) to predict annotations automatically.

Using Your Own ONE AI Model

Auto labeling works for all three annotation modes:

ModeOutput
ClassificationPredicted class labels
Object DetectionPredicted bounding boxes with class labels
SegmentationPredicted pixel masks per class

Requirements: An exported ONNX model (downloaded via the Exports tab).

Workflow:

  1. Open the Dataset tab → click an image to open the annotation tool
  2. Click the arrow in the top-right corner → select your ONNX model → click +
  3. Configure Minimum Confidence threshold for predictions
  4. Click the AI button to apply predictions to the current image

Generating New Annotations

For object detection, an IoU merge threshold prevents duplicate detections when using multiple models. For segmentation, results from multiple models are automatically fused.

Predictions can be discarded with the Reset button or manually corrected.

Using Segment Anything (SAM)

Segment Anything provides AI-assisted annotation using text prompts — no prior training required. SAM is available in both Object Detection and Segmentation modes:

  • Object Detection — click Auto-Segment with SAM to generate bounding boxes
  • Segmentation — use the SmartFill tool (S) to draw a rectangle and let SAM predict the mask, or click Auto-Segment with SAM for full-image segmentation

See the Segment Anything guide for detailed setup and usage instructions.

Iterative Annotation

Annotate a subset manually → train a model → use that model to assist annotation of remaining data. Repeat to progressively improve prediction quality.

Christopher - Development Support

Need Help? We're Here for You!

Christopher from our development team is ready to help with any questions about ONE AI usage, troubleshooting, or optimization. Don't hesitate to reach out!

Our Support Email:support@one-ware.com