Know When the Dataset Is Ready

The single checklist that decides whether to train now or fix the dataset first.

There's a moment in every project when the team is itching to train. Sometimes that moment is right; sometimes the dataset still has issues that will waste a week of GPU time. The Dataset Readiness Checklist exists to make the call objective.

Outcome

Run a single readiness checklist on your dataset and decide — yes, fine-tune now; or no, fix X first.

Fast Track

If you already know your way around, here's the short version.

Eight items. Each must be a yes or a documented exception.
Any "no" is a fix-first item — not a tune-with-hyperparameters item.
Document the checklist outcome alongside the QC report.
Re-run the checklist before every retrain.

Hands-on

The Dataset Readiness Checklist

Ultralytics Platform dataset charts and statistics

Eight items. If all eight are yes (or have a documented exception), the dataset is ready to fine-tune. If any are no, fix them before training — it's almost always cheaper than tuning around them. The checklist gates the same set of decisions the data collection and annotation guide and the preprocessing annotated data guide cover individually.

#	Check	Pass criteria
1	Clear objective	Business goal, vision task, class list, success metric — all written down.
2	Documented classes	Each class has a one-paragraph definition with positive / negative examples.
3	Representative images	Dataset spans your scenarios; ≥ 50% from real production sources.
4	Sufficient volume	≥ 1500 images / class and ≥ 10,000 labeled instances / class (or documented exception).
5	Consistent annotations	Labeling guide written; calibration round done; inter-rater agreement ≥ 90%.
6	QC pass complete	All seven QC checks (lesson 5) pass; QC report written.
7	Clean splits	70/15/15 or equivalent; no scene / camera / day leakage; class proportions stratified.
8	Background images	0–10% of dataset is unlabeled "no objects" frames.

graph TD
    A[Run readiness checklist]
    A --> B{All 8 items<br/>green?}
    B -- yes --> C[Fine-tune YOLO26<br/>from pretrained weights]
    B -- no --> D[Identify failing item]
    D --> E{Which item?}
    E -- 1–2 --> F[Fix objective<br/>+ class definitions]
    E -- 3–4 --> G[Collect more /<br/>more diverse data]
    E -- 5 --> H[Re-label or<br/>recalibrate annotators]
    E -- 6 --> I[Re-run 7-check QC]
    E -- 7 --> J[Re-split by<br/>scene / camera / day]
    E -- 8 --> K[Add 0–10%<br/>background images]
    F --> A
    G --> A
    H --> A
    I --> A
    J --> A
    K --> A

    style B fill:#FF9800,color:#fff
    style C fill:#4CAF50,color:#fff
    style D fill:#F44336,color:#fff

What the checklist replaces

Without this checklist, the next steps in many projects are anything but checking the data:

"Let's try a bigger model." — won't fix label noise.
"Let's tune the learning rate." — won't fix class imbalance.
"Let's train for more epochs." — will overfit faster.
"Let's add more augmentation." — won't add scenarios that aren't there.

The checklist short-circuits all of these. If checklist item N is failing, the next action is fixing N — not tuning around it.

Common "almost ready" patterns

Three patterns that show up over and over:

Pattern 1: Volume but no variance

   Items 1–3 ✅   1500+ images? ✅
   Item 4 (volume) ✅   Item 5 (consistency) ✅
   Item 6 (QC) ❌  ← no edge cases, single-camera capture

Fix: collect from more cameras / sites / times before training. The 5,000 images you have aren't worth as much as 1,500 diverse ones.

Pattern 2: Variance but no consistency

   Items 1–4 ✅
   Item 5 (consistent annotations) ❌  ← inter-rater agreement 72%

Fix: write the labeling guide, run a calibration round, re-label or audit a sample. Train on inconsistent labels and the model learns the inconsistency.

Pattern 3: Quiet leakage

   Items 1–6 ✅
   Item 7 (clean splits) ❌  ← random per-image split on video data

Fix: re-split by scene / camera / day. Your val mAP will drop; that drop is reality.

When to ship despite a "no"

Sometimes shipping with a known gap is the right call — a v1 deployment, a prototype, a customer demo. In those cases:

Document which checklist item failed.
Document the expected impact (e.g. "no nighttime data; expect ≥ 5 mAP drop after dark").
Collect the missing data while v1 runs.

The checklist isn't a gate so much as a forcing function: if we ship now, what's the bug we're shipping with? Knowing the answer is the difference between a managed risk and a surprise.

Re-run the checklist every retrain

A dataset that was ready for v1 isn't automatically ready for v3. New classes, new collection, new annotators — each can break a previously-passing item. The checklist is cheap to re-run; the consequences of skipping it are not.

The checklist is a 10-minute action that prevents week-long GPU bills.

Try It

Run the eight-item checklist on your dataset. Score yes / no / exception for each. If anything is no, fix that first — before the first training run.

Done When

You've finished the lesson when all of these are true.

Every item is yes or has a documented exception.
The checklist outcome is committed to your project wiki / repo.
The team agrees the dataset is ready (or knows what's failing).

What's next

Ready. Time to fine-tune the first YOLO26 model on your data.