Know When the Dataset Is Ready
The single checklist that decides whether to train now or fix the dataset first.
There's a moment in every project when the team is itching to train. Sometimes that moment is right; sometimes the dataset still has issues that will waste a week of GPU time. The Dataset Readiness Checklist exists to make the call objective.
Run a single readiness checklist on your dataset and decide — yes, fine-tune now; or no, fix X first.
Eight items. Each must be a yes or a documented exception.
Any "no" is a fix-first item — not a tune-with-hyperparameters item.
Document the checklist outcome alongside the QC report.
Re-run the checklist before every retrain.
Hands-on
The Dataset Readiness Checklist

Eight items. If all eight are yes (or have a documented exception), the dataset is ready to fine-tune. If any are no, fix them before training — it's almost always cheaper than tuning around them. The checklist gates the same set of decisions the data collection and annotation guide and the preprocessing annotated data guide cover individually.
| # | Check | Pass criteria |
|---|---|---|
| 1 | Clear objective | Business goal, vision task, class list, success metric — all written down. |
| 2 | Documented classes | Each class has a one-paragraph definition with positive / negative examples. |
| 3 | Representative images | Dataset spans your scenarios; ≥ 50% from real production sources. |
| 4 | Sufficient volume | ≥ 1500 images / class and ≥ 10,000 labeled instances / class (or documented exception). |
| 5 | Consistent annotations | Labeling guide written; calibration round done; inter-rater agreement ≥ 90%. |
| 6 | QC pass complete | All seven QC checks (lesson 5) pass; QC report written. |
| 7 | Clean splits | 70/15/15 or equivalent; no scene / camera / day leakage; class proportions stratified. |
| 8 | Background images | 0–10% of dataset is unlabeled "no objects" frames. |
graph TD
A[Run readiness checklist]
A --> B{All 8 items<br/>green?}
B -- yes --> C[Fine-tune YOLO26<br/>from pretrained weights]
B -- no --> D[Identify failing item]
D --> E{Which item?}
E -- 1–2 --> F[Fix objective<br/>+ class definitions]
E -- 3–4 --> G[Collect more /<br/>more diverse data]
E -- 5 --> H[Re-label or<br/>recalibrate annotators]
E -- 6 --> I[Re-run 7-check QC]
E -- 7 --> J[Re-split by<br/>scene / camera / day]
E -- 8 --> K[Add 0–10%<br/>background images]
F --> A
G --> A
H --> A
I --> A
J --> A
K --> A
style B fill:#FF9800,color:#fff
style C fill:#4CAF50,color:#fff
style D fill:#F44336,color:#fffWhat the checklist replaces
Without this checklist, the next steps in many projects are anything but checking the data:
- "Let's try a bigger model." — won't fix label noise.
- "Let's tune the learning rate." — won't fix class imbalance.
- "Let's train for more epochs." — will overfit faster.
- "Let's add more augmentation." — won't add scenarios that aren't there.
The checklist short-circuits all of these. If checklist item N is failing, the next action is fixing N — not tuning around it.
Common "almost ready" patterns
Three patterns that show up over and over:
Pattern 1: Volume but no variance
Items 1–3 ✅ 1500+ images? ✅
Item 4 (volume) ✅ Item 5 (consistency) ✅
Item 6 (QC) ❌ ← no edge cases, single-camera captureFix: collect from more cameras / sites / times before training. The 5,000 images you have aren't worth as much as 1,500 diverse ones.
Pattern 2: Variance but no consistency
Items 1–4 ✅
Item 5 (consistent annotations) ❌ ← inter-rater agreement 72%Fix: write the labeling guide, run a calibration round, re-label or audit a sample. Train on inconsistent labels and the model learns the inconsistency.
Pattern 3: Quiet leakage
Items 1–6 ✅
Item 7 (clean splits) ❌ ← random per-image split on video dataFix: re-split by scene / camera / day. Your val mAP will drop; that drop is reality.
When to ship despite a "no"
Sometimes shipping with a known gap is the right call — a v1 deployment, a prototype, a customer demo. In those cases:
- Document which checklist item failed.
- Document the expected impact (e.g. "no nighttime data; expect ≥ 5 mAP drop after dark").
- Collect the missing data while v1 runs.
The checklist isn't a gate so much as a forcing function: if we ship now, what's the bug we're shipping with? Knowing the answer is the difference between a managed risk and a surprise.
Re-run the checklist every retrain
A dataset that was ready for v1 isn't automatically ready for v3. New classes, new collection, new annotators — each can break a previously-passing item. The checklist is cheap to re-run; the consequences of skipping it are not.
The checklist is a 10-minute action that prevents week-long GPU bills.
Run the eight-item checklist on your dataset. Score yes / no / exception for each. If anything is no, fix that first — before the first training run.
Every item is yes or has a documented exception.
The checklist outcome is committed to your project wiki / repo.
The team agrees the dataset is ready (or knows what's failing).
Ready. Time to fine-tune the first YOLO26 model on your data.