Building High-Performance YOLO Datasets·Train and Iterate·Lesson 9/10

Lessonbeginner

First Fine-Tune and the Iteration Loop

Use pretrained YOLO26 weights, train with defaults, and let the validation results tell you what to fix in the dataset.

Fine-tuning means starting from pretrained weights instead of random initialization — it converges faster and needs less data. With a ready dataset, the first run is cheap and informative. The point isn't to win on the first try; it's to surface what the dataset still needs.

Outcome

Fine-tune a pretrained Ultralytics YOLO model on your dataset, read the validation artifacts honestly, and use them to plan dataset improvements.

Fast Track

If you already know your way around, here's the short version.

Always start from pretrained weights — pick the variant that matches your task: yolo26n.pt (detect), yolo26n-seg.pt (segment), yolo26n-pose.pt (pose), yolo26n-obb.pt (OBB), yolo26n-cls.pt (classify).
Train with defaults first — establish a baseline.
Read the per-class AP and confusion matrix; let them point at dataset issues.
Iterate on data before tuning hyperparameters.

Hands-on

The minimal first run

Ultralytics Platform training charts and metrics

Once the readiness checklist passes, the first training run is three lines. Pick the YOLO26 variant that matches your task:

Task	Pretrained checkpoint
Detect (default below)	`yolo26n.pt`
Segment	`yolo26n-seg.pt`
Pose	`yolo26n-pose.pt`
OBB	`yolo26n-obb.pt`
Classify	`yolo26n-cls.pt`

from ultralytics import YOLO

model = YOLO("yolo26n.pt")     # pretrained on COCO — fine-tune from here
results = model.train(
    data="data.yaml",
    epochs=100,
    imgsz=640,
)

For other tasks, swap in the matching checkpoint above (e.g. YOLO("yolo26n-seg.pt") for segmentation). That's the whole script. The fine-tuning guide covers what's actually happening:

Backbone and neck weights transfer from pretrained features (COCO for detect / segment / pose, ImageNet for classify, DOTAv1 for OBB).
The task-specific head is partially reinitialized to match your class count.
Training adapts the model to your classes in a fraction of the time training-from-scratch would take.

Why fine-tune (and not train from scratch)

	Fine-tuning	From scratch
Starting weights	Pretrained on COCO (80 classes)	Random
Convergence	Faster — backbone is already trained	Slower — all layers learn from zero
Data requirements	Lower	Higher
When to use	Custom classes with natural images	Domains fundamentally different from COCO (medical, satellite, radar)

For 95% of enterprise projects, fine-tuning from the matching nano checkpoint is the right starting point (yolo26n.pt for detect, yolo26n-seg.pt for segment, etc.) — even if you eventually upgrade to m or x sizes. Start small, get a baseline, then scale.

Train with defaults first

The Ultralytics tips guidance is unambiguous: establish a baseline before changing anything. Defaults are tuned across thousands of datasets; the chance that your first instinct beats them on lap one is low. Three knobs you'll touch before defaults:

Argument	Default	When to change
`epochs`	100	Lower for tiny datasets to avoid overfitting; higher (200–300) for very large ones
`imgsz`	640	Bump to 1024 / 1280 if many of your objects are small in frame
`batch`	-1 (auto)	Manually if you're hitting OOM or want batch-norm stability

Skip everything else for the first run.

Reading the run

After training finishes, you'll have a directory like runs/detect/train/ containing:

File	What it tells you
`weights/best.pt`	Best validation-mAP checkpoint
`results.png`	Loss + metric curves over training
`confusion_matrix_normalized.png`	Where classes get confused
`PR_curve.png`	Per-class precision / recall
`val_batch*_pred.jpg`	Visual: predictions on a val batch

What healthy looks like at 30+ epochs:

Train and val loss both descending (not diverging — that's overfitting).
Val mAP@0.5:0.95 climbing past 0.4 (varies by task).
Per-class AP not too unbalanced.
Sample predictions visually correct.

If any of those don't hold, the performance metrics guide is the right next read.

The dataset-improvement loop

The most important habit: when val results disappoint, look at the data first.

                   ┌──────────────┐
                   │   train run  │
                   └──────┬───────┘
                          ▼
                  ┌──────────────────┐
                  │ confusion matrix │
                  │   per-class AP   │
                  └────────┬─────────┘
                           ▼
              ┌─────────────────────────┐
   yes        │  is the bottleneck a    │       no
   ┌──────────│  specific class /       │──────────┐
   │          │  scenario / labeling?   │          │
   ▼          └─────────────────────────┘          ▼
 collect more                              tune hyperparameters,
 / re-label /                              try a larger model,
 fix the spec                              or augmentation

Almost every "model is bad" diagnosis ends in dataset fixes. The fine-tuning guide has a troubleshooting matrix worth bookmarking.

A typical first iteration loop:

Train with defaults. Get a baseline mAP per class.
Find the worst class on the confusion matrix.
Audit 50 examples of that class. Usually you'll find under-representation, mislabeled instances, or unusable image quality.
Fix the dataset (collect, re-label, drop bad images).
Retrain — same hyperparameters, just better data.
Compare. The improvement should be much bigger than any tuning would have produced.

When to reach for hyperparameters

After 2–3 dataset iterations, when the dataset is genuinely strong, then hyperparameters can buy a few mAP points. Read the fine-tuning guide for layer freezing, two-stage fine-tuning, and optimizer choice; the YOLO performance metrics guide covers the metrics you'll use to compare runs.

But don't get there before the dataset is right. Most teams skip the dataset loop and tune for weeks, when half a day of data work would have gotten them further.

Where to go next

Once your fine-tune produces a model that meets the success metric:

Productionize it. The Academy's Train your first YOLO model and Ultralytics YOLO in Production courses cover export (ONNX, TensorRT, OpenVINO, CoreML), tracking, and multi-stream serving.
Deploy it. Build with Ultralytics Platform covers managed cloud training, dedicated endpoints, and monitoring.
Plan the retrain. Production data drifts. The retraining loop in Build with Ultralytics Platform (lessons 9–10) shows the operational shape.

Try It

Run the three-line fine-tune on your dataset. After it finishes, open the confusion matrix and the per-class AP. Pick the worst class. Audit 50 examples of it. Write down what you find — that's the next dataset improvement, not the next hyperparameter tweak.

Done When

You've finished the lesson when all of these are true.

Your first fine-tune completes and produces a best.pt.
You've identified the worst-performing class from the confusion matrix.
You've audited that class and have a concrete dataset action.
You haven't touched any hyperparameter yet — you're iterating on data.

Show solution

from ultralytics import YOLO

model = YOLO("yolo26n.pt")
results = model.train(
    data="data.yaml",
    epochs=100,
    imgsz=640,
)

# Inspect val results
metrics = model.val()
print(f"mAP@0.5:0.95 = {metrics.box.map:.3f}")

# Per-class AP — this is what tells you what to fix in the data
for cls_id, cls_name in model.names.items():
    ap = metrics.box.maps[cls_id]
    print(f"  {cls_name:>15s}  AP={ap:.3f}")

What's next

One more lesson: a reusable client checklist for every future dataset.