Meet YOLO26: next-gen vision AI.
Build with Ultralytics Platform·Deploy and Monitor·Lesson 10/10
Lessonintermediate

From Prototype to Pipeline

The repeating shape of a CV team's quarter — and the artifacts that make it sustainable.

After the first deployment, the project enters a steady state. Every month-or-so cycle: drift hits, you collect data, retrain, validate, redeploy. Done well — and the Ultralytics customers page is full of teams running this loop quietly — this is calm and repeatable. Done badly, it's a panic on a dashboard. The difference is mostly in artifacts you set up before you're under pressure.

Outcome

Set up the artifacts and rituals that make a CV project sustainable for a year — not just shipped once.

Fast Track
If you already know your way around, here's the short version.
  1. Holdout set, refreshed quarterly.

  2. Drift dashboards with named owners.

  3. Versioned datasets, versioned deployments, traceable through to runs.

  4. A weekly review ritual.

Hands-on

Link to this sectionThe artifacts#

Ultralytics Platform quickstart training charts

Five things every long-running CV project should have. Platform makes most of them automatic — but they need someone explicitly responsible.

Link to this section1. The holdout set#

200–500 production-realistic labeled frames. Refresh quarterly. The single most reliable model monitoring accuracy signal.

Link to this section2. The drift dashboard#

Confidence histogram, object size, detection volume — three signals, plotted weekly. Owner: someone who looks at it.

Link to this section3. The MLOps retraining runbook#

A document, not a culture. When drift fires, what does the on-call do? Step by step. Half a page is plenty.

# Retraining runbook

## Trigger
- Holdout mAP drops by > 1.0 point for 3 consecutive weeks, OR
- Confidence-distribution KS p < 0.01 + per-class AP regression on holdout.

## Steps
1. Collect the last 30 days of production frames outside Platform.
2. Pre-filter with current best.pt (event-driven sampling) and upload only survivors at 1 fps into a new Platform dataset.
3. Auto-annotate locally with current best.pt @ conf=0.5, import the labels, and route them to the review queue.
4. Review until drift-class edit rate stabilizes (~200 frames/class typical).
5. Promote new annotations into a new dataset version (v_(n+1)).
6. Train from v_n best.pt with lr0=0.001 for 50 epochs.
7. Validate against holdout AND drift-flagged frames.
8. Deploy with 10%/50%/100% rollout over 24 hours.
9. Update holdout with 50 newly labeled drift-frame examples.

## Owner
- ML lead is the DRI.
- On-call runs the playbook; ML lead reviews the v_(n+1) deployment before 100%.

Link to this section4. Versioned everything#

Every deployment traces back to:

Platform does this automatically. The discipline is to use the trace — when accuracy drops, reach for the deployment log first, not the source code.

Link to this section5. The weekly review#

15 minutes, on a recurring calendar invite. Two questions:

  1. Anything red on the dashboards?
  2. Did anything change in the deployment, and if so, why?

Most weeks the answers are "no" and "no" and the meeting takes 4 minutes. The discipline is the calendar, not the meeting.

Link to this sectionThe shape of a quarter#

A typical CV quarter on Platform looks like this:

   Week 1  : Drift alert. Sample, label, retrain. Deploy v_(n+1).
   Week 2-3: Quiet. Dashboards green.
   Week 4  : Add a new class. Schema migration on dataset, retrain v_(n+2).
   Week 5-7: Quiet.
   Week 8  : Camera fleet expansion. New region, new dataRegion concerns. Migrate.
   Week 9-11: Quiet.
   Week 12 : Quarterly holdout refresh + review.

Two retrains, one schema change, one infrastructure change, one review per quarter. That's the steady state of a healthy CV project.

Link to this sectionWhere projects get stuck#

Patterns that mean a project is in trouble:

  • No holdout set. "We can't tell if it's getting better." Build one immediately and revisit the model monitoring guide.
  • No drift dashboard. "Production seems fine?" Seems is doing too much work — see the continual learning glossary entry for the loop you're missing.
  • No naming convention. "Which checkpoint is in prod?" If you can't answer in 30 seconds, the deployment isn't traceable. The Platform quickstart shows the run-naming pattern that scales.
  • Retraining is ad hoc. Every retrain is an event; the runbook prevents it from being a crisis. Fold hyperparameter tuning into the runbook so each iteration is also a controlled experiment.
  • One person knows everything. Bus factor 1. Documentation isn't optional.

Each of these is fixable in a sprint. The project that has all five fixed by month 2 is the one still healthy in month 12. For self-hosted serving along the way, Triton Inference Server is the canonical handoff target.

Link to this sectionYou've finished the course#

You now have:

  • A trained, validated model.
  • An exported, optimized runtime.
  • Real-time tracking and pipelines.
  • A multi-stream architecture.
  • A managed Platform deployment.
  • Monitoring and a retraining runbook.

That's a complete production CV system, end to end. The next time someone asks you to "ship a CV model," you have a checklist for the whole quarter — and the artifacts to keep it healthy.

Try It

Write your project's retraining runbook in the format above. Half a page. Include the trigger, the steps, and the owner. The exercise of writing it is half the value.

Commit
git add -A && git commit -m "docs(academy): completed Build with Ultralytics Platform"
Done When
You've finished the lesson when all of these are true.
  • Your project has a holdout set, a drift dashboard, and a retraining runbook.

  • Each deployment is traceable to a run + dataset version.

  • You have a recurring weekly review on someone's calendar.

What's next

Course complete — take the final exam to earn the Build with Ultralytics Platform certificate.