Skip to main content
Practitioner pathwayintermediate ~5 hours 9 lessons Final exam · Certificate

From best.pt to a real, observable, optimized system

Ultralytics YOLO in Production

Take a trained Ultralytics YOLO model out of the lab and into production — runtimes, optimization, tracking, multi-stream inference, and the observability you need to keep it healthy.

By Ultralytics Academy

Multi-object tracking in production
What you'll learn
Pick a runtime, optimize the model for it, run multiple streams in parallel, build practical pipelines (counting, heatmaps), and observe the system in production.
  • Pick a deployment target and the matching runtime — ONNX Runtime, TensorRT, OpenVINO, CoreML.

  • Optimize for latency with FP16, INT8, and dynamic batch sizing.

  • Track objects across frames with ByteTrack or BoT-SORT and use IDs in pipelines.

  • Build counting, heatmap, and speed-estimation solutions on top of detection.

  • Run multiple camera streams concurrently without dropping frames.

  • Observe accuracy, latency, and drift in production.

What you'll build
  • An optimized engine (TensorRT or OpenVINO) for your target hardware with verified parity to the .pt model.

  • A counting pipeline with line crossing or zone occupancy on top of tracking.

  • A multi-stream service that processes 4+ camera feeds concurrently.

  • A monitoring dashboard wired to drift / latency / detection volume metrics.

Prerequisites
  • A trained model (or a pretrained Ultralytics YOLO checkpoint to follow along).

  • Comfort with Python, the command line, and basic networking (ports, RTSP).

  • Recommended: complete Train your first YOLO model first.

Course content

4 modules · 9 lessons