Skip to main content
Foundation pathwaybeginner ~3 hours 10 lessons Final exam · Certificate

From the first image to the first model

Computer Vision Foundations

Build the vocabulary, intuition, and first hands-on reflexes for modern computer vision — tasks, datasets, metrics, and what they mean for a real product.

By Ultralytics Academy

Ultralytics YOLO object detection examples on a city street
What you'll learn
Frame a computer vision problem, choose the right task and labels, evaluate a model honestly, and know what "good" looks like before you train.
  • Match a product need to the right vision task — classification, detection, segmentation, pose, or OBB.

  • Decide between bounding boxes and segmentation masks based on shape, cost, and downstream use.

  • Plan datasets and annotation guidelines that match deployment conditions.

  • Read precision, recall, and mAP together with examples to find the next improvement.

  • Recognize the difference between training, validation, and test data — and why people get this wrong.

What you'll build
  • A one-page task spec for a vision project: what the model predicts, what it ignores, what shape the output takes.

  • A dataset plan that lists splits, edge cases, label rules, and review criteria.

  • A small evaluation worksheet that converts metrics into a decision: ship, iterate, or rethink.

Prerequisites
  • Basic Python familiarity is helpful but not required — this course is concept-first.

  • Comfort installing software locally if you want to follow the optional ultralytics install in lesson 10.

Course content

4 modules · 10 lessons