Choose an Ultralytics YOLO Model Size
Nano, small, medium, large, x-large — what each one costs and where each one wins.
Ultralytics YOLO26 ships in five sizes — "n" (nano), "s", "m", "l", "x". All support detect, segment, classify, pose, and OBB tasks, and YOLO26 is the recommended default for new projects (see how it stacks up against YOLO11). The right size depends on your inference latency budget, your hardware, and how hard your task actually is. Pick the smallest that meets your accuracy bar and you'll thank yourself in production.
Pick a YOLO26 size for your project based on inference speed, target hardware, and task difficulty.
n— fastest, smallest, mobile / edge.s— best general default for laptops and modest GPUs.m— server-side default when accuracy matters.l/x— when every last mAP point counts and you have the GPU.
Hands-on
The lineup

| Size | Params | mAP@0.5:0.95 (COCO) | Latency on RTX 4090 | Latency on Apple M2 |
|---|---|---|---|---|
yolo26n | ~2.7M | ~38 | ~1 ms | ~10 ms |
yolo26s | ~9.5M | ~46 | ~2 ms | ~20 ms |
yolo26m | ~21M | ~51 | ~4 ms | ~40 ms |
yolo26l | ~25M | ~53 | ~5 ms | ~55 ms |
yolo26x | ~57M | ~55 | ~10 ms | ~120 ms |
Numbers are approximate and shift between releases — use yolo benchmark for your hardware. The ratios are the takeaway: each step up is roughly 2× the parameters and FLOPs, modest accuracy gains, and proportionally slower inference. The full performance metrics guide walks through what each column actually measures.
The cost curve
The accuracy / size curve is strictly diminishing returns. Going from n to s might gain you 8 mAP points; going from l to x might gain you 2. If your problem is well-matched, you usually saturate around m.
mAP
▲
55 │ ▲ x
54 │ ▲ l ╲ ← diminishing returns
52 │ ▲ m ↓
│
46 │ ▲ s
│
38 │ ▲ n
└────────────────────────────────▶ parameters →Pick by deployment, not by leaderboard
The model you pick is the one you can afford to run at your target rate. A 55-mAP model that runs at 50 ms on your CPU won't ship if your target is 30 fps; a 38-mAP model that runs at 5 ms will. Pick from the deployment side first.
| Target | Sensible default |
|---|---|
| Mobile or edge AI / browser via WASM | n |
| Desktop CPU, ~10–15 fps | n or s |
| Single GPU server, 30 fps | s or m |
| Multi-GPU server, batch jobs | m or l |
| Offline / overnight reprocessing | x if accuracy moves the needle |
Don't train on x and rely on model pruning or quantization down to n for deployment. Train on the size you'll deploy. Augmentation, learning rate, and anchor priors are size-aware; cross-size transfer is rarely worth the complication.
When small models surprise you
For narrow tasks — single class, well-controlled lighting, similar viewpoint — n and s often match or beat m on your specific data. The leaderboard numbers are on COCO, which has 80 diverse classes; your problem is usually much easier.
The right test: train both n and s on your dataset for one round, compare mAP. If n is within a point or two and runs 2× faster, ship n.
Run a quick benchmark
Benchmark mode measures real fps and mAP for each export format and image size combination on your machine — and even tells you how latency vs throughput tuning would help on Intel CPUs:
yolo benchmark model=yolo26n.pt imgsz=640
yolo benchmark model=yolo26s.pt imgsz=640
yolo benchmark model=yolo26m.pt imgsz=640That tells you actual fps on actual hardware. Don't rely on someone else's numbers.
Run yolo benchmark on yolo26n and yolo26s on your hardware. Note the fps difference. Decide which one fits your latency budget.
You've benchmarked at least two model sizes on your hardware.
You can name a target inference latency for your project.
You've picked an initial model size and noted the next size up as a backup if accuracy is short.
Time to leave pretrained models behind. Next: prepare your own dataset in the format Ultralytics YOLO expects.