Skip to content

Model Integrations

SAHI works with any object detection framework through a unified API. Load your model once with AutoDetectionModel.from_pretrained(), then use it with any SAHI function -- sliced prediction, batch inference, CLI, etc.

Ultralytics (YOLO)

Supports YOLOv8, YOLO11, YOLO26, and all Ultralytics model variants including segmentation and oriented bounding box models.

pip install ultralytics
from sahi import AutoDetectionModel
from sahi.predict import get_sliced_prediction

detection_model = AutoDetectionModel.from_pretrained(
    model_type="ultralytics",
    model_path="yolo26n.pt",
    confidence_threshold=0.25,
    device="cuda:0",  # or "cpu"
)

result = get_sliced_prediction(
    "image.jpg",
    detection_model,
    slice_height=512,
    slice_width=512,
    overlap_height_ratio=0.2,
    overlap_width_ratio=0.2,
)

Ultralytics models also support native GPU batch inference for faster processing of multiple slices:

result = get_sliced_prediction(
    "large_image.jpg",
    detection_model,
    slice_height=640,
    slice_width=640,
    batch_size=8,  # process 8 slices at once
)

Open In Colab


YOLOE

YOLOE models with prompt-free and open-vocabulary detection.

pip install ultralytics
detection_model = AutoDetectionModel.from_pretrained(
    model_type="yoloe",
    model_path="yoloe-v8l-seg.pt",
    confidence_threshold=0.25,
    device="cuda:0",
)

result = get_sliced_prediction(
    "image.jpg",
    detection_model,
    slice_height=512,
    slice_width=512,
)

Open In Colab


YOLO-World (Zero-Shot)

Open-vocabulary detection -- detect objects by text description without retraining.

pip install ultralytics
detection_model = AutoDetectionModel.from_pretrained(
    model_type="yolo-world",
    model_path="yolov8s-worldv2.pt",
    confidence_threshold=0.1,
    device="cuda:0",
)

result = get_sliced_prediction(
    "image.jpg",
    detection_model,
    slice_height=512,
    slice_width=512,
)

YOLOv5

Classic YOLOv5 models via the yolov5 pip package.

pip install yolov5
detection_model = AutoDetectionModel.from_pretrained(
    model_type="yolov5",
    model_path="yolov5s.pt",
    confidence_threshold=0.25,
    device="cuda:0",
)

result = get_sliced_prediction(
    "image.jpg",
    detection_model,
    slice_height=512,
    slice_width=512,
)

Open In Colab


HuggingFace Transformers

Use any object detection model from the HuggingFace Hub (DETR, Deformable DETR, DETA, etc.).

pip install transformers timm
detection_model = AutoDetectionModel.from_pretrained(
    model_type="huggingface",
    model_path="facebook/detr-resnet-50",
    confidence_threshold=0.3,
    device="cuda:0",
)

result = get_sliced_prediction(
    "image.jpg",
    detection_model,
    slice_height=512,
    slice_width=512,
)

Open In Colab


RT-DETR

Real-Time Detection Transformer for high-accuracy real-time detection.

pip install transformers timm
detection_model = AutoDetectionModel.from_pretrained(
    model_type="rtdetr",
    model_path="PekingU/rtdetr_r50vd",
    confidence_threshold=0.3,
    device="cuda:0",
)

result = get_sliced_prediction(
    "image.jpg",
    detection_model,
    slice_height=512,
    slice_width=512,
)

Open In Colab


TorchVision

Use built-in TorchVision detection models (Faster R-CNN, RetinaNet, FCOS, SSD, etc.).

pip install torch torchvision
detection_model = AutoDetectionModel.from_pretrained(
    model_type="torchvision",
    model_path="fasterrcnn_resnet50_fpn",
    confidence_threshold=0.3,
    device="cuda:0",
)

result = get_sliced_prediction(
    "image.jpg",
    detection_model,
    slice_height=512,
    slice_width=512,
)

Open In Colab


MMDetection

Supports the full MMDetection model zoo (300+ models).

pip install mmdet mmcv mmengine
detection_model = AutoDetectionModel.from_pretrained(
    model_type="mmdet",
    model_path="path/to/checkpoint.pth",
    config_path="path/to/config.py",
    confidence_threshold=0.25,
    device="cuda:0",
)

result = get_sliced_prediction(
    "image.jpg",
    detection_model,
    slice_height=512,
    slice_width=512,
)

Open In Colab


Detectron2

Use Facebook's Detectron2 models for detection and instance segmentation.

pip install detectron2
detection_model = AutoDetectionModel.from_pretrained(
    model_type="detectron2",
    model_path="path/to/model_final.pth",
    config_path="path/to/config.yaml",
    confidence_threshold=0.25,
    device="cuda:0",
)

result = get_sliced_prediction(
    "image.jpg",
    detection_model,
    slice_height=512,
    slice_width=512,
)

Open In Colab


Roboflow (RF-DETR)

Use Roboflow's RF-DETR models for detection and segmentation.

pip install rfdetr
detection_model = AutoDetectionModel.from_pretrained(
    model_type="roboflow",
    model_path="rfdetr-base",
    confidence_threshold=0.3,
    device="cuda:0",
)

result = get_sliced_prediction(
    "image.jpg",
    detection_model,
    slice_height=512,
    slice_width=512,
)

Open In Colab


Common Parameters

All models accept these parameters in AutoDetectionModel.from_pretrained():

Parameter Type Description
model_type str Framework name (see sections above)
model_path str Path to weights file or model name
config_path str Config file path (MMDetection, Detectron2)
confidence_threshold float Minimum score to keep a detection (default: 0.25)
device str "cpu", "cuda:0", "mps", etc.
category_mapping dict Map category IDs to names: {0: "car", 1: "person"}
category_remapping dict Remap category names after inference
image_size int Override model input resolution
load_at_init bool Load weights immediately (default: True)

Using a Pre-loaded Model

If you already have a model instance, pass it directly instead of a path:

from ultralytics import YOLO

yolo_model = YOLO("yolo26n.pt")
# ... customize the model ...

detection_model = AutoDetectionModel.from_pretrained(
    model_type="ultralytics",
    model=yolo_model,
    confidence_threshold=0.25,
    device="cuda:0",
)

Next Steps