Postprocessing Backends¶
SAHI's postprocessing (NMS, NMM) can run on three interchangeable backends. The right backend depends on your hardware and installed packages.
Backend overview¶
| Backend | Best for | Extra dependency |
|---|---|---|
| numpy | CPU-only environments, small/medium prediction counts | None (always available) |
| numba | CPU with large prediction counts; ~1 s JIT warmup on first call, then fast | pip install numba |
| torchvision | CUDA GPU available; fastest for large batches | pip install torch torchvision + CUDA |
Auto-detection (default)¶
By default SAHI automatically picks the best available backend at runtime:
- torchvision — if
torchvisionis installed and a CUDA GPU is present. - numba — if the
numbapackage is installed. - numpy — always available as the final fallback.
from sahi.postprocess.backends import get_postprocess_backend
# Check which backend was resolved (triggers auto-detection)
print(get_postprocess_backend()) # "auto" until first postprocessing call
Forcing a specific backend¶
Use set_postprocess_backend before running inference to pin a backend:
from sahi.postprocess.backends import set_postprocess_backend
# Force pure-numpy (no extra deps, works everywhere)
set_postprocess_backend("numpy")
# Force numba JIT (install with: pip install numba)
set_postprocess_backend("numba")
# Force torchvision GPU (install with: pip install torch torchvision)
set_postprocess_backend("torchvision")
# Restore auto-detection
set_postprocess_backend("auto")
This call affects all subsequent NMS/NMM operations in the current process,
including those triggered internally by get_sliced_prediction.
Example: pinning the backend for a full inference run¶
from sahi import AutoDetectionModel
from sahi.predict import get_sliced_prediction
from sahi.postprocess.backends import set_postprocess_backend
# Use GPU-accelerated postprocessing when running on a CUDA machine
set_postprocess_backend("torchvision")
detection_model = AutoDetectionModel.from_pretrained(
model_type="ultralytics",
model_path="yolo11n.pt",
confidence_threshold=0.25,
device="cuda:0",
)
result = get_sliced_prediction(
"image.jpg",
detection_model,
slice_height=512,
slice_width=512,
overlap_height_ratio=0.2,
overlap_width_ratio=0.2,
)
Using postprocessing functions directly¶
All three backends share the same array convention: an (N, 6) numpy array with
columns [x1, y1, x2, y2, score, category_id].
NMS (suppression)¶
import numpy as np
from sahi.postprocess.combine import nms, batched_nms
predictions = np.array([
[100, 100, 200, 200, 0.95, 0],
[105, 105, 205, 205, 0.80, 0],
[300, 300, 400, 400, 0.90, 1],
])
# Global NMS — all categories compete together
keep = nms(predictions, match_metric="IOU", match_threshold=0.5)
print(predictions[keep])
# Per-category NMS — class 0 and class 1 are treated independently
keep = batched_nms(predictions, match_metric="IOU", match_threshold=0.5)
print(predictions[keep])
NMM (merging)¶
Instead of discarding overlapping boxes, NMM merges them:
from sahi.postprocess.combine import greedy_nmm, nmm, batched_greedy_nmm
# Greedy NMM: each kept box merges only its direct neighbours (fast, tight boxes)
keep_to_merge = greedy_nmm(predictions, match_metric="IOU", match_threshold=0.5)
# {kept_index: [merged_index, ...], ...}
# Full NMM: transitive merging (A merges B, B merges C → A gets all three)
keep_to_merge = nmm(predictions, match_metric="IOU", match_threshold=0.5)
# Per-category greedy NMM
keep_to_merge = batched_greedy_nmm(predictions, match_threshold=0.5)
IoS metric¶
Both NMS and NMM support match_metric="IOS" (Intersection over Smaller area),
which is useful when one box is much smaller than another:
Postprocess classes¶
High-level classes integrate with SAHI's ObjectPrediction lists and are used
by get_sliced_prediction via the postprocess_type argument:
from sahi.postprocess.combine import NMSPostprocess, NMMPostprocess, GreedyNMMPostprocess
# NMS — keep the best box, discard the rest
postprocessor = NMSPostprocess(
match_threshold=0.5,
match_metric="IOU",
class_agnostic=True, # False → per-category
)
filtered = postprocessor(object_prediction_list)
# Greedy NMM — merge overlapping boxes (fast)
postprocessor = GreedyNMMPostprocess(match_threshold=0.5)
merged = postprocessor(object_prediction_list)
# Full NMM — transitive merging
postprocessor = NMMPostprocess(match_threshold=0.5)
merged = postprocessor(object_prediction_list)
Passing class_agnostic=False makes each postprocessor run independently per
category, so a "car" prediction will never suppress a "person" prediction.
API reference¶
sahi.postprocess.backends
¶
Postprocessing backend selection and auto-detection.
Usage
from sahi.postprocess.backends import set_postprocess_backend, get_postprocess_backend
set_postprocess_backend("numba") # force numba set_postprocess_backend("auto") # auto-detect best available
Functions¶
get_postprocess_backend()
¶
resolve_backend()
¶
Resolve "auto" to a concrete backend, caching the result.
When the backend is set to "auto", detection follows this priority:
- torchvision -- selected if both torchvision and a CUDA GPU are available (GPU-accelerated NMS).
- numba -- selected if the numba package is installed (JIT-compiled loops, faster than pure numpy for large prediction counts).
- numpy -- always available as the fallback (pure numpy, no extra dependencies).
If the backend was explicitly set via set_postprocess_backend, that
value is returned directly without auto-detection.
Returns:
| Type | Description |
|---|---|
str
|
One of "numpy", "numba", or "torchvision". |
Source code in sahi/postprocess/backends.py
set_postprocess_backend(name)
¶
Set the postprocessing backend.
Call once at startup before running any inference. This function is not thread-safe.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
|
str
|
One of "auto", "numpy", "numba", "torchvision". |
required |
Source code in sahi/postprocess/backends.py
sahi.postprocess.combine
¶
Postprocessing strategies for combining predictions from sliced inference.
Classes¶
GreedyNMMPostprocess
¶
Bases: NMMPostprocess
Postprocessor using Greedy Non-Maximum Merging (NMM).
Similar to NMM but uses a greedy strategy: each kept prediction only merges boxes that directly overlap with it (no transitive merging). This is faster than full NMM and produces tighter merged boxes.
Source code in sahi/postprocess/combine.py
LSNMSPostprocess
¶
Bases: PostprocessPredictions
Postprocessor using Locality-Sensitive NMS from the lsnms package.
Uses a spatial index for fast neighbor lookup, making it efficient for
large numbers of predictions. Only supports IoU metric (not IoS).
Requires the lsnms package (pip install lsnms>0.3.1).
Note
This postprocessor is experimental and not recommended for production use.
Source code in sahi/postprocess/combine.py
Functions¶
__call__(object_predictions)
¶
Apply Locality-Sensitive NMS to suppress overlapping predictions.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
object_predictions
¶ |
list[ObjectPrediction]
|
List of ObjectPrediction instances to suppress. |
required |
Returns:
| Type | Description |
|---|---|
list[ObjectPrediction]
|
List of suppressed ObjectPrediction instances. |
Raises:
| Type | Description |
|---|---|
ModuleNotFoundError
|
If the lsnms package is not installed. |
NotImplementedError
|
If match_metric is not "IOU". |
Source code in sahi/postprocess/combine.py
NMMPostprocess
¶
Bases: PostprocessPredictions
Postprocessor using Non-Maximum Merging (NMM) with transitive merging.
Instead of discarding overlapping detections, merges their bounding boxes, masks, and scores. Uses non-greedy transitive merging: if A overlaps B and B overlaps C, all three are merged even if A does not directly overlap C.
Source code in sahi/postprocess/combine.py
Functions¶
__call__(object_predictions)
¶
Apply Non-Maximum Merging to merge overlapping predictions.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
object_predictions
¶ |
list[ObjectPrediction]
|
List of ObjectPrediction instances to merge. |
required |
Returns:
| Type | Description |
|---|---|
list[ObjectPrediction]
|
List of merged ObjectPrediction instances. |
Source code in sahi/postprocess/combine.py
NMSPostprocess
¶
Bases: PostprocessPredictions
Postprocessor using Non-Maximum Suppression (NMS).
Keeps the highest-scored prediction among overlapping boxes and discards the rest. Does not merge bounding boxes or masks.
Source code in sahi/postprocess/combine.py
Functions¶
__call__(object_predictions)
¶
Apply Non-Maximum Suppression to suppress overlapping predictions.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
object_predictions
¶ |
list[ObjectPrediction]
|
List of ObjectPrediction instances to suppress. |
required |
Returns:
| Type | Description |
|---|---|
list[ObjectPrediction]
|
List of suppressed ObjectPrediction instances. |
Source code in sahi/postprocess/combine.py
PostprocessPredictions
¶
Bases: ABC
Abstract base class for postprocessing object prediction lists.
Subclasses implement a specific strategy (NMS, NMM, greedy NMM, etc.) to reduce overlapping detections produced by sliced inference.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
|
float
|
Minimum overlap value (IoU or IoS) to consider two predictions as matching. |
0.5
|
|
str
|
Overlap metric, "IOU" or "IOS". |
'IOU'
|
|
bool
|
If True, apply postprocessing across all categories. If False, apply per category independently. |
True
|
Source code in sahi/postprocess/combine.py
Functions¶
__call__(predictions)
abstractmethod
¶
Apply postprocessing to the list of predictions.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
predictions
¶ |
list[ObjectPrediction]
|
List of ObjectPrediction instances to postprocess. |
required |
Returns:
| Type | Description |
|---|---|
list[ObjectPrediction]
|
List of postprocessed ObjectPrediction instances. |
Source code in sahi/postprocess/combine.py
__init__(match_threshold=0.5, match_metric='IOU', class_agnostic=True)
¶
Initialize the postprocessor with configuration parameters.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
match_threshold
¶ |
float
|
Minimum overlap value (IoU or IoS) to consider two predictions as matching. |
0.5
|
match_metric
¶ |
str
|
Overlap metric, "IOU" or "IOS". |
'IOU'
|
class_agnostic
¶ |
bool
|
If True, apply postprocessing across all categories. If False, apply per category independently. |
True
|
Source code in sahi/postprocess/combine.py
Functions¶
batched_greedy_nmm(predictions, match_metric='IOU', match_threshold=0.5)
¶
Apply greedy non-maximum merging independently per category.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
|
ndarray
|
Array of shape (N, 6) with columns [x1, y1, x2, y2, score, category_id]. |
required |
|
str
|
Overlap metric, "IOU" or "IOS". |
'IOU'
|
|
float
|
Minimum overlap to merge a lower-scored box. |
0.5
|
Returns:
| Type | Description |
|---|---|
dict[int, list[int]]
|
Dict mapping each kept index to a list of indices merged into it. |
Source code in sahi/postprocess/combine.py
batched_nmm(predictions, match_metric='IOU', match_threshold=0.5)
¶
Apply non-maximum merging (non-greedy, transitive) independently per category.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
|
ndarray
|
Array of shape (N, 6) with columns [x1, y1, x2, y2, score, category_id]. |
required |
|
str
|
Overlap metric, "IOU" or "IOS". |
'IOU'
|
|
float
|
Minimum overlap to merge a lower-scored box. |
0.5
|
Returns:
| Type | Description |
|---|---|
dict[int, list[int]]
|
Dict mapping each kept index to a list of indices merged into it. |
Source code in sahi/postprocess/combine.py
batched_nms(predictions, match_metric='IOU', match_threshold=0.5)
¶
Apply non-maximum suppression independently per category.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
|
ndarray
|
Array of shape (N, 6) with columns [x1, y1, x2, y2, score, category_id]. |
required |
|
str
|
Overlap metric, "IOU" or "IOS". |
'IOU'
|
|
float
|
Minimum overlap to suppress a lower-scored box. |
0.5
|
Returns:
| Type | Description |
|---|---|
list[int]
|
List of indices of the kept predictions, sorted by score descending. |
Source code in sahi/postprocess/combine.py
greedy_nmm(predictions, match_metric='IOU', match_threshold=0.5)
¶
Greedy non-maximum merging for axis-aligned bounding boxes.
Instead of discarding overlapping boxes, merges them into the highest-scored box. Dispatches to the resolved backend.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
|
ndarray
|
Array of shape (N, 6) with columns [x1, y1, x2, y2, score, category_id]. |
required |
|
str
|
Overlap metric, "IOU" or "IOS". |
'IOU'
|
|
float
|
Minimum overlap to merge a lower-scored box. |
0.5
|
Returns:
| Type | Description |
|---|---|
dict[int, list[int]]
|
Dict mapping each kept index to a list of indices merged into it. |
Source code in sahi/postprocess/combine.py
nmm(predictions, match_metric='IOU', match_threshold=0.5)
¶
Non-maximum merging (non-greedy, transitive) for axis-aligned bounding boxes.
Unlike greedy NMM, this variant allows transitive merging: if box A merges with B and B merges with C, all three are merged together. Dispatches to the resolved backend.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
|
ndarray
|
Array of shape (N, 6) with columns [x1, y1, x2, y2, score, category_id]. |
required |
|
str
|
Overlap metric, "IOU" or "IOS". |
'IOU'
|
|
float
|
Minimum overlap to merge a lower-scored box. |
0.5
|
Returns:
| Type | Description |
|---|---|
dict[int, list[int]]
|
Dict mapping each kept index to a list of indices merged into it. |
Source code in sahi/postprocess/combine.py
nms(predictions, match_metric='IOU', match_threshold=0.5)
¶
Non-maximum suppression for axis-aligned bounding boxes.
Dispatches to the resolved backend (numpy, numba, or torchvision).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
|
ndarray
|
Array of shape (N, 6) with columns [x1, y1, x2, y2, score, category_id]. |
required |
|
str
|
Overlap metric, "IOU" or "IOS". |
'IOU'
|
|
float
|
Minimum overlap to suppress a lower-scored box. |
0.5
|
Returns:
| Type | Description |
|---|---|
list[int]
|
List of indices of the kept predictions, sorted by score descending. |