detection_dataset
orchard.data_handler.detection_dataset
¶
PyTorch Dataset for Object Detection.
Wraps image arrays and bounding-box annotations into the format expected
by torchvision detection models: (image_tensor, target_dict) where
target_dict contains boxes and labels tensors.
DetectionDataset(images, annotations, *, transform=None)
¶
Bases: Dataset[tuple[Tensor, dict[str, Tensor]]]
PyTorch Dataset for detection tasks with bounding-box annotations.
Each sample returns an (image, target) pair where target is a
dict with boxes (N, 4) in [x1, y1, x2, y2] format and
labels (N,) as int64 class indices.
Initialize from pre-loaded arrays and annotation list.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
images
|
NDArray[Any]
|
Image array |
required |
annotations
|
list[dict[str, NDArray[Any]]]
|
Per-image annotation dicts with |
required |
transform
|
Compose | None
|
Torchvision transform pipeline for images. |
None
|
Source code in orchard/data_handler/detection_dataset.py
from_arrays(images, annotations, *, transform=None, max_samples=None, seed=DEFAULT_SEED)
classmethod
¶
Build a DetectionDataset with optional subsampling.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
images
|
NDArray[Any]
|
Image array |
required |
annotations
|
list[dict[str, NDArray[Any]]]
|
Per-image annotation dicts. |
required |
transform
|
Compose | None
|
Transform pipeline. |
None
|
max_samples
|
int | None
|
Limit number of samples. |
None
|
seed
|
int
|
Random seed for deterministic subsampling. |
DEFAULT_SEED
|
Source code in orchard/data_handler/detection_dataset.py
from_npz(image_path, annotation_path, split='train', *, transform=None, max_samples=None, seed=DEFAULT_SEED)
classmethod
¶
Load a detection dataset from NPZ (images) and NPZ (annotations).
The image NPZ has key {split}_images. The annotation NPZ has
keys {split}_boxes (list of (N_i, 4) arrays) and
{split}_labels (list of (N_i,) arrays), stored as object arrays.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
image_path
|
Path
|
Path to images NPZ. |
required |
annotation_path
|
Path
|
Path to annotations NPZ. |
required |
split
|
str
|
Dataset split ( |
'train'
|
transform
|
Compose | None
|
Transform pipeline. |
None
|
max_samples
|
int | None
|
Limit number of samples. |
None
|
seed
|
int
|
Random seed. |
DEFAULT_SEED
|
Source code in orchard/data_handler/detection_dataset.py
__len__()
¶
__getitem__(idx)
¶
Retrieve an image and its bounding-box annotations.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
idx
|
int
|
Sample index. |
required |
Returns:
| Type | Description |
|---|---|
Tensor
|
Tuple of (image_tensor, target_dict) where target_dict has |
dict[str, Tensor]
|
|