Skip to content

base

orchard.core.metadata.base

Dataset Metadata Base Definitions.

Defines dataset metadata schema using Pydantic for immutability, type safety, and seamless integration with the global configuration engine.

DatasetMetadata

Bases: BaseModel

Immutable metadata container for a dataset entry.

Holds identity, source, image properties, and normalization constants for both classification and detection datasets. Detection datasets additionally specify an annotation_path for bounding-box labels.

Attributes:

Name Type Description
name str

Short identifier (e.g., 'pathmnist', 'galaxy10').

display_name str

Human-readable name for reporting.

md5_checksum str

MD5 hash for download integrity verification.

url str

Source URL for dataset download.

path Path

Local path to the .npz archive.

classes list[str]

Class labels in index order.

in_channels int

Number of image channels (1=grayscale, 3=RGB).

native_resolution int | None

Native pixel resolution (e.g., 28, 224).

mean tuple[float, ...]

Channel-wise normalization mean.

std tuple[float, ...]

Channel-wise normalization standard deviation.

annotation_path Path | None

Local path to annotation .npz (detection only).

normalization_info property

Formatted mean/std for reporting.

resolution_str property

Formatted resolution string (e.g., '28x28', '224x224').

num_classes property

Total number of target classes.