Skip to content

base

orchard.core.metadata.base

Dataset Metadata Base Definitions.

Defines dataset metadata schema using Pydantic for immutability, type safety, and seamless integration with the global configuration engine.

DatasetMetadata

Bases: BaseModel

Immutable metadata container for a dataset entry.

Ensures dataset-specific constants are grouped and frozen throughout pipeline execution. Serves as static definition feeding into dynamic DatasetConfig.

Attributes:

Name Type Description
name str

Short identifier (e.g., 'pathmnist', 'galaxy10').

display_name str

Human-readable name for reporting.

md5_checksum str

MD5 hash for download integrity verification.

url str

Source URL for dataset download.

path Path

Local path to the .npz archive.

classes list[str]

Class labels in index order.

in_channels int

Number of image channels (1=grayscale, 3=RGB).

native_resolution int | None

Native pixel resolution (e.g., 28, 224).

mean tuple[float, ...]

Channel-wise normalization mean.

std tuple[float, ...]

Channel-wise normalization standard deviation.

is_anatomical bool

Whether images have fixed anatomical orientation.

is_texture_based bool

Whether classification relies on texture patterns.

normalization_info property

Formatted mean/std for reporting.

resolution_str property

Formatted resolution string (e.g., '28x28', '224x224').

num_classes property

Total number of target classes.