metadata
orchard.core.metadata
¶
Dataset Metadata Package.
This package centralizes the specifications for all supported datasets. It serves as the single source of truth for the Orchard, ensuring that data dimensions, labels, and normalization constants are consistent across the entire pipeline.
DatasetMetadata
¶
Bases: BaseModel
Immutable metadata container for a dataset entry.
Ensures dataset-specific constants are grouped and frozen throughout pipeline execution. Serves as static definition feeding into dynamic DatasetConfig.
Attributes:
| Name | Type | Description |
|---|---|---|
name |
str
|
Short identifier (e.g., |
display_name |
str
|
Human-readable name for reporting. |
md5_checksum |
str
|
MD5 hash for download integrity verification. |
url |
str
|
Source URL for dataset download. |
path |
Path
|
Local path to the |
classes |
list[str]
|
Class labels in index order. |
in_channels |
int
|
Number of image channels (1=grayscale, 3=RGB). |
native_resolution |
int | None
|
Native pixel resolution (e.g., 28, 224). |
mean |
tuple[float, ...]
|
Channel-wise normalization mean. |
std |
tuple[float, ...]
|
Channel-wise normalization standard deviation. |
is_anatomical |
bool
|
Whether images have fixed anatomical orientation. |
is_texture_based |
bool
|
Whether classification relies on texture patterns. |
DatasetRegistryWrapper
¶
Bases: BaseModel
Pydantic wrapper for multi-domain dataset registries.
Merges domain-specific registries (medical, space) based on the selected resolution and provides validated, deep-copied access to dataset metadata entries.
Attributes:
| Name | Type | Description |
|---|---|---|
resolution |
int
|
Target dataset resolution (28, 32, 64, 128, or 224). |
registry |
dict[str, DatasetMetadata]
|
Deep-copied metadata registry for the selected resolution. |
get_dataset(name)
¶
Retrieves specific DatasetMetadata by name.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Dataset identifier |
required |
Returns:
| Type | Description |
|---|---|
DatasetMetadata
|
Deep copy of DatasetMetadata |
Raises:
| Type | Description |
|---|---|
KeyError
|
If dataset not found in registry |