Skip to content

manifest

orchard.core.config.manifest

Vision Pipeline Configuration Manifest.

Defines the hierarchical Config schema that aggregates specialized sub-configs (Hardware, Dataset, Architecture, Training, Evaluation, Augmentation, Optuna, Export) into a single immutable manifest.

Layout:

  • Config — main Pydantic model, ordered as: Fields & model validator, Properties (run_slug, num_workers), Serialization (dump_portable, dump_serialized), from_recipe — primary factory (orchard CLI)
  • _CrossDomainValidator — cross-domain validation logic (AMP vs Device, LR bounds, Mixup scheduling, resolution/model pairing)
  • _deep_set — dot-notation dict helper for CLI overrides

Config

Bases: BaseModel

Main experiment manifest aggregating specialized sub-configurations.

Serves as the Single Source of Truth (SSOT) for all experiment parameters. Validates cross-domain logic (AMP/device compatibility, resolution/model pairing) and provides factory methods for YAML and CLI instantiation.

Attributes:

Name Type Description
task_type TaskType

ML task type (currently "classification")

hardware HardwareConfig

Device selection, threading, reproducibility settings

telemetry TelemetryConfig

Logging, paths, experiment naming

training TrainingConfig

Optimizer, scheduler, epochs, regularization

augmentation AugmentationConfig

Data augmentation and TTA parameters

dataset DatasetConfig

Dataset selection, resolution, normalization

evaluation EvaluationConfig

Metrics, visualization, reporting settings

architecture ArchitectureConfig

Architecture selection, pretrained weights

optuna OptunaConfig | None

Hyperparameter optimization configuration (optional)

export ExportConfig | None

Model export configuration for ONNX (optional)

tracking TrackingConfig | None

Experiment tracking configuration for MLflow (optional)

Example

from orchard.core import Config cfg = Config.from_recipe(Path("recipes/config_mini_cnn.yaml")) cfg.architecture.name 'mini_cnn'

run_slug property

Generate unique experiment folder identifier.

Combines dataset name and model name for human-readable run identification in output directories. Slashes in architecture names (e.g. timm/convnext_base) are replaced with underscores to keep paths flat.

Returns:

Type Description
str

String in format '{dataset_name}_{model_name}'.

num_workers property

Get effective DataLoader workers from hardware policy.

Delegates to hardware config which respects reproducibility constraints (returns 0 if reproducible mode enabled).

Returns:

Type Description
int

Number of DataLoader worker processes.

validate_logic()

Cross-domain validation enforcing consistency across sub-configs.

Invokes _CrossDomainValidator to check:

  • Model/resolution compatibility (ResNet-18 → 28x28)
  • Training epochs bounds (mixup_epochs ≤ epochs)
  • Hardware/feature alignment (AMP requires GPU)
  • Pretrained/channel consistency (pretrained → RGB)
  • Optimizer bounds (min_lr < learning_rate)

Returns:

Type Description
'Config'

Validated Config instance with auto-corrections applied

Raises:

Type Description
OrchardConfigError

On irrecoverable validation failures

Source code in orchard/core/config/manifest.py
@model_validator(mode="after")
def validate_logic(self) -> "Config":
    """
    Cross-domain validation enforcing consistency across sub-configs.

    Invokes _CrossDomainValidator to check:

    - Model/resolution compatibility (ResNet-18 → 28x28)
    - Training epochs bounds (mixup_epochs ≤ epochs)
    - Hardware/feature alignment (AMP requires GPU)
    - Pretrained/channel consistency (pretrained → RGB)
    - Optimizer bounds (min_lr < learning_rate)

    Returns:
        Validated Config instance with auto-corrections applied

    Raises:
        OrchardConfigError: On irrecoverable validation failures
    """
    return _CrossDomainValidator.validate(self)

dump_portable()

Serialize config with environment-agnostic paths.

Converts absolute filesystem paths to project-relative paths (e.g., '/home/user/project/dataset' -> './dataset') to prevent host-specific path leakage in exported configurations.

Returns:

Type Description
dict[str, Any]

Dictionary with all paths converted to portable relative strings.

Source code in orchard/core/config/manifest.py
def dump_portable(self) -> dict[str, Any]:
    """
    Serialize config with environment-agnostic paths.

    Converts absolute filesystem paths to project-relative paths
    (e.g., '/home/user/project/dataset' -> './dataset') to prevent
    host-specific path leakage in exported configurations.

    Returns:
        Dictionary with all paths converted to portable relative strings.
    """
    full_data = self.model_dump()
    full_data["hardware"] = self.hardware.model_dump()
    full_data["telemetry"] = self.telemetry.to_portable_dict()

    # Sanitize dataset root path
    # default {} never reached (model_dump always has "dataset"), equivalent mutant
    dataset_section = full_data.get("dataset", {})  # pragma: no mutate
    data_root = dataset_section.get("data_root")

    if data_root:
        dr_path = Path(data_root)
        if dr_path.is_relative_to(PROJECT_ROOT):
            relative_dr = dr_path.relative_to(PROJECT_ROOT)
            full_data["dataset"]["data_root"] = f"./{relative_dr}"

    return full_data

dump_serialized()

Convert config to JSON-compatible dict for YAML serialization.

Uses Pydantic's json mode to ensure all values are serializable (Path objects become strings, enums become values, etc.).

Returns:

Type Description
dict[str, Any]

Dictionary with all values JSON-serializable for YAML export.

Source code in orchard/core/config/manifest.py
def dump_serialized(self) -> dict[str, Any]:
    """
    Convert config to JSON-compatible dict for YAML serialization.

    Uses Pydantic's json mode to ensure all values are serializable
    (Path objects become strings, enums become values, etc.).

    Returns:
        Dictionary with all values JSON-serializable for YAML export.
    """
    # Pydantic mode= is case-insensitive, so "json"/"JSON" are equivalent mutants
    return self.model_dump(mode="json")  # pragma: no mutate

from_recipe(recipe_path, overrides=None) classmethod

Factory from YAML recipe with optional dot-notation overrides.

Loads the recipe, applies overrides to the raw dict before Pydantic instantiation, resolves dataset metadata, and returns a validated Config. This is the preferred entry point for the orchard CLI.

Parameters:

Name Type Description Default
recipe_path Path

Path to YAML recipe file

required
overrides dict[str, Any] | None

Flat dict of dot-notation keys to values (e.g. {"training.epochs": 20})

None

Returns:

Type Description
'Config'

Validated Config instance

Raises:

Type Description
FileNotFoundError

If recipe_path does not exist

ValueError

If recipe is missing dataset.name

KeyError

If dataset not found in registry

Example

cfg = Config.from_recipe(Path("recipes/config_mini_cnn.yaml")) cfg = Config.from_recipe( ... Path("recipes/config_mini_cnn.yaml"), ... overrides={"training.epochs": 20, "training.seed": 123}, ... )

Source code in orchard/core/config/manifest.py
@classmethod
def from_recipe(
    cls,
    recipe_path: Path,
    overrides: dict[str, Any] | None = None,
) -> "Config":
    """
    Factory from YAML recipe with optional dot-notation overrides.

    Loads the recipe, applies overrides to the raw dict *before*
    Pydantic instantiation, resolves dataset metadata, and returns
    a validated Config. This is the preferred entry point for the
    ``orchard`` CLI.

    Args:
        recipe_path: Path to YAML recipe file
        overrides: Flat dict of dot-notation keys to values
                   (e.g. ``{"training.epochs": 20}``)

    Returns:
        Validated Config instance

    Raises:
        FileNotFoundError: If recipe_path does not exist
        ValueError: If recipe is missing ``dataset.name``
        KeyError: If dataset not found in registry

    Example:
        >>> cfg = Config.from_recipe(Path("recipes/config_mini_cnn.yaml"))
        >>> cfg = Config.from_recipe(
        ...     Path("recipes/config_mini_cnn.yaml"),
        ...     overrides={"training.epochs": 20, "training.seed": 123},
        ... )
    """
    raw_data = load_config_from_yaml(recipe_path)

    if overrides:
        for dotted_key, value in overrides.items():
            _deep_set(raw_data, dotted_key, value)

    dataset_section = raw_data.get("dataset", {})
    ds_name = dataset_section.get("name")
    if not ds_name:
        raise OrchardConfigError(f"Recipe '{recipe_path}' must specify 'dataset.name'")

    resolution = dataset_section.get("resolution", 28)
    wrapper = DatasetRegistryWrapper(resolution=resolution)

    if ds_name not in wrapper.registry:
        available = list(wrapper.registry.keys())
        raise OrchardConfigError(
            f"Dataset '{ds_name}' not found at resolution {resolution}. "
            f"Available at {resolution}px: {available}"
        )

    metadata = wrapper.get_dataset(ds_name)
    raw_data.setdefault("dataset", {})["metadata"] = metadata

    if overrides and raw_data.get("optuna") is not None:
        optuna_section = raw_data["optuna"]
        preset = optuna_section.get("search_space_preset", "full")
        _warn_optuna_override_conflicts(overrides, preset)

    return cls(**raw_data)