Skip to content

run_paths

orchard.core.paths.run_paths

Dynamic Run Orchestration and Experiment Directory Management.

Provides the RunPaths class implementing an 'Atomic Run Isolation' strategy for ML experiment artifact management. Automates creation of immutable, hashed directory structures ensuring hyperparameters, model weights, and logs are uniquely identified and protected from accidental overwrites.

The hashing strategy combines date, dataset/model slugs, and a blake2b hash of training configuration plus timestamp to guarantee unique run directories without collision fallbacks.

Example

from orchard.core.paths import RunPaths paths = RunPaths.create( ... dataset_slug="organcmnist", ... architecture_name="EfficientNet-B0", ... training_cfg={"batch_size": 32, "lr": 0.001} ... ) paths.root PosixPath('outputs/20260208_organcmnist_efficientnetb0_a3f7c2')

RunPaths

Bases: BaseModel

Immutable container for experiment-specific directory paths.

Implements atomic run isolation using a deterministic hashing strategy that combines DATE + DATASET_SLUG + MODEL_SLUG + CONFIG_HASH to create unique, collision-free directory structures. The Pydantic frozen model ensures paths cannot be modified after creation.

Attributes:

Name Type Description
run_id str

Unique identifier in format YYYYMMDD_dataset_model_hash.

dataset_slug str

Normalized lowercase dataset name.

architecture_slug str

Sanitized alphanumeric architecture identifier.

root Path

Base directory for all run artifacts.

figures Path

Directory for plots, confusion matrices, ROC curves.

checkpoints Path

Directory for saved checkpoints (.pth files).

reports Path

Directory for config mirrors, CSV/XLSX summaries.

logs Path

Directory for training logs and session output.

database Path

Directory for SQLite optimization studies.

exports Path

Directory for production exports (ONNX).

Example

Directory structure created::

outputs/20260208_organcmnist_efficientnetb0_a3f7c2/
├── figures/
├── checkpoints/
├── reports/
├── logs/
├── database/
└── exports/

best_model_path property

Path for the best-performing model checkpoint.

Returns:

Type Description
Path

Path in format: checkpoints/best_{architecture_slug}.pth

final_report_path property

Path for the comprehensive experiment summary report.

Returns:

Type Description
Path

Path to reports/training_summary.xlsx

create(dataset_slug, architecture_name, training_cfg, base_dir=None) classmethod

Factory method to create and initialize a unique run environment.

Creates a new RunPaths instance with a deterministic unique ID based on dataset, model, and training configuration. Physically creates all subdirectories on the filesystem.

Parameters:

Name Type Description Default
dataset_slug str

Dataset identifier (e.g., 'organcmnist'). Will be normalized to lowercase.

required
architecture_name str

Human-readable model name (e.g., 'EfficientNet-B0'). Special characters are stripped, converted to lowercase.

required
training_cfg dict[str, Any]

Dictionary of hyperparameters used for hash generation. Supports nested dicts, but only hashable primitives (int, float, str, bool, list) contribute to the hash.

required
base_dir Path | None

Custom base directory for outputs. Defaults to OUTPUTS_ROOT (typically './outputs').

None

Returns:

Type Description
'RunPaths'

Fully initialized RunPaths instance with all directories created.

Raises:

Type Description
ValueError

If dataset_slug or architecture_name is not a string.

Example

paths = RunPaths.create( ... dataset_slug="OrganCMNIST", ... architecture_name="EfficientNet-B0", ... training_cfg={"batch_size": 32, "lr": 0.001} ... ) paths.dataset_slug 'organcmnist' paths.architecture_slug 'efficientnetb0'

Source code in orchard/core/paths/run_paths.py
@classmethod
def create(
    cls,
    dataset_slug: str,
    architecture_name: str,
    training_cfg: dict[str, Any],
    base_dir: Path | None = None,
) -> "RunPaths":
    """
    Factory method to create and initialize a unique run environment.

    Creates a new RunPaths instance with a deterministic unique ID based
    on dataset, model, and training configuration. Physically creates all
    subdirectories on the filesystem.

    Args:
        dataset_slug: Dataset identifier (e.g., 'organcmnist'). Will be
            normalized to lowercase.
        architecture_name: Human-readable model name (e.g., 'EfficientNet-B0').
            Special characters are stripped, converted to lowercase.
        training_cfg: Dictionary of hyperparameters used for hash generation.
            Supports nested dicts, but only hashable primitives (int, float,
            str, bool, list) contribute to the hash.
        base_dir: Custom base directory for outputs. Defaults to OUTPUTS_ROOT
            (typically './outputs').

    Returns:
        Fully initialized RunPaths instance with all directories created.

    Raises:
        ValueError: If dataset_slug or architecture_name is not a string.

    Example:
        >>> paths = RunPaths.create(
        ...     dataset_slug="OrganCMNIST",
        ...     architecture_name="EfficientNet-B0",
        ...     training_cfg={"batch_size": 32, "lr": 0.001}
        ... )
        >>> paths.dataset_slug
        'organcmnist'
        >>> paths.architecture_slug
        'efficientnetb0'
    """
    if not isinstance(dataset_slug, str):
        raise ValueError(f"Expected string for dataset_slug but got {type(dataset_slug)}")
    ds_slug = dataset_slug.lower()

    if not isinstance(architecture_name, str):
        raise ValueError(
            f"Expected string for architecture_name but got {type(architecture_name)}"
        )
    a_slug = re.sub(r"[^a-zA-Z0-9]", "", architecture_name.lower())

    # Determine the unique run ID
    run_id = cls._generate_unique_id(ds_slug, a_slug, training_cfg)

    base = Path(base_dir or OUTPUTS_ROOT)
    root_path = base / run_id

    # No collision fallback needed: run_timestamp guarantees uniqueness

    instance = cls(
        run_id=run_id,
        dataset_slug=ds_slug,
        architecture_slug=a_slug,
        root=root_path,
        figures=root_path / "figures",
        checkpoints=root_path / "checkpoints",
        reports=root_path / "reports",
        logs=root_path / "logs",
        database=root_path / "database",
        exports=root_path / "exports",
    )

    instance._setup_run_directories()
    return instance

get_fig_path(filename)

Generate path for a visualization artifact.

Parameters:

Name Type Description Default
filename str

Name of the figure file (e.g., 'confusion_matrix.png').

required

Returns:

Type Description
Path

Absolute path within the figures directory.

Source code in orchard/core/paths/run_paths.py
def get_fig_path(self, filename: str) -> Path:
    """
    Generate path for a visualization artifact.

    Args:
        filename: Name of the figure file (e.g., 'confusion_matrix.png').

    Returns:
        Absolute path within the figures directory.
    """
    return self.figures / filename

get_config_path()

Get path for the archived run configuration.

Returns:

Type Description
Path

Path to reports/config.yaml

Source code in orchard/core/paths/run_paths.py
def get_config_path(self) -> Path:
    """
    Get path for the archived run configuration.

    Returns:
        Path to reports/config.yaml
    """
    return self.reports / "config.yaml"

get_db_path()

Get path for Optuna SQLite study database.

The database directory is created during RunPaths initialization, ensuring the parent directory exists before Optuna writes to it.

Returns:

Type Description
Path

Path to database/study.db

Source code in orchard/core/paths/run_paths.py
def get_db_path(self) -> Path:
    """
    Get path for Optuna SQLite study database.

    The database directory is created during RunPaths initialization,
    ensuring the parent directory exists before Optuna writes to it.

    Returns:
        Path to database/study.db
    """
    return self.database / "study.db"

__repr__()

Return string representation with run_id and root path.

Source code in orchard/core/paths/run_paths.py
def __repr__(self) -> str:
    """
    Return string representation with run_id and root path.
    """
    return f"RunPaths(run_id='{self.run_id}', root={self.root})"