Skip to content

paths

orchard.core.paths

Filesystem Authority and Path Orchestration Package.

Centralizes all path-related logic for Orchard ML using a three-layer approach:

  1. Constants Layer (constants module):

  2. SUPPORTED_RESOLUTIONS, METRIC_*, LOGGER_NAME: Pure project constants

  3. LogStyle: Unified logging style constants (symbols, separators, ANSI)

  4. Root Discovery Layer (root module):

  5. PROJECT_ROOT: Dynamically resolved project root

  6. DATASET_DIR, OUTPUTS_ROOT: Global directory constants

  7. Dynamic Layer (RunPaths class):

  8. Experiment-specific directory management

  9. Atomic run isolation via deterministic hashing
  10. Automatic subdirectory creation (figures, models, reports, logs, etc.)
Example

from orchard.core.paths import PROJECT_ROOT, RunPaths, LogStyle print(PROJECT_ROOT) PosixPath('/home/user/orchard-ml')

LogStyle

Unified logging style constants for consistent visual hierarchy.

Provides separators, symbols, indentation, and ANSI color codes used by all logging modules. Placed here (in paths.constants) rather than in logger.styles so that low-level packages (environment, config) can reference the constants without triggering circular imports.

RunPaths

Bases: BaseModel

Immutable container for experiment-specific directory paths.

Implements atomic run isolation using a deterministic hashing strategy that combines DATE + DATASET_SLUG + MODEL_SLUG + CONFIG_HASH to create unique, collision-free directory structures. The Pydantic frozen model ensures paths cannot be modified after creation.

Attributes:

Name Type Description
run_id str

Unique identifier in format YYYYMMDD_dataset_model_hash.

dataset_slug str

Normalized lowercase dataset name.

architecture_slug str

Sanitized alphanumeric architecture identifier.

root Path

Base directory for all run artifacts.

figures Path

Directory for plots, confusion matrices, ROC curves.

checkpoints Path

Directory for saved checkpoints (.pth files).

reports Path

Directory for config mirrors, CSV/XLSX summaries.

logs Path

Directory for training logs and session output.

database Path

Directory for SQLite optimization studies.

exports Path

Directory for production exports (ONNX).

Example

Directory structure created::

outputs/20260208_organcmnist_efficientnetb0_a3f7c2/
├── figures/
├── checkpoints/
├── reports/
├── logs/
├── database/
└── exports/

best_model_path property

Path for the best-performing model checkpoint.

Returns:

Type Description
Path

Path in format: checkpoints/best_{architecture_slug}.pth

final_report_path property

Path for the comprehensive experiment summary report.

Returns:

Type Description
Path

Path to reports/training_summary.xlsx

create(dataset_slug, architecture_name, training_cfg, base_dir=None) classmethod

Factory method to create and initialize a unique run environment.

Creates a new RunPaths instance with a deterministic unique ID based on dataset, model, and training configuration. Physically creates all subdirectories on the filesystem.

Parameters:

Name Type Description Default
dataset_slug str

Dataset identifier (e.g., 'organcmnist'). Will be normalized to lowercase.

required
architecture_name str

Human-readable model name (e.g., 'EfficientNet-B0'). Special characters are stripped, converted to lowercase.

required
training_cfg dict[str, Any]

Dictionary of hyperparameters used for hash generation. Supports nested dicts, but only hashable primitives (int, float, str, bool, list) contribute to the hash.

required
base_dir Path | None

Custom base directory for outputs. Defaults to OUTPUTS_ROOT (typically './outputs').

None

Returns:

Type Description
'RunPaths'

Fully initialized RunPaths instance with all directories created.

Raises:

Type Description
ValueError

If dataset_slug or architecture_name is not a string.

Example

paths = RunPaths.create( ... dataset_slug="OrganCMNIST", ... architecture_name="EfficientNet-B0", ... training_cfg={"batch_size": 32, "lr": 0.001} ... ) paths.dataset_slug 'organcmnist' paths.architecture_slug 'efficientnetb0'

Source code in orchard/core/paths/run_paths.py
@classmethod
def create(
    cls,
    dataset_slug: str,
    architecture_name: str,
    training_cfg: dict[str, Any],
    base_dir: Path | None = None,
) -> "RunPaths":
    """
    Factory method to create and initialize a unique run environment.

    Creates a new RunPaths instance with a deterministic unique ID based
    on dataset, model, and training configuration. Physically creates all
    subdirectories on the filesystem.

    Args:
        dataset_slug: Dataset identifier (e.g., 'organcmnist'). Will be
            normalized to lowercase.
        architecture_name: Human-readable model name (e.g., 'EfficientNet-B0').
            Special characters are stripped, converted to lowercase.
        training_cfg: Dictionary of hyperparameters used for hash generation.
            Supports nested dicts, but only hashable primitives (int, float,
            str, bool, list) contribute to the hash.
        base_dir: Custom base directory for outputs. Defaults to OUTPUTS_ROOT
            (typically './outputs').

    Returns:
        Fully initialized RunPaths instance with all directories created.

    Raises:
        ValueError: If dataset_slug or architecture_name is not a string.

    Example:
        >>> paths = RunPaths.create(
        ...     dataset_slug="OrganCMNIST",
        ...     architecture_name="EfficientNet-B0",
        ...     training_cfg={"batch_size": 32, "lr": 0.001}
        ... )
        >>> paths.dataset_slug
        'organcmnist'
        >>> paths.architecture_slug
        'efficientnetb0'
    """
    if not isinstance(dataset_slug, str):
        raise ValueError(f"Expected string for dataset_slug but got {type(dataset_slug)}")
    ds_slug = dataset_slug.lower()

    if not isinstance(architecture_name, str):
        raise ValueError(
            f"Expected string for architecture_name but got {type(architecture_name)}"
        )
    a_slug = re.sub(r"[^a-zA-Z0-9]", "", architecture_name.lower())

    # Determine the unique run ID
    run_id = cls._generate_unique_id(ds_slug, a_slug, training_cfg)

    base = Path(base_dir or OUTPUTS_ROOT)
    root_path = base / run_id

    # No collision fallback needed: run_timestamp guarantees uniqueness

    instance = cls(
        run_id=run_id,
        dataset_slug=ds_slug,
        architecture_slug=a_slug,
        root=root_path,
        figures=root_path / "figures",
        checkpoints=root_path / "checkpoints",
        reports=root_path / "reports",
        logs=root_path / "logs",
        database=root_path / "database",
        exports=root_path / "exports",
    )

    instance._setup_run_directories()
    return instance

get_fig_path(filename)

Generate path for a visualization artifact.

Parameters:

Name Type Description Default
filename str

Name of the figure file (e.g., 'confusion_matrix.png').

required

Returns:

Type Description
Path

Absolute path within the figures directory.

Source code in orchard/core/paths/run_paths.py
def get_fig_path(self, filename: str) -> Path:
    """
    Generate path for a visualization artifact.

    Args:
        filename: Name of the figure file (e.g., 'confusion_matrix.png').

    Returns:
        Absolute path within the figures directory.
    """
    return self.figures / filename

get_config_path()

Get path for the archived run configuration.

Returns:

Type Description
Path

Path to reports/config.yaml

Source code in orchard/core/paths/run_paths.py
def get_config_path(self) -> Path:
    """
    Get path for the archived run configuration.

    Returns:
        Path to reports/config.yaml
    """
    return self.reports / "config.yaml"

get_db_path()

Get path for Optuna SQLite study database.

The database directory is created during RunPaths initialization, ensuring the parent directory exists before Optuna writes to it.

Returns:

Type Description
Path

Path to database/study.db

Source code in orchard/core/paths/run_paths.py
def get_db_path(self) -> Path:
    """
    Get path for Optuna SQLite study database.

    The database directory is created during RunPaths initialization,
    ensuring the parent directory exists before Optuna writes to it.

    Returns:
        Path to database/study.db
    """
    return self.database / "study.db"

__repr__()

Return string representation with run_id and root path.

Source code in orchard/core/paths/run_paths.py
def __repr__(self) -> str:
    """
    Return string representation with run_id and root path.
    """
    return f"RunPaths(run_id='{self.run_id}', root={self.root})"

get_project_root()

Dynamically locate the project root by searching for anchor files.

Traverses upward from current file's directory until finding a marker file (.git or pyproject.toml). Supports Docker environments via IN_DOCKER environment variable override.

Returns:

Type Description
Path

Resolved absolute Path to the project root directory.

Note:

- IN_DOCKER=1 or IN_DOCKER=TRUE returns /app
- Falls back to fixed parent traversal if no markers found
Source code in orchard/core/paths/root.py
def get_project_root() -> Path:
    """
    Dynamically locate the project root by searching for anchor files.

    Traverses upward from current file's directory until finding a marker
    file (.git or pyproject.toml). Supports Docker environments via
    IN_DOCKER environment variable override.

    Returns:
        Resolved absolute Path to the project root directory.

    Note:

        - IN_DOCKER=1 or IN_DOCKER=TRUE returns /app
        - Falls back to fixed parent traversal if no markers found
    """
    # Environment override for Docker setups
    if str(os.getenv("IN_DOCKER")).upper() in ("1", "TRUE"):
        return Path("/app").resolve()

    # Start from the directory of this file
    current_path = Path(__file__).resolve().parent

    # Look for markers that define the project root
    # Note: .git is most reliable; README.md alone can exist in subdirectories
    root_markers = {".git", "pyproject.toml"}

    for parent in [current_path] + list(current_path.parents):
        if any((parent / marker).exists() for marker in root_markers):
            return parent

    # Fallback if no markers are found
    try:
        if len(current_path.parents) >= 3:
            return current_path.parents[2]
    except IndexError:  # pragma: no cover
        pass

    # Final fallback
    return current_path.parent.parent  # pragma: no cover

setup_static_directories()

Ensure core project directories exist at startup.

Creates DATASET_DIR and OUTPUTS_ROOT if they do not exist, preventing runtime errors during data fetching or artifact creation. Uses mkdir(parents=True, exist_ok=True) for idempotent operation.

Source code in orchard/core/paths/root.py
def setup_static_directories() -> None:
    """
    Ensure core project directories exist at startup.

    Creates DATASET_DIR and OUTPUTS_ROOT if they do not exist, preventing
    runtime errors during data fetching or artifact creation. Uses
    mkdir(parents=True, exist_ok=True) for idempotent operation.
    """
    for directory in STATIC_DIRS:
        directory.mkdir(parents=True, exist_ok=True)