Skip to content

hardware_config

orchard.core.config.hardware_config

Hardware Manifest.

Declarative schema for hardware abstraction and execution policy. Resolves compute device, enforces determinism constraints, and derives hardware-dependent execution parameters.

Single Source of Truth (SSOT) for: * Device selection (CPU/CUDA/MPS) with automatic resolution * Reproducibility and deterministic execution * DataLoader parallelism constraints * Process-level synchronization (cross-platform lock files)

HardwareConfig

Bases: BaseModel

Hardware abstraction and execution policy configuration.

Manages device selection, reproducibility, process synchronization, and DataLoader parallelism for training execution.

Attributes:

Name Type Description
device DeviceType

Compute device ('auto', 'cpu', 'cuda', 'mps'). Auto-resolved to best available accelerator.

project_name ProjectSlug

Project identifier for lock file naming.

allow_process_kill bool

Allow terminating duplicate/zombie processes.

reproducible bool

Enable strict deterministic mode (disables workers, enables deterministic algorithms).

deterministic_warn_only bool

Use warn-only mode for deterministic algorithms. Requires reproducible=True.

lock_file_path property

Cross-platform lock file for preventing concurrent experiments.

Returns:

Type Description
Path

Path in system temp directory based on project name

supports_amp property

Check if device supports Automatic Mixed Precision.

Returns:

Type Description
bool

True if device is CUDA or MPS (GPU accelerators), False for CPU.

effective_num_workers property

Get optimal DataLoader workers respecting reproducibility constraints.

Returns:

Type Description
int

0 if reproducible mode (avoids non-determinism from multiprocessing),

int

otherwise system-detected optimal worker count.

use_deterministic_algorithms property

Check if PyTorch should enforce deterministic operations.

Returns:

Type Description
bool

True if reproducible mode is enabled, False otherwise.

resolve_device(v) classmethod

Validates and resolves device to available hardware.

Auto-selects best device if 'auto', falls back to CPU if requested accelerator unavailable.

Parameters:

Name Type Description Default
v DeviceType

Requested device type

required

Returns:

Type Description
DeviceType

Resolved device string

Source code in orchard/core/config/hardware_config.py
@field_validator("device")
@classmethod
def resolve_device(cls, v: DeviceType) -> DeviceType:
    """
    Validates and resolves device to available hardware.

    Auto-selects best device if 'auto', falls back to CPU if
    requested accelerator unavailable.

    Args:
        v: Requested device type

    Returns:
        Resolved device string
    """
    if v == "auto":
        return cast(DeviceType, detect_best_device())

    requested = v.lower()

    # Warn-and-fallback, not raise: configs may be built on a CPU-only
    # machine (CI, laptops) to validate recipes or dispatch to remote
    # GPU nodes.  The orchestrator raises OrchardDeviceError if the
    # resolved device disappears at actual training time.
    if requested == "cuda" and not torch.cuda.is_available():
        import warnings

        warnings.warn(
            "CUDA was explicitly requested but is not available. Falling back to CPU.",
            UserWarning,
            stacklevel=2,
        )
        return "cpu"
    if requested == "mps" and not torch.backends.mps.is_available():
        import warnings

        warnings.warn(
            "MPS was explicitly requested but is not available. Falling back to CPU.",
            UserWarning,
            stacklevel=2,
        )
        return "cpu"

    return cast(DeviceType, requested)