environment
orchard.core.environment
¶
Environment & Infrastructure Abstraction Layer.
This package centralizes hardware acceleration discovery, system-level optimizations, and reproducibility protocols. It provides a unified interface to ensure consistent execution across Local, HPC, and Docker environments.
DuplicateProcessCleaner(script_name=None)
¶
Scans and optionally terminates duplicate instances of the current script.
Attributes:
| Name | Type | Description |
|---|---|---|
script_path |
str
|
Absolute path of the script to match against running processes. |
current_pid |
int
|
PID of the current process. |
Source code in orchard/core/environment/guards.py
detect_duplicates()
¶
Detects other Python processes running the same script.
Returns:
| Type | Description |
|---|---|
list[Process]
|
list of psutil.Process instances representing duplicates. |
Source code in orchard/core/environment/guards.py
terminate_duplicates(logger=None)
¶
Terminates detected duplicate processes.
In distributed mode (torchrun / DDP), termination is skipped entirely because sibling rank processes are intentional, not duplicates.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
logger
|
Logger | None
|
Logger for reporting terminated PIDs. |
None
|
Returns:
| Type | Description |
|---|---|
int
|
Number of terminated duplicate processes (0 in distributed mode). |
Source code in orchard/core/environment/guards.py
TimeTracker()
¶
Default implementation of TimeTrackerProtocol.
Tracks elapsed time between start() and stop() calls, providing both raw seconds and formatted output.
Source code in orchard/core/environment/timing.py
TimeTrackerProtocol
¶
get_local_rank()
¶
Return the node-local rank of the current process.
Reads from the LOCAL_RANK environment variable. Used primarily
for per-rank GPU assignment (torch.device(f"cuda:{local_rank}")).
Returns 0 in single-process mode.
Source code in orchard/core/environment/distributed.py
get_rank()
¶
Return the global rank of the current process.
Reads from the RANK environment variable set by torchrun or
torch.distributed.launch. Returns 0 when running outside a
distributed context (single-process default).
Source code in orchard/core/environment/distributed.py
get_world_size()
¶
Return the total number of distributed processes.
Reads from the WORLD_SIZE environment variable. Returns 1
when running outside a distributed context.
Source code in orchard/core/environment/distributed.py
is_distributed()
¶
Detect whether the current process was launched in a distributed context.
Returns True when either RANK or LOCAL_RANK is present in
the environment, indicating a torchrun or equivalent launcher.
Source code in orchard/core/environment/distributed.py
is_main_process()
¶
Check whether the current process is the main (rank 0) process.
Always returns True in single-process mode. In distributed mode,
only the process with RANK=0 returns True.
Source code in orchard/core/environment/distributed.py
ensure_single_instance(lock_file, logger)
¶
Implements a cooperative advisory lock to guarantee singleton execution.
Leverages Unix 'flock' to create an exclusive lock on a sentinel file. If the lock cannot be acquired immediately, it indicates another instance is active, and the process will abort to prevent filesystem or GPU race conditions.
In distributed mode (torchrun / DDP), only the main process (rank 0) acquires the lock. Non-main ranks skip locking entirely to avoid deadlocking against the rank-0 held lock.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
lock_file
|
Path
|
Filesystem path where the lock sentinel will reside. |
required |
logger
|
Logger
|
Active logger for reporting acquisition status. |
required |
Raises:
| Type | Description |
|---|---|
SystemExit
|
If an existing lock is detected on the system. |
Source code in orchard/core/environment/guards.py
release_single_instance(lock_file)
¶
Safely releases the system lock and unlinks the sentinel file.
Guarantees that the file descriptor is closed and the lock is returned to the OS. Designed to be called during normal shutdown or within exception handling blocks.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
lock_file
|
Path
|
Filesystem path to the sentinel file to be removed. |
required |
Source code in orchard/core/environment/guards.py
apply_cpu_threads(num_workers)
¶
Sets optimal compute threads to avoid resource contention.
Synchronizes PyTorch, OMP, and MKL thread counts.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
num_workers
|
int
|
Active DataLoader workers |
required |
Returns:
| Type | Description |
|---|---|
int
|
Number of threads assigned to compute operations |
Source code in orchard/core/environment/hardware.py
configure_system_libraries()
¶
Configures libraries for headless environments and reduces logging noise.
- Sets Matplotlib to 'Agg' backend on Linux/Docker (no GUI)
- Configures font embedding for PDF/PS exports
- Suppresses verbose Matplotlib warnings
Source code in orchard/core/environment/hardware.py
detect_best_device()
¶
Detects the most performant accelerator (CUDA > MPS > CPU).
Returns:
| Type | Description |
|---|---|
str
|
Device string: 'cuda', 'mps', or 'cpu' |
Source code in orchard/core/environment/hardware.py
get_accelerator_name()
¶
Returns accelerator model name (CUDA GPU or Apple Silicon) or empty string.
Source code in orchard/core/environment/hardware.py
get_num_workers()
¶
Determines optimal DataLoader workers with RAM stability cap.
Returns:
| Type | Description |
|---|---|
int
|
Recommended number of subprocesses (2-8 range) |
Source code in orchard/core/environment/hardware.py
get_vram_info(device_idx=0)
¶
Retrieves VRAM availability for a CUDA device.
Note
MPS (Apple Silicon) does not expose VRAM info via PyTorch —
torch.mps.mem_get_info() does not exist. Returns 'N/A' for
non-CUDA devices until Apple provides a public API.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
device_idx
|
int
|
GPU index to query |
0
|
Returns:
| Type | Description |
|---|---|
str
|
Formatted string 'X.XX GB / Y.YY GB' or status message |
Source code in orchard/core/environment/hardware.py
has_mps_backend()
¶
to_device_obj(device_str, local_rank=0)
¶
Converts device string to PyTorch device object.
In distributed multi-GPU setups, uses local_rank to select the
correct GPU and calls torch.cuda.set_device() for CUDA affinity.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
device_str
|
str
|
'cuda', 'cpu', or 'auto' (auto-selects best available) |
required |
local_rank
|
int
|
Node-local process rank for GPU assignment (default 0).
Used to select |
0
|
Returns:
| Type | Description |
|---|---|
device
|
torch.device object |
Raises:
| Type | Description |
|---|---|
ValueError
|
If CUDA requested but unavailable, or invalid device string |
Source code in orchard/core/environment/hardware.py
determine_tta_mode(use_tta, device_type, tta_mode='full')
¶
Reports the active TTA ensemble policy.
The ensemble complexity is driven by the tta_mode config field,
not by hardware. This guarantees identical predictions on CPU, CUDA
and MPS for the same config, preserving cross-platform determinism.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
use_tta
|
bool
|
Whether Test-Time Augmentation is enabled. |
required |
device_type
|
str
|
The type of active device ('cpu', 'cuda', 'mps'). |
required |
tta_mode
|
str
|
Config-driven ensemble complexity ('full' or 'light'). |
'full'
|
Returns:
| Type | Description |
|---|---|
str
|
Descriptive string of the TTA operation mode. |
Source code in orchard/core/environment/policy.py
set_seed(seed, strict=False, warn_only=False)
¶
Seed all PRNGs and optionally enforce deterministic algorithms.
Seeds Python's random, NumPy, and PyTorch (CPU + CUDA + MPS).
In strict mode, additionally forces deterministic kernels at the
cost of reduced performance.
Note
PYTHONHASHSEED is set here for completeness, but CPython reads it
only at interpreter startup — the runtime assignment has no effect on
the running process. The project Dockerfile handles this correctly
(ENV PYTHONHASHSEED=0). For bare-metal runs, prefix the command:
PYTHONHASHSEED=42 orchard run <recipe>. Full bit-exact determinism
additionally requires strict=True and num_workers=0 (both
enforced automatically in Docker via DOCKER_REPRODUCIBILITY_MODE).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
seed
|
int
|
The seed value to set across all PRNGs. |
required |
strict
|
bool
|
If True, enforces deterministic algorithms (5-30% perf penalty). |
False
|
warn_only
|
bool
|
If True (and strict=True), uses warn-only mode for
|
False
|
Source code in orchard/core/environment/reproducibility.py
worker_init_fn(worker_id)
¶
Initialize PRNGs for a DataLoader worker subprocess.
Each worker receives a unique but deterministic sub-seed derived from the parent seed, ensuring augmentation diversity while maintaining reproducibility across runs.
Called automatically by DataLoader when num_workers > 0.
In strict reproducibility mode, num_workers is forced to 0 by
HardwareConfig, so this function is never invoked.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
worker_id
|
int
|
Subprocess ID provided by DataLoader (0-based). |
required |