core
orchard.core
¶
Core Utilities Package.
This package exposes the essential components for configuration, logging, system management, project constants, and the dynamic dataset registry. It also includes the RootOrchestrator to manage experiment lifecycle initialization.
InfraManagerProtocol
¶
Bases: Protocol
Protocol defining infrastructure management interface.
Enables dependency injection and mocking in tests while ensuring consistent lifecycle management across implementations.
prepare_environment(cfg, logger)
¶
Prepare execution environment before experiment run.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
cfg
|
'HardwareAwareConfig'
|
Configuration with hardware manifest access. |
required |
logger
|
Logger
|
Logger instance for status reporting. |
required |
Source code in orchard/core/config/infrastructure_config.py
release_resources(cfg, logger)
¶
Release resources allocated during environment preparation.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
cfg
|
'HardwareAwareConfig'
|
Configuration used during resource allocation. |
required |
logger
|
Logger
|
Logger instance for status reporting. |
required |
Source code in orchard/core/config/infrastructure_config.py
TimeTracker()
¶
Default implementation of TimeTrackerProtocol.
Tracks elapsed time between start() and stop() calls, providing both raw seconds and formatted output.
Source code in orchard/core/environment/timing.py
TimeTrackerProtocol
¶
Logger(name=LOGGER_NAME, log_dir=None, log_to_file=True, level=logging.INFO, max_bytes=5 * 1024 * 1024, backup_count=5)
¶
Manages centralized logging configuration with singleton-like behavior.
Provides a unified logging interface for the entire framework with support for dynamic reconfiguration. Initially bootstraps with console-only output, then transitions to dual console+file logging when experiment directories become available.
The logger implements pseudo-singleton semantics via class-level tracking (_configured_names) to prevent duplicate handler registration while allowing intentional reconfiguration when log directories are provided.
Lifecycle
- Bootstrap Phase: Console-only logging (no log_dir specified)
- Orchestration Phase: RootOrchestrator calls setup() with log_dir
- Reconfiguration: Existing handlers removed, file handler added
Class Attributes: _configured_names (dict[str, bool]): Tracks which logger names have been configured
Attributes:
| Name | Type | Description |
|---|---|---|
name |
str
|
Logger identifier (typically LOGGER_NAME constant) |
log_dir |
Path | None
|
Directory for log file storage |
log_to_file |
bool
|
Enable file logging (requires log_dir) |
level |
int
|
Logging level (DEBUG, INFO, WARNING, ERROR, CRITICAL) |
max_bytes |
int
|
Maximum log file size before rotation (default: 5MB) |
backup_count |
int
|
Number of rotated log files to retain (default: 5) |
_log |
Logger
|
Underlying Python logger instance |
Example
Bootstrap phase (console-only)¶
logger = Logger().get_logger() logger.info("Framework initializing...")
Orchestration phase (add file logging)¶
logger = Logger.setup( ... name=LOGGER_NAME, ... log_dir=Path("./outputs/run_123/logs"), ... level="INFO" ... ) logger.info("Logging to file now")
Notes:
- Reconfiguration is idempotent: calling setup() multiple times is safe
- All handlers are properly closed before reconfiguration
- Log files use UTC timestamps for consistency across time zones
- RotatingFileHandler prevents disk space exhaustion
Initializes the Logger with specified configuration.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Logger identifier (default: LOGGER_NAME constant) |
LOGGER_NAME
|
log_dir
|
Path | None
|
Directory for log file storage (None = console-only) |
None
|
log_to_file
|
bool
|
Enable file logging if log_dir provided (default: True) |
True
|
level
|
int
|
Logging level as integer constant (default: logging.INFO) |
INFO
|
max_bytes
|
int
|
Maximum log file size before rotation in bytes (default: 5MB) |
5 * 1024 * 1024
|
backup_count
|
int
|
Number of rotated backup files to retain (default: 5) |
5
|
Source code in orchard/core/logger/logger.py
get_logger()
¶
Returns the configured logging.Logger instance.
Returns:
| Type | Description |
|---|---|
Logger
|
The underlying Python logging.Logger instance with configured handlers |
setup(name, log_dir=None, level='INFO', **kwargs)
classmethod
¶
Main entry point for configuring the logger, called by RootOrchestrator.
Bridges semantic LogLevel strings (INFO, DEBUG, WARNING) to Python logging constants. Provides convenient string-based level specification while internally using numeric logging constants.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Logger identifier (typically LOGGER_NAME constant) |
required |
log_dir
|
Path | None
|
Directory for log file storage (None = console-only mode) |
None
|
level
|
str
|
Logging level as string (DEBUG, INFO, WARNING, ERROR, CRITICAL) |
'INFO'
|
**kwargs
|
Any
|
Additional arguments passed to Logger constructor |
{}
|
Returns:
| Type | Description |
|---|---|
Logger
|
Configured logging.Logger instance ready for use |
Environment Variables
DEBUG: If set to "1", overrides level to DEBUG regardless of level parameter
Example
logger = Logger.setup( ... name="OrchardML", ... log_dir=Path("./outputs/run_123/logs"), ... level="INFO" ... ) logger.info("Training started")
Source code in orchard/core/logger/logger.py
LogStyle
¶
Unified logging style constants for consistent visual hierarchy.
Provides separators, symbols, indentation, and ANSI color codes used
by all logging modules. Placed here (in paths.constants) rather
than in logger.styles so that low-level packages (environment,
config) can reference the constants without triggering circular
imports.
Reporter
¶
Bases: BaseModel
Centralized logging and reporting utility for experiment lifecycle events.
Transforms complex configuration states and hardware objects into human-readable logs. Called by Orchestrator during initialization.
log_phase_header(log, title, style=None)
staticmethod
¶
Log a centered phase header with separator lines.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
log
|
Logger
|
Logger instance to write to. |
required |
title
|
str
|
Header text (will be uppercased and centered). |
required |
style
|
str | None
|
Separator string (defaults to |
None
|
Source code in orchard/core/logger/env_reporter.py
log_initial_status(logger_instance, cfg, paths, device, applied_threads, num_workers)
¶
Logs verified baseline environment configuration upon initialization.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
logger_instance
|
Logger
|
Active experiment logger |
required |
cfg
|
'Config'
|
Validated global configuration manifest |
required |
paths
|
'RunPaths'
|
Dynamic path orchestrator for current session |
required |
device
|
'torch.device'
|
Resolved PyTorch compute device |
required |
applied_threads
|
int
|
Number of intra-op threads assigned |
required |
num_workers
|
int
|
Number of DataLoader workers |
required |
Source code in orchard/core/logger/env_reporter.py
DatasetMetadata
¶
Bases: BaseModel
Immutable metadata container for a dataset entry.
Ensures dataset-specific constants are grouped and frozen throughout pipeline execution. Serves as static definition feeding into dynamic DatasetConfig.
Attributes:
| Name | Type | Description |
|---|---|---|
name |
str
|
Short identifier (e.g., |
display_name |
str
|
Human-readable name for reporting. |
md5_checksum |
str
|
MD5 hash for download integrity verification. |
url |
str
|
Source URL for dataset download. |
path |
Path
|
Local path to the |
classes |
list[str]
|
Class labels in index order. |
in_channels |
int
|
Number of image channels (1=grayscale, 3=RGB). |
native_resolution |
int | None
|
Native pixel resolution (e.g., 28, 224). |
mean |
tuple[float, ...]
|
Channel-wise normalization mean. |
std |
tuple[float, ...]
|
Channel-wise normalization standard deviation. |
is_anatomical |
bool
|
Whether images have fixed anatomical orientation. |
is_texture_based |
bool
|
Whether classification relies on texture patterns. |
DatasetRegistryWrapper
¶
Bases: BaseModel
Pydantic wrapper for multi-domain dataset registries.
Merges domain-specific registries (medical, space) based on the selected resolution and provides validated, deep-copied access to dataset metadata entries.
Attributes:
| Name | Type | Description |
|---|---|---|
resolution |
int
|
Target dataset resolution (28, 32, 64, 128, or 224). |
registry |
dict[str, DatasetMetadata]
|
Deep-copied metadata registry for the selected resolution. |
get_dataset(name)
¶
Retrieves specific DatasetMetadata by name.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Dataset identifier |
required |
Returns:
| Type | Description |
|---|---|
DatasetMetadata
|
Deep copy of DatasetMetadata |
Raises:
| Type | Description |
|---|---|
KeyError
|
If dataset not found in registry |
Source code in orchard/core/metadata/wrapper.py
RootOrchestrator(cfg, infra_manager=None, reporter=None, time_tracker=None, audit_saver=None, log_initializer=None, seed_setter=None, thread_applier=None, system_configurator=None, static_dir_setup=None, device_resolver=None, rank=None, local_rank=None)
¶
Central coordinator for ML experiment lifecycle management.
Orchestrates the complete initialization sequence from configuration validation through resource provisioning to execution readiness. Implements a 7-phase initialization protocol (phases 1-6 eager, phase 7 deferred) with dependency injection for maximum testability.
The orchestrator follows the Single Responsibility Principle by delegating specialized tasks to injected dependencies while maintaining overall coordination. Uses the Context Manager pattern to guarantee resource cleanup even during failures.
Initialization Phases:
- Determinism: Global RNG seeding (Python, NumPy, PyTorch)
- Runtime Configuration: CPU thread affinity, system libraries
- Filesystem Provisioning: Dynamic workspace creation via RunPaths
- Logging Initialization: File-based persistent logging setup
- Config Persistence: YAML manifest export for auditability
- Infrastructure Guarding: OS-level resource locks (prevents race conditions)
- Environment Reporting: Comprehensive telemetry logging
Dependency Injection:
All external dependencies are injectable with sensible defaults:
- infra_manager: OS resource management (locks, cleanup)
- reporter: Environment telemetry engine
- log_initializer: Logging setup strategy
- seed_setter: RNG seeding function
- thread_applier: CPU thread configuration
- system_configurator: System library setup (matplotlib, etc)
- static_dir_setup: Static directory creation
- audit_saver: Config YAML + requirements snapshot persistence
- device_resolver: Hardware device detection
Attributes:
| Name | Type | Description |
|---|---|---|
cfg |
Config
|
Validated global configuration (Single Source of Truth) |
rank |
int
|
Global rank of this process (0 in single-process mode) |
local_rank |
int
|
Node-local rank for GPU assignment (0 in single-process mode) |
is_main_process |
bool
|
True for rank 0, False for non-main ranks |
infra |
InfraManagerProtocol
|
Infrastructure resource manager |
reporter |
ReporterProtocol
|
Environment telemetry engine |
time_tracker |
TimeTrackerProtocol
|
Pipeline duration tracker |
paths |
RunPaths | None
|
Session-specific directory structure (None on non-main ranks) |
run_logger |
Logger | None
|
Active logger instance (None on non-main ranks) |
repro_mode |
bool
|
Strict determinism flag |
warn_only_mode |
bool
|
Warn-only mode for strict determinism |
num_workers |
int
|
DataLoader worker processes |
Example
cfg = Config.from_recipe(Path("recipes/config_mini_cnn.yaml")) with RootOrchestrator(cfg) as orch: ... device = orch.get_device() ... logger = orch.run_logger ... paths = orch.paths ... # Execute training pipeline with guaranteed cleanup
Notes:
- Thread-safe: Single-instance locking via InfrastructureManager
- Idempotent: initialize_core_services() is safe to call multiple times (subsequent calls return cached RunPaths without re-executing phases)
- Auditable: All configuration saved to YAML in workspace
- Deterministic: Reproducible experiments via strict seeding
Initializes orchestrator with dependency injection.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
cfg
|
'Config'
|
Validated global configuration (SSOT) |
required |
infra_manager
|
InfraManagerProtocol | None
|
Infrastructure management handler (default: InfrastructureManager()) |
None
|
reporter
|
ReporterProtocol | None
|
Environment reporting engine (default: Reporter()) |
None
|
time_tracker
|
TimeTrackerProtocol | None
|
Pipeline duration tracker (default: TimeTracker()) |
None
|
audit_saver
|
AuditSaverProtocol | None
|
Run-manifest persistence — config YAML + dependency snapshot (default: AuditSaver()) |
None
|
log_initializer
|
Callable[..., Any] | None
|
Logging setup function (default: Logger.setup) |
None
|
seed_setter
|
Callable[..., Any] | None
|
RNG seeding function (default: set_seed) |
None
|
thread_applier
|
Callable[..., Any] | None
|
CPU thread configuration (default: apply_cpu_threads) |
None
|
system_configurator
|
Callable[..., Any] | None
|
System library setup (default: configure_system_libraries) |
None
|
static_dir_setup
|
Callable[..., Any] | None
|
Static directory creation (default: setup_static_directories) |
None
|
device_resolver
|
Callable[..., Any] | None
|
Device resolution (default: to_device_obj) |
None
|
rank
|
int | None
|
Global rank of this process (default: auto-detected from RANK env var). Rank 0 executes all phases; rank N skips filesystem, logging, config persistence, infrastructure locking, and reporting. |
None
|
local_rank
|
int | None
|
Node-local rank for GPU assignment (default: auto-detected from LOCAL_RANK env var). Used by device_resolver to select the correct GPU in multi-GPU distributed setups. |
None
|
Source code in orchard/core/orchestrator.py
__enter__()
¶
Context Manager entry — triggers the initialization sequence.
Starts the pipeline timer and delegates to initialize_core_services() for phases 1-6 (seeding, runtime config, filesystem, logging, config persistence, infrastructure locking, and device resolution). Phase 7 (environment reporting) is deferred to log_environment_report().
If any phase raises (including KeyboardInterrupt / SystemExit), cleanup() is called before re-raising to ensure partial resources (locks, file handles) are released even on failure.
Returns:
| Type | Description |
|---|---|
'RootOrchestrator'
|
Fully initialized RootOrchestrator ready for pipeline execution. |
Raises:
| Type | Description |
|---|---|
BaseException
|
Re-raises any initialization error after cleanup. |
Source code in orchard/core/orchestrator.py
__exit__(exc_type, exc_val, exc_tb)
¶
Context Manager exit — stops timer and guarantees resource teardown.
Invoked automatically when leaving the with block, whether the
pipeline completed normally or raised an exception. Stops the timer,
then delegates to cleanup() for infrastructure lock release and
logging handler closure.
Error reporting is intentionally left to the caller (CLI layer), which has the user-facing context to log appropriate messages.
Returns False so that any exception propagates to the caller unchanged.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
exc_type
|
type[BaseException] | None
|
Exception class if the block raised, else None. |
required |
exc_val
|
BaseException | None
|
Exception instance if the block raised, else None. |
required |
exc_tb
|
TracebackType | None
|
Traceback object if the block raised, else None. |
required |
Returns:
| Type | Description |
|---|---|
Literal[False]
|
Always False — exceptions are never suppressed. |
Source code in orchard/core/orchestrator.py
initialize_core_services()
¶
Executes linear sequence of environment initialization phases.
Synchronizes global state through phases 1-6, progressing from deterministic seeding to device resolution. Phase 7 (environment reporting) is deferred to log_environment_report().
In distributed mode (torchrun / DDP), only the main process (rank 0)
executes phases 3-6 (filesystem, logging, config persistence, infra
locking). All ranks execute phases 1-2 (seeding, threads) for
identical RNG state and thread affinity, plus device resolution
for DDP readiness (each rank binds to cuda:{local_rank}).
Idempotent: guarded by _initialized flag. If already initialized,
returns existing RunPaths without re-executing any phase. This prevents
orphaned directories (Phase 3 creates unique paths per call) and
resource leaks (Phase 6 acquires filesystem locks).
Returns:
| Type | Description |
|---|---|
RunPaths | None
|
Provisioned directory structure for rank 0, None for non-main ranks. |
Raises:
| Type | Description |
|---|---|
RuntimeError
|
If called after cleanup (single-use guard). |
OrchardDeviceError
|
If device resolution fails at runtime. |
Source code in orchard/core/orchestrator.py
445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 | |
log_environment_report()
¶
Emit the environment initialization report (phase 7).
Designed to be called explicitly by the CLI app after external services (e.g. MLflow tracker) have been started, so that all enter/exit log messages appear in the correct chronological order.
Source code in orchard/core/orchestrator.py
cleanup()
¶
Releases system resources and removes execution lock file.
Guarantees clean state for subsequent runs by unlinking InfrastructureManager guards and closing logging handlers. Non-main ranks skip resource release (they never acquired locks or opened file-based log handlers).
Source code in orchard/core/orchestrator.py
get_device()
¶
Resolves and caches optimal computation device (CUDA/CPU/MPS).
Returns:
| Type | Description |
|---|---|
device
|
PyTorch device object for model execution |
Source code in orchard/core/orchestrator.py
RunPaths
¶
Bases: BaseModel
Immutable container for experiment-specific directory paths.
Implements atomic run isolation using a deterministic hashing strategy that combines DATE + DATASET_SLUG + MODEL_SLUG + CONFIG_HASH to create unique, collision-free directory structures. The Pydantic frozen model ensures paths cannot be modified after creation.
Attributes:
| Name | Type | Description |
|---|---|---|
run_id |
str
|
Unique identifier in format YYYYMMDD_dataset_model_hash. |
dataset_slug |
str
|
Normalized lowercase dataset name. |
architecture_slug |
str
|
Sanitized alphanumeric architecture identifier. |
root |
Path
|
Base directory for all run artifacts. |
figures |
Path
|
Directory for plots, confusion matrices, ROC curves. |
checkpoints |
Path
|
Directory for saved checkpoints (.pth files). |
reports |
Path
|
Directory for config mirrors, CSV/XLSX summaries. |
logs |
Path
|
Directory for training logs and session output. |
database |
Path
|
Directory for SQLite optimization studies. |
exports |
Path
|
Directory for production exports (ONNX). |
Example
Directory structure created::
outputs/20260208_organcmnist_efficientnetb0_a3f7c2/
├── figures/
├── checkpoints/
├── reports/
├── logs/
├── database/
└── exports/
best_model_path
property
¶
Path for the best-performing model checkpoint.
Returns:
| Type | Description |
|---|---|
Path
|
Path in format: checkpoints/best_{architecture_slug}.pth |
final_report_path
property
¶
Path for the comprehensive experiment summary report.
Returns:
| Type | Description |
|---|---|
Path
|
Path to reports/training_summary.xlsx |
create(dataset_slug, architecture_name, training_cfg, base_dir=None)
classmethod
¶
Factory method to create and initialize a unique run environment.
Creates a new RunPaths instance with a deterministic unique ID based on dataset, model, and training configuration. Physically creates all subdirectories on the filesystem.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dataset_slug
|
str
|
Dataset identifier (e.g., 'organcmnist'). Will be normalized to lowercase. |
required |
architecture_name
|
str
|
Human-readable model name (e.g., 'EfficientNet-B0'). Special characters are stripped, converted to lowercase. |
required |
training_cfg
|
dict[str, Any]
|
Dictionary of hyperparameters used for hash generation. Supports nested dicts, but only hashable primitives (int, float, str, bool, list) contribute to the hash. |
required |
base_dir
|
Path | None
|
Custom base directory for outputs. Defaults to OUTPUTS_ROOT (typically './outputs'). |
None
|
Returns:
| Type | Description |
|---|---|
'RunPaths'
|
Fully initialized RunPaths instance with all directories created. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If dataset_slug or architecture_name is not a string. |
Example
paths = RunPaths.create( ... dataset_slug="OrganCMNIST", ... architecture_name="EfficientNet-B0", ... training_cfg={"batch_size": 32, "lr": 0.001} ... ) paths.dataset_slug 'organcmnist' paths.architecture_slug 'efficientnetb0'
Source code in orchard/core/paths/run_paths.py
98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 | |
get_fig_path(filename)
¶
Generate path for a visualization artifact.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filename
|
str
|
Name of the figure file (e.g., 'confusion_matrix.png'). |
required |
Returns:
| Type | Description |
|---|---|
Path
|
Absolute path within the figures directory. |
Source code in orchard/core/paths/run_paths.py
get_config_path()
¶
Get path for the archived run configuration.
Returns:
| Type | Description |
|---|---|
Path
|
Path to reports/config.yaml |
get_db_path()
¶
Get path for Optuna SQLite study database.
The database directory is created during RunPaths initialization, ensuring the parent directory exists before Optuna writes to it.
Returns:
| Type | Description |
|---|---|
Path
|
Path to database/study.db |
Source code in orchard/core/paths/run_paths.py
apply_cpu_threads(num_workers)
¶
Sets optimal compute threads to avoid resource contention.
Synchronizes PyTorch, OMP, and MKL thread counts.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
num_workers
|
int
|
Active DataLoader workers |
required |
Returns:
| Type | Description |
|---|---|
int
|
Number of threads assigned to compute operations |
Source code in orchard/core/environment/hardware.py
configure_system_libraries()
¶
Configures libraries for headless environments and reduces logging noise.
- Sets Matplotlib to 'Agg' backend on Linux/Docker (no GUI)
- Configures font embedding for PDF/PS exports
- Suppresses verbose Matplotlib warnings
Source code in orchard/core/environment/hardware.py
detect_best_device()
¶
Detects the most performant accelerator (CUDA > MPS > CPU).
Returns:
| Type | Description |
|---|---|
str
|
Device string: 'cuda', 'mps', or 'cpu' |
Source code in orchard/core/environment/hardware.py
determine_tta_mode(use_tta, device_type, tta_mode='full')
¶
Reports the active TTA ensemble policy.
The ensemble complexity is driven by the tta_mode config field,
not by hardware. This guarantees identical predictions on CPU, CUDA
and MPS for the same config, preserving cross-platform determinism.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
use_tta
|
bool
|
Whether Test-Time Augmentation is enabled. |
required |
device_type
|
str
|
The type of active device ('cpu', 'cuda', 'mps'). |
required |
tta_mode
|
str
|
Config-driven ensemble complexity ('full' or 'light'). |
'full'
|
Returns:
| Type | Description |
|---|---|
str
|
Descriptive string of the TTA operation mode. |
Source code in orchard/core/environment/policy.py
ensure_single_instance(lock_file, logger)
¶
Implements a cooperative advisory lock to guarantee singleton execution.
Leverages Unix 'flock' to create an exclusive lock on a sentinel file. If the lock cannot be acquired immediately, it indicates another instance is active, and the process will abort to prevent filesystem or GPU race conditions.
In distributed mode (torchrun / DDP), only the main process (rank 0) acquires the lock. Non-main ranks skip locking entirely to avoid deadlocking against the rank-0 held lock.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
lock_file
|
Path
|
Filesystem path where the lock sentinel will reside. |
required |
logger
|
Logger
|
Active logger for reporting acquisition status. |
required |
Raises:
| Type | Description |
|---|---|
SystemExit
|
If an existing lock is detected on the system. |
Source code in orchard/core/environment/guards.py
get_accelerator_name()
¶
Returns accelerator model name (CUDA GPU or Apple Silicon) or empty string.
Source code in orchard/core/environment/hardware.py
get_num_workers()
¶
Determines optimal DataLoader workers with RAM stability cap.
Returns:
| Type | Description |
|---|---|
int
|
Recommended number of subprocesses (2-8 range) |
Source code in orchard/core/environment/hardware.py
has_mps_backend()
¶
release_single_instance(lock_file)
¶
Safely releases the system lock and unlinks the sentinel file.
Guarantees that the file descriptor is closed and the lock is returned to the OS. Designed to be called during normal shutdown or within exception handling blocks.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
lock_file
|
Path
|
Filesystem path to the sentinel file to be removed. |
required |
Source code in orchard/core/environment/guards.py
set_seed(seed, strict=False, warn_only=False)
¶
Seed all PRNGs and optionally enforce deterministic algorithms.
Seeds Python's random, NumPy, and PyTorch (CPU + CUDA + MPS).
In strict mode, additionally forces deterministic kernels at the
cost of reduced performance.
Note
PYTHONHASHSEED is set here for completeness, but CPython reads it
only at interpreter startup — the runtime assignment has no effect on
the running process. The project Dockerfile handles this correctly
(ENV PYTHONHASHSEED=0). For bare-metal runs, prefix the command:
PYTHONHASHSEED=42 orchard run <recipe>. Full bit-exact determinism
additionally requires strict=True and num_workers=0 (both
enforced automatically in Docker via DOCKER_REPRODUCIBILITY_MODE).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
seed
|
int
|
The seed value to set across all PRNGs. |
required |
strict
|
bool
|
If True, enforces deterministic algorithms (5-30% perf penalty). |
False
|
warn_only
|
bool
|
If True (and strict=True), uses warn-only mode for
|
False
|
Source code in orchard/core/environment/reproducibility.py
to_device_obj(device_str, local_rank=0)
¶
Converts device string to PyTorch device object.
In distributed multi-GPU setups, uses local_rank to select the
correct GPU and calls torch.cuda.set_device() for CUDA affinity.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
device_str
|
str
|
'cuda', 'cpu', or 'auto' (auto-selects best available) |
required |
local_rank
|
int
|
Node-local process rank for GPU assignment (default 0).
Used to select |
0
|
Returns:
| Type | Description |
|---|---|
device
|
torch.device object |
Raises:
| Type | Description |
|---|---|
ValueError
|
If CUDA requested but unavailable, or invalid device string |
Source code in orchard/core/environment/hardware.py
worker_init_fn(worker_id)
¶
Initialize PRNGs for a DataLoader worker subprocess.
Each worker receives a unique but deterministic sub-seed derived from the parent seed, ensuring augmentation diversity while maintaining reproducibility across runs.
Called automatically by DataLoader when num_workers > 0.
In strict reproducibility mode, num_workers is forced to 0 by
HardwareConfig, so this function is never invoked.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
worker_id
|
int
|
Subprocess ID provided by DataLoader (0-based). |
required |
Source code in orchard/core/environment/reproducibility.py
load_config_from_yaml(yaml_path)
¶
Loads a raw configuration dictionary from a YAML file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
yaml_path
|
Path
|
Path to the source YAML file. |
required |
Returns:
| Type | Description |
|---|---|
dict[str, Any]
|
dict[str, Any]: The loaded configuration manifest. |
Raises:
| Type | Description |
|---|---|
FileNotFoundError
|
If the specified path does not exist. |
Source code in orchard/core/io/serialization.py
load_model_weights(model, path, device)
¶
Restores model state from a checkpoint using secure weight-only loading.
Loads PyTorch state_dict from disk with security hardening (weights_only=True) to prevent arbitrary code execution. Automatically maps tensors to target device.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model
|
Module
|
The model instance to populate with loaded weights |
required |
path
|
Path
|
Filesystem path to the checkpoint file (.pth) |
required |
device
|
device
|
Target device for mapping the loaded tensors |
required |
Raises:
| Type | Description |
|---|---|
OrchardExportError
|
If the checkpoint file does not exist at path |
Example
model = get_model(device, dataset_cfg=cfg.dataset, arch_cfg=cfg.architecture) checkpoint_path = Path("outputs/run_123/checkpoints/best_model.pth") load_model_weights(model, checkpoint_path, device)
Source code in orchard/core/io/checkpoints.py
md5_checksum(path, chunk_size=_MD5_CHUNK_SIZE)
¶
Calculates the MD5 checksum of a file using buffered reading.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
Path
|
Path to the file to verify. |
required |
chunk_size
|
int
|
Read buffer size in bytes. |
_MD5_CHUNK_SIZE
|
Returns:
| Name | Type | Description |
|---|---|---|
str |
str
|
The calculated hexadecimal MD5 hash. |
Source code in orchard/core/io/data_io.py
save_config_as_yaml(data, yaml_path)
¶
Serializes and persists configuration data to a YAML file.
This function coordinates the extraction of data from potentially complex objects (supporting Pydantic models, custom portable manifests, or raw dicts), applies recursive sanitization, and performs an atomic write to disk.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
Any
|
The configuration object to save. Supports objects with 'dump_portable()' or 'model_dump()' methods, or standard dictionaries. |
required |
yaml_path
|
Path
|
The destination filesystem path. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
Path |
Path
|
The confirmed path where the YAML was successfully written. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If the data structure cannot be serialized. |
OSError
|
If a filesystem-level error occurs (permissions, disk full). |
Source code in orchard/core/io/serialization.py
validate_npz_keys(data)
¶
Validates that the loaded NPZ dataset contains all required dataset keys.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
NpzFile
|
The loaded NPZ file object. |
required |
Raises:
| Type | Description |
|---|---|
OrchardDatasetError
|
If any required key (images/labels) is missing. |
Source code in orchard/core/io/data_io.py
log_optimization_header(cfg, logger_instance=None)
¶
Log Optuna optimization configuration details.
Logs search-specific parameters only (dataset/model already shown in environment).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
cfg
|
'Config'
|
Configuration with optuna settings |
required |
logger_instance
|
Logger | None
|
Logger instance to use (defaults to module logger) |
None
|
Source code in orchard/core/logger/progress.py
log_optimization_summary(study, cfg, device, paths, logger_instance=None)
¶
Log optimization study completion summary.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
study
|
'optuna.Study'
|
Completed Optuna study |
required |
cfg
|
'Config'
|
Configuration object |
required |
device
|
'torch.device'
|
PyTorch device used |
required |
paths
|
'RunPaths'
|
Run paths for artifacts |
required |
logger_instance
|
Logger | None
|
Logger instance to use (defaults to module logger) |
None
|
Source code in orchard/core/logger/progress.py
log_pipeline_summary(test_acc, macro_f1, best_model_path, run_dir, duration, test_auc=None, onnx_path=None, logger_instance=None)
¶
Log final pipeline completion summary.
Called at the end of the pipeline after all phases complete. Consolidates key metrics and artifact locations.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
test_acc
|
float
|
Final test accuracy |
required |
macro_f1
|
float
|
Final macro F1 score |
required |
best_model_path
|
Path
|
Path to best model checkpoint |
required |
run_dir
|
Path
|
Root directory for this run |
required |
duration
|
str
|
Human-readable duration string |
required |
test_auc
|
float | None
|
Final test AUC (if available) |
None
|
onnx_path
|
Path | None
|
Path to ONNX export (if performed) |
None
|
logger_instance
|
Logger | None
|
Logger instance to use (defaults to module logger) |
None
|
Source code in orchard/core/logger/progress.py
log_trial_start(trial_number, params, logger_instance=None)
¶
Log trial start with formatted parameters (grouped by category).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
trial_number
|
int
|
Trial index |
required |
params
|
dict[str, Any]
|
Sampled hyperparameters |
required |
logger_instance
|
Logger | None
|
Logger instance to use (defaults to module logger) |
None
|
Source code in orchard/core/logger/progress.py
get_project_root()
¶
Dynamically locate the project root by searching for anchor files.
Traverses upward from current file's directory until finding a marker file (.git or pyproject.toml). Supports Docker environments via IN_DOCKER environment variable override.
Returns:
| Type | Description |
|---|---|
Path
|
Resolved absolute Path to the project root directory. |
Note:
- IN_DOCKER=1 or IN_DOCKER=TRUE returns /app
- Falls back to fixed parent traversal if no markers found
Source code in orchard/core/paths/root.py
setup_static_directories()
¶
Ensure core project directories exist at startup.
Creates DATASET_DIR and OUTPUTS_ROOT if they do not exist, preventing runtime errors during data fetching or artifact creation. Uses mkdir(parents=True, exist_ok=True) for idempotent operation.