reproducibility
orchard.core.environment.reproducibility
¶
Reproducibility Environment.
Ensures deterministic behavior across Python, NumPy, and PyTorch by centralizing RNG seeding, DataLoader worker initialization, and strict algorithmic determinism enforcement.
Three reproducibility levels are supported:
- Standard (
strict=False): Seeds all PRNGs and disables cuDNN auto-tuner. Sufficient for most experiments — results are reproducible across runs on the same hardware, but non-deterministic kernels (e.g. atomicAdd in cuBLAS) may cause minor floating-point variations. - Strict (
strict=True): Enablestorch.use_deterministic_algorithms(True)on all backends (CUDA, MPS, CPU) and configuresCUBLAS_WORKSPACE_CONFIGwhen CUDA is available. Forcesnum_workers=0via HardwareConfig to eliminate multiprocessing non-determinism. Incurs a 5-30% performance penalty on GPU workloads. - Strict warn-only (
strict=True, warn_only=True): Same as strict, but non-deterministic operations emit warnings instead of raising errors. Useful for discovering which operations lack deterministic kernels without crashing the experiment.
Strict mode is controlled by HardwareConfig.use_deterministic_algorithms,
resolved from the recipe YAML or direct Config construction.
set_seed(seed, strict=False, warn_only=False)
¶
Seed all PRNGs and optionally enforce deterministic algorithms.
Seeds Python's random, NumPy, and PyTorch (CPU + CUDA + MPS).
In strict mode, additionally forces deterministic kernels at the
cost of reduced performance.
Note
PYTHONHASHSEED is set here for completeness, but CPython reads it
only at interpreter startup — the runtime assignment has no effect on
the running process. The project Dockerfile handles this correctly
(ENV PYTHONHASHSEED=0). For bare-metal runs, prefix the command:
PYTHONHASHSEED=42 orchard run <recipe>. Full bit-exact determinism
additionally requires strict=True and num_workers=0 (both
enforced automatically in Docker via DOCKER_REPRODUCIBILITY_MODE).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
seed
|
int
|
The seed value to set across all PRNGs. |
required |
strict
|
bool
|
If True, enforces deterministic algorithms (5-30% perf penalty). |
False
|
warn_only
|
bool
|
If True (and strict=True), uses warn-only mode for
|
False
|
Source code in orchard/core/environment/reproducibility.py
worker_init_fn(worker_id)
¶
Initialize PRNGs for a DataLoader worker subprocess.
Each worker receives a unique but deterministic sub-seed derived from the parent seed, ensuring augmentation diversity while maintaining reproducibility across runs.
Called automatically by DataLoader when num_workers > 0.
In strict reproducibility mode, num_workers is forced to 0 by
HardwareConfig, so this function is never invoked.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
worker_id
|
int
|
Subprocess ID provided by DataLoader (0-based). |
required |