io
orchard.core.io
¶
Input/Output & Persistence Utilities.
This module manages the pipeline's interaction with the filesystem, handling configuration serialization (YAML), model checkpoint restoration, and dataset integrity verification via MD5 checksums and schema validation.
AuditSaver
¶
Default AuditSaverProtocol implementation.
Delegates to the module-level save_config_as_yaml,
dump_requirements, and dump_git_info functions —
no logic duplication.
save_config(data, yaml_path)
¶
Persist configuration to a YAML file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
Any
|
Configuration object to serialize. |
required |
yaml_path
|
Path
|
Destination filesystem path. |
required |
Returns:
| Type | Description |
|---|---|
Path
|
Confirmed path where the YAML was written. |
Source code in orchard/core/io/serialization.py
dump_requirements(output_path)
¶
Freeze installed packages for reproducibility.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
output_path
|
Path
|
Filesystem path for the requirements snapshot. |
required |
dump_git_info(output_path)
¶
Persist git commit hash and working tree status.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
output_path
|
Path
|
Filesystem path for the git info snapshot. |
required |
AuditSaverProtocol
¶
Bases: Protocol
Protocol for run-manifest persistence (config YAML + dependency snapshot).
Enables dependency injection of auditability operations in
RootOrchestrator, keeping the constructor signature lean while
allowing full mocking in tests.
save_config(data, yaml_path)
¶
Persist configuration to a YAML file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
Any
|
Configuration object to serialize. |
required |
yaml_path
|
Path
|
Destination filesystem path. |
required |
Returns:
| Type | Description |
|---|---|
Path
|
Confirmed path where the YAML was written. |
Source code in orchard/core/io/serialization.py
dump_requirements(output_path)
¶
Freeze installed packages for reproducibility.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
output_path
|
Path
|
Filesystem path for the requirements snapshot. |
required |
dump_git_info(output_path)
¶
Persist git commit hash and working tree status for auditability.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
output_path
|
Path
|
Filesystem path for the git info snapshot. |
required |
load_model_weights(model, path, device)
¶
Restores model state from a checkpoint using secure weight-only loading.
Loads PyTorch state_dict from disk with security hardening (weights_only=True) to prevent arbitrary code execution. Automatically maps tensors to target device.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model
|
Module
|
The model instance to populate with loaded weights |
required |
path
|
Path
|
Filesystem path to the checkpoint file (.pth) |
required |
device
|
device
|
Target device for mapping the loaded tensors |
required |
Raises:
| Type | Description |
|---|---|
OrchardExportError
|
If the checkpoint file does not exist at path |
Example
model = get_model(device, dataset_cfg=cfg.dataset, arch_cfg=cfg.architecture) checkpoint_path = Path("outputs/run_123/checkpoints/best_model.pth") load_model_weights(model, checkpoint_path, device)
Source code in orchard/core/io/checkpoints.py
md5_checksum(path, chunk_size=_MD5_CHUNK_SIZE)
¶
Calculates the MD5 checksum of a file using buffered reading.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
Path
|
Path to the file to verify. |
required |
chunk_size
|
int
|
Read buffer size in bytes. |
_MD5_CHUNK_SIZE
|
Returns:
| Name | Type | Description |
|---|---|---|
str |
str
|
The calculated hexadecimal MD5 hash. |
Source code in orchard/core/io/data_io.py
validate_npz_keys(data)
¶
Validates that the loaded NPZ dataset contains all required dataset keys.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
NpzFile
|
The loaded NPZ file object. |
required |
Raises:
| Type | Description |
|---|---|
OrchardDatasetError
|
If any required key (images/labels) is missing. |
Source code in orchard/core/io/data_io.py
dump_requirements(output_path)
¶
Freeze installed packages to a requirements file for reproducibility.
Invokes pip freeze --local to capture the exact dependency versions
of the current environment. The output is prefixed with a Python version
header for auditability.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
output_path
|
Path
|
Filesystem path where the requirements file is written. |
required |
Source code in orchard/core/io/serialization.py
load_config_from_yaml(yaml_path)
¶
Loads a raw configuration dictionary from a YAML file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
yaml_path
|
Path
|
Path to the source YAML file. |
required |
Returns:
| Type | Description |
|---|---|
dict[str, Any]
|
dict[str, Any]: The loaded configuration manifest. |
Raises:
| Type | Description |
|---|---|
FileNotFoundError
|
If the specified path does not exist. |
Source code in orchard/core/io/serialization.py
save_config_as_yaml(data, yaml_path)
¶
Serializes and persists configuration data to a YAML file.
This function coordinates the extraction of data from potentially complex objects (supporting Pydantic models, custom portable manifests, or raw dicts), applies recursive sanitization, and performs an atomic write to disk.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
Any
|
The configuration object to save. Supports objects with 'dump_portable()' or 'model_dump()' methods, or standard dictionaries. |
required |
yaml_path
|
Path
|
The destination filesystem path. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
Path |
Path
|
The confirmed path where the YAML was successfully written. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If the data structure cannot be serialized. |
OSError
|
If a filesystem-level error occurs (permissions, disk full). |