guards
orchard.core.environment.guards
¶
Process & Resource Guarding Utilities.
Provides low-level OS abstractions to manage Python script execution in multi-user or shared environments. It ensures system stability and safe resource usage by offering:
- Exclusive filesystem locking (
flock) to prevent concurrent runs and protect against disk or GPU/MPS conflicts. - Duplicate process detection and optional termination to free resources and avoid interference.
These utilities ensure each run is isolated, reproducible, and safe even on clusters or shared systems.
DuplicateProcessCleaner(script_name=None)
¶
Scans and optionally terminates duplicate instances of the current script.
Attributes:
| Name | Type | Description |
|---|---|---|
script_path |
str
|
Absolute path of the script to match against running processes. |
current_pid |
int
|
PID of the current process. |
Source code in orchard/core/environment/guards.py
detect_duplicates()
¶
Detects other Python processes running the same script.
Returns:
| Type | Description |
|---|---|
list[Process]
|
list of psutil.Process instances representing duplicates. |
Source code in orchard/core/environment/guards.py
terminate_duplicates(logger=None)
¶
Terminates detected duplicate processes.
In distributed mode (torchrun / DDP), termination is skipped entirely because sibling rank processes are intentional, not duplicates.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
logger
|
Logger | None
|
Logger for reporting terminated PIDs. |
None
|
Returns:
| Type | Description |
|---|---|
int
|
Number of terminated duplicate processes (0 in distributed mode). |
Source code in orchard/core/environment/guards.py
ensure_single_instance(lock_file, logger)
¶
Implements a cooperative advisory lock to guarantee singleton execution.
Leverages Unix 'flock' to create an exclusive lock on a sentinel file. If the lock cannot be acquired immediately, it indicates another instance is active, and the process will abort to prevent filesystem or GPU race conditions.
In distributed mode (torchrun / DDP), only the main process (rank 0) acquires the lock. Non-main ranks skip locking entirely to avoid deadlocking against the rank-0 held lock.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
lock_file
|
Path
|
Filesystem path where the lock sentinel will reside. |
required |
logger
|
Logger
|
Active logger for reporting acquisition status. |
required |
Raises:
| Type | Description |
|---|---|
SystemExit
|
If an existing lock is detected on the system. |
Source code in orchard/core/environment/guards.py
release_single_instance(lock_file)
¶
Safely releases the system lock and unlinks the sentinel file.
Guarantees that the file descriptor is closed and the lock is returned to the OS. Designed to be called during normal shutdown or within exception handling blocks.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
lock_file
|
Path
|
Filesystem path to the sentinel file to be removed. |
required |