synthetic
orchard.data_handler.diagnostic.synthetic
¶
Synthetic Data Handler for Testing.
This module provides tiny synthetic NPZ datasets for unit tests without requiring any external downloads or network access. It generates random image data and labels that match the expected NPZ format specifications.
Note
mutmut's trampoline resolves default parameters in the wrapper before
dispatching to the mutant function, making default-value mutations
unkillable. Body mutations on internal pixel/label generation are also
unobservable because tests verify the returned DatasetData metadata,
not raw byte content. Lines are marked pragma: no mutate accordingly.
create_synthetic_dataset(num_classes=8, samples=100, resolution=28, channels=3, name='syntheticmnist')
¶
Create a synthetic NPZ-compatible dataset for testing.
This function generates random image data and labels, saves them to a temporary .npz file, and returns a DatasetData object that can be used with the existing data pipeline.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
num_classes
|
int
|
Number of target categories (default: 8) |
8
|
samples
|
int
|
Number of training samples (default: 100) |
100
|
resolution
|
int
|
Image resolution (HxW) (default: 28) |
28
|
channels
|
int
|
Number of color channels (default: 3 for RGB) |
3
|
name
|
str
|
Dataset name for identification (default: "syntheticmnist") |
'syntheticmnist'
|
Returns:
| Name | Type | Description |
|---|---|---|
DatasetData |
DatasetData
|
A data object compatible with the existing pipeline |
Example
data = create_synthetic_dataset(num_classes=8, samples=100) train_loader, val_loader, test_loader = get_dataloaders( ... data, cfg.dataset, cfg.training, cfg.augmentation, cfg.num_workers ... )
Source code in orchard/data_handler/diagnostic/synthetic.py
create_synthetic_grayscale_dataset(num_classes=8, samples=100, resolution=28)
¶
Create a synthetic grayscale NPZ dataset for testing.
Convenience function for creating single-channel (grayscale) synthetic data.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
num_classes
|
int
|
Number of target categories (default: 8) |
8
|
samples
|
int
|
Number of training samples (default: 100) |
100
|
resolution
|
int
|
Image resolution (HxW) (default: 28) |
28
|
Returns:
| Name | Type | Description |
|---|---|---|
DatasetData |
DatasetData
|
A grayscale data object compatible with the pipeline |