Skip to content

cifar_converter

orchard.data_handler.fetchers.cifar_converter

CIFAR-10/100 Dataset Converter.

Downloads CIFAR datasets via torchvision and converts them to NPZ format compatible with the Orchard ML pipeline. Creates stratified train/val/test splits from the original train/test partition.

ensure_cifar_npz(metadata)

Ensures a CIFAR dataset is downloaded and converted to NPZ format.

Supports both CIFAR-10 and CIFAR-100 via metadata.name routing.

Parameters:

Name Type Description Default
metadata DatasetMetadata

DatasetMetadata with name ('cifar10' or 'cifar100') and path

required

Returns:

Type Description
Path

Path to validated NPZ file

Source code in orchard/data_handler/fetchers/cifar_converter.py
def ensure_cifar_npz(metadata: DatasetMetadata) -> Path:
    """
    Ensures a CIFAR dataset is downloaded and converted to NPZ format.

    Supports both CIFAR-10 and CIFAR-100 via metadata.name routing.

    Args:
        metadata: DatasetMetadata with name ('cifar10' or 'cifar100') and path

    Returns:
        Path to validated NPZ file
    """
    target_npz = metadata.path

    if target_npz.exists():
        logger.debug(
            "%s%s %-18s: %s found at %s",
            LogStyle.INDENT,
            LogStyle.ARROW,
            "Dataset",
            metadata.display_name,
            target_npz.name,
        )
        return target_npz

    from torchvision.datasets import CIFAR10, CIFAR100

    if metadata.name == "cifar100":
        cifar_cls = CIFAR100
    else:
        cifar_cls = CIFAR10

    return _download_and_convert(metadata, cifar_cls)