Skip to content

tasks

orchard.tasks

Task Strategy Packages.

Each sub-package exports its adapter classes. Registration in the core task registry is handled by :mod:orchard (the top-level init), which is the natural junction point between core and tasks.

ClassificationCriterionAdapter

Builds classification loss functions (CrossEntropy / Focal).

get_criterion(training, class_weights=None)

Delegate to the existing criterion factory.

Parameters:

Name Type Description Default
training TrainingConfig

Training sub-config with criterion parameters.

required
class_weights Tensor | None

Optional per-class weights for imbalanced datasets.

None

Returns:

Type Description
Module

Loss module (CrossEntropyLoss or FocalLoss).

Source code in orchard/tasks/classification/criterion_adapter.py
def get_criterion(
    self,
    training: TrainingConfig,
    class_weights: torch.Tensor | None = None,
) -> nn.Module:
    """
    Delegate to the existing criterion factory.

    Args:
        training: Training sub-config with criterion parameters.
        class_weights: Optional per-class weights for imbalanced datasets.

    Returns:
        Loss module (CrossEntropyLoss or FocalLoss).
    """
    return get_criterion(training, class_weights=class_weights)

ClassificationEvalPipelineAdapter

Orchestrates classification inference, visualization, and reporting.

run_evaluation(model, test_loader, train_losses, val_metrics_history, class_names, paths, training, dataset, augmentation, evaluation, arch_name, aug_info='N/A', tracker=None)

Delegate to the existing final evaluation pipeline.

Parameters:

Name Type Description Default
model Module

Trained model (already on target device).

required
test_loader DataLoader[Any]

DataLoader for test set.

required
train_losses list[float]

Training loss history per epoch.

required
val_metrics_history list[Mapping[str, float]]

Validation metrics history per epoch.

required
class_names list[str]

List of class label strings.

required
paths RunPaths

RunPaths for artifact output.

required
training TrainingConfig

Training sub-config.

required
dataset DatasetConfig

Dataset sub-config.

required
augmentation AugmentationConfig

Augmentation sub-config.

required
evaluation EvaluationConfig

Evaluation sub-config.

required
arch_name str

Architecture identifier.

required
aug_info str

Augmentation description string.

'N/A'
tracker TrackerProtocol | None

Optional experiment tracker for final metrics.

None

Returns:

Type Description
Mapping[str, float]

Mapping of metric names to float values.

Source code in orchard/tasks/classification/evaluation_adapter.py
def run_evaluation(
    self,
    model: nn.Module,
    test_loader: DataLoader[Any],
    train_losses: list[float],
    val_metrics_history: list[Mapping[str, float]],
    class_names: list[str],
    paths: RunPaths,
    training: TrainingConfig,
    dataset: DatasetConfig,
    augmentation: AugmentationConfig,
    evaluation: EvaluationConfig,
    arch_name: str,
    aug_info: str = "N/A",  # pragma: no mutate
    tracker: TrackerProtocol | None = None,
) -> Mapping[str, float]:
    """
    Delegate to the existing final evaluation pipeline.

    Args:
        model: Trained model (already on target device).
        test_loader: DataLoader for test set.
        train_losses: Training loss history per epoch.
        val_metrics_history: Validation metrics history per epoch.
        class_names: List of class label strings.
        paths: RunPaths for artifact output.
        training: Training sub-config.
        dataset: Dataset sub-config.
        augmentation: Augmentation sub-config.
        evaluation: Evaluation sub-config.
        arch_name: Architecture identifier.
        aug_info: Augmentation description string.
        tracker: Optional experiment tracker for final metrics.

    Returns:
        Mapping of metric names to float values.
    """
    macro_f1, test_acc, test_auc = run_final_evaluation(
        model=model,
        test_loader=test_loader,
        train_losses=train_losses,
        val_metrics_history=val_metrics_history,
        class_names=class_names,
        paths=paths,
        training=training,
        dataset=dataset,
        augmentation=augmentation,
        evaluation=evaluation,
        arch_name=arch_name,
        aug_info=aug_info,
        tracker=tracker,
    )
    return MappingProxyType(
        {
            METRIC_F1: macro_f1,
            METRIC_ACCURACY: test_acc,
            METRIC_AUC: test_auc,
        }
    )

ClassificationMetricsAdapter

Computes per-epoch classification metrics (loss, accuracy, AUC, F1).

compute_validation_metrics(model, val_loader, criterion, device)

Delegate to the existing validation engine.

Parameters:

Name Type Description Default
model Module

Neural network model to evaluate.

required
val_loader DataLoader[Any]

Validation data provider.

required
criterion Module

Loss function.

required
device device

Hardware target.

required

Returns:

Type Description
Mapping[str, float]

Immutable mapping with keys: loss, accuracy, auc, f1.

Source code in orchard/tasks/classification/metrics_adapter.py
def compute_validation_metrics(
    self,
    model: nn.Module,
    val_loader: DataLoader[Any],
    criterion: nn.Module,
    device: torch.device,
) -> Mapping[str, float]:
    """
    Delegate to the existing validation engine.

    Args:
        model: Neural network model to evaluate.
        val_loader: Validation data provider.
        criterion: Loss function.
        device: Hardware target.

    Returns:
        Immutable mapping with keys: loss, accuracy, auc, f1.
    """
    return validate_epoch(model, val_loader, criterion, device)

ClassificationTrainingStepAdapter

Computes classification training loss with optional MixUp blending.

compute_training_loss(model, inputs, targets, criterion, mixup_fn=None, device=None)

Execute classification forward pass and compute loss.

When mixup_fn is provided, inputs and targets are blended before the forward pass and the loss is computed as a convex combination of the two target sets.

Parameters:

Name Type Description Default
model Module

Neural network producing logits.

required
inputs Any

Batch of input tensors.

required
targets Any

Batch of target tensors.

required
criterion Module

Loss function (e.g. CrossEntropyLoss).

required
mixup_fn Callable[..., Any] | None

Optional MixUp augmentation callable.

None
device device | None

Target device for tensor placement.

None

Returns:

Type Description
Tensor

Scalar loss tensor for backward pass.

Source code in orchard/tasks/classification/training_step_adapter.py
def compute_training_loss(
    self,
    model: nn.Module,
    inputs: Any,
    targets: Any,
    criterion: nn.Module,
    mixup_fn: Callable[..., Any] | None = None,
    device: torch.device | None = None,
) -> torch.Tensor:
    """
    Execute classification forward pass and compute loss.

    When ``mixup_fn`` is provided, inputs and targets are blended
    before the forward pass and the loss is computed as a convex
    combination of the two target sets.

    Args:
        model: Neural network producing logits.
        inputs: Batch of input tensors.
        targets: Batch of target tensors.
        criterion: Loss function (e.g. CrossEntropyLoss).
        mixup_fn: Optional MixUp augmentation callable.
        device: Target device for tensor placement.

    Returns:
        Scalar loss tensor for backward pass.
    """
    if device is not None:
        inputs = inputs.to(device)
        targets = targets.to(device)
    if mixup_fn is not None:
        inputs, y_a, y_b, lam = mixup_fn(inputs, targets)
        outputs = model(inputs)
        loss: torch.Tensor = lam * criterion(outputs, y_a) + (1 - lam) * criterion(outputs, y_b)
        return loss
    outputs = model(inputs)
    result: torch.Tensor = criterion(outputs, targets)
    return result

DetectionCriterionAdapter

Returns a no-op sentinel criterion for detection tasks.

get_criterion(training, class_weights=None)

Return a sentinel criterion.

Detection models compute their own losses internally (classification loss, box regression loss, objectness, RPN box reg). The returned module raises RuntimeError if its forward() is ever called, making misuse immediately visible.

Parameters:

Name Type Description Default
training TrainingConfig

Training sub-config (ignored for detection).

required
class_weights Tensor | None

Per-class weights (ignored for detection).

None

Returns:

Type Description
Module

Sentinel nn.Module that raises on forward.

Source code in orchard/tasks/detection/criterion_adapter.py
def get_criterion(
    self,
    training: TrainingConfig,  # noqa: ARG002
    class_weights: torch.Tensor | None = None,  # noqa: ARG002
) -> nn.Module:
    """
    Return a sentinel criterion.

    Detection models compute their own losses internally (classification
    loss, box regression loss, objectness, RPN box reg). The returned
    module raises ``RuntimeError`` if its ``forward()`` is ever called,
    making misuse immediately visible.

    Args:
        training: Training sub-config (ignored for detection).
        class_weights: Per-class weights (ignored for detection).

    Returns:
        Sentinel ``nn.Module`` that raises on forward.
    """
    return _DetectionNoOpCriterion()

DetectionEvalPipelineAdapter

Orchestrates detection inference, mAP computation, and reporting.

run_evaluation(model, test_loader, train_losses, val_metrics_history, class_names, paths, training, dataset, augmentation, evaluation, arch_name, aug_info='N/A', tracker=None)

Run detection evaluation pipeline.

Computes mAP metrics on the test set, plots training loss curves, and optionally logs metrics to the experiment tracker.

Parameters:

Name Type Description Default
model Module

Trained detection model (already on target device).

required
test_loader DataLoader[Any]

DataLoader for test set.

required
train_losses list[float]

Training loss history per epoch.

required
val_metrics_history list[Mapping[str, float]]

Validation metrics history per epoch.

required
class_names list[str]

List of class label strings.

required
paths RunPaths

RunPaths for artifact output.

required
training TrainingConfig

Training sub-config.

required
dataset DatasetConfig

Dataset sub-config.

required
augmentation AugmentationConfig

Augmentation sub-config.

required
evaluation EvaluationConfig

Evaluation sub-config.

required
arch_name str

Architecture identifier.

required
aug_info str

Augmentation description string.

'N/A'
tracker TrackerProtocol | None

Optional experiment tracker for final metrics.

None

Returns:

Type Description
Mapping[str, float]

Mapping of detection metric names to float values.

Source code in orchard/tasks/detection/evaluation_adapter.py
def run_evaluation(
    self,
    model: nn.Module,
    test_loader: DataLoader[Any],
    train_losses: list[float],
    val_metrics_history: list[Mapping[str, float]],
    class_names: list[str],
    paths: RunPaths,
    training: TrainingConfig,
    dataset: DatasetConfig,
    augmentation: AugmentationConfig,  # noqa: ARG002
    evaluation: EvaluationConfig,
    arch_name: str,
    aug_info: str = "N/A",  # noqa: ARG002
    tracker: TrackerProtocol | None = None,
) -> Mapping[str, float]:
    """
    Run detection evaluation pipeline.

    Computes mAP metrics on the test set, plots training loss curves,
    and optionally logs metrics to the experiment tracker.

    Args:
        model: Trained detection model (already on target device).
        test_loader: DataLoader for test set.
        train_losses: Training loss history per epoch.
        val_metrics_history: Validation metrics history per epoch.
        class_names: List of class label strings.
        paths: RunPaths for artifact output.
        training: Training sub-config.
        dataset: Dataset sub-config.
        augmentation: Augmentation sub-config.
        evaluation: Evaluation sub-config.
        arch_name: Architecture identifier.
        aug_info: Augmentation description string.
        tracker: Optional experiment tracker for final metrics.

    Returns:
        Mapping of detection metric names to float values.
    """
    device = next(model.parameters()).device

    # Inference + mAP computation
    model.eval()
    metric = MeanAveragePrecision(iou_type="bbox")

    with torch.no_grad():
        for images, targets in test_loader:
            images_on_device = [img.to(device) for img in images]
            predictions = model(images_on_device)
            metric.update(
                [to_cpu(p) for p in predictions],
                [to_cpu(t) for t in targets],
            )

    result = metric.compute()
    test_metrics = {
        METRIC_MAP: float(result["map"]),
        METRIC_MAP_50: float(result["map_50"]),
        METRIC_MAP_75: float(result["map_75"]),
    }

    # Log results
    logger.info(
        "%s%s %-18s: mAP=%.4f  mAP@50=%.4f  mAP@75=%.4f",
        LogStyle.INDENT,
        LogStyle.ARROW,
        "Test Metrics",
        test_metrics[METRIC_MAP],
        test_metrics[METRIC_MAP_50],
        test_metrics[METRIC_MAP_75],
    )

    # Bbox visualization grid
    if evaluation.save_predictions_grid:
        show_detections(
            model=model,
            loader=test_loader,
            device=device,
            classes=class_names,
            save_path=paths.figures / f"detection_samples_{arch_name}_{dataset.resolution}.png",
            ctx=PlotContext(
                arch_name=arch_name,
                resolution=dataset.resolution,
                fig_dpi=evaluation.fig_dpi,
                plot_style=evaluation.plot_style,
                cmap_confusion=evaluation.cmap_confusion,
                grid_cols=evaluation.grid_cols,
                n_samples=evaluation.n_samples,
                fig_size_predictions=evaluation.fig_size_predictions,
                mean=dataset.mean,
                std=dataset.std,
            ),
        )

    # Training curves — plot mAP instead of loss (METRIC_LOSS is a 0.0 sentinel)
    val_map = [m.get(METRIC_MAP, 0.0) for m in val_metrics_history]
    ctx = PlotContext(
        arch_name=arch_name,
        resolution=dataset.resolution,
        fig_dpi=evaluation.fig_dpi,
        plot_style=evaluation.plot_style,
        cmap_confusion=evaluation.cmap_confusion,
        grid_cols=evaluation.grid_cols,
        n_samples=evaluation.n_samples,
        fig_size_predictions=evaluation.fig_size_predictions,
    )
    plot_training_curves(
        train_losses=train_losses,
        val_metric_values=val_map,
        out_path=paths.figures / "training_curves.png",
        ctx=ctx,
        val_label="Validation mAP",
    )

    # Structured report (Excel/CSV/JSON) — args tested in test_reporting.py
    report = create_structured_report(
        val_metrics=val_metrics_history,
        test_metrics=test_metrics,
        train_losses=train_losses,
        best_path=paths.best_model_path,
        log_path=paths.logs / "session.log",
        arch_name=arch_name,
        dataset=dataset,
        training=training,
        task_type="detection",
    )
    report.save(paths.final_report_path, fmt=evaluation.report_format)

    # Tracker logging
    if tracker is not None:
        full_metrics = {METRIC_LOSS: 0.0, **test_metrics}  # sentinel for tracker schema
        tracker.log_test_metrics(full_metrics)

    return MappingProxyType(test_metrics)

DetectionMetricsAdapter

Computes mAP validation metrics for object detection.

compute_validation_metrics(model, val_loader, criterion, device)

Run detection inference and compute mAP metrics.

Iterates the validation loader, collects predictions and targets, then computes mean Average Precision at multiple IoU thresholds.

Detection models do not produce a single validation loss in eval mode, so "loss" is returned as 0.0.

Parameters:

Name Type Description Default
model Module

Detection model to evaluate.

required
val_loader DataLoader[Any]

Validation data provider.

required
criterion Module

Ignored (detection models compute losses internally).

required
device device

Hardware target for inference.

required

Returns:

Type Description
Mapping[str, float]

Immutable mapping with keys: loss, map, map_50, map_75.

Source code in orchard/tasks/detection/metrics_adapter.py
def compute_validation_metrics(
    self,
    model: nn.Module,
    val_loader: DataLoader[Any],
    criterion: nn.Module,  # noqa: ARG002
    device: torch.device,
) -> Mapping[str, float]:
    """
    Run detection inference and compute mAP metrics.

    Iterates the validation loader, collects predictions and targets,
    then computes mean Average Precision at multiple IoU thresholds.

    Detection models do not produce a single validation loss in eval
    mode, so ``"loss"`` is returned as ``0.0``.

    Args:
        model: Detection model to evaluate.
        val_loader: Validation data provider.
        criterion: Ignored (detection models compute losses internally).
        device: Hardware target for inference.

    Returns:
        Immutable mapping with keys: ``loss``, ``map``, ``map_50``, ``map_75``.
    """
    model.eval()
    metric = MeanAveragePrecision(iou_type="bbox")

    with torch.no_grad():
        for images, targets in val_loader:
            images = [img.to(device) for img in images]
            predictions = model(images)
            metric.update(
                [to_cpu(p) for p in predictions],
                [to_cpu(t) for t in targets],
            )

    result = metric.compute()

    return MappingProxyType(
        {
            METRIC_LOSS: 0.0,  # sentinel — detection models don't expose validation loss
            METRIC_MAP: float(result["map"]),
            METRIC_MAP_50: float(result["map_50"]),
            METRIC_MAP_75: float(result["map_75"]),
        }
    )

DetectionTrainingStepAdapter

Computes detection training loss by summing model-internal losses.

compute_training_loss(model, inputs, targets, criterion, mixup_fn=None, device=None)

Execute detection forward pass and compute total loss.

Moves images and target dicts to device, calls the model in training mode (which returns a loss dict), and sums all loss components into a single scalar for backpropagation.

Parameters:

Name Type Description Default
model Module

Detection model (e.g. Faster R-CNN) in training mode.

required
inputs Any

List of image tensors, one per image in the batch.

required
targets Any

List of target dicts, each with boxes and labels.

required
criterion Module

Ignored (detection models compute losses internally).

required
mixup_fn Callable[..., Any] | None

Ignored (MixUp is not applicable to detection).

None
device device | None

Target device for tensor placement.

None

Returns:

Type Description
Tensor

Scalar loss tensor (sum of all loss components).

Source code in orchard/tasks/detection/training_step_adapter.py
def compute_training_loss(
    self,
    model: nn.Module,
    inputs: Any,
    targets: Any,
    criterion: nn.Module,  # noqa: ARG002
    mixup_fn: Callable[..., Any] | None = None,  # noqa: ARG002
    device: torch.device | None = None,
) -> torch.Tensor:
    """
    Execute detection forward pass and compute total loss.

    Moves images and target dicts to device, calls the model in
    training mode (which returns a loss dict), and sums all loss
    components into a single scalar for backpropagation.

    Args:
        model: Detection model (e.g. Faster R-CNN) in training mode.
        inputs: List of image tensors, one per image in the batch.
        targets: List of target dicts, each with ``boxes`` and ``labels``.
        criterion: Ignored (detection models compute losses internally).
        mixup_fn: Ignored (MixUp is not applicable to detection).
        device: Target device for tensor placement.

    Returns:
        Scalar loss tensor (sum of all loss components).
    """
    if device is not None:
        images = [img.to(device) for img in inputs]
        targets_on_device: list[dict[str, Any]] = [
            {k: v.to(device) for k, v in t.items()} for t in targets
        ]
    else:
        images = list(inputs)
        targets_on_device = list(targets)

    loss_dict = model(images, targets_on_device)
    total_loss = torch.stack(list(loss_dict.values())).sum()
    return total_loss