Skip to content

reporting

orchard.evaluation.reporting

Reporting & Experiment Summarization Module.

This module orchestrates the generation of human-readable artifacts following the completion of a training pipeline. It leverages Pydantic for strict validation of experiment results and transforms raw metrics into structured, professionally formatted Excel summaries.

TrainingReport

Bases: BaseModel

Validated data container for summarizing a complete training experiment.

This model serves as a Schema for the final experimental metadata. It stores hardware, hyperparameter, and performance states to ensure full reproducibility and traceability of the training pipeline.

Attributes:

Name Type Description
timestamp str

ISO formatted execution time.

architecture str

Identifier of the architecture used.

dataset str

Name of the dataset.

best_val_accuracy float

Peak accuracy achieved on validation set.

best_val_auc float

Peak ROC-AUC achieved on validation set.

best_val_f1 float

Peak macro-averaged F1 on validation set.

test_accuracy float

Final accuracy on the unseen test set.

test_auc float

Final ROC-AUC on the unseen test set.

test_macro_f1 float

Macro-averaged F1 score (key for imbalanced data).

is_texture_based bool

Whether texture-preserving logic was applied.

is_anatomical bool

Whether anatomical orientation constraints were enforced.

use_tta bool

Indicates if Test-Time Augmentation was active.

epochs_trained int

Total number of optimization cycles completed.

learning_rate float

Initial learning rate used by the optimizer.

batch_size int

Samples processed per iteration.

seed int

Global RNG seed for experiment replication.

augmentations str

Descriptive string of the transformation pipeline.

normalization str

Mean/Std statistics applied to the input tensors.

model_path str

Absolute path to the best saved checkpoint.

log_path str

Absolute path to the session execution log.

to_vertical_df()

Converts the Pydantic model into a vertical pandas DataFrame.

Returns:

Type Description
DataFrame

pd.DataFrame: A two-column DataFrame (Parameter, Value) for Excel export.

Source code in orchard/evaluation/reporting.py
def to_vertical_df(self) -> pd.DataFrame:
    """
    Converts the Pydantic model into a vertical pandas DataFrame.

    Returns:
        pd.DataFrame: A two-column DataFrame (Parameter, Value) for Excel export.
    """
    data = self.model_dump()
    return pd.DataFrame(list(data.items()), columns=["Parameter", "Value"])

save(path, fmt='xlsx')

Saves the report to disk in the requested format.

Supported formats
  • xlsx: Professional Excel with conditional formatting.
  • csv: Flat CSV (two columns: Parameter, Value).
  • json: Pretty-printed JSON array.

Parameters:

Name Type Description Default
path Path

Base file path (suffix is replaced to match fmt).

required
fmt str

Output format — one of "xlsx", "csv", "json".

'xlsx'
Source code in orchard/evaluation/reporting.py
def save(
    self, path: Path, fmt: str = "xlsx"  # pragma: no mutate  # .lower() normalizes any case
) -> None:
    """
    Saves the report to disk in the requested format.

    Supported formats:
        - ``xlsx``: Professional Excel with conditional formatting.
        - ``csv``: Flat CSV (two columns: Parameter, Value).
        - ``json``: Pretty-printed JSON array.

    Args:
        path: Base file path (suffix is replaced to match *fmt*).
        fmt: Output format — one of ``"xlsx"``, ``"csv"``, ``"json"``.
    """
    fmt = fmt.lower()
    path.parent.mkdir(parents=True, exist_ok=True)

    try:
        df = self.to_vertical_df()
        if fmt == "csv":
            path = path.with_suffix(".csv")
            df.to_csv(path, index=False)  # pragma: no mutate  # None ≡ False in pandas
        elif fmt == "json":
            path = path.with_suffix(".json")
            df.to_json(path, orient="records", indent=2)
        else:
            path = path.with_suffix(".xlsx")
            with pd.ExcelWriter(  # pragma: no mutate
                path,
                engine="xlsxwriter",  # pragma: no mutate
                engine_kwargs={"options": {"nan_inf_to_errors": True}},  # pragma: no mutate
            ) as writer:
                # fmt: off
                df.to_excel(writer, sheet_name="Detailed Report", index=False)  # pragma: no mutate
                # fmt: on
                self._apply_excel_formatting(writer, df)

        logger.info(
            "%s%s %-18s: %s", LogStyle.INDENT, LogStyle.ARROW, "Summary Report", path.name
        )
    except Exception as e:  # xlsxwriter raises non-standard exceptions
        logger.error("Failed to generate report: %s", e)

create_structured_report(val_metrics, test_metrics, macro_f1, train_losses, best_path, log_path, arch_name, dataset, training, aug_info=None)

Constructs a TrainingReport object using final metrics and configuration.

This factory method aggregates disparate pipeline results into a single validated container, resolving paths and extracting augmentation summaries.

Parameters:

Name Type Description Default
val_metrics Sequence[Mapping[str, float]]

History of per-epoch validation metric dicts.

required
test_metrics dict[str, float]

Final test-set metric dict (accuracy, auc, etc.).

required
macro_f1 float

Final Macro F1 score on test set.

required
train_losses Sequence[float]

History of per-epoch training losses.

required
best_path Path

Path to the saved model weights.

required
log_path Path

Path to the run log file.

required
arch_name str

Architecture identifier (e.g. "resnet_18").

required
dataset DatasetConfig

Dataset sub-config with metadata, name, and normalization info.

required
training TrainingConfig

Training sub-config with hyperparameters and flags.

required
aug_info str | None

Pre-formatted augmentation string.

None

Returns:

Name Type Description
TrainingReport TrainingReport

A validated Pydantic model ready for export.

Source code in orchard/evaluation/reporting.py
def create_structured_report(
    val_metrics: Sequence[Mapping[str, float]],
    test_metrics: dict[str, float],
    macro_f1: float,
    train_losses: Sequence[float],
    best_path: Path,
    log_path: Path,
    arch_name: str,
    dataset: DatasetConfig,
    training: TrainingConfig,
    aug_info: str | None = None,
) -> TrainingReport:
    """
    Constructs a TrainingReport object using final metrics and configuration.

    This factory method aggregates disparate pipeline results into a single
    validated container, resolving paths and extracting augmentation summaries.

    Args:
        val_metrics: History of per-epoch validation metric dicts.
        test_metrics: Final test-set metric dict (accuracy, auc, etc.).
        macro_f1: Final Macro F1 score on test set.
        train_losses: History of per-epoch training losses.
        best_path: Path to the saved model weights.
        log_path: Path to the run log file.
        arch_name: Architecture identifier (e.g. ``"resnet_18"``).
        dataset: Dataset sub-config with metadata, name, and normalization info.
        training: Training sub-config with hyperparameters and flags.
        aug_info: Pre-formatted augmentation string.

    Returns:
        TrainingReport: A validated Pydantic model ready for export.
    """
    # Augmentation info is expected from caller; fallback to "N/A"
    aug_info = aug_info or "N/A"

    def _safe_max(key: str) -> float:
        """Return the best non-NaN value for *key* across validation epochs."""
        values = [m[key] for m in val_metrics if not math.isnan(m[key])]
        return max(values, default=0.0)

    best_val_acc = _safe_max(METRIC_ACCURACY)
    best_val_auc = _safe_max(METRIC_AUC)
    best_val_f1 = _safe_max(METRIC_F1)

    return TrainingReport(
        architecture=arch_name,
        dataset=dataset.dataset_name,
        best_val_accuracy=best_val_acc,
        best_val_auc=best_val_auc,
        best_val_f1=best_val_f1,
        test_accuracy=test_metrics[METRIC_ACCURACY],
        test_auc=test_metrics[METRIC_AUC],
        test_macro_f1=macro_f1,
        is_texture_based=dataset.metadata.is_texture_based,
        is_anatomical=dataset.metadata.is_anatomical,
        use_tta=training.use_tta,
        epochs_trained=len(train_losses),
        learning_rate=training.learning_rate,
        batch_size=training.batch_size,
        augmentations=aug_info,
        normalization=dataset.metadata.normalization_info,
        model_path=str(best_path.resolve()),
        log_path=str(log_path.resolve()),
        seed=training.seed,
    )