Skip to content

reporting

orchard.evaluation.reporting

Reporting & Experiment Summarization Module.

This module orchestrates the generation of human-readable artifacts following the completion of a training pipeline. It leverages Pydantic for strict validation of experiment results and transforms raw metrics into structured, professionally formatted Excel summaries.

TrainingReport

Bases: BaseModel

Validated data container for summarizing a complete training experiment.

This model serves as a Schema for the final experimental metadata. It stores hardware, hyperparameter, and performance states to ensure full reproducibility and traceability of the training pipeline.

Attributes:

Name Type Description
timestamp str

ISO formatted execution time.

architecture str

Identifier of the architecture used.

dataset str

Name of the dataset.

best_val_metrics dict[str, float]

Peak values for each validation metric.

test_metrics dict[str, float]

Final metric values on the unseen test set.

is_texture_based bool

Whether texture-preserving logic was applied.

is_anatomical bool

Whether anatomical orientation constraints were enforced.

use_tta bool

Indicates if Test-Time Augmentation was active.

epochs_trained int

Total number of optimization cycles completed.

learning_rate float

Initial learning rate used by the optimizer.

batch_size int

Samples processed per iteration.

seed int

Global RNG seed for experiment replication.

augmentations str

Descriptive string of the transformation pipeline.

normalization str

Mean/Std statistics applied to the input tensors.

model_path str

Absolute path to the best saved checkpoint.

log_path str

Absolute path to the session execution log.

to_vertical_df()

Converts the Pydantic model into a vertical pandas DataFrame.

Metric dict fields (best_val_metrics, test_metrics) are flattened into prefixed rows (e.g. best_val_accuracy, test_f1).

Returns:

Type Description
DataFrame

pd.DataFrame: A two-column DataFrame (Parameter, Value) for Excel export.

Source code in orchard/evaluation/reporting.py
def to_vertical_df(self) -> pd.DataFrame:
    """
    Converts the Pydantic model into a vertical pandas DataFrame.

    Metric dict fields (``best_val_metrics``, ``test_metrics``) are flattened
    into prefixed rows (e.g. ``best_val_accuracy``, ``test_f1``).

    Returns:
        pd.DataFrame: A two-column DataFrame (Parameter, Value) for Excel export.
    """
    rows: list[tuple[str, object]] = []
    for key, value in self.model_dump().items():
        if value is None:
            continue
        if isinstance(value, dict):
            prefix = key.removesuffix("_metrics")
            for sub_key, sub_value in value.items():
                rows.append((f"{prefix}_{sub_key}", sub_value))
        else:
            rows.append((key, value))
    return pd.DataFrame(rows, columns=["Parameter", "Value"])

save(path, fmt='xlsx')

Saves the report to disk in the requested format.

Supported formats
  • xlsx: Professional Excel with conditional formatting.
  • csv: Flat CSV (two columns: Parameter, Value).
  • json: Pretty-printed JSON array.

Parameters:

Name Type Description Default
path Path

Base file path (suffix is replaced to match fmt).

required
fmt str

Output format — one of "xlsx", "csv", "json".

'xlsx'
Source code in orchard/evaluation/reporting.py
def save(self, path: Path, fmt: str = "xlsx") -> None:
    """
    Saves the report to disk in the requested format.

    Supported formats:
        - ``xlsx``: Professional Excel with conditional formatting.
        - ``csv``: Flat CSV (two columns: Parameter, Value).
        - ``json``: Pretty-printed JSON array.

    Args:
        path: Base file path (suffix is replaced to match *fmt*).
        fmt: Output format — one of ``"xlsx"``, ``"csv"``, ``"json"``.
    """
    fmt = fmt.lower()
    path.parent.mkdir(parents=True, exist_ok=True)

    try:
        df = self.to_vertical_df()
        if fmt == "csv":
            path = path.with_suffix(".csv")
            df.to_csv(path, index=False)
        elif fmt == "json":
            path = path.with_suffix(".json")
            df.to_json(path, orient="records", indent=2)
        else:
            path = path.with_suffix(".xlsx")
            with pd.ExcelWriter(
                path,
                engine="xlsxwriter",
                engine_kwargs={"options": {"nan_inf_to_errors": True}},
            ) as writer:  # pragma: no mutate block
                df.to_excel(writer, sheet_name="Detailed Report", index=False)
                self._apply_excel_formatting(writer, df)

        logger.info(
            "%s%s %-18s: %s", LogStyle.INDENT, LogStyle.ARROW, "Summary Report", path.name
        )
    except Exception as e:  # xlsxwriter raises non-standard exceptions
        logger.error("Failed to generate report: %s", e)

create_structured_report(val_metrics, test_metrics, train_losses, best_path, log_path, arch_name, dataset, training, aug_info=None, task_type='classification')

Constructs a TrainingReport object using final metrics and configuration.

This factory method aggregates disparate pipeline results into a single validated container, resolving paths and extracting augmentation summaries.

Parameters:

Name Type Description Default
val_metrics Sequence[Mapping[str, float]]

History of per-epoch validation metric dicts.

required
test_metrics Mapping[str, float]

Final test-set metric mapping (all task metrics included).

required
train_losses Sequence[float]

History of per-epoch training losses.

required
best_path Path

Path to the saved model weights.

required
log_path Path

Path to the run log file.

required
arch_name str

Architecture identifier (e.g. "resnet_18").

required
dataset DatasetConfig

Dataset sub-config with metadata, name, and normalization info.

required
training TrainingConfig

Training sub-config with hyperparameters and flags.

required
aug_info str | None

Pre-formatted augmentation string.

None
task_type str

Task type for conditional field inclusion.

'classification'

Returns:

Name Type Description
TrainingReport TrainingReport

A validated Pydantic model ready for export.

Source code in orchard/evaluation/reporting.py
def create_structured_report(
    val_metrics: Sequence[Mapping[str, float]],
    test_metrics: Mapping[str, float],
    train_losses: Sequence[float],
    best_path: Path,
    log_path: Path,
    arch_name: str,
    dataset: DatasetConfig,
    training: TrainingConfig,
    aug_info: str | None = None,
    task_type: str = "classification",  # pragma: no mutate
) -> TrainingReport:
    """
    Constructs a TrainingReport object using final metrics and configuration.

    This factory method aggregates disparate pipeline results into a single
    validated container, resolving paths and extracting augmentation summaries.

    Args:
        val_metrics: History of per-epoch validation metric dicts.
        test_metrics: Final test-set metric mapping (all task metrics included).
        train_losses: History of per-epoch training losses.
        best_path: Path to the saved model weights.
        log_path: Path to the run log file.
        arch_name: Architecture identifier (e.g. ``"resnet_18"``).
        dataset: Dataset sub-config with metadata, name, and normalization info.
        training: Training sub-config with hyperparameters and flags.
        aug_info: Pre-formatted augmentation string.
        task_type: Task type for conditional field inclusion.

    Returns:
        TrainingReport: A validated Pydantic model ready for export.
    """

    def _safe_max(key: str) -> float:
        """Return the best non-NaN value for *key* across validation epochs."""
        values = [m[key] for m in val_metrics if not math.isnan(m[key])]
        return max(values, default=0.0)

    # Build best-val metrics generically from all keys present in history
    all_keys: set[str] = set()
    for m in val_metrics:
        all_keys.update(m.keys())
    best_val_metrics = {key: _safe_max(key) for key in sorted(all_keys)}

    # Classification-only fields (None → excluded from report)
    is_classification = task_type == "classification"

    # Detection: drop the METRIC_LOSS sentinel (always 0.0, not meaningful)
    if not is_classification:
        best_val_metrics.pop("loss", None)  # pragma: no mutate

    return TrainingReport(
        architecture=arch_name,
        dataset=dataset.dataset_name,
        best_val_metrics=best_val_metrics,
        test_metrics=dict(test_metrics),
        is_texture_based=dataset.metadata.is_texture_based if is_classification else None,
        is_anatomical=dataset.metadata.is_anatomical if is_classification else None,
        use_tta=training.use_tta if is_classification else None,
        epochs_trained=len(train_losses),
        learning_rate=training.learning_rate,
        batch_size=training.batch_size,
        augmentations=(aug_info or "N/A") if is_classification else None,
        normalization=dataset.metadata.normalization_info,
        model_path=str(best_path.resolve()),
        log_path=str(log_path.resolve()),
        seed=training.seed,
    )