Back to Home

Hyperparameter Optimization Guide

Quick Start

# Install Optuna (if not already present)
pip install optuna plotly timm  # timm required for ViT support

# Run optimization with presets (28×28 MedMNIST)
orchard run recipes/optuna_resnet_18.yaml          # 50 trials, ~15 min GPU, ~2.5h CPU
orchard run recipes/optuna_mini_cnn.yaml           # 50 trials, ~1-2 min GPU, ~5 min CPU

# 32×32 resolution (CIFAR-10/100)
orchard run recipes/optuna_cifar100_mini_cnn.yaml  # 50 trials, ~1-2h GPU
orchard run recipes/optuna_cifar100_resnet_18.yaml # 50 trials, ~3-4h GPU

# 128×128 resolution - timm model search
orchard run recipes/optuna_128.yaml                # 20 trials, ~2-4h GPU

# 224×224 resolution (includes weight variant search for ViT)
orchard run recipes/optuna_efficientnet_b0.yaml    # 20 trials, ~1.5-5h GPU
orchard run recipes/optuna_vit_tiny.yaml           # 20 trials, ~3-5h GPU

# Custom search via --set overrides
orchard run recipes/optuna_resnet_18.yaml \
    --set dataset.name=pathmnist \
    --set optuna.n_trials=20 \
    --set training.epochs=10 \
    --set optuna.search_space_preset=quick

# Resume interrupted study
orchard run recipes/optuna_vit_tiny.yaml \
    --set optuna.load_if_exists=true

Search Space Coverage

Select a preset via search_space_preset:

Preset Parameters Use case
full (default) 16 parameters Comprehensive search
quick 7 parameters Rapid exploration
architectures Full + model search Best model-hyperparameter combo

Full Space parameters: - Optimization: optimizer_type, learning_rate, weight_decay, momentum, min_lr - Loss: criterion_type, focal_gamma, label_smoothing - Regularization: mixup_alpha, dropout - Scheduling: scheduler_type, scheduler_patience - Augmentation: rotation_angle, jitter_val, min_scale - Batch Size: Resolution-aware categorical choices - ≤64×64: batch_size_low_res — [16, 32, 48, 64] - 128×128: batch_size_low_res — [16, 32, 48, 64] - 224×224: batch_size_high_res — [8, 12, 16] (OOM-safe for 8GB VRAM) - Architecture (requires enable_model_search: true): - ≤64×64: [resnet_18, mini_cnn] - 128×128: [timm/efficientnet_lite0, timm/convnextv2_nano, timm/mobilenetv3_large_100] - 224×224: [resnet_18, efficientnet_b0, convnext_tiny, vit_tiny] - Weight Variants (ViT only, 224×224): - vit_tiny_patch16_224.augreg_in21k_ft_in1k - vit_tiny_patch16_224.augreg_in21k - Default variant

Quick Space: learning_rate, weight_decay, momentum, min_lr, batch_size, dropout

Model Search

Enable enable_model_search to let Optuna automatically explore all registered architectures for the target resolution alongside hyperparameters:

optuna:
  n_trials: 20
  enable_model_search: true   # Explore architectures automatically

When enabled, the optimizer treats the model architecture as an additional categorical hyperparameter, selecting from all models compatible with the configured resolution. This is the recommended approach for finding the best architecture–hyperparameter combination without manual experimentation.

Optimization Workflow

# Phase 1: Comprehensive search (configurable trials, early stopping enabled)
orchard run recipes/optuna_efficientnet_b0.yaml

# Phase 2: Review results
firefox outputs/*/figures/param_importances.html
firefox outputs/*/figures/optimization_history.html

# Phase 3: Train with best config (60 epochs, full evaluation)
orchard run outputs/*/reports/best_config.yaml

Artifacts Generated

See the Artifact Reference Guide for complete documentation of all generated files.

Customization

Search Space Overrides (YAML-based)

Override any parameter from the search space directly in your recipe YAML, without code changes:

optuna:
  search_space_preset: full
  search_space_overrides:
    learning_rate:
      low: 1.0e-04           # Narrower range for stable convergence
      high: 5.0e-03
      log: true
    dropout:
      low: 0.15
      high: 0.4
    batch_size_low_res:       # Categorical: provide a list
      - 32
      - 48
      - 64

Float parameters require low, high, and optionally log: true (default false) for log-scale sampling. Categorical parameters are plain lists.

All parameters listed in Search Space Coverage can be overridden.