Configuration File
The communication between the user and the toolbox is performed through a configuration file containing a list of tunable system parameters. This file is implemented in YAML, a simple and concise language that maps easily into native data structures. Its comprehensibility makes it accessible to developers and non-developers and facilitates tracking experiment changes over time.
1. Defining the Task and Path
name: 'AIDE'
# Addressed task, choices: Classification, OutlierDetection, ImpactAssessment
task: ...
# Use a previously saved model to skip the train phase (True/False)
from_scratch: ...
# Path to the best model, required if from_scratch: False
best_run_path: ''
# Directory to save model outputs and results
save_path: "experiments/"
2. Defining the Dataset
Pointer to the Dataset class. This section can be customized by adding more variables. See Section Database for more details on how to create your own Dataset class.
# Database and DataLoader definition
data:
name: ... # Dataset class name
data_dim: ... # Data dimension
input_size: ... # Number of features
features: # Name of the features of the database
features_selected: ... # Features selected from the whole set of features
num_classes: ... # Number of categories in the database (drought, non-drought, e.g.)
lon_slice_test: ... # If visualization 2D enabled, min/max longitude coordinates (test)
lat_slice_test: ... # If visualization 2D enabled, min/max latitude coordinates (test)
3. Defining the Model
To specify the architecture to train, use the parameter type. This can be a user-defined model or a model available in the toolbox (see Section Available models for more details).
# Architecture definition
arch:
# Select a user-defined model (true/false)
user_defined: ...
# Type of architecture to be used (e.g., 'UNET')
type: ...
# Parameters to configure the architecture
params:
param_1: ...
# Model input dimension (1: 1D, 2: 2D)
input_model_dim: ...
# Model output dimension (1: 1D, 2: 2D)
output_model_dim: ...
4. Defining the Training
This part of the configuration file allows for specifying the parameters of:
loss function: Can be either custom or from a Python package. To choose from a Python package, set user_defined: False and specify loss name and package (e.g. type: 'sigmoid_focal_loss' and package: 'torchvision.ops' ). For custom losses, see section Custom Loss in Advanced features
optimizer: Defines the parameters to initialize the optimizer. type can be any of torch.optim.
trainer: Defines the parameters to initialize the Pytorch Lightning trainer.
dataloader: Defines the number of workers for the Pytorch Lightning dataloader.
# Definition of the training stage
implementation:
# Loss function
loss:
user_defined: ... # Select user-defined model (true/false)
type: ... # Python class name
package: ... # Python package, none for user defined
activation:
type: ... # Activation before computing the loss function
masked: ... # Use masks to compute loss
# Parameters for the loss function
params:
reduction: 'none'
param_1: ...
# Definition of the optimizer
optimizer:
type: ... # Optimizer type
lr: ... # Learning rate
weight_decay: ... # Weight decay
gclip_value: ... # Gradient clipping values
# Definition of PyTorch trainer
trainer:
accelerator: ... # Choices: gpu/cpu
devices: ... #
epochs: ... # Number of epochs
batch_size: ... # Batch size
monitor: # Metric to be monitored during training
split: ... # Choices: train/val/test
metric: ... # Either loss or a metric's name to monitor for early stopping and checkpoints
monitor_mode: ... # Monitor mode (increase or decrease monitored metric value)
early_stop: ... # Number of steps to perform early stopping
# Definition of PyTorch data loader
data_loader:
num_workers: ... # Number of CPUs to read the data in parallel
5. Defining the Evaluation
The toolbox provides several modules for evaluation at inference: metrics, visualizations, characterization and XAI. The metrics module will always be run while the other can be (de)activated. For more details of the capabilities of each module, please refeer to Section Evaluations.
# Types of chosen evaluations, choices: Visualization, Characterization, XAI
evaluation:
metrics:
Metric_1: {param_1: ...} # Metric for evaluation, from torchmetrics. Metric_1 has to be the name of the metric as in torchmetrics docs
visualization:
activate: ... # Choices: True/False
params:
param_1: ...
characterization:
activate: ... # Choices: True/False
params:
param_1: ...
xai:
activate: ... # Choices: True/False
params:
param_1: ...