minerva.pipelines.experiment ============================ .. py:module:: minerva.pipelines.experiment Classes ------- .. autoapisummary:: minerva.pipelines.experiment.Experiment minerva.pipelines.experiment.ModelConfig minerva.pipelines.experiment.ModelInformation minerva.pipelines.experiment.ModelInstantiator Functions --------- .. autoapisummary:: minerva.pipelines.experiment.get_trainer minerva.pipelines.experiment.load_predictions minerva.pipelines.experiment.load_results minerva.pipelines.experiment.perform_evaluation minerva.pipelines.experiment.perform_predict minerva.pipelines.experiment.perform_train minerva.pipelines.experiment.save_predictions minerva.pipelines.experiment.save_results Module Contents --------------- .. py:class:: Experiment(experiment_name, model_config, data_module, pretrained_backbone_ckpt_path = None, root_log_dir = './logs', execution_id = 0, checkpoint_metrics = None, max_epochs = 100, accelerator = 'gpu', devices = 1, strategy = 'auto', num_nodes = 1, limit_train_batches = None, limit_val_batches = None, limit_test_batches = None, limit_predict_batches = None, evaluation_metrics = None, per_sample_evaluation_metrics = None, seed = None, progress_bar_refresh_rate = 1, profiler = None, save_predictions = True, save_results = True, add_last_checkpoint = True) Bases: :py:obj:`minerva.pipelines.base.Pipeline` Pipelines provide a versatile API for automating tasks efficiently. They are runnable objects that keeps track of their parameters, results, and status, allowing the reproductibility and traceability of the experiments. This is the base class for all pipelines. It provides the basic structure for running a pipeline and saving the results and status of the runs. Users should inherit from this class and implement the `_run` method. Pipelines are clonal objects, meaning that they can be cloned to create new pipelines with the same configuration. Cloned pipelines do receive a new pipeline_id and run_count. Pipelines expose their public API though properties (which are read-only) and though the `run` method. Users should not access or modify the internal attributes directly. The run method may set desired attributed (hence properties), used to be accessed after or during the run. The run method may return a result, which can be cached and accessed through the `result` property (if the `cache_result` is set to True). An experiment is a pipeline that contains all the parameters needed to train and evaluate a model, as well as to manage the logging, checkpointing, prediction, and results processes in a coherent way. Parameters ---------- experiment_name : str The name of the experiment. This name will be used to create a directory for the experiment in the log directory. model_config : ModelConfig The model configuration. This object contains the model instantiator and the model information. data_module : MinervaDataModule The data module. This object contains the training, validation, and test datasets, as well as the data loaders. For now, datasets must return a 2 element tuple (input, label) for each sample. pretrained_backbone_ckpt_path : Optional[PathLike], optional The path to the pretrained backbone checkpoint. This is used to finetune the model. If None, the model will be trained from scratch. This parameter handles the lazy instantiation of the model and calls `create_model_and_load_backbone` method of the model instantiator if `pretrained_backbone_ckpt_path` is not None or `create_model_randomly_initialized` method if it is None. By default None root_log_dir : PathLike, optional Root directory for logging and checkpoints. This directory will be used to create a subdirectory for the experiment. By default ./logs execution_id : Union[str, int], optional The execution ID for the experiment. This ID will be used to create a subdirectory for the experiment in the log directory. This is useful when running the experiment multiple times with the same parameters. By default 0 checkpoint_metrics : Optional[List[Dict[str, str]]], optional The checkpoint metrics. This is a list of dictionaries that contain the checkpoint metrics. Each dictionary must contain the keys "monitor", "mode", and "filename". The "monitor" key is the name of the metric to monitor, the "mode" key is the mode of the metric ("min" or "max"), and the "filename" key is the name of the checkpoint file. The "monitor" key can be None if the checkpoint is the last one. By default None max_epochs : int, optional Number of epochs to train the model. This parameter is passed to the `get_trainer` function. By default 100. accelerator : str, optional The accelerator to use for training. This parameter is passed to the `get_trainer` function. By default "gpu". Possible values are "cpu", "gpu", "tpu", "ipu", "hpu", "mps", "auto". If "auto" is selected, the accelerator will be automatically selected based on the available hardware. By default "gpu" devices : Optional[Union[int, list[int], str]], optional Number of accelerators to use for training. This parameter is passed to the `get_trainer` function. By default 1. strategy : str, optional Strategy to use for distributed training. This parameter is passed to the `get_trainer` function. By default "auto". num_nodes : int, optional Number of nodes to use for distributed training. This parameter is passed to the `get_trainer` function. By default 1. limit_train_batches : Optional[Union[int, float]], optional Limit the number of training batches to use. This parameter is passed to the `get_trainer` function. By default None. If None, all batches will be used. If an integer is provided, it will be the absolute number of batches. If a float is provided, it will be the fraction of the total number of batches. For example, 0.1 means 10% of the training batches will be used. limit_val_batches : Optional[Union[int, float]], optional Limit the number of validation batches to use. This parameter is passed to the `get_trainer` function. By default None. If None, all batches will be used. If an integer is provided, it will be the absolute number of batches. If a float is provided, it will be the fraction of the total number of batches. For example, 0.1 means 10% of the validation batches will be used. limit_test_batches : Optional[Union[int, float]], optional Limit the number of test batches to use. This parameter is passed to the `get_trainer` function. By default None. If None, all batches will be used. If an integer is provided, it will be the absolute number of batches. If a float is provided, it will be the fraction of the total number of batches. For example, 0.1 means 10% of the test batches will be used. limit_predict_batches : Optional[Union[int, float]], optional Limit the number of prediction batches to use. This parameter is passed to the `get_trainer` function. By default None. If None, all batches will be used. If an integer is provided, it will be the absolute number of batches. If a float is provided, it will be the fraction of the total number of batches. For example, 0.1 means 10% of the prediction batches will be used. evaluation_metrics : Optional[Dict[str, torchmetrics.Metric]], optional A dictionary of evaluation metrics to use for the predictions. The keys are the names of the metrics and the values are the `torchmetrics.Metric` objects. These metrics are calculated using all the predictions. By default None. per_sample_evaluation_metrics : Optional[ Dict[str, torchmetrics.Metric] ], optional A dictionary of evaluation metrics to use for the predictions. The keys are the names of the metrics and the values are the `torchmetrics.Metric` objects. These metrics are calculated using each prediction separately, that is, applyied per sample. By default None. seed : Optional[int], optional The seed to use for the experiment, by default None progress_bar_refresh_rate : int, optional The refresh rate of the progress bar (in batches). If 0, the progress bar is disabled. If 1, the progress bar is updated every batch. By default 1 profiler : Optional[str], optional A profiler to use for the experiment. This parameter is passed to the `get_trainer` function. By default None. save_predictions : bool, optional If True, the predictions will be saved to the log directory. By default True save_results : bool, optional If True, the results will be saved to the log directory. By default True add_last_checkpoint : bool, optional If True, the last checkpoint will be added to the list of checkpoint metrics. By default True. Raises ------ ValueError If the checkpoint metrics are not valid or do not contain the required keys. Notes ------ - This class assumes that the `MinervaDataModule` class returns a (input, label) tuple for each sample in the dataset. The input is the data and the label is the ground truth/target. .. py:attribute:: NUM_DEBUG_BATCHES :value: 10 .. py:attribute:: NUM_DEBUG_EPOCHS :value: 3 .. py:method:: __str__() .. py:method:: __typing_string(value) :staticmethod: .. py:attribute:: _checkpoint_dir .. py:method:: _evaluate_model(ckpts_to_evaluate = None, print_summary = True, debug = False) .. py:attribute:: _predictions_dir .. py:method:: _print_evaluation_summary(trainer_params, debug = False, ckpt_path = None, predictions_path = None, results_path = None) .. py:method:: _print_train_summary(model, trainer_params, debug = False, resume_from_ckpt = None) .. py:attribute:: _results_dir .. py:method:: _run(task, debug = False, resume_from_ckpt = None, print_summary = True, ckpts_to_evaluate = None) Default pipeline method to be implemented in derived classes. This implements the pipeline logic. Returns ------- Any The result of the pipeline run. .. py:method:: _train_model(resume_from_ckpt = None, debug = False, print_summary = True) .. py:method:: _trainer_parameters(enable_logging = True, debug = False) Return the parameters for the trainer based on the current on debug and logging settings. Parameters ---------- enable_logging : bool, optional If True, logging will be enabled, by default True debug : bool, optional If True, model will be trained with a few batches and for a few epochs only. Logging will always be disabled, by default False Returns ------- Dict[str, Any] All the parameters for the `get_trainer` function. .. py:attribute:: _training_metrics_path .. py:attribute:: accelerator :value: 'gpu' .. py:attribute:: checkpoint_metrics :value: [] .. py:property:: checkpoint_paths :type: Dict[str, pathlib.Path] Returns a dictionary of checkpoint paths for the experiment. The keys are the checkpoint names, and the values are the corresponding paths to the checkpoints. Returns ------- Dict[str, Path] A dictionary mapping checkpoint names to their respective paths. .. py:method:: cleanup() Clean up the experiment by removing the log directory. .. py:attribute:: data_module .. py:attribute:: devices :value: 1 .. py:attribute:: evaluation_metrics .. py:attribute:: execution_id :value: '' .. py:attribute:: experiment_name .. py:attribute:: limit_predict_batches :value: None .. py:attribute:: limit_test_batches :value: None .. py:attribute:: limit_train_batches :value: None .. py:attribute:: limit_val_batches :value: None .. py:method:: load_predictions_of_ckpt(name) Load predictions from a file. Parameters ---------- name : str The name of the prediction file (without extension). Returns ------- np.ndarray The loaded predictions as a numpy array. .. py:method:: load_results_of_ckpt(name) Load results from a file. Parameters ---------- name : str The name of the result file (without extension). Returns ------- pd.DataFrame The loaded results as a pandas DataFrame. .. py:attribute:: max_epochs :value: 100 .. py:attribute:: model_config .. py:attribute:: num_nodes :value: 1 .. py:attribute:: per_sample_evaluation_metrics .. py:property:: prediction_paths :type: Dict[str, pathlib.Path] Returns a dictionary of prediction paths for the experiment. The keys are the prediction names, and the values are the corresponding paths to the predictions. Returns ------- Dict[str, Path] A dictionary mapping prediction names to their respective paths. .. py:attribute:: pretrained_backbone_ckpt_path :value: None .. py:attribute:: profiler :value: None .. py:attribute:: progress_bar_refresh_rate :value: 1 .. py:property:: results_paths :type: Dict[str, pathlib.Path] Returns a dictionary of results paths for the experiment. The keys are the result names, and the values are the corresponding paths to the results. Returns ------- Dict[str, Path] A dictionary mapping result names to their respective paths. .. py:attribute:: root_log_dir .. py:attribute:: save_predictions :value: True .. py:attribute:: save_results :value: True .. py:attribute:: seed :value: None .. py:property:: status :type: Dict[str, Any] .. py:attribute:: strategy :value: 'auto' .. py:property:: training_metrics :type: Optional[pandas.DataFrame] Returns the training metrics as a pandas DataFrame. If the metrics file does not exist, returns None. Returns ------- Optional[pd.DataFrame] A DataFrame containing the training metrics. .. py:property:: training_metrics_path :type: Optional[pathlib.Path] The path to the training metrics file. Returns ------- Optional[Path] The path to the metrics file if it exists, otherwise None. .. py:class:: ModelConfig(instantiator, information) Encapsulates the full configuration of a model for use in a training or inference pipeline. A `ModelConfig` brings together two key components: - `ModelInstantiator`: Responsible for creating the model in different modes (lazily instantiated): - From scratch (randomly initialized) - Finetuning (load pretrained backbone, new head) - From checkpoint (fully restored model) - `ModelInformation`: Contains descriptive metadata about the model such as input/output shapes, number of classes, backbone used, and task type. This class serves as the primary interface for managing and accessing model configuration throughout the lifecycle of training, evaluation, or deployment. Initialize a model configuration. Parameters ---------- instantiator : ModelInstantiator An instance responsible for constructing the model in various training modes (random init, load backbone, load full checkpoint). This enables lazy instantiation depending on the training phase. information : ModelInformation Metadata describing the model's architecture and behavior. Includes input/output shapes, task type, number of classes, and other relevant info useful for logging, validation, and downstream processing. .. py:method:: __str__() .. py:attribute:: information .. py:attribute:: instantiator .. py:class:: ModelInformation Container for metadata related to a machine learning model configuration. This class stores essential information about a model's identity, architecture, data shapes, and output behavior. Such metadata is useful for tasks such as logging, reproducibility, automated evaluation, or dynamic behavior in pipelines. Attributes ---------- name : str A unique identifier for the model configuration. Commonly used for logging, saving checkpoints, or experiment tracking. backbone_name : Optional[str], optional The name of the backbone architecture used in the model (e.g., "resnet50", "vit-base"). Useful for identifying model variants or tracking architectural differences. task_type : Optional[str], optional The task the model is designed for (e.g., "classification", "segmentation", "detection"). Enables downstream logic to adapt based on the task type. input_shape : Optional[Tuple[int, ...]], optional Expected shape of input tensors (excluding batch size), typically in the format (C, H, W) for image data. For example: (3, 224, 224) for an RGB image of size 224x224. output_shape : Optional[Tuple[int, ...]], optional Expected shape of model outputs (excluding batch size). Examples include: - (6, 224, 224) for semantic segmentation logits with 6 classes - (224, 224) for semantic segmentation predictions (argmax indices) - (6,) for classification logits (6 classes) - (1,) for classification predictions as class indices num_classes : Optional[int], optional Total number of classes the model is predicting. Primarily relevant for classification or segmentation tasks. return_logits : Optional[bool], optional If True, the model returns raw logits. If False, it returns post-processed class predictions (e.g., argmax indices or probabilities). .. py:attribute:: backbone_name :type: Optional[str] :value: None .. py:attribute:: input_shape :type: Optional[Tuple[int, Ellipsis]] :value: None .. py:attribute:: name :type: str .. py:attribute:: num_classes :type: Optional[int] :value: None .. py:attribute:: output_shape :type: Optional[Tuple[int, Ellipsis]] :value: None .. py:attribute:: return_logits :type: Optional[bool] :value: None .. py:attribute:: task_type :type: Optional[str] :value: None .. py:class:: ModelInstantiator Bases: :py:obj:`abc.ABC` Abstract base class for lazy instantiation of PyTorch Lightning models. This interface defines a standardized way to construct models in three common training scenarios: 1. Training from scratch: the entire model (backbone + head) is randomly initialized. 2. Finetuning: a pretrained backbone is loaded from a checkpoint, while the head is randomly initialized. 3. Inference/Evaluation: the full model is restored from a previously saved checkpoint. Usually, this checkpoint is generated using one of the two scenarios above. This abstraction allows for flexible and decoupled model construction across various stages of the machine learning lifecycle. Thus, is expected that model's architecture follows the same pattern as the one below: +-------------------------------+ | Model (LightningModule) | | | | +-----------------+ | | | Backbone | | --> Feature extractor | +-----------------+ | | | | | v | | +----------+ | | | Head | | --> Task-specific layers | +----------+ | +-------------------------------+ Definitions ----------- - Backbone: Core feature extractor (e.g., ResNet, Transformer encoder). - Head: Task-specific layers (e.g., classification head, regression head). Implementations of this class should handle the appropriate model loading logic for each use case described above. .. py:method:: create_model_and_load_backbone(backbone_checkpoint_path) :abstractmethod: Create a model for finetuning with a pretrained backbone and a new head (randomly initialized). This method should load the backbone weights from the specified checkpoint and attach a freshly initialized head for the downstream task. User must handle the logic to load the backbone weights into the model's state dict. Parameters ---------- backbone_checkpoint_path : PathLike Path to the checkpoint containing pretrained backbone weights. The checkpoint must be compatible with the model architecture. Returns ------- L.LightningModule The model ready for finetuning (pretrained backbone, new head). .. py:method:: create_model_randomly_initialized() :abstractmethod: Create a model with both backbone and head randomly initialized. Typically used when training a model from scratch. Returns ------- L.LightningModule A Lightning model fully initialized with random weights, ready for training. .. py:method:: load_model_from_checkpoint(checkpoint_path) :abstractmethod: Load the full model (backbone and head) from a saved checkpoint. Typically used for resuming training, evaluation, or inference when the model must be restored in its entirety. In practice, the checkpoint should be one created using `create_model_and_load_backbone` or `create_model_randomly_initialized`. The checkpoint must be compatible with the model architecture. Parameters ---------- checkpoint_path : PathLike Path to the checkpoint file containing the full model state. Returns ------- L.LightningModule A Lightning model fully restored from checkpoint, ready for evaluation or inference. .. py:function:: get_trainer(log_dir, max_epochs = 100, limit_train_batches = None, limit_val_batches = None, limit_test_batches = None, limit_predict_batches = None, accelerator = 'auto', strategy = 'auto', devices = 'auto', num_nodes = 1, progress_bar_refresh_rate = 1, enable_logging = True, checkpoint_metrics = None, precision = '32-true', accumulate_grad_batches = 1, deterministic = False, benchmark = True, profiler = None, overfit_batches = 0.0, sync_batchnorm = False) Creates and configures a PyTorch Lightning Trainer instance. This function encapsulates all necessary options for flexible training, evaluation, or inference, including logging, checkpointing, device setup, precision, and more. Parameters ---------- log_dir : Path Directory path where logs and checkpoints will be saved. max_epochs : int, default=100 Maximum number of epochs for training. limit_train_batches : int or float, optional Limit on the number of training batches per epoch. Can be an integer (absolute number) or a float (fraction of total batches). limit_val_batches : int or float, optional Limit on the number of validation batches per epoch. limit_test_batches : int or float, optional Limit on the number of test batches per epoch. limit_predict_batches : int or float, optional Limit on the number of prediction batches. accelerator : str, default="auto" Hardware accelerator to use (e.g., "gpu", "cpu", "tpu", "auto"). strategy : str, default="auto" Distributed training strategy (e.g., "ddp", "deepspeed", etc.). devices : int, list of int, or str, optional, default="auto" Devices to use for training (e.g., 1, [0,1], "auto"). num_nodes : int, default=1 Number of nodes to use for distributed training. progress_bar_refresh_rate : int, default=1 Frequency (in steps) at which the progress bar is updated. Set to 0 to disable. enable_logging : bool, default=True Whether to enable CSV logging. checkpoint_metrics : list of dict, optional List of dictionaries containing checkpoint configurations. Each dictionary should specify "monitor", "mode", and "filename". precision : str, default="32-true" Numerical precision to use during training (e.g., 32-true, 16-mixed). accumulate_grad_batches : int, default=1 Number of batches for which gradients should be accumulated before performing an optimizer step. deterministic : bool, default=False If True, sets deterministic behavior for reproducibility. benchmark : bool, default=True Enables the cudnn.benchmark flag for optimized performance on fixed input sizes. profiler : str, optional Enables performance profiling (e.g., "simple", "advanced"). overfit_batches : int or float, default=0.0 Uses a fraction or number of batches for both training and validation to quickly debug overfitting behavior. sync_batchnorm : bool, default=False Synchronizes batch norm layers across devices during distributed training. Returns ------- L.Trainer A configured PyTorch Lightning Trainer instance. .. py:function:: load_predictions(path) Load a prediction from a given path. Parameters ---------- path : PathLike The path to the prediction file. Returns ------- np.ndarray The loaded prediction data. .. py:function:: load_results(path) Load results from a given path. Parameters ---------- path : PathLike The path to the results file. Returns ------- pd.DataFrame The loaded results data. .. py:function:: perform_evaluation(evaluation_metrics, data_module, predictions, argmax_axis = None, per_sample = False, batch_size = 1, device = 'cpu') Evaluates predictions using provided evaluation metrics and a data module This function compares predicted values against ground truth labels from a prediction dataset. It supports both aggregate evaluation over the entire dataset and per-sample evaluation. Metrics should be compatible with `torchmetrics`. Parameters ---------- evaluation_metrics : dict of str to torchmetrics.Metric A dictionary mapping metric names to `torchmetrics.Metric` instances. data_module : MinervaDataModule A data module that contains the `predict_dataset` used for evaluation. predictions : np.ndarray An array of predictions generated by the model. argmax_axis : int, optional If provided, applies `torch.argmax` along this axis to the predictions before metric evaluation. per_sample : bool, default=False If True, computes metrics individually for each sample. Otherwise, evaluates metrics over the entire dataset in batches. batch_size : int, default=1 Batch size used for evaluation when `per_sample` is False. device : str, default="cpu" The device (e.g., "cpu", "cuda") on which metric computations will run. Returns ------- pd.DataFrame A DataFrame containing computed metric values. If `per_sample` is True, each row corresponds to one sample. Otherwise, a single-row summary is returned. .. py:function:: perform_predict(data_module, model, trainer, squeeze = False) Perform predictions using the provided data module and trainer. Parameters ---------- data_module : MinervaDataModule The data module containing the dataset for predictions. model : L.LightningModule The model to be used for predictions. trainer : L.Trainer The trainer instance to use for predictions. squeeze : bool, optional If True, squeeze the predictions to remove single-dimensional entries from the shape of the predictions (except from first dimension). By default False Returns ------- np.ndarray The predictions as a numpy array. .. py:function:: perform_train(data_module, model, trainer, resume_from_ckpt = None) Train the model using the provided data module and trainer. Parameters ---------- data_module : MinervaDataModule The data module containing the training and validation datasets. model : L.LightningModule The model to be trained. trainer : L.Trainer The trainer instance to use for training. resume_from_ckpt : Optional[PathLike], optional A path to a checkpoint in which to resume training. If None, training starts from scratch. By default None Returns ------- L.LightningModule The trained model. .. py:function:: save_predictions(predictions, path) Save predictions to a given path. Parameters ---------- predictions : Union[np.ndarray, torch.Tensor] The prediction data to save. path : PathLike The path where the predictions will be saved. .. py:function:: save_results(results, path, index = False) Save results to a given path. Parameters ---------- results : pd.DataFrame The results data to save. path : PathLike The path where the results will be saved. index : bool, optional Whether to save the index of the DataFrame, by default False