minerva.models.nets.image.deeplabv3¶
Classes¶
A DeeplabV3 with a ResNet50 backbone |
|
A ResNet50 backbone for DeepLabV3 |
|
The prediction head for DeepLabV3 |
|
Regression head for DeepLabV3 (continuous per-pixel/voxel prediction). |
Module Contents¶
- class minerva.models.nets.image.deeplabv3.DeepLabV3(backbone=None, pred_head=None, loss_fn=None, learning_rate=0.001, num_classes=6, pretrained=False, weights_path=None, train_metrics=None, val_metrics=None, test_metrics=None, optimizer=torch.optim.Adam, optimizer_kwargs=None, lr_scheduler=None, lr_scheduler_kwargs=None, output_shape=None, freeze_backbone=False, interpolate_mode='bilinear', flatten=False, loss_squeeze=True, loss_long=True)[source]¶
Bases:
minerva.models.nets.base.SimpleSupervisedModelA DeeplabV3 with a ResNet50 backbone
References¶
Liang-Chieh Chen, George Papandreou, Florian Schroff, Hartwig Adam. “Rethinking Atrous Convolution for Semantic Image Segmentation”, 2017
Initializes a DeepLabV3 model.
Parameters¶
- backbone: Optional[nn.Module]
The backbone network. Defaults to None, which will use a ResNet50 backbone.
- pred_head: Optional[nn.Module]
The prediction head network. Defaults to None, which will use a DeepLabV3PredictionHead with specified number of classes.
- loss_fn: Optional[nn.Module]
The loss function. Defaults to None, which will use a CrossEntropyLoss.
- learning_rate: float
The learning rate for the optimizer. Defaults to 0.001.
- num_classes: int
The number of classes for prediction. Defaults to 6.
- pretrained: bool
Whether to use pretrained weights. Defaults to False.
- weights_path: Optional[str]
Path to local pretrained weights file. If provided with pretrained=True, loads weights from this path instead of downloading. Defaults to None.
- train_metrics: Optional[Dict[str, Metric]]
The metrics to be computed during training. Defaults to None.
- val_metrics: Optional[Dict[str, Metric]]
The metrics to be computed during validation. Defaults to None.
- test_metrics: Optional[Dict[str, Metric]]
The metrics to be computed during testing. Defaults to None.
- optimizer: type
Optimizer class to be instantiated. By default, it is set to torch.optim.Adam. Should be a subclass of torch.optim.Optimizer (e.g., torch.optim.SGD).
- optimizer_kwargsdict, optional
Additional kwargs passed to the optimizer constructor.
- lr_schedulertype, optional
Learning rate scheduler class to be instantiated. By default, it is set to None, which means no scheduler will be used. Should be a subclass of torch.optim.lr_scheduler.LRScheduler (e.g., torch.optim.lr_scheduler.StepLR).
- lr_scheduler_kwargsdict, optional
Additional kwargs passed to the scheduler constructor.
- output_shape: Optional[Tuple[int, …]]
The output shape of the model. If None, the output shape will be the same as the input shape. Defaults to None. This is useful for models that require a specific output shape, that is different from the input shape.
- freeze_backbone: bool
Whether to freeze the backbone weights during training. Defaults to False.
- interpolate_mode: Optional[str]
The interpolation mode to use when upscaling the output to the desired output shape. Defaults to “bilinear”. Other options include “nearest”, “bicubic”, etc. See PyTorch documentation for torch.nn.functional.interpolate for all options. Use None to disable upscaling.
- flatten: bool
Whether to flatten the output of the backbone before passing it to the prediction head. Defaults to False. Set to True for classification tasks where the prediction head is a fully connected layer.
- loss_squeeze: bool
Whether to squeeze the target tensor in the loss function. Defaults to True. This is useful for segmentation tasks where the target tensor has a singleton channel dimension (e.g., shape (B, 1, H, W)) and the loss function expects shape (B, H, W).
- loss_long: bool
Whether to convert the target tensor to long type in the loss function. Defaults to True. This is useful for classification tasks where the target tensor is of integer type.
- _loss_func(y_hat, y)[source]¶
Computes the loss between predictions and ground truth.
Parameters¶
- y_hatTensor
Predicted tensor of shape (batch_size, num_classes, height, width)
- yTensor
Ground truth tensor of shape (batch_size, 1, height, width)
- Parameters:
y_hat (torch.Tensor)
y (torch.Tensor)
- Return type:
torch.Tensor
- forward(x)[source]¶
Performs the forward pass of the DeepLabV3 model.
Parameters¶
- xTensor
Input tensor of shape (batch_size, channels, height, width)
Returns¶
- Tensor
Output tensor of shape (batch_size, num_classes, height, width)
- Parameters:
x (torch.Tensor)
- Return type:
torch.Tensor
- interpolate_mode = 'bilinear'¶
- loss_long = True¶
- output_shape = None¶
- squeeze_loss = True¶
- Parameters:
backbone (Optional[torch.nn.Module])
pred_head (Optional[torch.nn.Module])
loss_fn (Optional[torch.nn.Module])
learning_rate (float)
num_classes (int)
pretrained (bool)
weights_path (Optional[str])
train_metrics (Optional[Dict[str, torchmetrics.Metric]])
val_metrics (Optional[Dict[str, torchmetrics.Metric]])
test_metrics (Optional[Dict[str, torchmetrics.Metric]])
optimizer (type)
optimizer_kwargs (Optional[Dict[str, Any]])
lr_scheduler (Optional[type])
lr_scheduler_kwargs (Optional[Dict[str, Any]])
output_shape (Optional[Tuple[int, Ellipsis]])
freeze_backbone (bool)
interpolate_mode (Optional[str])
flatten (bool)
loss_squeeze (bool)
loss_long (bool)
- class minerva.models.nets.image.deeplabv3.DeepLabV3Backbone(num_classes=6, pretrained=False, weights_path=None)[source]¶
Bases:
torch.nn.ModuleA ResNet50 backbone for DeepLabV3
Initializes the DeepLabV3 backbone model.
Parameters¶
- num_classes: int
The number of classes for classification. Default is 6.
- pretrained: bool
Whether to use pretrained weights. If True and weights_path is None, will attempt to download ImageNet pretrained weights. Default is False.
- weights_path: Optional[str]
Path to local pretrained weights file. If provided with pretrained=True, loads weights from this path instead of downloading. Default is None.
- RN50model¶
- Parameters:
num_classes (int)
pretrained (bool)
weights_path (Optional[str])
- class minerva.models.nets.image.deeplabv3.DeepLabV3PredictionHead(in_channels=2048, num_classes=6, atrous_rates=(12, 24, 36))[source]¶
Bases:
torch.nn.SequentialThe prediction head for DeepLabV3
Initializes the DeepLabV3 prediction head.
Parameters¶
- in_channels: int
Number of input channels. Defaults to 2048.
- num_classes: int
Number of output classes. Defaults to 6.
- atrous_rates: Sequence[int]
A sequence of atrous rates for the ASPP module. Defaults to (12, 24, 36).
- Parameters:
in_channels (int)
num_classes (int)
atrous_rates (Sequence[int])
- class minerva.models.nets.image.deeplabv3.DeepLabV3RegressionHead(in_channels=2048, out_channels=1, atrous_rates=(12, 24, 36))[source]¶
Bases:
torch.nn.SequentialRegression head for DeepLabV3 (continuous per-pixel/voxel prediction).
Parameters¶
- in_channelsint
Number of input channels from the backbone (typically 2048 for ResNet50).
- out_channelsint
Number of output channels (1 for single regression target).
- atrous_ratesSequence[int]
Atrous (dilation) rates for ASPP.
- Parameters:
in_channels (int)
out_channels (int)
atrous_rates (Sequence[int])