minerva.models.nets.setr
Classes
Initializes the SetR model. |
|
Multi level feature aggretation head of SETR. |
|
Naive upsampling head and Progressive upsampling head of SETR. |
|
Initializes the SETR PUP model. |
Module Contents
- class minerva.models.nets.setr.SETR_PUP(image_size=512, patch_size=16, num_layers=24, num_heads=16, hidden_dim=1024, mlp_dim=4096, encoder_dropout=0.1, num_classes=1000, norm_layer=None, decoder_channels=256, num_convs=4, up_scale=2, kernel_size=3, align_corners=False, decoder_dropout=0.1, conv_norm=None, conv_act=None, interpolate_mode='bilinear', loss_fn=None, train_metrics=None, val_metrics=None, test_metrics=None, aux_output=True, aux_output_layers=[9, 14, 19], aux_weights=[0.3, 0.3, 0.3])
Bases:
lightning.LightningModule
Initializes the SetR model.
Parameters
- image_sizeint or tuple[int, int]
The input image size. Defaults to 512.
- patch_sizeint
The size of each patch. Defaults to 16.
- num_layersint
The number of layers in the transformer encoder. Defaults to 24.
- num_headsint
The number of attention heads in the transformer encoder. Defaults to 16.
- hidden_dimint
The hidden dimension of the transformer encoder. Defaults to 1024.
- mlp_dimint
The dimension of the MLP layers in the transformer encoder. Defaults to 4096.
- encoder_dropoutfloat
The dropout rate for the transformer encoder. Defaults to 0.1.
- num_classesint
The number of output classes. Defaults to 1000.
- norm_layernn.Module, optional
The normalization layer to be used in the decoder. Defaults to None.
- decoder_channelsint
The number of channels in the decoder. Defaults to 256.
- num_convsint
The number of convolutional layers in the decoder. Defaults to 4.
- up_scaleint
The scale factor for upsampling in the decoder. Defaults to 2.
- kernel_sizeint
The kernel size for convolutional layers in the decoder. Defaults to 3.
- align_cornersbool
Whether to align corners during interpolation in the decoder. Defaults to False.
- decoder_dropoutfloat
The dropout rate for the decoder. Defaults to 0.1.
- conv_normnn.Module, optional
The normalization layer to be used in the convolutional layers of the decoder. Defaults to None.
- conv_actnn.Module, optional
The activation function to be used in the convolutional layers of the decoder. Defaults to None.
- interpolate_modestr
The interpolation mode for upsampling in the decoder. Defaults to “bilinear”.
- loss_fnnn.Module, optional
The loss function to be used during training. Defaults to None.
- train_metricsDict[str, Metric], optional
The metrics to be used for training evaluation. Defaults to None.
- val_metricsDict[str, Metric], optional
The metrics to be used for validation evaluation. Defaults to None.
- test_metricsDict[str, Metric], optional
The metrics to be used for testing evaluation. Defaults to None.
- aux_outputbool
Whether to include auxiliary output heads in the model. Defaults to True.
- aux_output_layerslist[int] | None
The indices of the layers to output auxiliary predictions. Defaults to [9, 14, 19].
- aux_weightslist[float]
The weights for the auxiliary predictions. Defaults to [0.3, 0.3, 0.3].
- _compute_metrics(y_hat, y, step_name)
- Parameters:
y_hat (torch.Tensor)
y (torch.Tensor)
step_name (str)
- _loss_func(y_hat, y)
Calculate the loss between the output and the input data.
Parameters
- y_hattorch.Tensor
The output data from the forward pass.
- ytorch.Tensor
The input data/label.
Returns
- torch.Tensor
The loss value.
- Parameters:
y_hat (torch.Tensor | Tuple[torch.Tensor, torch.Tensor, torch.Tensor, torch.Tensor])
y (torch.Tensor)
- Return type:
torch.Tensor
- _single_step(batch, batch_idx, step_name)
Perform a single step of the training/validation loop.
Parameters
- batchtorch.Tensor
The input data.
- batch_idxint
The index of the batch.
- step_namestr
The name of the step, either “train” or “val”.
Returns
- torch.Tensor
The loss value.
- Parameters:
batch (torch.Tensor)
batch_idx (int)
step_name (str)
- configure_optimizers()
- forward(x)
- Parameters:
x (torch.Tensor)
- Return type:
torch.Tensor
- predict_step(batch, batch_idx, dataloader_idx=None)
- Parameters:
batch (torch.Tensor)
batch_idx (int)
dataloader_idx (int | None)
- test_step(batch, batch_idx)
- Parameters:
batch (torch.Tensor)
batch_idx (int)
- training_step(batch, batch_idx)
- Parameters:
batch (torch.Tensor)
batch_idx (int)
- validation_step(batch, batch_idx)
- Parameters:
batch (torch.Tensor)
batch_idx (int)
- Parameters:
image_size (int | tuple[int, int])
patch_size (int)
num_layers (int)
num_heads (int)
hidden_dim (int)
mlp_dim (int)
encoder_dropout (float)
num_classes (int)
norm_layer (Optional[torch.nn.Module])
decoder_channels (int)
num_convs (int)
up_scale (int)
kernel_size (int)
align_corners (bool)
decoder_dropout (float)
conv_norm (Optional[torch.nn.Module])
conv_act (Optional[torch.nn.Module])
interpolate_mode (str)
loss_fn (Optional[torch.nn.Module])
train_metrics (Optional[Dict[str, torchmetrics.Metric]])
val_metrics (Optional[Dict[str, torchmetrics.Metric]])
test_metrics (Optional[Dict[str, torchmetrics.Metric]])
aux_output (bool)
aux_output_layers (list[int] | None)
aux_weights (list[float])
- class minerva.models.nets.setr._SETRMLAHead(channels, conv_norm, conv_act, in_channels, out_channels, num_classes, mla_channels=128, up_scale=4, kernel_size=3, align_corners=True, dropout=0.1, threshold=None)
Bases:
torch.nn.Module
Multi level feature aggretation head of SETR.
MLA head of SETR.
- Parameters:
channels (int)
conv_norm (Optional[torch.nn.Module])
conv_act (Optional[torch.nn.Module])
in_channels (list[int])
out_channels (int)
num_classes (int)
mla_channels (int)
up_scale (int)
kernel_size (int)
align_corners (bool)
dropout (float)
threshold (Optional[float])
- forward(x)
- class minerva.models.nets.setr._SETRUPHead(channels, in_channels, num_classes, norm_layer, conv_norm, conv_act, num_convs, up_scale, kernel_size, align_corners, dropout, interpolate_mode)
Bases:
torch.nn.Module
Naive upsampling head and Progressive upsampling head of SETR.
Naive or PUP head of SETR.
Initializes the SETR model.
Parameters
- channelsint
Number of output channels.
- in_channelsint
Number of input channels.
- num_classesint
Number of output classes.
- norm_layernn.Module
Normalization layer.
- conv_normnn.Module
Convolutional normalization layer.
- conv_actnn.Module
Convolutional activation layer.
- num_convsint
Number of convolutional layers.
- up_scaleint
Upsampling scale factor.
- kernel_sizeint
Kernel size for convolutional layers.
- align_cornersbool
Whether to align corners during upsampling.
- dropoutfloat
Dropout rate.
- interpolate_modestr
Interpolation mode for upsampling.
Raises
- AssertionError
If kernel_size is not 1 or 3.
- forward(x)
- Parameters:
channels (int)
in_channels (int)
num_classes (int)
norm_layer (torch.nn.Module)
conv_norm (torch.nn.Module)
conv_act (torch.nn.Module)
num_convs (int)
up_scale (int)
kernel_size (int)
align_corners (bool)
dropout (float)
interpolate_mode (str)
- class minerva.models.nets.setr._SetR_PUP(image_size, patch_size, num_layers, num_heads, hidden_dim, mlp_dim, num_convs, num_classes, decoder_channels, up_scale, encoder_dropout, kernel_size, decoder_dropout, norm_layer, interpolate_mode, conv_norm, conv_act, align_corners, aux_output=False, aux_output_layers=None)
Bases:
torch.nn.Module
Initializes the SETR PUP model.
Parameters
- image_sizeint or tuple[int, int]
The size of the input image.
- patch_sizeint
The size of each patch in the input image.
- num_layersint
The number of layers in the transformer encoder.
- num_headsint
The number of attention heads in the transformer encoder.
- hidden_dimint
The hidden dimension of the transformer encoder.
- mlp_dimint
The dimension of the feed-forward network in the transformer encoder.
- num_convsint
The number of convolutional layers in the decoder.
- num_classesint
The number of output classes.
- decoder_channelsint
The number of channels in the decoder.
- up_scaleint
The scale factor for upsampling in the decoder.
- encoder_dropoutfloat
The dropout rate for the transformer encoder.
- kernel_sizeint
The kernel size for the convolutional layers in the decoder.
- decoder_dropoutfloat
The dropout rate for the decoder.
- norm_layernn.Module
The normalization layer to be used.
- interpolate_modestr
The mode for interpolation during upsampling.
- conv_normnn.Module
The normalization layer to be used in the decoder convolutional layers.
- conv_actnn.Module
The activation function to be used in the decoder convolutional layers.
- align_cornersbool
Whether to align corners during upsampling.
- forward(x)
- Parameters:
x (torch.Tensor)
- Parameters:
image_size (int | tuple[int, int])
patch_size (int)
num_layers (int)
num_heads (int)
hidden_dim (int)
mlp_dim (int)
num_convs (int)
num_classes (int)
decoder_channels (int)
up_scale (int)
encoder_dropout (float)
kernel_size (int)
decoder_dropout (float)
norm_layer (torch.nn.Module)
interpolate_mode (str)
conv_norm (torch.nn.Module)
conv_act (torch.nn.Module)
align_corners (bool)
aux_output (bool)
aux_output_layers (list[int] | None)