dasf.ml.dl.lightning_fit

Classes

LazyDatasetComputer

This class encapsulates a map-style dataset that returns a Dask or GPU

LightningTrainer

Initialize the LightningFit class.

Module Contents

class dasf.ml.dl.lightning_fit.LazyDatasetComputer(dataset, unsqueeze_dim=None)[source]

This class encapsulates a map-style dataset that returns a Dask or GPU array. The __getitem__ method will compute the dask array before returning it. Thus, we can wrap this class into a DataLoader to make it compatible with PyTorch Lightning training loop.

Maps a dataset to a LazyDatasetComputer object.

Parameters

datasetAny

A Dasf map-style like dataset. The __getitem__ method should return either a tuple or a single object, in CPU/GPU or Dask array format.

unsqueeze_dimint, optional

The dimension to be unsqueezed in the output, by default None

dataset
unsqueeze_dim
__len__()[source]
__getitem__(index)[source]

Compute the dask array and return it.

Parameters

indexint

The index of the dataset to be returned.

Returns

_type_

np.ndarray or tuple of np.ndarray

Parameters:

index (int)

Return type:

Union[numpy.ndarray, Tuple[numpy.ndarray]]

Parameters:
  • dataset (Any)

  • unsqueeze_dim (int)

class dasf.ml.dl.lightning_fit.LightningTrainer(model, use_gpu=False, batch_size=1, max_epochs=1, limit_train_batches=None, limit_val_batches=None, devices='auto', num_nodes=1, shuffle=True, strategy='ddp', unsqueeze_dim=None)[source]

Initialize the LightningFit class.

Parameters

modelLightningModule

The LightningModule instance representing the model to be trained.

use_gpubool, optional

Flag indicating whether to use GPU for training, by default False.

batch_sizeint, optional

The batch size for training, by default 1.

max_epochsint, optional

The maximum number of epochs for training, by default 1.

limit_train_batchesint, optional

The number of batches to consider for training, by default None.

limit_val_batchesint, optional

The number of batches to consider for validation, by default None.

devicesint, optional

The number of devices to use for training, by default “auto”.

num_nodesint, optional

The number of nodes to use for distributed training, by default 1.

shufflebool, optional

Flag indicating whether to shuffle the data during training, by default True.

strategystr, optional

The strategy to use for distributed training, by default “ddp”.

unsqueeze_dimint, optional

The dimension to unsqueeze the input data, by default None.

model
accelerator
batch_size
max_epochs
limit_train_batches
limit_val_batches
devices
num_nodes
shuffle
strategy
unsqueeze_dim
fit(train_data, val_data=None)[source]

Perform the training of the model using torch Lightning.

Parameters

train_dataAny

A dasf map-style like dataset containing the training data.

val_dataAny, optional

A dasf map-style like dataset containing the validation data.

Parameters:
  • train_data (Any)

  • val_data (Any)

_fit(train_data, val_data=None)[source]
_lazy_fit_cpu(train_data, val_data=None)[source]
_lazy_fit_gpu(train_data, val_data=None)[source]
_fit_cpu(train_data, val_data=None)[source]
_fit_gpu(train_data, val_data=None)[source]
Parameters:
  • use_gpu (bool)

  • batch_size (int)

  • max_epochs (int)

  • limit_train_batches (int)

  • limit_val_batches (int)

  • devices (int)

  • num_nodes (int)

  • shuffle (bool)

  • strategy (str)

  • unsqueeze_dim (int)