dasf.ml.dl.lightning_fit

Classes

`LazyDatasetComputer`	This class encapsulates a map-style dataset that returns a Dask or GPU
`LightningTrainer`	Initialize the LightningFit class.

Module Contents

class dasf.ml.dl.lightning_fit.LazyDatasetComputer(dataset, unsqueeze_dim=None)[source]

This class encapsulates a map-style dataset that returns a Dask or GPU array. The __getitem__ method will compute the dask array before returning it. Thus, we can wrap this class into a DataLoader to make it compatible with PyTorch Lightning training loop.

Maps a dataset to a LazyDatasetComputer object.

Parameters

datasetAny: A Dasf map-style like dataset. The __getitem__ method should return either a tuple or a single object, in CPU/GPU or Dask array format.
unsqueeze_dimint, optional: The dimension to be unsqueezed in the output, by default None

dataset

unsqueeze_dim

__len__()[source]

__getitem__(index)[source]

Compute the dask array and return it.

Parameters

indexint: The index of the dataset to be returned.

Returns

_type_: np.ndarray or tuple of np.ndarray

Parameters:: index (int)
Return type:: Union[numpy.ndarray, Tuple[numpy.ndarray]]

Parameters:

dataset (Any)
unsqueeze_dim (int)

class dasf.ml.dl.lightning_fit.LightningTrainer(model, use_gpu=False, batch_size=1, max_epochs=1, limit_train_batches=None, limit_val_batches=None, devices='auto', num_nodes=1, shuffle=True, strategy='ddp', unsqueeze_dim=None)[source]

Initialize the LightningFit class.

Parameters

modelLightningModule: The LightningModule instance representing the model to be trained.
use_gpubool, optional: Flag indicating whether to use GPU for training, by default False.
batch_sizeint, optional: The batch size for training, by default 1.
max_epochsint, optional: The maximum number of epochs for training, by default 1.
limit_train_batchesint, optional: The number of batches to consider for training, by default None.
limit_val_batchesint, optional: The number of batches to consider for validation, by default None.
devicesint, optional: The number of devices to use for training, by default “auto”.
num_nodesint, optional: The number of nodes to use for distributed training, by default 1.
shufflebool, optional: Flag indicating whether to shuffle the data during training, by default True.
strategystr, optional: The strategy to use for distributed training, by default “ddp”.
unsqueeze_dimint, optional: The dimension to unsqueeze the input data, by default None.

model

accelerator

batch_size

max_epochs

limit_train_batches

limit_val_batches

devices

num_nodes

shuffle

strategy

unsqueeze_dim

fit(train_data, val_data=None)[source]

Perform the training of the model using torch Lightning.

Parameters

train_dataAny: A dasf map-style like dataset containing the training data.
val_dataAny, optional: A dasf map-style like dataset containing the validation data.

Parameters:

train_data (Any)
val_data (Any)

_fit(train_data, val_data=None)[source]

_lazy_fit_cpu(train_data, val_data=None)[source]

_lazy_fit_gpu(train_data, val_data=None)[source]

_fit_cpu(train_data, val_data=None)[source]

_fit_gpu(train_data, val_data=None)[source]

Parameters:

use_gpu (bool)
batch_size (int)
max_epochs (int)
limit_train_batches (int)
limit_val_batches (int)
devices (int)
num_nodes (int)
shuffle (bool)
strategy (str)
unsqueeze_dim (int)