minerva.data.datasets

Submodules

Classes

SimpleDataset

Dataset is responsible for loading data from multiple readers and

SupervisedReconstructionDataset

A simple dataset class for supervised reconstruction tasks.

Package Contents

class minerva.data.datasets.SimpleDataset(readers, transforms=None, return_single=False)

Bases: torch.utils.data.Dataset

Dataset is responsible for loading data from multiple readers and responsible for loading data from multiple readers and

apply specified transforms. It is a generic implementation that can be used to create differents dataset, from supervised to unsupervised ones.

This class implements the common pipeline for reading and transforming data. The __getitem__ pipeline is as follows:

For each reader R and transform list T:
  1. Read the data from the reader R at the index idx.

  2. Apply the transforms T to the data.

  3. Append the transformed data to the list of data.

Return the tuple of transformed data.

Load data from multiple sources and apply specified transforms.

Parameters

readersUnion[_Reader, List[_Reader]]

The list of readers to load data from. It can be a single reader or a list of readers.

transformsOptional[Union[_Transform, List[_Transform]]], optional

The list of transforms to apply to each sample. This can be: - None, in which case no transform is applied. - A single transform, in which case the same transform is applied

to data from all readers.

  • A list of transforms, in which case each transform is applied to the corresponding reader. That is, the first transform is applied to the first reader, the second transform is applied to the second reader, and so on.

return_singlebool, optional

If True, the __getitem__ method will return a single sample when a single reader is used. This is useful for unsupervised datasets, where we usually have a single reader. If False, the __getitem__ method will return a tuple of samples, where each sample is from a different reader, from same index. This is useful for supervised datasets, where the data from different readers are related and should be returned together. The default is False.

Examples

1. Supervised Dataset: ```python from minerva.data.readers import ImageReader, LabelReader from minerva.transforms import ImageTransform, LabelTransform from minerva.data.datasets import SimpleDataset

# Create the readers image_reader = ImageReader(“path/to/images”) label_reader = LabelReader(“path/to/labels”)

# Create the transforms image_transform = ImageTransform() label_transform = None # No transform for the labels # Create the dataset dataset = SimpleDataset(

readers=[image_reader, label_reader], transforms=[image_transform, label_transform]

)

dataset[0] # Returns [image, label] ```

2. Unsupervised Dataset: ```python from minerva.data.readers import ImageReader from minerva.transforms import ImageTransform from minerva.data.datasets import SimpleDataset

# Create the reader image_reader = ImageReader(“path/to/images”)

# Create the transform image_transform = ImageTransform() # Create the dataset dataset = SimpleDataset(

readers=[image_reader], transforms=image_transform, return_single=True

) dataset[0] # Returns image ```

__getitem__(idx)

Load data from multiple sources and apply specified transforms.

Parameters

idxint

The index of the sample to load.

Returns

List[Any]

A list of transformed data from each reader.

Parameters:

idx (int)

Return type:

Union[Any, Tuple[Any, Ellipsis]]

__len__()

The length of the dataset is the length of the first reader.

Returns

int

The number of samples in the dataset.

Return type:

int

Parameters:
class minerva.data.datasets.SupervisedReconstructionDataset(readers, transforms=None)

Bases: minerva.data.datasets.base.SimpleDataset

A simple dataset class for supervised reconstruction tasks.

In summary, each element of the dataset is a pair of data, where the first element is the input data and the second element is the target data. Usually, both input and target data have the same shape.

This dataset is useful for supervised tasks such as image reconstruction, segmantic segmentation, and object detection, where the input data is the original data and the target is a mask or a segmentation map.

Examples

  1. Semantic Segmentation Dataset:

    ```python from minerva.data.readers import ImageReader from minerva.transforms import ImageTransform from minerva.data.datasets import SupervisedReconstructionDataset

    # Create the readers image_reader = ImageReader(“path/to/images”) mask_reader = ImageReader(“path/to/masks”)

    # Create the transforms image_transform = ImageTransform()

    # Create the dataset dataset = SupervisedReconstructionDataset(

    readers=[image_reader, mask_reader], transforms=image_transform

    ) # Load the first sample dataset[0] # Returns a tuple: (image, mask) ```

A simple dataset class for supervised reconstruction tasks.

Parameters

readers: List[_Reader]

List of data readers. It must contain exactly 2 readers. The first reader for the input data and the second reader for the target data.

transforms: _Transform | None

Optional data transformation pipeline.

Raises

AssertionError: If the number of readers is not exactly 2.

__getitem__(index)

Load data from sources and apply specified transforms. The same transform is applied to both input and target data.

Parameters

indexint

The index of the sample to load.

Returns

Tuple[np.ndarray, np.ndarray]

A tuple containing two numpy arrays representing the data.

Parameters:

index (int)

Return type:

Tuple[numpy.ndarray, numpy.ndarray]

Parameters: