minerva.data.datasets.har_rodrigues_24

Classes

HARDatasetCPC

An abstract class representing a Dataset.

Functions

norm_shape(shape)

Normalize numpy array shapes so they're always expressed as a tuple,

opp_sliding_window(data_x, data_y, ws, ss)

sliding_window(a, ws[, ss, flatten])

Return a sliding window over a in any number of dimensions

Module Contents

class minerva.data.datasets.har_rodrigues_24.HARDatasetCPC(data_path, input_size, window, overlap, phase='train', use_train_as_val=False, use_val_with_train=True, columns=None, label='standard activity code', transpose_data=True)[source]

Bases: torch.utils.data.Dataset

An abstract class representing a Dataset.

All datasets that represent a map from keys to data samples should subclass it. All subclasses should overwrite __getitem__(), supporting fetching a data sample for a given key. Subclasses could also optionally overwrite __len__(), which is expected to return the size of the dataset by many Sampler implementations and the default options of DataLoader. Subclasses could also optionally implement __getitems__(), for speedup batched samples loading. This method accepts list of indices of samples of batch and returns list of samples.

Note

DataLoader by default constructs an index sampler that yields integral indices. To make it work with a map-style dataset with non-integral indices/keys, a custom sampler must be provided.

Initializes the dataset by loading the dataset from CSV files, segmenting the data into windows, and preparing it for training or evaluation.

Parameters

data_pathUnion[PathLike, List[PathLike]]

The path to the directory containing the dataset files. If a list of paths is provided, the datasets will be concatenated, in the order provided, into a single dataset.

input_sizeint

The expected size of input features.

windowint

The size of the sliding window used to segment the data.

overlapint

The overlap between consecutive windows.

phasestr

The phase of the dataset (‘train’, ‘val’, or ‘test’).

use_train_as_valbool

Whether to use the training set as the validation set.

use_val_with_trainbool

Whether to use the validation set as the training set.

columnsOptional[List[str]]

The columns to be used as input features. If None, the default columns [‘accel-x’, ‘accel-y’, ‘accel-z’, ‘gyro-x’, ‘gyro-y’, ‘gyro-z’] will be used.

labelOptional[str]

The column to be used as the label. If None, no labels will be used. If ‘return_index_as_label’, the index of the data will be used as the label.

transpose_databool

If True, the data will be returned as a vector of shape (C, T), else the data will be returned as a vector of shape (T, C).

__getitem__(index)[source]
__len__()[source]
columns
data_raw
input_size
label = 'standard activity code'
load_dataset()[source]

Loads the dataset from CSV files, concatenates them into numpy arrays, and converts them to the appropriate data types.

Returns

dict

A dictionary containing ‘data’ and ‘labels’ for ‘train’, ‘val’, and ‘test’ phases, where ‘data’ is a numpy array of concatenated data and ‘labels’ is a numpy array of concatenated labels.

paths
transpose_data = True
use_train_as_val = False
use_val_with_train = True
Parameters:
  • data_path (Union[minerva.utils.typing.PathLike, List[minerva.utils.typing.PathLike]])

  • input_size (int)

  • window (int)

  • overlap (int)

  • phase (str)

  • use_train_as_val (bool)

  • use_val_with_train (bool)

  • columns (Optional[List[str]])

  • label (Optional[str])

  • transpose_data (bool)

minerva.data.datasets.har_rodrigues_24.norm_shape(shape)[source]

Normalize numpy array shapes so they’re always expressed as a tuple, even for one-dimensional shapes.

Parameters

shapeint, tuple, or numpy.ndarray

The shape to be normalized.

Returns

Tuple[int, …]

The normalized shape.

minerva.data.datasets.har_rodrigues_24.opp_sliding_window(data_x, data_y, ws, ss)[source]
minerva.data.datasets.har_rodrigues_24.sliding_window(a, ws, ss=None, flatten=True)[source]

Return a sliding window over a in any number of dimensions

Parameters:

a - an n-dimensional numpy array ws - an int (a is 1D) or tuple (a is 2D or greater) representing the size

of each dimension of the window

ss - an int (a is 1D) or tuple (a is 2D or greater) representing the

amount to slide the window in each dimension. If not specified, it defaults to ws.

flatten - if True, all slices are flattened, otherwise, there is an

extra dimension for each dimension of the input.

Returns

an array containing each n-dimensional window from a