minerva.data.readers.csv_reader
Classes
Base class for readers. Readers define an ordered collection of data and |
Module Contents
- class minerva.data.readers.csv_reader.CSVReader(path, columns_to_select, cast_to=None, data_shape=None)
Bases:
minerva.data.readers.tabular_reader.TabularReader
Base class for readers. Readers define an ordered collection of data and provide methods to access it. This class primarily handles:
Definition of data structure and storage.
Reading data from the source.
The access is handled by the __getitem__ and __len__ methods, which should be implemented by a subclass. Readers usually returns a single item at a time, that can be a single image, a single label, etc.
Reader to select columns from a DataFrame and return them as a NumPy array. The DataFrame is indexed by the row number. Each row of the DataFrame is considered as a sample. Thus, the __getitem__ method will return the columns of the DataFrame at the specified index as a NumPy array.
Parameters
- dfpd.DataFrame
The DataFrame to select the columns from. The DataFrame should have the columns that are specified in the columns_to_select parameter.
- columns_to_selectUnion[str, list[str]]
A string or a list of strings used to select the columns from the DataFrame. The string can be a regular expression pattern or a column name. The columns that match the pattern will be selected.
- cast_tostr, optional
Cast the selected columns to the specified data type. If None, the data type of the columns will not be changed. (default is None)
- data_shapetuple[int, …], optional
The shape of the data to be returned. If None, the data will be returned as a 1D array. If provided, the data will be reshaped to the specified shape. (default is None)
- Parameters:
path (str)
columns_to_select (Union[str, list[str]])
cast_to (str)
data_shape (tuple[int, Ellipsis])