minerva.data.readers.tabular_reader
Classes
| Base class for readers. Readers define an ordered collection of data and | 
Module Contents
- class minerva.data.readers.tabular_reader.TabularReader(df, columns_to_select, cast_to=None, data_shape=None)[source]
- Bases: - minerva.data.readers.reader._Reader- Base class for readers. Readers define an ordered collection of data and provide methods to access it. This class primarily handles: - Definition of data structure and storage. 
- Reading data from the source. 
 - The access is handled by the __getitem__ and __len__ methods, which should be implemented by a subclass. Readers usually returns a single item at a time, that can be a single image, a single label, etc. - Reader to select columns from a DataFrame and return them as a NumPy array. The DataFrame is indexed by the row number. Each row of the DataFrame is considered as a sample. Thus, the __getitem__ method will return the columns of the DataFrame at the specified index as a NumPy array. - Parameters- dfpd.DataFrame
- The DataFrame to select the columns from. The DataFrame should have the columns that are specified in the columns_to_select parameter. 
- columns_to_selectUnion[str, list[str]]
- A string or a list of strings used to select the columns from the DataFrame. The string can be a regular expression pattern or a column name. The columns that match the pattern will be selected. Note that if columns_to_select is a list, the result is always a numpy array with the columns in the same order as the list. If the columns_to_select is a string, the result is a numpy array if the selected columns are more than one, otherwise it is a single value (which is not a numpy array). 
- cast_tostr, optional
- Cast the selected columns to the specified data type. If None, the data type of the columns will not be changed. (default is None) 
- data_shapetuple[int, …], optional
- The shape of the data to be returned. If None, the data will be returned as a 1D array. If provided, the data will be reshaped to the specified shape. (default is None) 
 - __getitem__(index)[source]
- Return the columns of the DataFrame at the specified row index as a NumPy array. The columns are selected based on the self.columns_to_select. - Parameters- indexint
- The row index to select the columns from the DataFrame. 
 - Returns- np.ndarray
- The selected columns from the row as a NumPy array. 
 - Parameters:
- index (int) 
- Return type:
- numpy.ndarray 
 
 - __len__()[source]
- Return the number of samples in the DataFrame. The number of samples is equal to the number of rows in the DataFrame. - Returns- int
- The number of samples in the DataFrame. 
 - Return type:
- int 
 
 - __str__()[source]
- Return a string representation of the TabularReader object. - Returns- str
- A string representation of the TabularReader object. 
 - Return type:
- str 
 
 - cast_to = None
 - columns_to_select
 - data_shape = None
 - df
 - return_single = False
 - Parameters:
- df (pandas.DataFrame) 
- columns_to_select (Union[str, List[str]]) 
- cast_to (Optional[str]) 
- data_shape (Optional[Tuple[int, Ellipsis]])