minerva.data.readers.tabular_reader =================================== .. py:module:: minerva.data.readers.tabular_reader Classes ------- .. autoapisummary:: minerva.data.readers.tabular_reader.TabularReader Module Contents --------------- .. py:class:: TabularReader(df, columns_to_select, cast_to = None, data_shape = None) Bases: :py:obj:`minerva.data.readers.reader._Reader` Base class for readers. Readers define an ordered collection of data and provide methods to access it. This class primarily handles: 1. Definition of data structure and storage. 2. Reading data from the source. The access is handled by the __getitem__ and __len__ methods, which should be implemented by a subclass. Readers usually returns a single item at a time, that can be a single image, a single label, etc. Reader to select columns from a DataFrame and return them as a NumPy array. The DataFrame is indexed by the row number. Each row of the DataFrame is considered as a sample. Thus, the __getitem__ method will return the columns of the DataFrame at the specified index as a NumPy array. Parameters ---------- df : pd.DataFrame The DataFrame to select the columns from. The DataFrame should have the columns that are specified in the `columns_to_select` parameter. columns_to_select : Union[str, list[str]] A string or a list of strings used to select the columns from the DataFrame. The string can be a regular expression pattern or a column name. The columns that match the pattern will be selected. cast_to : str, optional Cast the selected columns to the specified data type. If None, the data type of the columns will not be changed. (default is None) data_shape : tuple[int, ...], optional The shape of the data to be returned. If None, the data will be returned as a 1D array. If provided, the data will be reshaped to the specified shape. (default is None) .. py:method:: __getitem__(index) Return the columns of the DataFrame at the specified row index as a NumPy array. The columns are selected based on the `self.columns_to_select`. Parameters ---------- index : int The row index to select the columns from the DataFrame. Returns ------- np.ndarray The selected columns from the row as a NumPy array. .. py:method:: __len__() Return the number of samples in the DataFrame. The number of samples is equal to the number of rows in the DataFrame. Returns ------- int The number of samples in the DataFrame. .. py:attribute:: cast_to :value: None .. py:attribute:: columns_to_select .. py:attribute:: data_shape :value: None .. py:attribute:: df