dasf.feature_extraction ======================= .. py:module:: dasf.feature_extraction Submodules ---------- .. toctree:: :maxdepth: 1 /autoapi/dasf/feature_extraction/histogram/index /autoapi/dasf/feature_extraction/transforms/index Classes ------- .. autoapisummary:: dasf.feature_extraction.Histogram dasf.feature_extraction.ConcatenateToArray dasf.feature_extraction.GetSubDataframe dasf.feature_extraction.SampleDataframe Package Contents ---------------- .. py:class:: Histogram(bins = None, range = None, normed = False, weights=None, density=None, *args, **kwargs) Bases: :py:obj:`dasf.transforms.base.TargeteredTransform`, :py:obj:`dasf.transforms.base.Transform` Operator to extract the histogram of a data. Parameters ---------- bins : Optional[int] Number of bins (the default is None). range : tuple 2-element tuple with the lower and upper range of the bins. If not provided, range is simply (X.min(), X.max()) (the default is None). normed : bool If the historgram must be normalized (the default is False). weights : type An array of weights, of the same shape as X. Each value in a only contributes its associated weight towards the bin count (the default is None). density : type If False, the result will contain the number of samples in each bin. If True, the result is the value of the probability density function at the bin, normalized such that the integral over the range is 1 (the default is None). Attributes ---------- bins range normed weights density Generic constructor of the class Histogram. .. py:attribute:: _bins .. py:attribute:: _range .. py:attribute:: _normed .. py:attribute:: _weights .. py:attribute:: _density .. py:method:: _lazy_transform_generic(X) Compute the histogram of a dataset using Dask. Parameters ---------- X : array_like Input data. The histogram is computed over the flattened array. Returns ------- hist : array The values of the histogram. See `density` and `weights` for a description of the possible semantics. If `weights` are given, ``hist.dtype`` will be taken from `weights`. bin_edges : array of dtype float Return the bin edges ``(length(hist)+1)``. .. py:method:: _transform_generic(X, xp) Compute the histogram of a dataset using local libraries. Parameters ---------- X : array_like Input data. The histogram is computed over the flattened array. Returns ------- hist : array The values of the histogram. See `density` and `weights` for a description of the possible semantics. If `weights` are given, ``hist.dtype`` will be taken from `weights`. bin_edges : array of dtype float Return the bin edges ``(length(hist)+1)``. .. py:method:: _lazy_transform_cpu(X) Compute the histogram of a dataset using Dask with CPUs only. Parameters ---------- X : array_like Input data. The histogram is computed over the flattened array. Returns ------- hist : array The values of the histogram. See `density` and `weights` for a description of the possible semantics. If `weights` are given, ``hist.dtype`` will be taken from `weights`. bin_edges : array of dtype float Return the bin edges ``(length(hist)+1)``. .. py:method:: _lazy_transform_gpu(X, **kwargs) Compute the histogram of a dataset using Dask with GPUs only. Parameters ---------- X : array_like Input data. The histogram is computed over the flattened array. Returns ------- hist : array The values of the histogram. See `density` and `weights` for a description of the possible semantics. If `weights` are given, ``hist.dtype`` will be taken from `weights`. bin_edges : array of dtype float Return the bin edges ``(length(hist)+1)``. .. py:method:: _transform_cpu(X, **kwargs) Compute the histogram of a dataset using CPU only. Parameters ---------- X : array_like Input data. The histogram is computed over the flattened array. Returns ------- hist : array The values of the histogram. See `density` and `weights` for a description of the possible semantics. If `weights` are given, ``hist.dtype`` will be taken from `weights`. bin_edges : array of dtype float Return the bin edges ``(length(hist)+1)``. .. py:method:: _transform_gpu(X, **kwargs) Compute the histogram of a dataset using GPU only. Parameters ---------- X : array_like Input data. The histogram is computed over the flattened array. Returns ------- hist : array The values of the histogram. See `density` and `weights` for a description of the possible semantics. If `weights` are given, ``hist.dtype`` will be taken from `weights`. bin_edges : array of dtype float Return the bin edges ``(length(hist)+1)``. .. py:class:: ConcatenateToArray(flatten = False) Bases: :py:obj:`dasf.transforms.base.Transform` Concatenate data from different Arrays into a single array. Parameters ---------- flatten : bool If the arrays must be flatten prior concatenating. If `False`, the arrays must share the shape of last dimansions in order to be concatenated (the default is False). .. py:attribute:: flatten .. py:method:: __transform_generic(xp, **kwargs) .. py:method:: _transform_cpu(**kwargs) Respective immediate transform mocked function for local CPU(s). .. py:method:: _transform_gpu(**kwargs) Respective immediate transform mocked function for local GPU(s). .. py:class:: GetSubDataframe(percent) Get the first x% samples from the dataset. Parameters ---------- percent : float Percentage of the samples to get from the dataframe. .. py:attribute:: __percent .. py:method:: transform(X) .. py:class:: SampleDataframe(percent) Bases: :py:obj:`dasf.transforms.base.Transform` Return a subset with random samples of the original dataset. Parameters ---------- percent : float Percentage of the samples to get from the dataset. .. py:attribute:: __percent .. py:method:: transform(X) Returns a subset with random samples from the dataset `X`. Parameters ---------- X : Any The dataset. Returns ------- Any The sampled subset.