dasf.feature_extraction

Submodules

Classes

Histogram

Operator to extract the histogram of a data.

ConcatenateToArray

Concatenate data from different Arrays into a single array.

GetSubDataframe

Get the first x% samples from the dataset.

SampleDataframe

Return a subset with random samples of the original dataset.

Package Contents

class dasf.feature_extraction.Histogram(bins=None, range=None, normed=False, weights=None, density=None, *args, **kwargs)[source]

Bases: dasf.transforms.base.TargeteredTransform, dasf.transforms.base.Transform

Operator to extract the histogram of a data.

Parameters

binsOptional[int]

Number of bins (the default is None).

rangetuple

2-element tuple with the lower and upper range of the bins. If not provided, range is simply (X.min(), X.max()) (the default is None).

normedbool

If the historgram must be normalized (the default is False).

weightstype

An array of weights, of the same shape as X. Each value in a only contributes its associated weight towards the bin count (the default is None).

densitytype

If False, the result will contain the number of samples in each bin. If True, the result is the value of the probability density function at the bin, normalized such that the integral over the range is 1 (the default is None).

Attributes

bins range normed weights density

Generic constructor of the class Histogram.

_bins
_range
_normed
_weights
_density
_lazy_transform_generic(X)[source]

Compute the histogram of a dataset using Dask.

Parameters

Xarray_like

Input data. The histogram is computed over the flattened array.

Returns

histarray

The values of the histogram. See density and weights for a description of the possible semantics. If weights are given, hist.dtype will be taken from weights.

bin_edgesarray of dtype float

Return the bin edges (length(hist)+1).

_transform_generic(X, xp)[source]

Compute the histogram of a dataset using local libraries.

Parameters

Xarray_like

Input data. The histogram is computed over the flattened array.

Returns

histarray

The values of the histogram. See density and weights for a description of the possible semantics. If weights are given, hist.dtype will be taken from weights.

bin_edgesarray of dtype float

Return the bin edges (length(hist)+1).

_lazy_transform_cpu(X)[source]

Compute the histogram of a dataset using Dask with CPUs only.

Parameters

Xarray_like

Input data. The histogram is computed over the flattened array.

Returns

histarray

The values of the histogram. See density and weights for a description of the possible semantics. If weights are given, hist.dtype will be taken from weights.

bin_edgesarray of dtype float

Return the bin edges (length(hist)+1).

_lazy_transform_gpu(X, **kwargs)[source]

Compute the histogram of a dataset using Dask with GPUs only.

Parameters

Xarray_like

Input data. The histogram is computed over the flattened array.

Returns

histarray

The values of the histogram. See density and weights for a description of the possible semantics. If weights are given, hist.dtype will be taken from weights.

bin_edgesarray of dtype float

Return the bin edges (length(hist)+1).

_transform_cpu(X, **kwargs)[source]

Compute the histogram of a dataset using CPU only.

Parameters

Xarray_like

Input data. The histogram is computed over the flattened array.

Returns

histarray

The values of the histogram. See density and weights for a description of the possible semantics. If weights are given, hist.dtype will be taken from weights.

bin_edgesarray of dtype float

Return the bin edges (length(hist)+1).

_transform_gpu(X, **kwargs)[source]

Compute the histogram of a dataset using GPU only.

Parameters

Xarray_like

Input data. The histogram is computed over the flattened array.

Returns

histarray

The values of the histogram. See density and weights for a description of the possible semantics. If weights are given, hist.dtype will be taken from weights.

bin_edgesarray of dtype float

Return the bin edges (length(hist)+1).

Parameters:
  • bins (int)

  • range (tuple)

  • normed (bool)

class dasf.feature_extraction.ConcatenateToArray(flatten=False)[source]

Bases: dasf.transforms.base.Transform

Concatenate data from different Arrays into a single array.

Parameters

flattenbool

If the arrays must be flatten prior concatenating. If False, the arrays must share the shape of last dimansions in order to be concatenated (the default is False).

flatten
__transform_generic(xp, **kwargs)
_transform_cpu(**kwargs)[source]

Respective immediate transform mocked function for local CPU(s).

_transform_gpu(**kwargs)[source]

Respective immediate transform mocked function for local GPU(s).

Parameters:

flatten (bool)

class dasf.feature_extraction.GetSubDataframe(percent)[source]

Get the first x% samples from the dataset.

Parameters

percentfloat

Percentage of the samples to get from the dataframe.

__percent
transform(X)[source]
Parameters:

percent (float)

class dasf.feature_extraction.SampleDataframe(percent)[source]

Bases: dasf.transforms.base.Transform

Return a subset with random samples of the original dataset.

Parameters

percentfloat

Percentage of the samples to get from the dataset.

__percent
transform(X)[source]

Returns a subset with random samples from the dataset X.

Parameters

XAny

The dataset.

Returns

Any

The sampled subset.

Parameters:

percent (float)