dasf.ml.cluster.som
Kohonen’s Self-Organized Map (SOM) algorithm module.
Classes
Initializes a Self Organizing Maps. |
Module Contents
- class dasf.ml.cluster.som.SOM(x, y, input_len, num_epochs=100, sigma=0, sigmaN=1, learning_rate=0.5, learning_rateN=0.01, decay_function='exponential', neighborhood_function='gaussian', std_coeff=0.5, topology='rectangular', activation_distance='euclidean', random_seed=None, n_parallel=0, compact_support=False, **kwargs)[source]
Bases:
dasf.ml.cluster.classifier.ClusterClassifier
Initializes a Self Organizing Maps.
A rule of thumb to set the size of the grid for a dimensionality reduction task is that it should contain 5*sqrt(N) neurons where N is the number of samples in the dataset to analyze.
E.g. if your dataset has 150 samples, 5*sqrt(150) = 61.23 hence a map 8-by-8 should perform well.
Parameters
- xint
x dimension of the SOM.
- yint
y dimension of the SOM.
- input_lenint
Number of the elements of the vectors in input.
- sigmafloat, default=min(x,y)/2
Spread of the neighborhood function, needs to be adequate to the dimensions of the map.
- sigmaNfloat, default=0.01
Spread of the neighborhood function at last iteration.
- learning_ratefloat, default=0.5
initial learning rate.
- learning_rateNfloat, default=0.01
final learning rate
- decay_functionstring, default=’exponential’
Function that reduces learning_rate and sigma at each iteration. Possible values: ‘exponential’, ‘linear’, ‘aymptotic’
- neighborhood_functionstring, default=’gaussian’
Function that weights the neighborhood of a position in the map. Possible values: ‘gaussian’, ‘mexican_hat’, ‘bubble’, ‘triangle’
- topologystring, default=’rectangular’
Topology of the map. Possible values: ‘rectangular’, ‘hexagonal’
- activation_distancestring, default=’euclidean’
Distance used to activate the map. Possible values: ‘euclidean’, ‘cosine’, ‘manhattan’
- random_seedint, default=None
Random seed to use.
- n_paralleluint, default=#max_CUDA_threads or 500*#CPUcores
Number of samples to be processed at a time. Setting a too low value may drastically lower performance due to under-utilization, setting a too high value increases memory usage without granting any significant performance benefit.
- xpnumpy or cupy, default=cupy if can be imported else numpy
Use numpy (CPU) or cupy (GPU) for computations.
- std_coeff: float, default=0.5
Used to calculate gausssian exponent denominator: d = 2*std_coeff**2*sigma**2
- compact_support: bool, default=False
Cut the neighbor function to 0 beyond neighbor radius sigma
Examples
>>> from dasf.ml.cluster import SOM >>> import numpy as np >>> X = np.array([[1, 1], [2, 1], [1, 0], ... [4, 7], [3, 5], [3, 6]]) >>> som = SOM(x=3, y=2, input_len=2, ... num_epochs=100).fit(X) >>> som SOM(x=3, y=2, input_len=2, num_epochs=100)
Constructor of the class SOM.
- x
- y
- input_len
- num_epochs
- sigma
- sigmaN
- learning_rate
- learning_rateN
- decay_function
- neighborhood_function
- std_coeff
- topology
- activation_distance
- random_seed
- n_parallel
- compact_support
- __som_cpu
- __som_mcpu
- _lazy_fit_cpu(X, y=None, sample_weight=None)[source]
Fit SOM method using Dask with CPUs only.
Parameters
X : {array-like, sparse matrix} of shape (n_samples, n_features).
- sample_weightarray-like of shape (n_samples,), default=None
This is just a placeholder to keep the compatibility with other fit methods. This is not used by SOM.
Returns
- selfobject
Returns a fitted instance of self.
- _lazy_fit_gpu(X, y=None, sample_weight=None)[source]
Fit SOM method using Dask with GPUs only.
Parameters
X : {array-like, sparse matrix} of shape (n_samples, n_features).
- sample_weightarray-like of shape (n_samples,), default=None
This is just a placeholder to keep the compatibility with other fit methods. This is not used by SOM.
Returns
- selfobject
Returns a fitted instance of self.
- _fit_cpu(X, y=None, sample_weight=None)[source]
Fit SOM method using CPU only.
Parameters
X : {array-like, sparse matrix} of shape (n_samples, n_features).
- sample_weightarray-like of shape (n_samples,), default=None
This is just a placeholder to keep the compatibility with other fit methods. This is not used by SOM.
Returns
- selfobject
Returns a fitted instance of self.
- _fit_gpu(X, y=None, sample_weight=None)[source]
Fit SOM method using GPU only.
Parameters
X : {array-like, sparse matrix} of shape (n_samples, n_features).
- sample_weightarray-like of shape (n_samples,), default=None
This is just a placeholder to keep the compatibility with other fit methods. This is not used by SOM.
Returns
- selfobject
Returns a fitted instance of self.
- _lazy_fit_predict_cpu(X, y=None, sample_weight=None)[source]
Fit SOM and select the winner neurons for the input using Dask with CPUs only.
Parameters
X : {array-like, sparse matrix} of shape (n_samples, n_features).
- y{array-like, sparse matrix} of shape (n_samples).
This is just a placeholder to keep the compatibility with other fit_predict methods. SOM does not use labels to verify the input.
- sample_weightarray-like of shape (n_samples,), default=None
This is just a placeholder to keep the compatibility with other fit_predict methods. This is not used by SOM.
Returns
- selfobject
Returns a fitted instance of self.
- _lazy_fit_predict_gpu(X, y=None, sample_weight=None)[source]
Fit SOM and select the winner neurons for the input using Dask with GPUs only.
Parameters
X : {array-like, sparse matrix} of shape (n_samples, n_features).
- y{array-like, sparse matrix} of shape (n_samples).
This is just a placeholder to keep the compatibility with other fit_predict methods. SOM does not use labels to verify the input.
- sample_weightarray-like of shape (n_samples,), default=None
This is just a placeholder to keep the compatibility with other fit_predict methods. This is not used by SOM.
Returns
- selfobject
Returns a fitted instance of self.
- _fit_predict_cpu(X, y=None, sample_weight=None)[source]
Fit SOM and select the winner neurons for the input using CPU only.
Parameters
X : {array-like, sparse matrix} of shape (n_samples, n_features).
- y{array-like, sparse matrix} of shape (n_samples).
This is just a placeholder to keep the compatibility with other fit_predict methods. SOM does not use labels to verify the input.
- sample_weightarray-like of shape (n_samples,), default=None
This is just a placeholder to keep the compatibility with other fit_predict methods. This is not used by SOM.
Returns
- selfobject
Returns a fitted instance of self.
- _fit_predict_gpu(X, y=None, sample_weight=None)[source]
Fit SOM and select the winner neurons for the input using GPU only.
Parameters
X : {array-like, sparse matrix} of shape (n_samples, n_features).
- y{array-like, sparse matrix} of shape (n_samples).
This is just a placeholder to keep the compatibility with other fit_predict methods. SOM does not use labels to verify the input.
- sample_weightarray-like of shape (n_samples,), default=None
This is just a placeholder to keep the compatibility with other fit_predict methods. This is not used by SOM.
Returns
- selfobject
Returns a fitted instance of self.
- _lazy_predict_cpu(X, sample_weight=None)[source]
Predict the input using a fitted SOM using Dask with CPUs only.
Parameters
X : {array-like, sparse matrix} of shape (n_samples, n_features).
- sample_weightarray-like of shape (n_samples,), default=None
This is just a placeholder to keep the compatibility with other fit methods. This is not used by SOM.
Returns
- labelsndarray of shape (n_samples,)
Cluster labels. Noisy samples are given the label -1.
- _lazy_predict_gpu(X, sample_weight=None)[source]
Predict the input using a fitted SOM using Dask with GPUs only.
Parameters
X : {array-like, sparse matrix} of shape (n_samples, n_features).
- sample_weightarray-like of shape (n_samples,), default=None
This is just a placeholder to keep the compatibility with other fit methods. This is not used by SOM.
Returns
- labelsndarray of shape (n_samples,)
Cluster labels. Noisy samples are given the label -1.
- _predict_cpu(X, sample_weight=None)[source]
Predict the input using a fitted SOM using CPU only.
Parameters
X : {array-like, sparse matrix} of shape (n_samples, n_features).
- sample_weightarray-like of shape (n_samples,), default=None
This is just a placeholder to keep the compatibility with other fit methods. This is not used by SOM.
Returns
- labelsndarray of shape (n_samples,)
Cluster labels. Noisy samples are given the label -1.
- _predict_gpu(X, sample_weight=None)[source]
Predict the input using a fitted SOM using GPU only.
Parameters
X : {array-like, sparse matrix} of shape (n_samples, n_features).
- sample_weightarray-like of shape (n_samples,), default=None
This is just a placeholder to keep the compatibility with other fit methods. This is not used by SOM.
Returns
- labelsndarray of shape (n_samples,)
Cluster labels. Noisy samples are given the label -1.
- _lazy_quantization_error_cpu(X)[source]
Returns the quantization error computed as the average distance between each input sample and its best matching unit using Dask with CPUs only.
Parameters
X : {array-like, sparse matrix} of shape (n_samples, n_features).
Returns
- errorfloat
The quantization error of the trained SOM.
- _lazy_quantization_error_gpu(X)[source]
Returns the quantization error computed as the average distance between each input sample and its best matching unit using Dask with GPUs only.
Parameters
X : {array-like, sparse matrix} of shape (n_samples, n_features).
Returns
- errorfloat
The quantization error of the trained SOM.
- _quantization_error_cpu(X)[source]
Returns the quantization error computed as the average distance between each input sample and its best matching unit using CPU only.
Parameters
X : {array-like, sparse matrix} of shape (n_samples, n_features).
Returns
- errorfloat
The quantization error of the trained SOM.
- _quantization_error_gpu(X)[source]
Returns the quantization error computed as the average distance between each input sample and its best matching unit using GPU only.
Parameters
X : {array-like, sparse matrix} of shape (n_samples, n_features).
Returns
- errorfloat
The quantization error of the trained SOM.