minerva.analysis.clustering_analysis

Classes

ClusteringAnalysis

Perform a clustering analysis on the embeddings generated by some model,

Module Contents

class minerva.analysis.clustering_analysis.ClusteringAnalysis(data_split='predict')[source]

Bases: minerva.analysis.model_analysis._ModelAnalysis

Perform a clustering analysis on the embeddings generated by some model, using the Silhouette score and Davies-Bouldin score, functions implemented in sklearn. The results are returned in a dictionary.

Initialize the analysis with the specified data split.

Parameters

data_splitstr, optional

The data split to use for the analysis, by default “predict”. This specifies which part of the dataset to analyze. Can be one of: [“train”, “validation”, “test”, “predict”].

compute(model, data)[source]

Compute the clustering analysis metrics.

Parameters

modelL.LightningModule

The trained model from which to extract embeddings.

dataL.LightningDataModule

The data module containing the dataset to analyze.

Returns

dict

A dictionary containing the Silhouette score and Davies-Bouldin score.

Parameters:
  • model (lightning.LightningModule)

  • data (lightning.LightningDataModule)

data_split = 'predict'
Parameters:

data_split (str)