package sklearn

  1. Overview
  2. Docs
Legend:
Library
Module
Module type
Parameter
Class
Class type
type tag = [
  1. | `SpectralCoclustering
]
type t = [ `BaseEstimator | `BaseSpectral | `BiclusterMixin | `Object | `SpectralCoclustering ] Obj.t
val of_pyobject : Py.Object.t -> t
val to_pyobject : [> tag ] Obj.t -> Py.Object.t
val as_estimator : t -> [ `BaseEstimator ] Obj.t
val as_spectral : t -> [ `BaseSpectral ] Obj.t
val as_bicluster : t -> [ `BiclusterMixin ] Obj.t
val create : ?n_clusters:int -> ?svd_method:[ `Randomized | `Arpack ] -> ?n_svd_vecs:int -> ?mini_batch:bool -> ?init: [ `Arr of [> `ArrayLike ] Np.Obj.t | `Random | `T_k_means_ of Py.Object.t ] -> ?n_init:int -> ?n_jobs:int -> ?random_state:int -> unit -> t

Spectral Co-Clustering algorithm (Dhillon, 2001).

Clusters rows and columns of an array `X` to solve the relaxed normalized cut of the bipartite graph created from `X` as follows: the edge between row vertex `i` and column vertex `j` has weight `Xi, j`.

The resulting bicluster structure is block-diagonal, since each row and each column belongs to exactly one bicluster.

Supports sparse matrices, as long as they are nonnegative.

Read more in the :ref:`User Guide <spectral_coclustering>`.

Parameters ---------- n_clusters : int, default=3 The number of biclusters to find.

svd_method : 'randomized', 'arpack', default='randomized' Selects the algorithm for finding singular vectors. May be 'randomized' or 'arpack'. If 'randomized', use :func:`sklearn.utils.extmath.randomized_svd`, which may be faster for large matrices. If 'arpack', use :func:`scipy.sparse.linalg.svds`, which is more accurate, but possibly slower in some cases.

n_svd_vecs : int, default=None Number of vectors to use in calculating the SVD. Corresponds to `ncv` when `svd_method=arpack` and `n_oversamples` when `svd_method` is 'randomized`.

mini_batch : bool, default=False Whether to use mini-batch k-means, which is faster but may get different results.

init : {'k-means++', 'random', or ndarray of shape (n_clusters, n_features), default='k-means++' Method for initialization of k-means algorithm; defaults to 'k-means++'.

n_init : int, default=10 Number of random initializations that are tried with the k-means algorithm.

If mini-batch k-means is used, the best initialization is chosen and the algorithm runs once. Otherwise, the algorithm is run for each initialization and the best solution chosen.

n_jobs : int, default=None The number of jobs to use for the computation. This works by breaking down the pairwise matrix into n_jobs even slices and computing them in parallel.

``None`` means 1 unless in a :obj:`joblib.parallel_backend` context. ``-1`` means using all processors. See :term:`Glossary <n_jobs>` for more details.

random_state : int, RandomState instance, default=None Used for randomizing the singular value decomposition and the k-means initialization. Use an int to make the randomness deterministic. See :term:`Glossary <random_state>`.

Attributes ---------- rows_ : array-like of shape (n_row_clusters, n_rows) Results of the clustering. `rowsi, r` is True if cluster `i` contains row `r`. Available only after calling ``fit``.

columns_ : array-like of shape (n_column_clusters, n_columns) Results of the clustering, like `rows`.

row_labels_ : array-like of shape (n_rows,) The bicluster label of each row.

column_labels_ : array-like of shape (n_cols,) The bicluster label of each column.

Examples -------- >>> from sklearn.cluster import SpectralCoclustering >>> import numpy as np >>> X = np.array([1, 1], [2, 1], [1, 0], ... [4, 7], [3, 5], [3, 6]) >>> clustering = SpectralCoclustering(n_clusters=2, random_state=0).fit(X) >>> clustering.row_labels_ #doctest: +SKIP array(0, 1, 1, 0, 0, 0, dtype=int32) >>> clustering.column_labels_ #doctest: +SKIP array(0, 0, dtype=int32) >>> clustering SpectralCoclustering(n_clusters=2, random_state=0)

References ----------

* Dhillon, Inderjit S, 2001. `Co-clustering documents and words using bipartite spectral graph partitioning <http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.140.3011>`__.

val fit : ?y:Py.Object.t -> x:[> `ArrayLike ] Np.Obj.t -> [> tag ] Obj.t -> t

Creates a biclustering for X.

Parameters ---------- X : array-like, shape (n_samples, n_features)

y : Ignored

val get_indices : i:int -> [> tag ] Obj.t -> Py.Object.t * Py.Object.t

Row and column indices of the i'th bicluster.

Only works if ``rows_`` and ``columns_`` attributes exist.

Parameters ---------- i : int The index of the cluster.

Returns ------- row_ind : np.array, dtype=np.intp Indices of rows in the dataset that belong to the bicluster. col_ind : np.array, dtype=np.intp Indices of columns in the dataset that belong to the bicluster.

val get_params : ?deep:bool -> [> tag ] Obj.t -> Dict.t

Get parameters for this estimator.

Parameters ---------- deep : bool, default=True If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns ------- params : mapping of string to any Parameter names mapped to their values.

val get_shape : i:int -> [> tag ] Obj.t -> int * int

Shape of the i'th bicluster.

Parameters ---------- i : int The index of the cluster.

Returns ------- shape : (int, int) Number of rows and columns (resp.) in the bicluster.

val get_submatrix : i:int -> data:[> `ArrayLike ] Np.Obj.t -> [> tag ] Obj.t -> [> `ArrayLike ] Np.Obj.t

Return the submatrix corresponding to bicluster `i`.

Parameters ---------- i : int The index of the cluster. data : array The data.

Returns ------- submatrix : array The submatrix corresponding to bicluster i.

Notes ----- Works with sparse matrices. Only works if ``rows_`` and ``columns_`` attributes exist.

val set_params : ?params:(string * Py.Object.t) list -> [> tag ] Obj.t -> t

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form ``<component>__<parameter>`` so that it's possible to update each component of a nested object.

Parameters ---------- **params : dict Estimator parameters.

Returns ------- self : object Estimator instance.

val rows_ : t -> [> `ArrayLike ] Np.Obj.t

Attribute rows_: get value or raise Not_found if None.

val rows_opt : t -> [> `ArrayLike ] Np.Obj.t option

Attribute rows_: get value as an option.

val columns_ : t -> [> `ArrayLike ] Np.Obj.t

Attribute columns_: get value or raise Not_found if None.

val columns_opt : t -> [> `ArrayLike ] Np.Obj.t option

Attribute columns_: get value as an option.

val row_labels_ : t -> [> `ArrayLike ] Np.Obj.t

Attribute row_labels_: get value or raise Not_found if None.

val row_labels_opt : t -> [> `ArrayLike ] Np.Obj.t option

Attribute row_labels_: get value as an option.

val column_labels_ : t -> [> `ArrayLike ] Np.Obj.t

Attribute column_labels_: get value or raise Not_found if None.

val column_labels_opt : t -> [> `ArrayLike ] Np.Obj.t option

Attribute column_labels_: get value as an option.

val to_string : t -> string

Print the object to a human-readable representation.

val show : t -> string

Print the object to a human-readable representation.

val pp : Format.formatter -> t -> unit

Pretty-print the object to a formatter.

OCaml

Innovation. Community. Security.