package sklearn

  1. Overview
  2. Docs
Legend:
Library
Module
Module type
Parameter
Class
Class type
type tag = [
  1. | `RFECV
]
type t = [ `BaseEstimator | `MetaEstimatorMixin | `Object | `RFECV | `SelectorMixin | `TransformerMixin ] Obj.t
val of_pyobject : Py.Object.t -> t
val to_pyobject : [> tag ] Obj.t -> Py.Object.t
val as_transformer : t -> [ `TransformerMixin ] Obj.t
val as_selector : t -> [ `SelectorMixin ] Obj.t
val as_estimator : t -> [ `BaseEstimator ] Obj.t
val as_meta_estimator : t -> [ `MetaEstimatorMixin ] Obj.t
val create : ?step:[ `F of float | `I of int ] -> ?min_features_to_select:int -> ?cv: [ `Arr of [> `ArrayLike ] Np.Obj.t | `BaseCrossValidator of [> `BaseCrossValidator ] Np.Obj.t | `I of int ] -> ?scoring: [ `Callable of Py.Object.t | `Score of [ `Explained_variance | `R2 | `Max_error | `Neg_median_absolute_error | `Neg_mean_absolute_error | `Neg_mean_squared_error | `Neg_mean_squared_log_error | `Neg_root_mean_squared_error | `Neg_mean_poisson_deviance | `Neg_mean_gamma_deviance | `Accuracy | `Roc_auc | `Roc_auc_ovr | `Roc_auc_ovo | `Roc_auc_ovr_weighted | `Roc_auc_ovo_weighted | `Balanced_accuracy | `Average_precision | `Neg_log_loss | `Neg_brier_score | `Adjusted_rand_score | `Homogeneity_score | `Completeness_score | `V_measure_score | `Mutual_info_score | `Adjusted_mutual_info_score | `Normalized_mutual_info_score | `Fowlkes_mallows_score | `Precision | `Precision_macro | `Precision_micro | `Precision_samples | `Precision_weighted | `Recall | `Recall_macro | `Recall_micro | `Recall_samples | `Recall_weighted | `F1 | `F1_macro | `F1_micro | `F1_samples | `F1_weighted | `Jaccard | `Jaccard_macro | `Jaccard_micro | `Jaccard_samples | `Jaccard_weighted ] ] -> ?verbose:int -> ?n_jobs:int -> estimator:[> `BaseEstimator ] Np.Obj.t -> unit -> t

Feature ranking with recursive feature elimination and cross-validated selection of the best number of features.

See glossary entry for :term:`cross-validation estimator`.

Read more in the :ref:`User Guide <rfe>`.

Parameters ---------- estimator : object A supervised learning estimator with a ``fit`` method that provides information about feature importance either through a ``coef_`` attribute or through a ``feature_importances_`` attribute.

step : int or float, optional (default=1) If greater than or equal to 1, then ``step`` corresponds to the (integer) number of features to remove at each iteration. If within (0.0, 1.0), then ``step`` corresponds to the percentage (rounded down) of features to remove at each iteration. Note that the last iteration may remove fewer than ``step`` features in order to reach ``min_features_to_select``.

min_features_to_select : int, (default=1) The minimum number of features to be selected. This number of features will always be scored, even if the difference between the original feature count and ``min_features_to_select`` isn't divisible by ``step``.

cv : int, cross-validation generator or an iterable, optional Determines the cross-validation splitting strategy. Possible inputs for cv are:

  • None, to use the default 5-fold cross-validation,
  • integer, to specify the number of folds.
  • :term:`CV splitter`,
  • An iterable yielding (train, test) splits as arrays of indices.

For integer/None inputs, if ``y`` is binary or multiclass, :class:`sklearn.model_selection.StratifiedKFold` is used. If the estimator is a classifier or if ``y`` is neither binary nor multiclass, :class:`sklearn.model_selection.KFold` is used.

Refer :ref:`User Guide <cross_validation>` for the various cross-validation strategies that can be used here.

.. versionchanged:: 0.22 ``cv`` default value of None changed from 3-fold to 5-fold.

scoring : string, callable or None, optional, (default=None) A string (see model evaluation documentation) or a scorer callable object / function with signature ``scorer(estimator, X, y)``.

verbose : int, (default=0) Controls verbosity of output.

n_jobs : int or None, optional (default=None) Number of cores to run in parallel while fitting across folds. ``None`` means 1 unless in a :obj:`joblib.parallel_backend` context. ``-1`` means using all processors. See :term:`Glossary <n_jobs>` for more details.

Attributes ---------- n_features_ : int The number of selected features with cross-validation.

support_ : array of shape n_features The mask of selected features.

ranking_ : array of shape n_features The feature ranking, such that `ranking_i` corresponds to the ranking position of the i-th feature. Selected (i.e., estimated best) features are assigned rank 1.

grid_scores_ : array of shape n_subsets_of_features The cross-validation scores such that ``grid_scores_i`` corresponds to the CV score of the i-th subset of features.

estimator_ : object The external estimator fit on the reduced dataset.

Notes ----- The size of ``grid_scores_`` is equal to ``ceil((n_features - min_features_to_select) / step) + 1``, where step is the number of features removed at each iteration.

Allows NaN/Inf in the input if the underlying estimator does as well.

Examples -------- The following example shows how to retrieve the a-priori not known 5 informative features in the Friedman #1 dataset.

>>> from sklearn.datasets import make_friedman1 >>> from sklearn.feature_selection import RFECV >>> from sklearn.svm import SVR >>> X, y = make_friedman1(n_samples=50, n_features=10, random_state=0) >>> estimator = SVR(kernel='linear') >>> selector = RFECV(estimator, step=1, cv=5) >>> selector = selector.fit(X, y) >>> selector.support_ array( True, True, True, True, True, False, False, False, False, False) >>> selector.ranking_ array(1, 1, 1, 1, 1, 6, 4, 3, 2, 5)

See also -------- RFE : Recursive feature elimination

References ----------

.. 1 Guyon, I., Weston, J., Barnhill, S., & Vapnik, V., 'Gene selection for cancer classification using support vector machines', Mach. Learn., 46(1-3), 389--422, 2002.

val decision_function : x:[> `ArrayLike ] Np.Obj.t -> [> tag ] Obj.t -> [> `ArrayLike ] Np.Obj.t

Compute the decision function of ``X``.

Parameters ---------- X : array-like or sparse matrix of shape (n_samples, n_features) The input samples. Internally, it will be converted to ``dtype=np.float32`` and if a sparse matrix is provided to a sparse ``csr_matrix``.

Returns ------- score : array, shape = n_samples, n_classes or n_samples The decision function of the input samples. The order of the classes corresponds to that in the attribute :term:`classes_`. Regression and binary classification produce an array of shape n_samples.

val fit : ?groups:[> `ArrayLike ] Np.Obj.t -> x:[> `ArrayLike ] Np.Obj.t -> y:[> `ArrayLike ] Np.Obj.t -> [> tag ] Obj.t -> t

Fit the RFE model and automatically tune the number of selected features.

Parameters ---------- X : array-like, sparse matrix of shape (n_samples, n_features) Training vector, where `n_samples` is the number of samples and `n_features` is the total number of features.

y : array-like of shape (n_samples,) Target values (integers for classification, real numbers for regression).

groups : array-like of shape (n_samples,) or None Group labels for the samples used while splitting the dataset into train/test set. Only used in conjunction with a 'Group' :term:`cv` instance (e.g., :class:`~sklearn.model_selection.GroupKFold`).

val fit_transform : ?y:[> `ArrayLike ] Np.Obj.t -> ?fit_params:(string * Py.Object.t) list -> x:[> `ArrayLike ] Np.Obj.t -> [> tag ] Obj.t -> [> `ArrayLike ] Np.Obj.t

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters ---------- X : numpy array of shape n_samples, n_features Training set.

y : numpy array of shape n_samples Target values.

**fit_params : dict Additional fit parameters.

Returns ------- X_new : numpy array of shape n_samples, n_features_new Transformed array.

val get_params : ?deep:bool -> [> tag ] Obj.t -> Dict.t

Get parameters for this estimator.

Parameters ---------- deep : bool, default=True If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns ------- params : mapping of string to any Parameter names mapped to their values.

val get_support : ?indices:bool -> [> tag ] Obj.t -> [> `ArrayLike ] Np.Obj.t

Get a mask, or integer index, of the features selected

Parameters ---------- indices : boolean (default False) If True, the return value will be an array of integers, rather than a boolean mask.

Returns ------- support : array An index that selects the retained features from a feature vector. If `indices` is False, this is a boolean array of shape # input features, in which an element is True iff its corresponding feature is selected for retention. If `indices` is True, this is an integer array of shape # output features whose values are indices into the input feature vector.

val inverse_transform : x:[> `ArrayLike ] Np.Obj.t -> [> tag ] Obj.t -> [> `ArrayLike ] Np.Obj.t

Reverse the transformation operation

Parameters ---------- X : array of shape n_samples, n_selected_features The input samples.

Returns ------- X_r : array of shape n_samples, n_original_features `X` with columns of zeros inserted where features would have been removed by :meth:`transform`.

val predict : x:[> `ArrayLike ] Np.Obj.t -> [> tag ] Obj.t -> [> `ArrayLike ] Np.Obj.t

Reduce X to the selected features and then predict using the underlying estimator.

Parameters ---------- X : array of shape n_samples, n_features The input samples.

Returns ------- y : array of shape n_samples The predicted target values.

val predict_log_proba : x:[> `ArrayLike ] Np.Obj.t -> [> tag ] Obj.t -> [> `ArrayLike ] Np.Obj.t

Predict class log-probabilities for X.

Parameters ---------- X : array of shape n_samples, n_features The input samples.

Returns ------- p : array of shape (n_samples, n_classes) The class log-probabilities of the input samples. The order of the classes corresponds to that in the attribute :term:`classes_`.

val predict_proba : x:[> `ArrayLike ] Np.Obj.t -> [> tag ] Obj.t -> [> `ArrayLike ] Np.Obj.t

Predict class probabilities for X.

Parameters ---------- X : array-like or sparse matrix of shape (n_samples, n_features) The input samples. Internally, it will be converted to ``dtype=np.float32`` and if a sparse matrix is provided to a sparse ``csr_matrix``.

Returns ------- p : array of shape (n_samples, n_classes) The class probabilities of the input samples. The order of the classes corresponds to that in the attribute :term:`classes_`.

val score : x:[> `ArrayLike ] Np.Obj.t -> y:[> `ArrayLike ] Np.Obj.t -> [> tag ] Obj.t -> Py.Object.t

Reduce X to the selected features and then return the score of the underlying estimator.

Parameters ---------- X : array of shape n_samples, n_features The input samples.

y : array of shape n_samples The target values.

val set_params : ?params:(string * Py.Object.t) list -> [> tag ] Obj.t -> t

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form ``<component>__<parameter>`` so that it's possible to update each component of a nested object.

Parameters ---------- **params : dict Estimator parameters.

Returns ------- self : object Estimator instance.

val transform : x:[> `ArrayLike ] Np.Obj.t -> [> tag ] Obj.t -> [> `ArrayLike ] Np.Obj.t

Reduce X to the selected features.

Parameters ---------- X : array of shape n_samples, n_features The input samples.

Returns ------- X_r : array of shape n_samples, n_selected_features The input samples with only the selected features.

val n_features_ : t -> int

Attribute n_features_: get value or raise Not_found if None.

val n_features_opt : t -> int option

Attribute n_features_: get value as an option.

val support_ : t -> [> `ArrayLike ] Np.Obj.t

Attribute support_: get value or raise Not_found if None.

val support_opt : t -> [> `ArrayLike ] Np.Obj.t option

Attribute support_: get value as an option.

val ranking_ : t -> [> `ArrayLike ] Np.Obj.t

Attribute ranking_: get value or raise Not_found if None.

val ranking_opt : t -> [> `ArrayLike ] Np.Obj.t option

Attribute ranking_: get value as an option.

val grid_scores_ : t -> [> `ArrayLike ] Np.Obj.t

Attribute grid_scores_: get value or raise Not_found if None.

val grid_scores_opt : t -> [> `ArrayLike ] Np.Obj.t option

Attribute grid_scores_: get value as an option.

val estimator_ : t -> Py.Object.t

Attribute estimator_: get value or raise Not_found if None.

val estimator_opt : t -> Py.Object.t option

Attribute estimator_: get value as an option.

val to_string : t -> string

Print the object to a human-readable representation.

val show : t -> string

Print the object to a human-readable representation.

val pp : Format.formatter -> t -> unit

Pretty-print the object to a formatter.

OCaml

Innovation. Community. Security.