DDGClassifier#

class skfda.ml.classification.DDGClassifier(*, multivariate_classifier=None, depth_method=None)[source]#

Generalized depth-versus-depth (DD) classifier for functional data.

This classifier builds an interface around the DDGTransfomer. The transformer takes a list of k depths and performs the following map [1]:

\[\begin{split}\mathcal{X} &\rightarrow \mathbb{R}^G \\ x &\rightarrow \textbf{d} = (D_1^1(x), D_1^2(x),...,D_g^k(x))\end{split}\]

Where $D_i^j(x)$ is the depth of the point $x$ with respect to the data in the $i$-th group using the $j$-th depth of the provided list. Note that $\mathcal{X}$ is possibly multivariate, that is, $\mathcal{X} = \mathcal{X}_1 \times ... \times \mathcal{X}_p$.

In the G dimensional space the classification is performed using a multivariate classifer.

Parameters:

depth_method (Depth[Input] | Sequence[Tuple[str, Depth[Input]]] | None) – The depth class or sequence of depths to use when calculating the depth of a test sample in a class. See the documentation of the depths module for a list of available depths. By default it is ModifiedBandDepth.
multivariate_classifier (ClassifierMixin[NDArrayFloat, NDArrayInt] | None) – The multivariate classifier to use in the DDG-plot.

Examples

Firstly, we will import and split the Berkeley Growth Study dataset

>>> from skfda.datasets import fetch_growth
>>> from sklearn.model_selection import train_test_split

>>> dataset = fetch_growth()
>>> fd = dataset['data']
>>> y = dataset['target']
>>> X_train, X_test, y_train, y_test = train_test_split(
...     fd, y, test_size=0.25, stratify=y, random_state=0)

>>> from skfda.exploratory.depth import (
...     ModifiedBandDepth,
...     IntegratedDepth,
... )
>>> from sklearn.neighbors import KNeighborsClassifier

We will fit a DDG-classifier using KNN

>>> from skfda.ml.classification import DDGClassifier
>>> clf = DDGClassifier(
...     depth_method=ModifiedBandDepth(),
...     multivariate_classifier=KNeighborsClassifier(),
... )
>>> clf.fit(X_train, y_train)
DDGClassifier(...)

We can predict the class of new samples

>>> clf.predict(X_test)
array([ 1, 1, 1, 0, 1, 1, 1, 1, 0, 0, 0, 1,
        1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 1, 1])

Finally, we calculate the mean accuracy for the test data

>>> clf.score(X_test, y_test)
0.875

It is also possible to use several depth functions to increase the number of features available to the classifier

>>> clf = DDGClassifier(
...     depth_method=[
...         ("mbd", ModifiedBandDepth()),
...         ("id", IntegratedDepth()),
...     ],
...     multivariate_classifier=KNeighborsClassifier(),
... )
>>> clf.fit(X_train, y_train)
DDGClassifier(...)
>>> clf.predict(X_test)
array([ 1, 1, 1, 0, 1, 1, 1, 1, 0, 0, 0, 1,
        1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 1, 1])
>>> clf.score(X_test, y_test)
0.875

See also

DDClassifier MaximumDepthClassifier _ddg_transformer

References

Methods

`fit`(X, y)	Fit the model using X as training data and y as target values.
`get_metadata_routing`()	Get metadata routing of this object.
`get_params`([deep])	Get parameters for this estimator.
`predict`(X)	Predict the class labels for the provided data.
`score`(X, y[, sample_weight])	Return accuracy on provided data and labels.
`set_params`(**params)	Set the parameters of this estimator.
`set_score_request`(*[, sample_weight])	Configure whether metadata should be requested to be passed to the `score` method.

fit(X, y)[source]#

Fit the model using X as training data and y as target values.

Parameters:

X (Input) – FDataGrid with the training data.
y (Target) – Target values of shape = (n_samples).

Returns:

self

Return type:

DDGClassifier[Input, Target]

get_metadata_routing()#

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns:: routing – A MetadataRequest encapsulating routing information.
Return type:: MetadataRequest

get_params(deep=True)[source]#

Get parameters for this estimator.

Parameters:: deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns:: params – Parameter names mapped to their values.
Return type:: dict

predict(X)[source]#

Predict the class labels for the provided data.

Parameters:

X (Input) – FDataGrid with the test samples.

Returns:

Array of shape (n_samples) with class labels: for each data sample.

Return type:

Target

score(X, y, sample_weight=None)[source]#

Return accuracy on provided data and labels.

In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.

Parameters:

X (array-like of shape (n_samples, n_features)) – Test samples.
y (array-like of shape (n_samples,) or (n_samples, n_outputs)) – True labels for X.
sample_weight (array-like of shape (n_samples,), default=None) – Sample weights.

Returns:

score – Mean accuracy of self.predict(X) w.r.t. y.

Return type:

float

set_params(**params)[source]#

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:: **params (dict) – Estimator parameters.
Returns:: self – Estimator instance.
Return type:: estimator instance

set_score_request(*, sample_weight='$UNCHANGED$')#

Configure whether metadata should be requested to be passed to the score method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.
self (DDGClassifier)

Returns:

self – The updated object.

Return type:

object

Examples using `skfda.ml.classification.DDGClassifier`#

Classification methods

Depth based classification

Scikit-fda and scikit-learn

DDGClassifier#

Examples using skfda.ml.classification.DDGClassifier#

This Page

Examples using `skfda.ml.classification.DDGClassifier`#