.. DO NOT EDIT.
.. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY.
.. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE:
.. "auto_examples/plot_depth_classification.py"
.. LINE NUMBERS ARE GIVEN BELOW.

.. only:: html

    .. note::
        :class: sphx-glr-download-link-note

        :ref:`Go to the end <sphx_glr_download_auto_examples_plot_depth_classification.py>`
        to download the full example code or to run this example in your browser via Binder

.. rst-class:: sphx-glr-example-title

.. _sphx_glr_auto_examples_plot_depth_classification.py:


Depth based classification
==========================

This example shows the use of the depth based classification methods
applied to the Berkeley Growth Study data. An attempt to show the
differences and similarities between
:class:`~skfda.ml.classification.MaximumDepthClassifier`,
:class:`~skfda.ml.classification.DDClassifier`,
and :class:`~skfda.ml.classification.DDGClassifier` is made.

.. GENERATED FROM PYTHON SOURCE LINES 12-35

.. code-block:: Python


    # Author: Pedro Martín Rodríguez-Ponga Eyriès
    # License: MIT

    # sphinx_gallery_thumbnail_number = 5

    import matplotlib.pyplot as plt
    import numpy as np
    from matplotlib.colors import ListedColormap
    from sklearn.model_selection import train_test_split
    from sklearn.neighbors import KNeighborsClassifier
    from sklearn.neural_network import MLPClassifier

    from skfda import datasets
    from skfda.exploratory.depth import ModifiedBandDepth
    from skfda.exploratory.visualization import DDPlot
    from skfda.ml.classification import (
        DDClassifier,
        DDGClassifier,
        MaximumDepthClassifier,
    )
    from skfda.preprocessing.feature_construction import PerClassTransformer


.. GENERATED FROM PYTHON SOURCE LINES 36-41

The Berkeley Growth Study data contains the heights of 39 boys and 54 girls
from age 1 to 18 and the ages at which they were collected. Males are
assigned the numeric value 0 while females are assigned a 1. In our
comparison of the different methods, we will try to learn the sex of a person
by using its growth curve.

.. GENERATED FROM PYTHON SOURCE LINES 41-46

.. code-block:: Python

    X, y = datasets.fetch_growth(return_X_y=True, as_frame=True)
    X = X.iloc[:, 0].values
    categories = y.values.categories
    y = y.values.codes


.. GENERATED FROM PYTHON SOURCE LINES 47-50

As in many ML algorithms, we split the dataset into train and test. In this
graph, we can see the training dataset. These growth curves will be used to
train the model. Hence, the predictions will be data-driven.

.. GENERATED FROM PYTHON SOURCE LINES 50-61

.. code-block:: Python

    X_train, X_test, y_train, y_test = train_test_split(
        X,
        y,
        test_size=0.5,
        stratify=y,
        random_state=0,
    )

    # Plot samples grouped by sex
    X_train.plot(group=y_train, group_names=categories).show()


.. image-sg:: /auto_examples/images/sphx_glr_plot_depth_classification_001.png
   :alt: Berkeley Growth Study
   :srcset: /auto_examples/images/sphx_glr_plot_depth_classification_001.png
   :class: sphx-glr-single-img


.. GENERATED FROM PYTHON SOURCE LINES 62-64

Below are the growth graphs of those individuals that we would like to
classify. Some of them will be male and some female.

.. GENERATED FROM PYTHON SOURCE LINES 64-66

.. code-block:: Python

    X_test.plot().show()


.. image-sg:: /auto_examples/images/sphx_glr_plot_depth_classification_002.png
   :alt: Berkeley Growth Study
   :srcset: /auto_examples/images/sphx_glr_plot_depth_classification_002.png
   :class: sphx-glr-single-img


.. GENERATED FROM PYTHON SOURCE LINES 67-79

As said above, we are trying to compare three different methods:
:class:`~skfda.ml.classification.MaximumDepthClassifier`,
:class:`~skfda.ml.classification.DDClassifier`, and
:class:`~skfda.ml.classification.DDGClassifier`. They all use a
depth which in our example is
:class:`~skfda.representation.depth.ModifiedBandDepth` for consistency. With
this depth we can create a :class:`~skfda.exploratory.visualization.DDPlot`.

In a :class:`~skfda.exploratory.visualization.DDPlot`, a growth curve is
mapped to :math:`[0,1]\times[0,1]` where the first coordinate corresponds
to the depth in the class of all boys and the second to that of all girls.
Note that the dots will be blue if the true sex is female and red otherwise.

.. GENERATED FROM PYTHON SOURCE LINES 81-87

Below we can see how a :class:`~skfda.exploratory.visualization.DDPlot` is
used to classify with
:class:`~skfda.ml.classification.MaximumDepthClassifier`. In this case it is
quite straighforward, a person is classified to the class where it is
deeper. This means that if a point is above the diagonal it is a girl and
otherwise it is a boy.

.. GENERATED FROM PYTHON SOURCE LINES 87-109

.. code-block:: Python

    clf = MaximumDepthClassifier(depth_method=ModifiedBandDepth())
    clf.fit(X_train, y_train)
    print(clf.predict(X_test))
    print('The score is {0:2.2%}'.format(clf.score(X_test, y_test)))

    fig, ax = plt.subplots()

    cmap_bold = ListedColormap(['#FF0000', '#0000FF'])

    index = y_train.astype(bool)
    DDPlot(
        fdata=X_test,
        dist1=X_train[np.invert(index)],
        dist2=X_train[index],
        depth_method=ModifiedBandDepth(),
        axes=ax,
        c=y_test,
        cmap_bold=cmap_bold,
        x_label="Boy class depth",
        y_label="Girl class depth",
    ).plot().show()


.. image-sg:: /auto_examples/images/sphx_glr_plot_depth_classification_003.png
   :alt: Berkeley Growth Study
   :srcset: /auto_examples/images/sphx_glr_plot_depth_classification_003.png
   :class: sphx-glr-single-img


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    [0 0 0 0 0 0 1 1 1 1 0 1 0 0 0 1 0 1 0 1 1 1 0 1 1 0 0 0 1 1 1 1 1 0 1 1 0
     0 1 1 1 0 0 1 1 1 1]
    The score is 89.36%


.. GENERATED FROM PYTHON SOURCE LINES 110-113

We can see that we have used the classification predictions to compute the
score (obtained by comparing to the real known sex). This will also be done
for the rest of the classifiers.

.. GENERATED FROM PYTHON SOURCE LINES 115-119

Next we use :class:`~skfda.ml.classification.DDClassifier` with polynomes
of degrees one, two, and three. Here, if a point in the
:class:`~skfda.exploratory.visualization.DDPlot` is above the polynome,
the classifier will predict that it is a girl and otherwise, a boy.

.. GENERATED FROM PYTHON SOURCE LINES 119-124

.. code-block:: Python

    clf1 = DDClassifier(degree=1, depth_method=ModifiedBandDepth())
    clf1.fit(X_train, y_train)
    print(clf1.predict(X_test))
    print('The score is {0:2.2%}'.format(clf1.score(X_test, y_test)))


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    [0 0 1 0 1 1 1 1 1 1 1 1 1 0 0 1 0 1 0 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 1 1 0
     0 1 1 1 0 0 1 1 1 1]
    The score is 80.85%


.. GENERATED FROM PYTHON SOURCE LINES 125-130

.. code-block:: Python

    clf2 = DDClassifier(degree=2, depth_method=ModifiedBandDepth())
    clf2.fit(X_train, y_train)
    print(clf2.predict(X_test))
    print('The score is {0:2.2%}'.format(clf2.score(X_test, y_test)))


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    [0 0 1 1 1 1 1 1 1 1 1 1 1 1 0 1 0 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 0 1 1 1
     1 1 1 1 0 1 1 1 1 1]
    The score is 68.09%


.. GENERATED FROM PYTHON SOURCE LINES 131-136

.. code-block:: Python

    clf3 = DDClassifier(degree=3, depth_method=ModifiedBandDepth())
    clf3.fit(X_train, y_train)
    print(clf3.predict(X_test))
    print('The score is {0:2.2%}'.format(clf3.score(X_test, y_test)))


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    [0 0 1 0 0 1 1 1 1 1 0 1 0 0 0 1 0 1 0 0 1 1 0 1 0 0 0 0 0 1 1 0 1 1 1 1 0
     0 1 1 1 0 0 1 1 0 1]
    The score is 89.36%


.. GENERATED FROM PYTHON SOURCE LINES 137-183

.. code-block:: Python

    fig, ax = plt.subplots()


    def _plot_boundaries(axis):
        margin = 0.025
        ts = np.linspace(- margin, 1 + margin, 100)
        pol1 = axis.plot(
            ts,
            np.polyval(clf1.poly_, ts),
            'c',
            label="Polynomial",
        )[0]
        pol2 = axis.plot(
            ts,
            np.polyval(clf2.poly_, ts),
            'm',
            label="Polynomial",
        )[0]
        pol3 = axis.plot(
            ts,
            np.polyval(clf3.poly_, ts),
            'g',
            label="Polynomial",
        )[0]
        max_depth = axis.plot(
            [0, 1],
            color="gray",
        )[0]

        axis.legend([pol1, pol2, pol3, max_depth], ['P1', 'P2', 'P3', 'MaxDepth'])


    _plot_boundaries(ax)

    DDPlot(
        fdata=X_test,
        dist1=X_train[np.invert(index)],
        dist2=X_train[index],
        depth_method=ModifiedBandDepth(),
        axes=ax,
        c=y_test,
        cmap_bold=cmap_bold,
        x_label="Boy class depth",
        y_label="Girl class depth",
    ).plot().show()


.. image-sg:: /auto_examples/images/sphx_glr_plot_depth_classification_004.png
   :alt: Berkeley Growth Study
   :srcset: /auto_examples/images/sphx_glr_plot_depth_classification_004.png
   :class: sphx-glr-single-img


.. GENERATED FROM PYTHON SOURCE LINES 184-186

:class:`~skfda.ml.classification.DDClassifier` used with
:class:`~sklearn.neighbors.KNeighborsClassifier`.

.. GENERATED FROM PYTHON SOURCE LINES 186-195

.. code-block:: Python

    clf = DDGClassifier(
        depth_method=ModifiedBandDepth(),
        multivariate_classifier=KNeighborsClassifier(n_neighbors=5),
    )
    clf.fit(X_train, y_train)
    print(clf.predict(X_test))
    print('The score is {0:2.2%}'.format(clf.score(X_test, y_test)))


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    [1 0 1 1 0 1 1 1 1 1 0 1 0 0 0 1 0 1 0 1 1 1 0 1 1 0 1 0 1 1 1 1 1 0 1 1 0
     0 1 1 1 1 0 1 1 1 1]
    The score is 85.11%


.. GENERATED FROM PYTHON SOURCE LINES 196-211

The other elements of the graph are the decision boundaries:

+--------------+--------------------------------------+
| Boundary     | Classifier                           |
+==============+======================================+
| MaxDepth     | MaximumDepthClassifier               |
+--------------+--------------------------------------+
| P1           | DDClassifier with degree 1           |
+--------------+--------------------------------------+
| P2           | DDClassifier with degree 2           |
+--------------+--------------------------------------+
| P3           | DDClassifier with degree 3           |
+--------------+--------------------------------------+
| NearestClass | DDGClassifier with nearest neighbors |
+--------------+--------------------------------------+

.. GENERATED FROM PYTHON SOURCE LINES 211-253

.. code-block:: Python


    pct = PerClassTransformer(ModifiedBandDepth(), array_output=True)
    X_train_trans = pct.fit_transform(X_train, y_train)
    X_train_trans = X_train_trans.reshape(len(categories), X_train.shape[0]).T

    clf = KNeighborsClassifier(n_neighbors=5)
    clf.fit(X_train_trans, y_train)

    h = 0.01  # step size in the mesh

    # Create color maps
    cmap_light = ListedColormap(['#FFAAAA', '#AAAAFF'])

    # Plot the decision boundary. For that, we will assign a color to each
    # point in the mesh [x_min, x_max]x[y_min, y_max].
    x_min, x_max = X_train_trans[:, 0].min() - 1, X_train_trans[:, 0].max() + 1
    y_min, y_max = X_train_trans[:, 1].min() - 1, X_train_trans[:, 1].max() + 1
    xx, yy = np.meshgrid(
        np.arange(x_min, x_max, h),
        np.arange(y_min, y_max, h),
    )
    Z = clf.predict(np.c_[xx.ravel(), yy.ravel()])

    # Put the result into a color plot
    Z = Z.reshape(xx.shape)

    fig, ax = plt.subplots()
    ax.pcolormesh(xx, yy, Z, cmap=cmap_light, shading='auto')

    _plot_boundaries(ax)
    DDPlot(
        fdata=X_test,
        dist1=X_train[np.invert(index)],
        dist2=X_train[index],
        depth_method=ModifiedBandDepth(),
        axes=ax,
        c=y_test,
        cmap_bold=cmap_bold,
        x_label="Boy class depth",
        y_label="Girl class depth",
    ).plot().show()


.. image-sg:: /auto_examples/images/sphx_glr_plot_depth_classification_005.png
   :alt: Berkeley Growth Study
   :srcset: /auto_examples/images/sphx_glr_plot_depth_classification_005.png
   :class: sphx-glr-single-img


.. GENERATED FROM PYTHON SOURCE LINES 254-261

In the above graph, we can see the obtained classifiers from the train set.
The dots are all part of the test set and have their real color so, for
example, if they are blue it means that the true sex is female. One can see
that none of the built classifiers is perfect.

Next, we will use :class:`~skfda.ml.classification.DDGClassifier` together
with a neural network: :class:`~sklearn.neural_network.MLPClassifier`.

.. GENERATED FROM PYTHON SOURCE LINES 261-274

.. code-block:: Python

    clf = DDGClassifier(
        depth_method=ModifiedBandDepth(),
        multivariate_classifier=MLPClassifier(
            solver='lbfgs',
            alpha=1e-5,
            hidden_layer_sizes=(6, 2),
            random_state=1,
        ),
    )
    clf.fit(X_train, y_train)
    print(clf.predict(X_test))
    print('The score is {0:2.2%}'.format(clf.score(X_test, y_test)))


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    [0 0 0 0 0 0 1 1 1 1 0 1 0 0 0 1 0 1 0 1 1 1 0 1 1 0 0 0 1 1 1 1 1 0 1 1 0
     0 1 1 1 0 0 1 1 1 1]
    The score is 89.36%


.. GENERATED FROM PYTHON SOURCE LINES 275-322

.. code-block:: Python

    clf1 = KNeighborsClassifier(n_neighbors=5)
    clf2 = MLPClassifier(
        solver='lbfgs',
        alpha=1e-5,
        hidden_layer_sizes=(6, 2),
        random_state=1,
    )
    clf1.fit(X_train_trans, y_train)
    clf2.fit(X_train_trans, y_train)

    Z1 = clf1.predict(np.c_[xx.ravel(), yy.ravel()])
    Z2 = clf2.predict(np.c_[xx.ravel(), yy.ravel()])

    Z1 = Z1.reshape(xx.shape)
    Z2 = Z2.reshape(xx.shape)

    fig, axs = plt.subplots(1, 2, sharex=True, sharey=True)

    axs[0].pcolormesh(xx, yy, Z1, cmap=cmap_light, shading='auto')
    axs[1].pcolormesh(xx, yy, Z2, cmap=cmap_light, shading='auto')

    DDPlot(
        fdata=X_test,
        dist1=X_train[np.invert(index)],
        dist2=X_train[index],
        depth_method=ModifiedBandDepth(),
        axes=axs[0],
        c=y_test,
        cmap_bold=cmap_bold,
        x_label="Boy class depth",
        y_label="Girl class depth",
    ).plot().show()
    DDPlot(
        fdata=X_test,
        dist1=X_train[np.invert(index)],
        dist2=X_train[index],
        depth_method=ModifiedBandDepth(),
        axes=axs[1],
        c=y_test,
        cmap_bold=cmap_bold,
        x_label="Boy class depth",
        y_label="Girl class depth",
    ).plot().show()

    for axis in axs:
        axis.label_outer()


.. image-sg:: /auto_examples/images/sphx_glr_plot_depth_classification_006.png
   :alt: Berkeley Growth Study, Berkeley Growth Study
   :srcset: /auto_examples/images/sphx_glr_plot_depth_classification_006.png
   :class: sphx-glr-single-img


.. GENERATED FROM PYTHON SOURCE LINES 323-328

We can compare the behavior of two
:class:`~skfda.ml.classification.DDGClassifier` based classifiers. The
one on the left corresponds to nearest neighbors and the one on the right to
a neural network. Interestingly, the neural network almost coincides with
:class:`~skfda.ml.classification.MaximumDepthClassifier`.


.. rst-class:: sphx-glr-timing

   **Total running time of the script:** (0 minutes 22.593 seconds)


.. _sphx_glr_download_auto_examples_plot_depth_classification.py:

.. only:: html

  .. container:: sphx-glr-footer sphx-glr-footer-example

    .. container:: binder-badge

      .. image:: images/binder_badge_logo.svg
        :target: https://mybinder.org/v2/gh/GAA-UAM/scikit-fda/develop?filepath=examples/plot_depth_classification.py
        :alt: Launch binder
        :width: 150 px

    .. container:: sphx-glr-download sphx-glr-download-jupyter

      :download:`Download Jupyter notebook: plot_depth_classification.ipynb <plot_depth_classification.ipynb>`

    .. container:: sphx-glr-download sphx-glr-download-python

      :download:`Download Python source code: plot_depth_classification.py <plot_depth_classification.py>`


.. only:: html

 .. rst-class:: sphx-glr-signature

    `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_