.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_examples/plot_radius_neighbors_classification.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code or to run this example in your browser via Binder .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_plot_radius_neighbors_classification.py: Radius neighbors classification =============================== Shows the usage of the radius nearest neighbors classifier. .. GENERATED FROM PYTHON SOURCE LINES 7-21 .. code-block:: Python # Author: Pablo Marcos Manchón # License: MIT # sphinx_gallery_thumbnail_number = 2 import numpy as np from sklearn.model_selection import train_test_split import skfda from skfda.misc.metrics import PairwiseMetric, linf_distance from skfda.ml.classification import RadiusNeighborsClassifier .. GENERATED FROM PYTHON SOURCE LINES 22-31 In this example, we are going to show the usage of the radius nearest neighbors classifier in their functional version, a variation of the K-nearest neighbors classifier, where it is used a vote among neighbors within a given radius, instead of use the k nearest neighbors. Firstly, we will construct a toy dataset to show the basic usage of the API. We will create two classes of sinusoidal samples, with different phases. Make toy dataset .. GENERATED FROM PYTHON SOURCE LINES 32-49 .. code-block:: Python fd1 = skfda.datasets.make_sinusoidal_process( error_std=0, phase_std=0.35, random_state=0, ) fd2 = skfda.datasets.make_sinusoidal_process( phase_mean=1.9, error_std=0, random_state=1, ) X = fd1.concatenate(fd2) y = np.array(15 * [0] + 15 * [1]) # Plot toy dataset X.plot(group=y, group_colors=['C0', 'C1']) .. image-sg:: /auto_examples/images/sphx_glr_plot_radius_neighbors_classification_001.png :alt: plot radius neighbors classification :srcset: /auto_examples/images/sphx_glr_plot_radius_neighbors_classification_001.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-script-out .. code-block:: none
.. GENERATED FROM PYTHON SOURCE LINES 50-53 As in the K-nearest neighbor example, we will split the dataset in two partitions, for training and test, using the sklearn function :func:`~sklearn.model_selection.train_test_split`. .. GENERATED FROM PYTHON SOURCE LINES 54-66 .. code-block:: Python # Concatenate the two classes in the same FDataGrid. X_train, X_test, y_train, y_test = train_test_split( X, y, test_size=0.25, shuffle=True, stratify=y, random_state=0, ) .. GENERATED FROM PYTHON SOURCE LINES 67-75 The label assigned to a test sample will be the majority class of its neighbors, in this case all the samples in the ball center in the sample. If we use the :math:`\mathbb{L}^\infty` metric, we can visualize a ball as a bandwidth with a fixed radius around a function. The following figure shows the ball centered in the first sample of the test partition. .. GENERATED FROM PYTHON SOURCE LINES 76-95 .. code-block:: Python radius = 0.3 sample = X_test[0] # Center of the ball fig = X_train.plot(group=y_train, group_colors=['C0', 'C1']) # Plot ball sample.plot(fig=fig, color='red', linewidth=3) lower = sample - radius upper = sample + radius fig.axes[0].fill_between( sample.grid_points[0], lower.data_matrix.flatten(), upper.data_matrix[0].flatten(), alpha=0.25, color='C1', ) .. image-sg:: /auto_examples/images/sphx_glr_plot_radius_neighbors_classification_002.png :alt: plot radius neighbors classification :srcset: /auto_examples/images/sphx_glr_plot_radius_neighbors_classification_002.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-script-out .. code-block:: none .. GENERATED FROM PYTHON SOURCE LINES 96-98 In this case, all the neighbors in the ball belong to the first class, so this will be the class predicted. .. GENERATED FROM PYTHON SOURCE LINES 99-117 .. code-block:: Python # Creation of pairwise distance l_inf = PairwiseMetric(linf_distance) distances = l_inf(sample, X_train)[0] # L_inf distances to 'sample' # Plot samples in the ball fig = X_train[distances <= radius].plot(color='C0') sample.plot(fig=fig, color='red', linewidth=3) fig.axes[0].fill_between( sample.grid_points[0], lower.data_matrix.flatten(), upper.data_matrix[0].flatten(), alpha=0.25, color='C1', ) .. image-sg:: /auto_examples/images/sphx_glr_plot_radius_neighbors_classification_003.png :alt: plot radius neighbors classification :srcset: /auto_examples/images/sphx_glr_plot_radius_neighbors_classification_003.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-script-out .. code-block:: none .. GENERATED FROM PYTHON SOURCE LINES 118-128 We will fit the classifier :class:`~skfda.ml.classification.RadiusNeighborsClassifier`, which has a similar API than the sklearn estimator :class:`~sklearn.neighbors.RadiusNeighborsClassifier` but accepting :class:`~skfda.representation.grid.FDataGrid` instead of arrays with multivariate data. The vote of the neighbors can be weighted using the paramenter ``weights``. In this case we will weight the vote inversely proportional to the distance. .. GENERATED FROM PYTHON SOURCE LINES 129-134 .. code-block:: Python radius_nn = RadiusNeighborsClassifier(radius=radius, weights='distance') radius_nn.fit(X_train, y_train) .. raw:: html
RadiusNeighborsClassifier(radius=0.3, weights='distance')
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.


.. GENERATED FROM PYTHON SOURCE LINES 135-137 We can predict labels for the test partition with :meth:`~skfda.ml.classification.RadiusNeighborsClassifier.predict`. .. GENERATED FROM PYTHON SOURCE LINES 138-142 .. code-block:: Python pred = radius_nn.predict(X_test) print(pred) .. rst-class:: sphx-glr-script-out .. code-block:: none [0 0 0 1 1 1 1 0] .. GENERATED FROM PYTHON SOURCE LINES 143-144 In this case, we get 100% accuracy, although it is a toy dataset. .. GENERATED FROM PYTHON SOURCE LINES 145-149 .. code-block:: Python test_score = radius_nn.score(X_test, y_test) print(test_score) .. rst-class:: sphx-glr-script-out .. code-block:: none 1.0 .. GENERATED FROM PYTHON SOURCE LINES 150-152 If the radius is too small, it is possible to get samples with no neighbors. The classifier will raise and exception in this case. .. GENERATED FROM PYTHON SOURCE LINES 153-162 .. code-block:: Python radius_nn.set_params(radius=0.5) #  Radius 0.05 in the L2 distance radius_nn.fit(X_train, y_train) try: radius_nn.predict(X_test) except ValueError as e: print(e) .. GENERATED FROM PYTHON SOURCE LINES 163-164 A label to these oulier samples can be provided to avoid this problem. .. GENERATED FROM PYTHON SOURCE LINES 165-172 .. code-block:: Python radius_nn.set_params(outlier_label=2) radius_nn.fit(X_train, y_train) pred = radius_nn.predict(X_test) print(pred) .. rst-class:: sphx-glr-script-out .. code-block:: none [0 0 0 1 1 1 1 0] .. GENERATED FROM PYTHON SOURCE LINES 173-175 This classifier can be used with multivariate funcional data, as surfaces or curves in :math:`\mathbb{R}^N`, if the metric support it too. .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 0.464 seconds) .. _sphx_glr_download_auto_examples_plot_radius_neighbors_classification.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: binder-badge .. image:: images/binder_badge_logo.svg :target: https://mybinder.org/v2/gh/GAA-UAM/scikit-fda/develop?filepath=examples/plot_radius_neighbors_classification.py :alt: Launch binder :width: 150 px .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_radius_neighbors_classification.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_radius_neighbors_classification.py ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_