KernelSmoother#

class skfda.preprocessing.smoothing.KernelSmoother(kernel_estimator=None, *, weights=None, output_points=None, metric=LpDistance(p=2, vector_norm=None))[source]#

Kernel smoothing method.

This module allows to perform functional data smoothing.

Let \(t = (t_1, t_2, ..., t_n)\) be the points of discretisation and \(X\) the vector of observations at that points. Then, the smoothed values, \(\hat{X}\), at the points \(t' = (t_1', t_2', ..., t_m')\) are obtained as

\[\hat{X} = \hat{H} X\]

where \(\hat{H}\) is a matrix described in HatMatrix.

Examples

>>> from skfda import FDataGrid
>>> from skfda.misc.hat_matrix import NadarayaWatsonHatMatrix
>>> fd = FDataGrid(
...     grid_points=[1, 2, 4, 5, 7],
...     data_matrix=[[1, 2, 3, 4, 5]],
... )
>>> kernel_estimator = NadarayaWatsonHatMatrix(bandwidth=3.5)
>>> smoother = KernelSmoother(kernel_estimator=kernel_estimator)
>>> fd_smoothed = smoother.fit_transform(fd)
>>> fd_smoothed.data_matrix.round(2)
array([[[ 2.42],
        [ 2.61],
        [ 3.03],
        [ 3.24],
        [ 3.65]]])
>>> smoother.hat_matrix().round(3)
array([[ 0.294, 0.282, 0.204, 0.153, 0.068],
       [ 0.249, 0.259, 0.22 , 0.179, 0.093],
       [ 0.165, 0.202, 0.238, 0.229, 0.165],
       [ 0.129, 0.172, 0.239, 0.249, 0.211],
       [ 0.073, 0.115, 0.221, 0.271, 0.319]])
>>> kernel_estimator = NadarayaWatsonHatMatrix(bandwidth=2)
>>> smoother = KernelSmoother(kernel_estimator=kernel_estimator)
>>> fd_smoothed = smoother.fit_transform(fd)
>>> fd_smoothed.data_matrix.round(2)
array([[[ 1.84],
        [ 2.18],
        [ 3.09],
        [ 3.55],
        [ 4.28]]])
>>> smoother.hat_matrix().round(3)
array([[ 0.425, 0.375, 0.138, 0.058, 0.005],
       [ 0.309, 0.35 , 0.212, 0.114, 0.015],
       [ 0.103, 0.193, 0.319, 0.281, 0.103],
       [ 0.046, 0.11 , 0.299, 0.339, 0.206],
       [ 0.006, 0.022, 0.163, 0.305, 0.503]])

The output points can be changed:

>>> kernel_estimator = NadarayaWatsonHatMatrix(bandwidth=2)
>>> smoother = KernelSmoother(
...     kernel_estimator=kernel_estimator,
...     output_points=[1, 2, 3, 4, 5, 6, 7],
... )
>>> fd_smoothed = smoother.fit_transform(fd)
>>> fd_smoothed.data_matrix.round(2)
array([[[ 1.84],
        [ 2.18],
        [ 2.61],
        [ 3.09],
        [ 3.55],
        [ 3.95],
        [ 4.28]]])
>>> smoother.hat_matrix().round(3)
array([[ 0.425,  0.375,  0.138,  0.058,  0.005],
       [ 0.309,  0.35 ,  0.212,  0.114,  0.015],
       [ 0.195,  0.283,  0.283,  0.195,  0.043],
       [ 0.103,  0.193,  0.319,  0.281,  0.103],
       [ 0.046,  0.11 ,  0.299,  0.339,  0.206],
       [ 0.017,  0.053,  0.238,  0.346,  0.346],
       [ 0.006,  0.022,  0.163,  0.305,  0.503]])

Parameters:

kernel_estimator (HatMatrix | None) – Method used to calculate the hat matrix (default = NadarayaWatsonHatMatrix)
weights (ndarray[tuple[Any, ...], dtype[floating[Any]]] | None) – weight coefficients for each point.
output_points (_Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str] | Sequence[_Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str]] | None) – The output points. If omitted, the input points are used.
metric (Metric[ndarray[tuple[Any, ...], dtype[floating[Any]]]])

So far only non parametric methods are implemented because we are only relying on a discrete representation of functional data.

Methods

`fit`(X[, y])	Compute the hat matrix for the desired output points.
`fit_transform`(X[, y])	Fit to data, then transform it.
`get_metadata_routing`()	Get metadata routing of this object.
`get_params`([deep])	Get parameters for this estimator.
`hat_matrix`([input_points, output_points])
`score`(X, y)	Return the generalized cross validation (GCV) score.
`set_output`(*[, transform])	Set output container.
`set_params`(**params)	Set the parameters of this estimator.
`transform`(X[, y])	Multiply the hat matrix with the function values to smooth them.

fit(X, y=None)[source]#

Compute the hat matrix for the desired output points.

Parameters:

X (FDataGrid) – The data whose points are used to compute the matrix.
y (object) – Ignored.

Returns:

self

Return type:

_LinearSmoother

fit_transform(X, y=None, **fit_params)[source]#

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters:

X (array-like of shape (n_samples, n_features)) – Input samples.
y (array-like of shape (n_samples,) or (n_samples, n_outputs), default=None) – Target values (None for unsupervised transformations).
**fit_params (dict) – Additional fit parameters.

Returns:

X_new – Transformed array.

Return type:

ndarray array of shape (n_samples, n_features_new)

get_metadata_routing()#

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns:: routing – A MetadataRequest encapsulating routing information.
Return type:: MetadataRequest

get_params(deep=True)#

Get parameters for this estimator.

Parameters:: deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns:: params – Parameter names mapped to their values.
Return type:: dict

hat_matrix(input_points=None, output_points=None)[source]#

Parameters:

input_points (TypeAliasForwardRef('Union[ArrayLike, Sequence[ArrayLike]]') | None)
output_points (TypeAliasForwardRef('Union[ArrayLike, Sequence[ArrayLike]]') | None)

Return type:

ndarray[tuple[Any, …], dtype[floating[Any]]]

score(X, y)[source]#

Return the generalized cross validation (GCV) score.

Parameters:

X (FDataGrid) – The data to smooth.
y (FDataGrid) – The target data. Typically the same as X.

Returns:

Generalized cross validation score.

Return type:

float

set_output(*, transform=None)#

Set output container.

See Introducing the set_output API for an example on how to use the API.

Parameters:

transform ({"default", "pandas", "polars"}, default=None) –

Configure output of transform and fit_transform.

”default”: Default output format of a transformer
”pandas”: DataFrame output
”polars”: Polars output
None: Transform configuration is unchanged

Added in version 1.4: “polars” option was added.

Returns:

self – Estimator instance.

Return type:

estimator instance

set_params(**params)#

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.