local_averages#

skfda.preprocessing.feature_construction.local_averages(data, *, domains)[source]#

Calculate the local averages of given data in the desired domains.

It takes functional data and performs the following map:

\[\begin{split}f_1(X) = \frac{1}{|T_1|} \int_{T_1} X(t) dt,\dots, \\ f_p(X) = \frac{1}{|T_p|} \int_{T_p} X(t) dt\end{split}\]

where \(T_1, \dots, T_p\) are subregions of the original domain.

Parameters:
Returns:

ndarray of shape (n_samples, n_domains, codomain_dimension) with the transformed data.

Return type:

ndarray[Any, dtype[float64]]

Examples

We import the Berkeley Growth Study dataset. We will use only the first 3 samples to make the example easy

>>> from skfda.datasets import fetch_growth
>>> dataset = fetch_growth(return_X_y=True)[0]
>>> X = dataset[:3]

We can choose the intervals used for the local averages. For example, we could in this case use the averages at different stages of development of the child: from 1 to 3 years, from 3 to 10 and from 10 to 18:

>>> import numpy as np
>>> from skfda.preprocessing.feature_construction import local_averages
>>> averages = local_averages(
...     X,
...     domains=[(1, 3), (3, 10), (10, 18)],
... )
>>> np.round(averages)
array([[[  91.],
        [ 127.],
        [ 179.]],
       [[  88.],
        [ 121.],
        [ 159.]],
       [[  86.],
        [ 115.],
        [ 156.]]])

A different possibility is to decide how many intervals we want to consider. For example, we could want to split the domain in 2 intervals of the same length.

>>> np.round(local_averages(X, domains=2))
array([[[ 117.],
        [ 177.]],
       [[ 112.],
        [ 158.]],
       [[ 107.],
        [ 155.]]])