occupation_measure#

skfda.preprocessing.feature_construction.occupation_measure(data, intervals, *, n_points=None)[source]#

Calculate the occupation measure of a grid.

It performs the following map:

..math:

\(f_1(X) = |t: X(t)\in T_p|,\dots,|t: X(t)\in T_p|\)

where \({T_1,\dots,T_p}\) are disjoint intervals in \(\mathbb{R}\) and | | stands for the Lebesgue measure.

The calculations are based on evaluation at a grid of points. In case of FDataGrid the original grid is taken unless n_points is specified. In case of FDataBasis it is mandatory to pass the number of points. If the result of this function is not accurate enough try to increase the grid of points.

Parameters:
  • data (FData) – Functional data where we want to calculate the occupation measure.

  • intervals (Sequence[Tuple[float, float]]) – ndarray of tuples containing the intervals we want to consider. The shape should be (n_sequences,2)

  • n_points (int | None) – Number of points to evaluate in the domain. By default will be used the points defined on the FDataGrid. On a FDataBasis this value should be specified.

Returns:

ndarray of shape (n_samples, n_intervals) with the transformed data.

Return type:

ndarray[Any, dtype[float64]]

Examples

We will create the FDataGrid that we will use to extract the occupation measure.

>>> from skfda.representation import FDataGrid
>>> import numpy as np
>>> t = np.linspace(0, 10, 100)
>>> fd_grid = FDataGrid(
...     data_matrix=[
...         t,
...         2 * t,
...         np.sin(t),
...     ],
...     grid_points=t,
... )

Finally we call to the occupation measure function with the intervals that we want to consider. In our case (0.0, 1.0) and (2.0, 3.0). We need also to specify the number of points we want that the function takes into account to interpolate. We are going to use 501 points.

>>> from skfda.preprocessing.feature_construction import (
...     occupation_measure,
... )
>>> np.around(
...     occupation_measure(
...         fd_grid,
...         [(0.0, 1.0), (2.0, 3.0)],
...         n_points=501,
...     ),
...     decimals=2,
... )
array([[ 0.98,  1.  ],
       [ 0.5 ,  0.52],
       [ 6.28,  0.  ]])