v_sample_stat#

skfda.inference.anova.v_sample_stat(fd, weights, p=2)[source]#

Compute sample statistic.

Calculates a statistic that measures the variability between groups of samples in a skfda.representation.FData object.

The statistic defined as below is calculated between all the samples in a skfda.representation.FData object with a given set of weights.

Let \(\{f_i\}_{i=1}^k\) be a set of samples in a FData object. Let \(\{w_j\}_{j=1}^k\) be a set of weights, where \(w_i\) is related to the sample \(f_i\) for \(i=1,\dots,k\). The statistic is defined as:

\[V_n = \sum_{i<j}^kw_i\|f_i-f_j\|^2\]

This statistic is defined in Cuevas [1].

Parameters:
  • fd (FData) – Object containing all the samples for which we want to calculate the statistic.

  • weights (ArrayLike) – Weights related to each sample. Each weight is expected to appear in the same position as its corresponding sample in the FData object.

  • p (int) – p of the lp norm. Must be greater or equal than 1. If p=’inf’ or p=np.inf it is used the L infinity metric. Defaults to 2.

Returns:

The value of the statistic.

Raises:

ValueError

Return type:

float

Examples

>>> from skfda.inference.anova import v_sample_stat
>>> from skfda.representation.grid import FDataGrid
>>> import numpy as np

We create different trajectories to be applied in the statistic and a set of weights.

>>> t = np.linspace(0, 1, 50)
>>> x1 = t * (1 - t) ** 5
>>> x2 = t ** 2 * (1 - t) ** 4
>>> x3 = t ** 3 * (1 - t) ** 3
>>> fd = FDataGrid([x1, x2, x3], grid_points=t)
>>> weights = [10, 20, 30]

Finally the value of the statistic is calculated:

>>> v_sample_stat(fd, weights)
0.0164...

References