FDataBasis#

class skfda.representation.basis.FDataBasis(basis, coefficients, *, dataset_name=None, argument_names=None, coordinate_names=None, sample_names=None, extrapolation=None)[source]#

Basis representation of functional data.

Class representation for functional data in the form of a set of basis functions multplied by a set of coefficients.

\[f(x) = \sum_{k=1}{K}c_k\phi_k\]

Where n is the number of basis functions, \(c = (c_1, c_2, ..., c_K)\) the vector of coefficients and \(\phi = (\phi_1, \phi_2, ..., \phi_K)\) the basis function system.

Attributes:
  • basis – Basis function system.

  • coefficients – List or matrix of coefficients. Has to have the same length or number of columns as the number of basis function in the basis. If a matrix, each row contains the coefficients that multiplied by the basis functions produce each functional datum.

  • domain_range – 2 dimension matrix where each row contains the bounds of the interval in which the functional data is considered to exist for each one of the axies.

  • dataset_name – name of the dataset.

  • argument_names – tuple containing the names of the different arguments.

  • coordinate_names – tuple containing the names of the different coordinate functions.

  • extrapolation – defines the default type of extrapolation. By default None, which does not apply any type of extrapolation. See Extrapolation for detailled information of the types of extrapolation.

Parameters:
  • basis (Basis) –

  • coefficients (ArrayLike) –

  • dataset_name (str | None) –

  • argument_names (Optional[LabelTupleLike]) –

  • coordinate_names (Optional[LabelTupleLike]) –

  • sample_names (Optional[LabelTupleLike]) –

  • extrapolation (Optional[ExtrapolationLike]) –

Examples

>>> from skfda.representation.basis import FDataBasis, MonomialBasis
>>>
>>> basis = MonomialBasis(n_basis=4)
>>> coefficients = [1, 1, 3, .5]
>>> FDataBasis(basis, coefficients)
FDataBasis(
    basis=MonomialBasis(domain_range=((0.0, 1.0),), n_basis=4),
    coefficients=[[ 1.   1.   3.   0.5]],
    ...)

Methods

argmax([skipna])

Return the index of maximum value.

argmin([skipna])

Return the index of minimum value.

argsort(*[, ascending, kind, na_position])

Return the indices that would sort this array.

astype(dtype[, copy])

Cast to a new dtype.

compose(fd, *[, eval_points])

Composition of functions.

concatenate(*others[, as_coordinates])

Join samples from a similar FDataBasis object.

copy(*[, deep, basis, coefficients, ...])

Copy the FDataBasis.

cov()

Compute the covariance of the functional data object.

delete(loc)

derivative(*[, order])

Differentiate a FData object.

dropna()

Return ExtensionArray without NA values.

duplicated([keep])

Return boolean ndarray denoting duplicate values.

equals(other)

Equality of FDataBasis.

evaluate(eval_points, *[, derivative, ...])

Evaluate the object at a list of values or a grid.

factorize([use_na_sentinel])

Encode the extension array as an enumerated type.

fillna([value, method, limit, copy])

Fill NA/NaN values using the specified method.

from_data(data_matrix, *, basis[, ...])

Transform raw data to a smooth functional form.

insert(loc, item)

Insert an item at the given position.

integrate(*[, domain])

Integration of the FData object.

interpolate(*, method, axis, index, limit, ...)

See DataFrame.interpolate.__doc__.

isin(values)

Pointwise comparison for set containment in the given values.

isna()

Return a 1-D array indicating if each value is missing.

map(mapper[, na_action])

Map values using an input mapping or function.

mean(*[, axis, dtype, out, keepdims, skipna])

Compute the mean of all the samples.

plot(*args, **kwargs)

Plot the FDatGrid object.

ravel([order])

Return a flattened view on this array.

repeat(repeats[, axis])

Repeat elements of a ExtensionArray.

searchsorted(value[, side, sorter])

Find indices where elements should be inserted to maintain order.

shift(shifts, *[, restrict_domain, ...])

Perform a shift of the curves.

sum(*[, axis, out, keepdims, skipna, min_count])

Compute the sum of all the samples in a FDataBasis object.

take(indices[, allow_fill, fill_value, axis])

Take elements from an array.

to_basis([basis, eval_points])

Return the basis representation of the object.

to_grid([grid_points, sample_points])

Return the discrete representation of the object.

to_numpy([dtype, copy, na_value])

Convert to a NumPy ndarray.

tolist()

Return a list of the values.

transpose(*axes)

Return a transposed view on this array.

unique()

Compute the ExtensionArray of unique values.

var([eval_points, correction])

Compute the variance of the functional data object.

view([dtype])

Return a view on the array.

argmax(skipna=True)#

Return the index of maximum value.

In case of multiple occurrences of the maximum value, the index corresponding to the first occurrence is returned.

Parameters:

skipna (bool, default True) –

Return type:

int

See also

ExtensionArray.argmin

Return the index of the minimum value.

Examples

>>> arr = pd.array([3, 1, 2, 5, 4])
>>> arr.argmax()
3
argmin(skipna=True)#

Return the index of minimum value.

In case of multiple occurrences of the minimum value, the index corresponding to the first occurrence is returned.

Parameters:

skipna (bool, default True) –

Return type:

int

See also

ExtensionArray.argmax

Return the index of the maximum value.

Examples

>>> arr = pd.array([3, 1, 2, 5, 4])
>>> arr.argmin()
1
argsort(*, ascending=True, kind='quicksort', na_position='last', **kwargs)#

Return the indices that would sort this array.

Parameters:
  • ascending (bool, default True) – Whether the indices should result in an ascending or descending sort.

  • kind ({'quicksort', 'mergesort', 'heapsort', 'stable'}, optional) – Sorting algorithm.

  • na_position ({'first', 'last'}, default 'last') – If 'first', put NaN values at the beginning. If 'last', put NaN values at the end.

  • *args – Passed through to numpy.argsort().

  • **kwargs – Passed through to numpy.argsort().

Returns:

Array of indices that sort self. If NaN values are contained, NaN values are placed at the end.

Return type:

np.ndarray[np.intp]

See also

numpy.argsort

Sorting implementation used internally.

Examples

>>> arr = pd.array([3, 1, 2, 5, 4])
>>> arr.argsort()
array([1, 2, 0, 4, 3])
astype(dtype, copy=True)[source]#

Cast to a new dtype.

Parameters:
Return type:

Any

compose(fd, *, eval_points=None, **kwargs)[source]#

Composition of functions.

Performs the composition of functions. The basis is discretized to compute the composition.

Parameters:
  • fd (FData) – FData object to make the composition. Should have the same number of samples and image dimension equal to 1.

  • eval_points (ndarray[Any, dtype[float64]] | None) – Points to perform the evaluation. kwargs: Named arguments to be passed to from_data().

  • kwargs (Any) –

Returns:

Function resulting from the composition.

Return type:

FData

concatenate(*others, as_coordinates=False)[source]#

Join samples from a similar FDataBasis object.

Joins samples from another FDataBasis object if they have the same basis.

Parameters:
  • others (T) – Objects to be concatenated.

  • as_coordinates (bool) – If False concatenates as new samples, else, concatenates the other functions as new components of the image. Defaults to False.

  • self (T) –

Returns:

FDataBasis object with the samples from the original objects.

Return type:

FDataBasis

Todo

By the moment, only unidimensional objects are supported in basis representation.

copy(*, deep=False, basis=None, coefficients=None, dataset_name=None, argument_names=None, coordinate_names=None, sample_names=None, extrapolation=None)[source]#

Copy the FDataBasis.

Parameters:
  • self (T) –

  • deep (bool) –

  • basis (Optional[Basis]) –

  • coefficients (Optional[NDArrayFloat]) –

  • dataset_name (Optional[str]) –

  • argument_names (Optional[LabelTupleLike]) –

  • coordinate_names (Optional[LabelTupleLike]) –

  • sample_names (Optional[LabelTupleLike]) –

  • extrapolation (Optional[ExtrapolationLike]) –

Return type:

T

cov(s_points: ndarray[Any, dtype[float64]], t_points: ndarray[Any, dtype[float64]], /, correction: int = 0) ndarray[Any, dtype[float64]][source]#
cov(correction: int = 0) Callable[[ndarray[Any, dtype[float64]], ndarray[Any, dtype[float64]]], ndarray[Any, dtype[float64]]]

Compute the covariance of the functional data object.

Calculates the unbiased sample covariance function of the data. This is expected to be only defined for univariate functions. This is a function defined over the basis consisting of the tensor product of the original basis with itself. The resulting covariance function is then represented as a callable object.

If s_points or t_points are not provided, this method returns a callable object representing the covariance function. If s_points and t_points are provided, this method returns the evaluation of the covariance function at the grid formed by the cartesian product of the points in s_points and t_points.

Parameters:
  • s_points – Points where the covariance function is evaluated.

  • t_points – Points where the covariance function is evaluated.

  • correction – degrees of freedom adjustment. The divisor used in the calculation is N - correction, where N represents the number of elements. Default: 0.

Returns:

Covariance function.

delete(loc)#
Parameters:

loc (PositionalIndexer) –

Return type:

Self

derivative(*, order=1)[source]#

Differentiate a FData object.

Parameters:
  • order (int) – Order of the derivative. Defaults to one.

  • self (T) –

Returns:

Functional object containg the derivative.

Return type:

T

dropna()#

Return ExtensionArray without NA values.

Examples

>>> pd.array([1, 2, np.nan]).dropna()
<IntegerArray>
[1, 2]
Length: 2, dtype: Int64
Return type:

Self

duplicated(keep='first')#

Return boolean ndarray denoting duplicate values.

Parameters:

keep ({'first', 'last', False}, default 'first') –

  • first : Mark duplicates as True except for the first occurrence.

  • last : Mark duplicates as True except for the last occurrence.

  • False : Mark all duplicates as True.

Return type:

ndarray[bool]

Examples

>>> pd.array([1, 1, 2, 3, 3], dtype="Int64").duplicated()
array([False,  True, False, False,  True])
equals(other)[source]#

Equality of FDataBasis.

Parameters:

other (object) –

Return type:

bool

evaluate(eval_points, *, derivative=0, extrapolation=None, grid=False, aligned=True)[source]#

Evaluate the object at a list of values or a grid.

Deprecated since version 0.8: Use normal calling notation instead.

Parameters:
  • eval_points (_SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes] | Sequence[_SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes]] | Iterable[_SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes] | Sequence[_SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes]]]) – List of points where the functions are evaluated. If grid is True, a list of axes, one per domain dimension, must be passed instead. If aligned is True, then a list of lists (of points or axes, as explained) must be passed, with one list per sample.

  • derivative (int) – Deprecated. Order of the derivative to evaluate.

  • extrapolation (Evaluator | typing_extensions.Literal[bounds, exception, nan, none, periodic, zeros] | None) – Controls the extrapolation mode for elements outside the domain range. By default it is used the mode defined during the instance of the object.

  • grid (bool) – Whether to evaluate the results on a grid spanned by the input arrays, or at points specified by the input arrays. If true the eval_points should be a list of size dim_domain with the corresponding times for each axis. The return matrix has shape n_samples x len(t1) x len(t2) x … x len(t_dim_domain) x dim_codomain. If the domain dimension is 1 the parameter has no efect. Defaults to False.

  • aligned (bool) – Whether the input points are the same for each sample, or an array of points per sample is passed.

Returns:

Matrix whose rows are the values of the each function at the values specified in eval_points.

Return type:

ndarray[Any, dtype[float64]]

factorize(use_na_sentinel=True)#

Encode the extension array as an enumerated type.

Parameters:

use_na_sentinel (bool, default True) –

If True, the sentinel -1 will be used for NaN values. If False, NaN values will be encoded as non-negative integers and will not drop the NaN from the uniques of the values.

New in version 1.5.0.

Returns:

  • codes (ndarray) – An integer NumPy array that’s an indexer into the original ExtensionArray.

  • uniques (ExtensionArray) – An ExtensionArray containing the unique values of self.

    Note

    uniques will not contain an entry for the NA value of the ExtensionArray if there are any missing values present in self.

Return type:

tuple[ndarray, ExtensionArray]

See also

factorize

Top-level factorize method that dispatches here.

Notes

pandas.factorize() offers a sort keyword as well.

Examples

>>> idx1 = pd.PeriodIndex(["2014-01", "2014-01", "2014-02", "2014-02",
...                       "2014-03", "2014-03"], freq="M")
>>> arr, idx = idx1.factorize()
>>> arr
array([0, 0, 1, 1, 2, 2])
>>> idx
PeriodIndex(['2014-01', '2014-02', '2014-03'], dtype='period[M]')
fillna(value=None, method=None, limit=None, copy=True)#

Fill NA/NaN values using the specified method.

Parameters:
  • value (scalar, array-like) – If a scalar value is passed it is used to fill all missing values. Alternatively, an array-like “value” can be given. It’s expected that the array-like have the same length as ‘self’.

  • method ({'backfill', 'bfill', 'pad', 'ffill', None}, default None) –

    Method to use for filling holes in reindexed Series:

    • pad / ffill: propagate last valid observation forward to next valid.

    • backfill / bfill: use NEXT valid observation to fill gap.

    Deprecated since version 2.1.0.

  • limit (int, default None) –

    If method is specified, this is the maximum number of consecutive NaN values to forward/backward fill. In other words, if there is a gap with more than this number of consecutive NaNs, it will only be partially filled. If method is not specified, this is the maximum number of entries along the entire axis where NaNs will be filled.

    Deprecated since version 2.1.0.

  • copy (bool, default True) – Whether to make a copy of the data before filling. If False, then the original should be modified and no new memory should be allocated. For ExtensionArray subclasses that cannot do this, it is at the author’s discretion whether to ignore “copy=False” or to raise. The base class implementation ignores the keyword in pad/backfill cases.

Returns:

With NA/NaN filled.

Return type:

ExtensionArray

Examples

>>> arr = pd.array([np.nan, np.nan, 2, 3, np.nan, np.nan])
>>> arr.fillna(0)
<IntegerArray>
[0, 0, 2, 3, 0, 0]
Length: 6, dtype: Int64
classmethod from_data(data_matrix, *, basis, grid_points=None, sample_points=None, method='cholesky')[source]#

Transform raw data to a smooth functional form.

Takes functional data in a discrete form and makes an approximates it to the closest function that can be generated by the basis. This function does not attempt to smooth the original data. If smoothing is desired, it is better to use BasisSmoother.

The fit is made so as to reduce the sum of squared errors [RS05-5-2-5]:

\[SSE(c) = (y - \Phi c)' (y - \Phi c)\]

where \(y\) is the vector or matrix of observations, \(\Phi\) the matrix whose columns are the basis functions evaluated at the sampling points and \(c\) the coefficient vector or matrix to be estimated.

By deriving the first formula we obtain the closed formed of the estimated coefficients matrix:

\[\hat{c} = \left( \Phi' \Phi \right)^{-1} \Phi' y\]

The solution of this matrix equation is done using the cholesky method for the resolution of a LS problem. If this method throughs a rounding error warning you may want to use the QR factorisation that is more numerically stable despite being more expensive to compute. [RS05-5-2-7]

Parameters:
  • data_matrix (Union[NDArrayFloat, NDArrayInt]) – List or matrix containing the observations. If a matrix each row represents a single functional datum and the columns the different observations.

  • grid_points (Optional[GridPointsLike]) – Values of the domain where the previous data were taken.

  • basis (Basis) – Basis used.

  • method (str) – Algorithm used for calculating the coefficients using the least squares method. The values admitted are ‘cholesky’ and ‘qr’ for Cholesky and QR factorisation methods respectively.

  • sample_points (Optional[GridPointsLike]) –

    Old name for grid_points. New code should use grid_points instead.

    Deprecated since version 0.5.

Returns:

Represention of the data in a functional form as

product of coefficients by basis functions.

Return type:

FDataBasis

Examples

>>> import numpy as np
>>> t = np.linspace(0, 1, 5)
>>> x = np.sin(2 * np.pi * t) + np.cos(2 * np.pi * t) + 2
>>> x
array([ 3.,  3.,  1.,  1.,  3.])
>>> from skfda.representation.basis import FDataBasis, FourierBasis
>>> basis = FourierBasis((0, 1), n_basis=3)
>>> fd = FDataBasis.from_data(x, grid_points=t, basis=basis)
>>> fd.coefficients.round(2)
array([[ 2.  , 0.71, 0.71]])

References

[RS05-5-2-5]

Ramsay, J., Silverman, B. W. (2005). How spline smooths are computed. In Functional Data Analysis (pp. 86-87). Springer.

[RS05-5-2-7]

Ramsay, J., Silverman, B. W. (2005). HSpline smoothing as an augmented least squares problem. In Functional Data Analysis (pp. 86-87). Springer.

insert(loc, item)#

Insert an item at the given position.

Parameters:
  • loc (int) –

  • item (scalar-like) –

Return type:

same type as self

Notes

This method should be both type and dtype-preserving. If the item cannot be held in an array of this type/dtype, either ValueError or TypeError should be raised.

The default implementation relies on _from_sequence to raise on invalid items.

Examples

>>> arr = pd.array([1, 2, 3])
>>> arr.insert(2, -1)
<IntegerArray>
[1, 2, -1, 3]
Length: 4, dtype: Int64
integrate(*, domain=None)[source]#

Integration of the FData object.

The integration is performed over the whole domain. Thus, for a function of several variables this will be a multiple integral.

For a vector valued function the vector of integrals will be returned.

Parameters:
  • domain (Tuple[Tuple[float, float], ...] | None) – Domain range where we want to integrate. By default is None as we integrate on the whole domain.

  • self (T) –

Returns:

NumPy array of size (n_samples, dim_codomain) with the integrated data.

Return type:

ndarray[Any, dtype[float64]]

Examples

We first create the data basis.
>>> from skfda.representation.basis import FDataBasis
>>> from skfda.representation.basis import MonomialBasis
>>> basis = MonomialBasis(n_basis=4)
>>> coefficients = [1, 1, 3, .5]
>>> fdata = FDataBasis(basis, coefficients)
Then we can integrate on the whole domain.
>>> fdata.integrate()
array([[ 2.625]])
Or we can do it on a given domain.
>>> fdata.integrate(domain=((0.5, 1),))
array([[ 1.8671875]])
interpolate(*, method, axis, index, limit, limit_direction, limit_area, copy, **kwargs)#

See DataFrame.interpolate.__doc__.

Examples

>>> arr = pd.arrays.NumpyExtensionArray(np.array([0, 1, np.nan, 3]))
>>> arr.interpolate(method="linear",
...                 limit=3,
...                 limit_direction="forward",
...                 index=pd.Index([1, 2, 3, 4]),
...                 fill_value=1,
...                 copy=False,
...                 axis=0,
...                 limit_area="inside"
...                 )
<NumpyExtensionArray>
[0.0, 1.0, 2.0, 3.0]
Length: 4, dtype: float64
Parameters:
  • method (InterpolateOptions) –

  • axis (int) –

  • index (Index) –

  • copy (bool) –

Return type:

Self

isin(values)#

Pointwise comparison for set containment in the given values.

Roughly equivalent to np.array([x in values for x in self])

Parameters:

values (np.ndarray or ExtensionArray) –

Return type:

np.ndarray[bool]

Examples

>>> arr = pd.array([1, 2, 3])
>>> arr.isin([1])
<BooleanArray>
[True, False, False]
Length: 3, dtype: boolean
isna()[source]#

Return a 1-D array indicating if each value is missing.

Returns:

Positions of NA.

Return type:

na_values (np.ndarray)

map(mapper, na_action=None)#

Map values using an input mapping or function.

Parameters:
  • mapper (function, dict, or Series) – Mapping correspondence.

  • na_action ({None, 'ignore'}, default None) – If ‘ignore’, propagate NA values, without passing them to the mapping correspondence. If ‘ignore’ is not supported, a NotImplementedError should be raised.

Returns:

The output of the mapping function applied to the array. If the function returns a tuple with more than one element a MultiIndex will be returned.

Return type:

Union[ndarray, Index, ExtensionArray]

mean(*, axis=None, dtype=None, out=None, keepdims=False, skipna=False)[source]#

Compute the mean of all the samples.

Parameters:
  • axis (int | None) – Used for compatibility with numpy. Must be None or 0.

  • dtype (None) – Used for compatibility with numpy. Must be None.

  • out (None) – Used for compatibility with numpy. Must be None.

  • keepdims (bool) – Used for compatibility with numpy. Must be False.

  • skipna (bool) – Wether the NaNs are ignored or not.

  • self (T) –

Returns:

A FData object with just one sample representing the mean of all the samples in the original object.

Return type:

T

plot(*args, **kwargs)[source]#

Plot the FDatGrid object.

Parameters:
  • args (Any) – Positional arguments to be passed to the class GraphPlot.

  • kwargs (Any) – Keyword arguments to be passed to the class GraphPlot.

Returns:

Figure object in which the graphs are plotted.

Return type:

Figure

ravel(order='C')#

Return a flattened view on this array.

Parameters:

order ({None, 'C', 'F', 'A', 'K'}, default 'C') –

Return type:

ExtensionArray

Notes

  • Because ExtensionArrays are 1D-only, this is a no-op.

  • The “order” argument is ignored, is for compatibility with NumPy.

Examples

>>> pd.array([1, 2, 3]).ravel()
<IntegerArray>
[1, 2, 3]
Length: 3, dtype: Int64
repeat(repeats, axis=None)#

Repeat elements of a ExtensionArray.

Returns a new ExtensionArray where each element of the current ExtensionArray is repeated consecutively a given number of times.

Parameters:
  • repeats (int or array of ints) – The number of repetitions for each element. This should be a non-negative integer. Repeating 0 times will return an empty ExtensionArray.

  • axis (None) – Must be None. Has no effect but is accepted for compatibility with numpy.

Returns:

Newly created ExtensionArray with repeated elements.

Return type:

ExtensionArray

See also

Series.repeat

Equivalent function for Series.

Index.repeat

Equivalent function for Index.

numpy.repeat

Similar method for numpy.ndarray.

ExtensionArray.take

Take arbitrary positions.

Examples

>>> cat = pd.Categorical(['a', 'b', 'c'])
>>> cat
['a', 'b', 'c']
Categories (3, object): ['a', 'b', 'c']
>>> cat.repeat(2)
['a', 'a', 'b', 'b', 'c', 'c']
Categories (3, object): ['a', 'b', 'c']
>>> cat.repeat([1, 2, 3])
['a', 'b', 'b', 'c', 'c', 'c']
Categories (3, object): ['a', 'b', 'c']
searchsorted(value, side='left', sorter=None)#

Find indices where elements should be inserted to maintain order.

Find the indices into a sorted array self (a) such that, if the corresponding elements in value were inserted before the indices, the order of self would be preserved.

Assuming that self is sorted:

side

returned index i satisfies

left

self[i-1] < value <= self[i]

right

self[i-1] <= value < self[i]

Parameters:
  • value (array-like, list or scalar) – Value(s) to insert into self.

  • side ({'left', 'right'}, optional) – If ‘left’, the index of the first suitable location found is given. If ‘right’, return the last such index. If there is no suitable index, return either 0 or N (where N is the length of self).

  • sorter (1-D array-like, optional) – Optional array of integer indices that sort array a into ascending order. They are typically the result of argsort.

Returns:

If value is array-like, array of insertion points. If value is scalar, a single integer.

Return type:

array of ints or int

See also

numpy.searchsorted

Similar method from NumPy.

Examples

>>> arr = pd.array([1, 2, 3, 5])
>>> arr.searchsorted([4])
array([3])
shift(shifts, *, restrict_domain=False, extrapolation=None, grid_points=None)[source]#

Perform a shift of the curves.

The i-th shifted function \(y_i\) has the form

\[y_i(t) = x_i(t + \delta_i)\]

where \(x_i\) is the i-th original function and \(delta_i\) is the shift performed for that function, that must be a vector in the domain space.

Note that a positive shift moves the graph of the function in the negative direction and vice versa.

Parameters:
  • shifts (Union[ArrayLike, float]) – List with the shifts corresponding for each sample or numeric with the shift to apply to all samples.

  • restrict_domain (bool) – If True restricts the domain to avoid the evaluation of points outside the domain using extrapolation. Defaults uses extrapolation.

  • extrapolation (Optional[ExtrapolationLike]) – Controls the extrapolation mode for elements outside the domain range. By default uses the method defined in fd. See extrapolation to more information.

  • grid_points (Optional[GridPointsLike]) – Grid of points where the functions are evaluated to obtain the discrete representation of the object to operate. If None the current grid_points are used to unificate the domain of the shifted data.

Returns:

Shifted functions.

Return type:

FDataGrid

sum(*, axis=None, out=None, keepdims=False, skipna=False, min_count=0)[source]#

Compute the sum of all the samples in a FDataBasis object.

Parameters:
  • axis (int | None) – Used for compatibility with numpy. Must be None or 0.

  • out (None) – Used for compatibility with numpy. Must be None.

  • keepdims (bool) – Used for compatibility with numpy. Must be False.

  • skipna (bool) – Wether the NaNs are ignored or not.

  • min_count (int) – Number of valid (non NaN) data to have in order for the a variable to not be NaN when skipna is True.

  • self (T) –

Returns:

A FDataBais object with just one sample representing the sum of all the samples in the original FDataBasis object.

Return type:

T

Examples

>>> from skfda.representation.basis import (
...     FDataBasis,
...     MonomialBasis,
... )
>>> basis = MonomialBasis(n_basis=4)
>>> coefficients = [[0.5, 1, 2, .5], [1.5, 1, 4, .5]]
>>> FDataBasis(basis, coefficients).sum()
FDataBasis(
    basis=MonomialBasis(domain_range=((0.0, 1.0),), n_basis=4),
    coefficients=[[ 2.  2.  6.  1.]],
    ...)
take(indices, allow_fill=False, fill_value=None, axis=0)[source]#

Take elements from an array.

Parameters:
  • indices (int | Sequence[int] | ndarray[Any, dtype[int64]]) – Indices to be taken.

  • allow_fill (bool) –

    How to handle negative values in indices.

    • False: negative values in indices indicate positional indices from the right (the default). This is similar to numpy.take().

    • True: negative values in indices indicate missing values. These values are set to fill_value. Any other negative values raise a ValueError.

  • fill_value (T | None) – Fill value to use for NA-indices when allow_fill is True. This may be None, in which case the default NA value for the type, self.dtype.na_value, is used. For many ExtensionArrays, there will be two representations of fill_value: a user-facing “boxed” scalar, and a low-level physical NA value. fill_value should be the user-facing version, and the implementation should handle translating that to the physical version for processing the take if necessary.

  • axis (int) – Parameter for compatibility with numpy. Must be always 0.

  • self (T) –

Returns:

FData

Raises:
  • IndexError – When the indices are out of bounds for the array.

  • ValueError – When indices contains negative values other than -1 and allow_fill is True.

Return type:

T

Notes

ExtensionArray.take is called by Series.__getitem__, .loc, iloc, when indices is a sequence of values. Additionally, it’s called by Series.reindex(), or any other method that causes realignment, with a fill_value.

See also

numpy.take pandas.api.extensions.take

to_basis(basis=None, eval_points=None, **kwargs)[source]#

Return the basis representation of the object.

Parameters:
  • basis (Optional[Basis]) – Basis object in which the functional data are going to be represented.

  • eval_points (Optional[NDArrayFloat]) – Evaluation points used to discretize the function if the basis is going to be changed.

  • kwargs (Any) – Keyword arguments to be passed to FDataBasis.from_data().

Returns:

Basis representation of the funtional data object.

Return type:

FDataBasis

to_grid(grid_points=None, *, sample_points=None)[source]#

Return the discrete representation of the object.

Parameters:
  • grid_points (array_like, optional) – Points per axis where the functions are evaluated. If none are passed it calls numpy.linspace with bounds equal to the ones defined in self.domain_range and the number of points the maximum between 501 and 10 times the number of basis.

  • sample_points (Optional[GridPointsLike]) –

    Old name for grid_points. New code should use grid_points instead.

    Deprecated since version 0.5.

Returns:

Discrete representation of the functional data object.

Return type:

FDataGrid

Examples

>>> from skfda.representation.basis import(
...     FDataBasis,
...     MonomialBasis,
... )
>>> fd = FDataBasis(
...     coefficients=[[1, 1, 1], [1, 0, 1]],
...     basis=MonomialBasis(domain_range=(0,5), n_basis=3),
... )
>>> fd.to_grid([0, 1, 2])
FDataGrid(
    array([[[ 1.],
            [ 3.],
            [ 7.]],
           [[ 1.],
            [ 2.],
            [ 5.]]]),
    grid_points=(array([ 0., 1., 2.]),),
    domain_range=((0.0, 5.0),),
    ...)
to_numpy(dtype=None, copy=False, na_value=_NoDefault.no_default)#

Convert to a NumPy ndarray.

This is similar to numpy.asarray(), but may provide additional control over how the conversion is done.

Parameters:
  • dtype (str or numpy.dtype, optional) – The dtype to pass to numpy.asarray().

  • copy (bool, default False) – Whether to ensure that the returned value is a not a view on another array. Note that copy=False does not ensure that to_numpy() is no-copy. Rather, copy=True ensure that a copy is made, even if not strictly necessary.

  • na_value (Any, optional) – The value to use for missing values. The default value depends on dtype and the type of the array.

Return type:

numpy.ndarray

tolist()#

Return a list of the values.

These are each a scalar type, which is a Python scalar (for str, int, float) or a pandas scalar (for Timestamp/Timedelta/Interval/Period)

Return type:

list

Examples

>>> arr = pd.array([1, 2, 3])
>>> arr.tolist()
[1, 2, 3]
transpose(*axes)#

Return a transposed view on this array.

Because ExtensionArrays are always 1D, this is a no-op. It is included for compatibility with np.ndarray.

Return type:

ExtensionArray

Parameters:

axes (int) –

Examples

>>> pd.array([1, 2, 3]).transpose()
<IntegerArray>
[1, 2, 3]
Length: 3, dtype: Int64
unique()#

Compute the ExtensionArray of unique values.

Return type:

pandas.api.extensions.ExtensionArray

Examples

>>> arr = pd.array([1, 2, 3, 1, 2, 3])
>>> arr.unique()
<IntegerArray>
[1, 2, 3]
Length: 3, dtype: Int64
var(eval_points=None, correction=0)[source]#

Compute the variance of the functional data object.

A numerical approach its used. The object its transformed into its discrete representation and then the variance is computed and then the object is taken back to the basis representation.

Parameters:
  • eval_points (ndarray[Any, dtype[float64]] | None) – Set of points where the functions are evaluated to obtain the discrete representation of the object. If none are passed it calls numpy.linspace with bounds equal to the ones defined in self.domain_range and the number of points the maximum between 501 and 10 times the number of basis.

  • correction (int) – degrees of freedom adjustment. The divisor used in the calculation is N - correction, where N represents the number of elements. Default: 0.

  • self (T) –

Returns:

Variance of the original object.

Return type:

T

view(dtype=None)#

Return a view on the array.

Parameters:

dtype (str, np.dtype, or ExtensionDtype, optional) – Default None.

Returns:

A view on the ExtensionArray’s data.

Return type:

ExtensionArray or np.ndarray

Examples

This gives view on the underlying data of an ExtensionArray and is not a copy. Modifications on either the view or the original ExtensionArray will be reflectd on the underlying data:

>>> arr = pd.array([1, 2, 3])
>>> arr2 = arr.view()
>>> arr[0] = 2
>>> arr2
<IntegerArray>
[2, 2, 3]
Length: 3, dtype: Int64