PerClassTransformer#
- class skfda.preprocessing.feature_construction.PerClassTransformer(transformer, *, array_output=False)[source]#
Per class feature transformer for functional data.
This class takes a transformer and performs the following map:
\[\begin{split}\mathcal{X} &\rightarrow \mathbb{R}^G \\ x &\rightarrow \textbf{t} = (T_1(x), T_2(x),...,T_k(x))\end{split}\]Where \(T_i(x)\) is the transformation \(x\) with respect to the data in the \(i\)-th group.
Note that \(\mathcal{X}\) is possibly multivariate, that is, \(\mathcal{X} = \mathcal{X}_1 \times ... \times \mathcal{X}_p\).
- Parameters:
transformer (TransformerMixin[Input, TransformerOutput, object]) – The transformer that we want to apply to the given data. It should use target data while fitting. This is checked by looking at the ‘stateless’ and ‘requires_y’ tags
array_output (bool) – Indicates if the transformed data is requested to be a NumPy array output. By default the value is False.
Examples
Firstly, we will import the Berkeley Growth Study dataset:
>>> from skfda.datasets import fetch_growth >>> X, y = fetch_growth(return_X_y=True, as_frame=True) >>> X = X.iloc[:, 0].values >>> y = y.values.codes
>>> from skfda.preprocessing.feature_construction import ( ... PerClassTransformer, ... )
Then we will need to select a fda transformer, and so we will use RecursiveMaximaHunting. We need to fit the data and transform it:
>>> from skfda.preprocessing.dim_reduction.variable_selection import ( ... RecursiveMaximaHunting, ... ) >>> t1 = PerClassTransformer( ... RecursiveMaximaHunting(), ... array_output=True, ... ) >>> x_transformed1 = t1.fit_transform(X, y)
x_transformed1
will be a vector with the transformed data. We will split the generated data and fit a KNN classifier.>>> from sklearn.model_selection import train_test_split >>> from sklearn.neighbors import KNeighborsClassifier >>> X_train1, X_test1, y_train1, y_test1 = train_test_split( ... x_transformed1, ... y, ... test_size=0.25, ... stratify=y, ... random_state=0, ... ) >>> neigh1 = KNeighborsClassifier() >>> neigh1 = neigh1.fit(X_train1, y_train1)
Finally we can predict and check the score:
>>> neigh1.predict(X_test1) array([0, 0, 1, 0, 1, 1, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1], dtype=int8)
>>> round(neigh1.score(X_test1, y_test1), 3) 0.958
We can also use a transformer that returns a FData object when predicting. In our example we are going to use the Fisher Rao Elastic Registration.
>>> from skfda.preprocessing.registration import ( ... FisherRaoElasticRegistration, ... ) >>> t2 = PerClassTransformer( ... FisherRaoElasticRegistration(), ... ) >>> x_transformed2 = t2.fit_transform(X, y)
x_transformed2
will be a DataFrame with the transformed data. Each column of the frame contains a FDataGrid describing a transformed curve. Now we are able to use it to fit a KNN classifier. Again we split the data into train and test.>>> X_train2, X_test2, y_train2, y_test2 = train_test_split( ... x_transformed2.iloc[:, 0].values, ... y, ... test_size=0.25, ... stratify=y, ... random_state=0, ... )
This time we need a functional data classifier. We fit the classifier and predict.
>>> from skfda.ml.classification import KNeighborsClassifier >>> neigh2 = KNeighborsClassifier() >>> neigh2 = neigh2.fit(X_train2, y_train2) >>> neigh2.predict(X_test2) array([1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1], dtype=int8)
>>> round(neigh2.score(X_test2, y_test2), 3) 0.917
Methods
fit
(X, y)Fit the model on each class.
fit_transform
(X, y)Fits and transforms the provided data.
set_output
(*[, transform])Set output container.
transform
(X[, y])Transform the provided data using the already fitted transformer.
- fit(X, y)[source]#
Fit the model on each class.
It uses X as training data and y as target values.
- Parameters:
- Returns:
self
- Return type:
PerClassTransformer[Input, Output]
- fit_transform(X, y)[source]#
Fits and transforms the provided data.
It uses the transformer specified when initializing the class.
- set_output(*, transform=None)#
Set output container.
See Introducing the set_output API for an example on how to use the API.
- Parameters:
transform ({"default", "pandas"}, default=None) –
Configure output of transform and fit_transform.
”default”: Default output format of a transformer
”pandas”: DataFrame output
”polars”: Polars output
None: Transform configuration is unchanged
New in version 1.4: “polars” option was added.
- Returns:
self – Estimator instance.
- Return type:
estimator instance