pflm.fpca.FunctionalPCA#

class FunctionalPCA(assume_measurement_error: bool = True, num_points_reg_grid: int = 51, mu_cov_params: FunctionalPCAMuCovParams | None = None, user_params: FunctionalPCAUserDefinedParams | None = None, verbose: bool = False)[source][source]#

Bases: BaseEstimator

Functional Principal Component Analysis (FPCA) for functional data.

Parameters:
assume_measurement_errorbool, default=True

Whether to estimate and account for measurement error.

num_points_reg_gridint, default=51

Number of points in the internal regular grid used for regression/interpolation.

mu_cov_paramsFunctionalPCAMuCovParams, default=FunctionalPCAMuCovParams()

Smoothing and bandwidth selection settings.

user_paramsFunctionalPCAUserDefinedParams, default=FunctionalPCAUserDefinedParams()

Optional user-specified mean/covariance/sigma2/rho to override parts of the pipeline.

verbosebool, default=False

If True, record and expose timing diagnostics in elapsed_time_.

Attributes:
y_List[np.ndarray]

Functional observations (per-sample vectors).

t_List[np.ndarray]

Time grids corresponding to y_ (per-sample vectors).

flatten_func_data_FlattenFunctionalData

Flattened/validated dataset with fields: y, t, w, tid, unique_tid, inverse_tid_idx, sid, unique_sid, sid_cnt.

raw_cov_np.ndarray of shape (M, 5)

Raw covariance entries: (sid, t1, t2, w, cov).

smoothed_model_result_obs_SmoothedModelResult

Smoothed mean/covariance on the observation grid.

smoothed_model_result_reg_SmoothedModelResult

Smoothed mean/covariance on the regular grid.

fpca_model_params_FpcaModelParams

FPCA artifacts: eigen decomposition, selected phi/lambda, fitted covariances, rho, eigenvalue fit, etc.

num_pcs_int

Number of principal components.

xi_np.ndarray of shape (n_samples, k)

Estimated principal component scores.

xi_var_np.ndarray or List[np.ndarray]

Estimated per-sample score variances or covariance summaries (method-dependent).

fitted_y_mat_np.ndarray of shape (nt_obs, n_samples)

Fitted values on the observation grid (columns are samples).

fitted_y_List[np.ndarray]

Fitted values at observed time points per subject.

elapsed_time_Dict[str, float]

Timings (seconds) per pipeline stage.

Examples

>>> import numpy as np
>>> from pflm.fpca import FunctionalDataGenerator, FunctionalPCA
>>> t = np.linspace(0.0, 10.0, 51)
>>> gen = FunctionalDataGenerator(
...     t, lambda x: np.sin(x) * 0.5, lambda x: 1.0 + 0.2 * np.cos(x),
... )
>>> y_list, t_list = gen.generate(n=50, seed=42)
>>> fpca = FunctionalPCA().fit(t_list, y_list)
>>> fpca.xi_.shape[0]
50
>>> fpca.num_pcs_
2
fit(t: list[ndarray | list[float]], y: list[ndarray | list[float]], w: list[ndarray | list[float]] | None = None, method_pcs: Literal['IN', 'CE'] = 'CE', method_select_num_pcs: int | Literal['FVE', 'AIC', 'BIC'] = 'FVE', method_rho: Literal['truncated', 'ridge', 'vanilla'] = 'vanilla', max_num_pcs: int | None = None, if_impute_scores: bool = True, if_shrinkage: bool = False, if_fit_eigen_values: bool = False, fve_threshold: float = 0.99, reg_grid: ndarray | list[float] = None) FunctionalPCA[source][source]#

Fit the FPCA model: mean, covariance, eigen-structure, and scores.

Parameters:
tlist of array-like

Time vectors per sample; each is 1D with shape (n_i,).

ylist of array-like

Observations per sample; each is 1D with shape (n_i,).

wlist of array-like, optional

Per-sample weights; each is 1D with shape (n_i,) or sample-level weights compatible with the flattening utility.

method_pcs{“IN”, “CE”}, default=”CE”

Score computation method (In-sample or Conditional Expectation).

method_select_num_pcsint or {“FVE”, “AIC”, “BIC”}, default=”FVE”

Number of PCs selection strategy or fixed integer.

method_rho{“truncated”, “ridge”, “vanilla”}, default=”vanilla”

Rho strategy for CE (ignored for IN).

max_num_pcsint, optional

Upper bound for PCs; inferred if None.

if_impute_scoresbool, default=True

Whether to compute scores during fit.

if_shrinkagebool, default=False

Apply shrinkage for IN scores.

if_fit_eigen_valuesbool, default=False

Fit eigenvalues by projection regression.

fve_thresholdfloat, default=0.99

FVE threshold used when method_select_num_pcs=”FVE”.

reg_gridarray-like, optional

Regular grid for smoothing; if None, created uniformly over observed range.

Returns:
FunctionalPCA

The fitted estimator with smoothed artifacts and scores.

Notes

  • Requires at least two observations per sample to form covariance pairs.

  • Timing diagnostics are recorded in elapsed_time_ when verbose=True.

fit_score(method_pcs: Literal['IN', 'CE'] = 'CE', method_select_num_pcs: int | Literal['FVE', 'AIC', 'BIC'] = 'FVE', method_rho: Literal['truncated', 'ridge', 'vanilla'] = 'vanilla', max_num_pcs: int = 20, if_impute_scores: bool = True, if_shrinkage: bool = False, if_fit_eigen_values: bool = False, fve_threshold: float = 0.99) tuple[ndarray, list[ndarray], ndarray, list[ndarray]][source][source]#

Compute principal component scores given smoothed artifacts.

This method assumes fit has been called to produce smoothed mean/covariance and eigen-structure. It then selects the number of components and computes scores using the chosen method.

Parameters:
method_pcs{“IN”, “CE”}, default=”CE”

Score computation method. Numerical integration (IN) or Conditional Expectation (CE).

method_select_num_pcsint or {“FVE”, “AIC”, “BIC”}, default=”FVE”

Selection strategy for number of components or fixed integer.

method_rho{“truncated”, “ridge”, “vanilla”}, default=”vanilla”

Rho strategy for CE (ignored for IN).

max_num_pcsint

Upper bound for PCs.

if_impute_scoresbool, default=True

Whether to compute scores.

if_shrinkagebool, default=False

Apply shrinkage for IN scores.

if_fit_eigen_valuesbool, default=False

Fit eigenvalues by projection regression.

fve_thresholdfloat, default=0.99

FVE threshold used when method_select_num_pcs=”FVE”.

Returns:
xinp.ndarray of shape (n_samples, k)

Principal component scores.

xi_varnp.ndarray or List[np.ndarray]

Score variance summaries (method-dependent).

fitted_y_matnp.ndarray of shape (nt_obs, n_samples)

Fitted values on the observation grid.

fitted_yList[np.ndarray]

Fitted values at observed time points per subject.

Raises:
sklearn.exceptions.NotFittedError

If called before fit.

ValueError

If selection/scoring arguments are invalid.

Notes

  • CE scoring may estimate rho depending on method_rho.

  • When if_fit_eigen_values=True, eigenvalue_fit is stored in fpca_model_params_.

fitted_values() tuple[ndarray, list[ndarray]][source][source]#

Return fitted functional values on grids.

Returns:
fitted_y_matnp.ndarray of shape (nt_obs, n_samples)

Fitted values on the observation grid.

fitted_yList[np.ndarray]

Fitted values at actually observed time points per subject.

Raises:
sklearn.exceptions.NotFittedError

If called before fitting.

predict(y: list[ndarray | list[float]], t: list[ndarray | list[float]], w: list[ndarray | list[float]] | None = None, num_pcs: int | None = None) tuple[ndarray, list[ndarray], ndarray, list[ndarray]][source][source]#

Predict scores and fitted curves for new observations.

Parameters:
ylist of array-like

New observations per sample; each is 1D with shape (n_i,).

tlist of array-like

Matching time vectors per sample; each is 1D with shape (n_i,).

wlist of array-like, optional

Optional weights compatible with the flattening utility.

num_pcsint, optional

Number of PCs to use; if None, uses the number selected during fit.

Returns:
new_xinp.ndarray of shape (n_samples, k)

Predicted FPCA scores.

new_xi_varnp.ndarray or List[np.ndarray]

Predicted score variance summaries (method-dependent).

new_fitted_y_matnp.ndarray of shape (nt_obs, n_samples)

Predicted values on the observation grid.

new_fitted_yList[np.ndarray]

Predicted values at actually observed time points per subject.

Notes

  • The method of scoring follows the fitted configuration (method_pcs).

set_fit_request(*, fve_threshold: bool | None | str = '$UNCHANGED$', if_fit_eigen_values: bool | None | str = '$UNCHANGED$', if_impute_scores: bool | None | str = '$UNCHANGED$', if_shrinkage: bool | None | str = '$UNCHANGED$', max_num_pcs: bool | None | str = '$UNCHANGED$', method_pcs: bool | None | str = '$UNCHANGED$', method_rho: bool | None | str = '$UNCHANGED$', method_select_num_pcs: bool | None | str = '$UNCHANGED$', reg_grid: bool | None | str = '$UNCHANGED$', t: bool | None | str = '$UNCHANGED$', w: bool | None | str = '$UNCHANGED$') FunctionalPCA#

Configure whether metadata should be requested to be passed to the fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
fve_thresholdstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for fve_threshold parameter in fit.

if_fit_eigen_valuesstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for if_fit_eigen_values parameter in fit.

if_impute_scoresstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for if_impute_scores parameter in fit.

if_shrinkagestr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for if_shrinkage parameter in fit.

max_num_pcsstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for max_num_pcs parameter in fit.

method_pcsstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for method_pcs parameter in fit.

method_rhostr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for method_rho parameter in fit.

method_select_num_pcsstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for method_select_num_pcs parameter in fit.

reg_gridstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for reg_grid parameter in fit.

tstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for t parameter in fit.

wstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for w parameter in fit.

Returns:
selfobject

The updated object.

set_predict_request(*, num_pcs: bool | None | str = '$UNCHANGED$', t: bool | None | str = '$UNCHANGED$', w: bool | None | str = '$UNCHANGED$') FunctionalPCA#

Configure whether metadata should be requested to be passed to the predict method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to predict.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
num_pcsstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for num_pcs parameter in predict.

tstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for t parameter in predict.

wstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for w parameter in predict.

Returns:
selfobject

The updated object.