pflm.fpca.FunctionalPCA#

class FunctionalPCA(assume_measurement_error: bool = True, num_points_reg_grid: int = 51, mu_cov_params: FunctionalPCAMuCovParams | None = None, user_params: FunctionalPCAUserDefinedParams | None = None, verbose: bool = False)[source][source]#

Bases: BaseEstimator

Functional Principal Component Analysis (FPCA) for functional data.

Parameters:

assume_measurement_errorbool, default=True: Whether to estimate and account for measurement error.
num_points_reg_gridint, default=51: Number of points in the internal regular grid used for regression/interpolation.
mu_cov_paramsFunctionalPCAMuCovParams, default=FunctionalPCAMuCovParams(): Smoothing and bandwidth selection settings.
user_paramsFunctionalPCAUserDefinedParams, default=FunctionalPCAUserDefinedParams(): Optional user-specified mean/covariance/sigma2/rho to override parts of the pipeline.
verbosebool, default=False: If True, record and expose timing diagnostics in elapsed_time_.

Attributes:

y_List[np.ndarray]: Functional observations (per-sample vectors).
t_List[np.ndarray]: Time grids corresponding to y_ (per-sample vectors).
flatten_func_data_FlattenFunctionalData: Flattened/validated dataset with fields: y, t, w, tid, unique_tid, inverse_tid_idx, sid, unique_sid, sid_cnt.
raw_cov_np.ndarray of shape (M, 5): Raw covariance entries: (sid, t1, t2, w, cov).
smoothed_model_result_obs_SmoothedModelResult: Smoothed mean/covariance on the observation grid.
smoothed_model_result_reg_SmoothedModelResult: Smoothed mean/covariance on the regular grid.
fpca_model_params_FpcaModelParams: FPCA artifacts: eigen decomposition, selected phi/lambda, fitted covariances, rho, eigenvalue fit, etc.
num_pcs_int: Number of principal components.
xi_np.ndarray of shape (n_samples, k): Estimated principal component scores.
xi_var_np.ndarray or List[np.ndarray]: Estimated per-sample score variances or covariance summaries (method-dependent).
fitted_y_mat_np.ndarray of shape (nt_obs, n_samples): Fitted values on the observation grid (columns are samples).
fitted_y_List[np.ndarray]: Fitted values at observed time points per subject.
elapsed_time_Dict[str, float]: Timings (seconds) per pipeline stage.

See also

FunctionalPCAMuCovParams
FunctionalPCAUserDefinedParams

Examples

>>> import numpy as np
>>> from pflm.fpca import FunctionalDataGenerator, FunctionalPCA
>>> t = np.linspace(0.0, 10.0, 51)
>>> gen = FunctionalDataGenerator(
...     t, lambda x: np.sin(x) * 0.5, lambda x: 1.0 + 0.2 * np.cos(x),
... )
>>> y_list, t_list = gen.generate(n=50, seed=42)
>>> fpca = FunctionalPCA().fit(t_list, y_list)
>>> fpca.xi_.shape[0]
50
>>> fpca.num_pcs_
2

fit(t: list[ndarray | list[float]], y: list[ndarray | list[float]], w: list[ndarray | list[float]] | None = None, method_pcs: Literal['IN', 'CE'] = 'CE', method_select_num_pcs: int | Literal['FVE', 'AIC', 'BIC'] = 'FVE', method_rho: Literal['truncated', 'ridge', 'vanilla'] = 'vanilla', max_num_pcs: int | None = None, if_impute_scores: bool = True, if_shrinkage: bool = False, if_fit_eigen_values: bool = False, fve_threshold: float = 0.99, reg_grid: ndarray | list[float] = None) → FunctionalPCA[source][source]#

Fit the FPCA model: mean, covariance, eigen-structure, and scores.

Parameters:

tlist of array-like: Time vectors per sample; each is 1D with shape (n_i,).
ylist of array-like: Observations per sample; each is 1D with shape (n_i,).
wlist of array-like, optional: Per-sample weights; each is 1D with shape (n_i,) or sample-level weights compatible with the flattening utility.
method_pcs{“IN”, “CE”}, default=”CE”: Score computation method (In-sample or Conditional Expectation).
method_select_num_pcsint or {“FVE”, “AIC”, “BIC”}, default=”FVE”: Number of PCs selection strategy or fixed integer.
method_rho{“truncated”, “ridge”, “vanilla”}, default=”vanilla”: Rho strategy for CE (ignored for IN).
max_num_pcsint, optional: Upper bound for PCs; inferred if None.
if_impute_scoresbool, default=True: Whether to compute scores during fit.
if_shrinkagebool, default=False: Apply shrinkage for IN scores.
if_fit_eigen_valuesbool, default=False: Fit eigenvalues by projection regression.
fve_thresholdfloat, default=0.99: FVE threshold used when method_select_num_pcs=”FVE”.
reg_gridarray-like, optional: Regular grid for smoothing; if None, created uniformly over observed range.

Returns:

FunctionalPCA: The fitted estimator with smoothed artifacts and scores.

Notes

Requires at least two observations per sample to form covariance pairs.
Timing diagnostics are recorded in elapsed_time_ when verbose=True.

fit_score(method_pcs: Literal['IN', 'CE'] = 'CE', method_select_num_pcs: int | Literal['FVE', 'AIC', 'BIC'] = 'FVE', method_rho: Literal['truncated', 'ridge', 'vanilla'] = 'vanilla', max_num_pcs: int = 20, if_impute_scores: bool = True, if_shrinkage: bool = False, if_fit_eigen_values: bool = False, fve_threshold: float = 0.99) → tuple[ndarray, list[ndarray], ndarray, list[ndarray]][source][source]#

Compute principal component scores given smoothed artifacts.

This method assumes fit has been called to produce smoothed mean/covariance and eigen-structure. It then selects the number of components and computes scores using the chosen method.

Parameters:

method_pcs{“IN”, “CE”}, default=”CE”: Score computation method. Numerical integration (IN) or Conditional Expectation (CE).
method_select_num_pcsint or {“FVE”, “AIC”, “BIC”}, default=”FVE”: Selection strategy for number of components or fixed integer.
method_rho{“truncated”, “ridge”, “vanilla”}, default=”vanilla”: Rho strategy for CE (ignored for IN).
max_num_pcsint: Upper bound for PCs.
if_impute_scoresbool, default=True: Whether to compute scores.
if_shrinkagebool, default=False: Apply shrinkage for IN scores.
if_fit_eigen_valuesbool, default=False: Fit eigenvalues by projection regression.
fve_thresholdfloat, default=0.99: FVE threshold used when method_select_num_pcs=”FVE”.

Returns:

xinp.ndarray of shape (n_samples, k): Principal component scores.
xi_varnp.ndarray or List[np.ndarray]: Score variance summaries (method-dependent).
fitted_y_matnp.ndarray of shape (nt_obs, n_samples): Fitted values on the observation grid.
fitted_yList[np.ndarray]: Fitted values at observed time points per subject.

Raises:

sklearn.exceptions.NotFittedError: If called before fit.
ValueError: If selection/scoring arguments are invalid.

Notes

CE scoring may estimate rho depending on method_rho.
When if_fit_eigen_values=True, eigenvalue_fit is stored in fpca_model_params_.

fitted_values() → tuple[ndarray, list[ndarray]][source][source]#

Return fitted functional values on grids.

Returns:

fitted_y_matnp.ndarray of shape (nt_obs, n_samples): Fitted values on the observation grid.
fitted_yList[np.ndarray]: Fitted values at actually observed time points per subject.

Raises:

sklearn.exceptions.NotFittedError: If called before fitting.

Predict scores and fitted curves for new observations.

Parameters:

ylist of array-like: New observations per sample; each is 1D with shape (n_i,).
tlist of array-like: Matching time vectors per sample; each is 1D with shape (n_i,).
wlist of array-like, optional: Optional weights compatible with the flattening utility.
num_pcsint, optional: Number of PCs to use; if None, uses the number selected during fit.

Returns:

new_xinp.ndarray of shape (n_samples, k): Predicted FPCA scores.
new_xi_varnp.ndarray or List[np.ndarray]: Predicted score variance summaries (method-dependent).
new_fitted_y_matnp.ndarray of shape (nt_obs, n_samples): Predicted values on the observation grid.
new_fitted_yList[np.ndarray]: Predicted values at actually observed time points per subject.

Notes

The method of scoring follows the fitted configuration (method_pcs).

Configure whether metadata should be requested to be passed to the fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

fve_thresholdstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for fve_threshold parameter in fit.
if_fit_eigen_valuesstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for if_fit_eigen_values parameter in fit.
if_impute_scoresstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for if_impute_scores parameter in fit.
if_shrinkagestr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for if_shrinkage parameter in fit.
max_num_pcsstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for max_num_pcs parameter in fit.
method_pcsstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for method_pcs parameter in fit.
method_rhostr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for method_rho parameter in fit.
method_select_num_pcsstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for method_select_num_pcs parameter in fit.
reg_gridstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for reg_grid parameter in fit.
tstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for t parameter in fit.
wstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for w parameter in fit.

Returns:

selfobject: The updated object.

Configure whether metadata should be requested to be passed to the predict method.

The options for each parameter are:

True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to predict.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

num_pcsstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for num_pcs parameter in predict.
tstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for t parameter in predict.
wstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED: Metadata routing for w parameter in predict.

Returns:

selfobject: The updated object.