
ePCA: Exponential family PCA
Description
**Faculty Candidate - Joint Search with MIT Institute for Data, Systems, and Society (IDSS)**
Many applications, such as photon-limited imaging, neuroscience, and genomics, involve large datasets with entries from exponential family distributions. It is of interest to estimate the covariance structure and principal components of the noiseless distribution. Principal Component Analysis (PCA), the standard method for this setting, can be inefficient for non-Gaussian noise. In this talk we present ePCA, a methodology for PCA on exponential family distributions. ePCA involves the eigendecomposition of a new covariance matrix estimator, constructed in a deterministic non-iterative way using moment calculations, shrinkage, and random matrix theory. We provide several theoretical justifications for our estimator, including the Marchenko-Pastur law in high dimensions. We illustrate ePCA by denoising single-molecule diffraction maps obtained using photon-limited X-ray free electron laser (XFEL) imaging. This is joint work with Lydia T. Liu and Amit Singer.