Biostatistics Weekly Seminar


The Paradoxes, Perplexities, and Power of Factor Analysis

Tyler J. VanderWeele, PhD
John L. Loeb and Frances Lehman Loeb Professor of Epidemiology
Departments of Epidemiology and Biostatistics
Harvard T.H. Chan School of Public Health

Factor analysis is often employed to evaluate the extent to which a single factor suffices to explain the variation in the individual indicators, or alternatively to identify clusters of indicators that are strongly correlated with one another. However, the conclusions drawn from factor analysis often extend beyond what statistical analyses have in fact established. Often the resulting factors are each interpreted as corresponding to a structural univariate latent variable that is itself causally efficacious. I show that this assumption is in fact so strong that it has empirically testable implications, even though the supposed latent variable is unobserved; statistical tests are proposed that can often reject this underlying assumption. Factor analysis also suffers from the inability to distinguish between associations arising from causal versus conceptual relations, and if two supposed factors were to causally affect one another then, in many settings, over time, the process will converge to a factor model wherein only a single factor can be detected if one uses a single wave of data. Factor analysis further suffers from the problem that if different indicators are used to assess different portions of the distribution of an underlying univariate latent variable (as might arise from the use of negatively worded items in surveys), then factor analysis can suggest that two factors are present even though the data are in fact generated by only one. Examples of each these various phenomena are given from the psychology and biomedical literature concerning causal relations between depression and anxiety, differential associations with mortality of various indicators of life satisfaction, and supposedly different factors corresponding to optimism and pessimism. Despite these severe limitations, factor analyses, perhaps paradoxically, can nevertheless often be very informative, but the phenomena above require an appropriate reinterpretation of factor analysis results as reflecting a combination of causal, conceptual, and distributional relations.


Virtual: Zoom Link to Follow
7 February 2024
1:30pm


Speaker Itinerary

Topic revision: r2 - 23 Jan 2024, CierraStreeter
 

This site is powered by FoswikiCopyright © 2013-2022 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Vanderbilt Biostatistics Wiki? Send feedback