Machine Learning Approaches to Latent Variable Modeling

Project Description

Multi-item batteries are frequently used in social scientific surveys to examine latent traits and to scale individuals on a single construct. Individual latent variable scores based on observed responses to items are used for psychopathological diagnoses as well as assessment of abilities and personality in occupations and education. If these latent traits are to be meaningfully used for substantive analyses, one must assume measurement invariance. However, especially in the context of large scale surveys, the measurement invariance assumption rarely holds because of the heterogeneous nature of the survey samples. Measurement non-invariance is also referred to as differential item functioning (DIF). By using data-driven, algorithmic approaches, it is possible to detect subgroups with DIF when little theoretical guidance on the relevant subgroups is available. We propose and compare model-based recursive partitioning (MOB) techniques for detecting DIF with a focus on measurement models with multiple latent variables. Such models may be referred to as multidimensional graded response (MGR) models. Additionally, we propose a method we call latent variable forest (LV Forest) for estimating unbiased latent variable scores. LV Forest can be used for latent variable score estimation, especially if the assumed latent variable model does not fit the data and/or includes parameter estimates that are unstable with respect to construct irrelevant covariates. Furthermore, we propose a method to efficiently estimate parameter instability in MGR models and to drastically reduce computation time of MOB for MGR models.

Contact Person

Prof. Dr. Christoph Kern

Chair of Statistics and Data Science in Social Sciences and the Humanities (SODA)

Send an email

More

Project Team

Name	Email
Kern, Christoph	christoph.kern@stat.uni-muenchen.de
Classe, Franz

Publications

Classe, F. and Kern, C. (2024). Latent Variable Forests for Latent Variable Score Estimation. Educational and Psychological Measurement. https://doi.org/10.1177/00131644241237502
Classe, F. and Kern, C. (2024). Detecting Differential Item Functioning in Multidimensional Graded Response Models With Recursive Partitioning. Applied Psychological Measurement. https://doi.org/10.1177/01466216241238743
Classe, F. L., & Steyer, R. (2023). A Probit Multistate IRT Model With Latent Item Effect Variables for Graded Responses. European Journal of Psychological Assessment. Advance online publication. https://doi.org/10.1027/1015-5759/a000751

Machine Learning Approaches to Latent Variable Modeling

Project Description

Contact Person

Project Team

Publications

What are you looking for?