Project Description
The recent development and large-scale proliferation of large language models (LLMs), such as OpenAI’s GPT or Meta’s Llama, have spurred discussions about the extent to which these language models can be used for research in the social and behavioral sciences. This includes augmenting survey data collection and analysis. Research has started to examine to what extent LLM-generated “synthetic samples” could complement or replace traditional surveys, considering their training data potentially reflects attitudes and behaviors prevalent in the population. However, several contextual factors related to the relationship between the respective target population and LLM training data might limit such applications. In this project, we investigate the extent to which LLMs can estimate public opinion in countries with different digital, social, political, and linguistic settings. By examining the prediction of voting behavior using LLMs in new contexts, our studies contribute to the growing body of research about the conditions under which LLMs can be leveraged for studying public opinion.
Current Application: Predicting the 2024 European Elections with GPT
Contact Person
Publications
- von der Heyde, L., Haensch, A., & Wenz, A. (2023). Vox Populi, Vox AI? Using Language Models to Estimate German Public Opinion. https://arxiv.org/abs/2407.08563
- von der Heyde, L., Haensch, A., & Wenz, A. (2023). Assessing Bias in LLM-Generated Synthetic Datasets: The Case of German Voter Behavior. https://doi.org/10.31235/osf.io/97r8s
- von der Heyde, L., Haensch, A., & Wenz, A. (2024). United in Diversity? Contextual Biases in LLM-Based Predictions of the 2024 European Parliament Elections. https://arxiv.org/abs/2409.09045