Project Description
The recent development and large-scale proliferation of large language models (LLMs), such as OpenAI’s GPT or Meta’s Llama, have spurred discussions about the extent to which these language models can be used for research in the social and behavioral sciences. This includes augmenting survey data collection and analysis. Research has started to examine to what extent LLM-generated “synthetic samples” could complement or replace traditional surveys, considering their training data potentially reflects attitudes and behaviors prevalent in the population. However, several contextual factors related to the relationship between the respective target population and LLM training data might limit such applications. In this project, we investigate the extent to which LLMs can estimate public opinion in countries with different digital, social, political, and linguistic settings. By examining the prediction of voting behavior using LLMs in new contexts, our studies contribute to the growing body of research about the conditions under which LLMs can be leveraged for studying public opinion.
Current Application: Predicting the 2024 European Elections with GPT
Contact Person
Publications
- von der Heyde, L., Haensch, A., & Wenz, A. (2024). United in Diversity? Contextual Biases in LLM-Based Predictions of the 2024 European Parliament Elections. https://arxiv.org/abs/2409.09045
- von der Heyde, L., Haensch, A., & Wenz, A. (2023). Vox Populi, Vox AI? Using Language Models to Estimate German Public Opinion. https://arxiv.org/abs/2407.08563
- von der Heyde, L., Haensch, A., & Wenz, A. (2023). Assessing Bias in LLM-Generated Synthetic Datasets: The Case of German Voter Behavior. https://doi.org/10.31235/osf.io/97r8s
- Bolei Ma, Xinpeng Wang, Tiancheng Hu, Anna-Carolina Haensch, Michael A. Hedderich, Barbara Plank, and Frauke Kreuter. 2024. The Potential and Challenges of Evaluating Attitudes, Opinions, and Values in Large Language Models. In Findings of the Association for Computational Linguistics: EMNLP 2024, pages 8783–8805, Miami, Florida, USA. Association for Computational Linguistics.
- Bolei Ma, Berk Yoztyurk, Anna-Carolina Haensch, Xinpeng Wang, Markus Herklotz, Frauke Kreuter, Barbara Plank, Matthias Assenmacher. 2024. Algorithmic Fidelity of Large Language Models in Generating Synthetic German Public Opinions: A Case Study. Preprint.