Reddit Data as a New Tool and Source for Social Research

Project Description

The use of non-traditional data (i.e., data collected from non-probability sample surveys, passive data, or Big Data) to supplement or replace survey data is growing. However, these data are not without weaknesses; they suffer from their own sources of error, access challenges, and confidentiality concerns. This project uses survey data collected on and posts scraped from to answer three research questions:

1) Can social media data be used to accurately assess social attitudes?

2) What are the sources of error in social media data?

3) What variability in the conclusions drawn from these data is introduced by the researcher’s choice in analytic methods?

In addition to the research questions, this project also offers some descriptions of the data and access to it so future Reddit data users can further refine their budgets, timelines, and expectations.


  • Achimescu, V. und Chachev, P. D. (2021). Raising the flag: Monitoring user-perceived dis­information on reddit. Information, 12, 4.
  • Amaya, A., Bach, R. L., Keusch, F. und Kreuter, F. (2019). New data sources in social science research: Things to know before working with Reddit data. Social Science Computer Review : SSCORE, 1-10.
  • Amaya, A., Bach, R. L., Kreuter, F. und Keusch, F. (2020). Measuring the strength of attitudes in social media data. In Big data meets survey science : a collection of innovative methods (S. 163-192). Hoboken, NJ: John Wiley & Sons.