Theses
We are always looking for motivated students who are interested in writing about a topic connected to our current research projects!
We are always looking for motivated students who are interested in writing about a topic connected to our current research projects!
To ensure effective supervision, please prepare a short exposé (2 pages) and submit it to the assigned supervisor. The exposé should outline your chosen thesis topic(s), your motivation, a tentative structure for your work, and your desired submission deadline. Provide details on your current familiarity with the topic at hand and any preliminary progress (i.e., what literature you have read so far) you have made. Please also include your current transcript of records.
TikTok has become a central platform for public discourse, shaping political campaigns, social movements, and cultural trends. In this thesis, you will analyze TikTok data to understand how content spreads, influences public opinion, or mobilizes communities. The thesis can focus on content analysis (e.g., text and video data), interaction patterns (e.g., likes, comments, shares), or sentiment analysis to understand audience engagement.
Please note that accessing TikTok data requires submitting a data access application, which should be filed one month before the planned start of your thesis.
If interested, please email anna-carolina.haensch@stat.uni-muenchen.de.
Due to Caro Haensch’s availability, supervision of this thesis topic will be possible again from April 2026 onwards. If you are interested, please get in touch in mid-April 2026.
Financial regulators and central banks are increasingly integrating sustainability aspects into their operations. The Corporate Sustainability Reporting Directive (CSRD) mandates that ~50000 European companies will have to publish sustainability reports in the future, a great source of data for statistical analysis.
One particular challenge is that companies communicate their sustainability information through unstructured PDF reports that contain both numerical and textual data. To make this information amenable to quantitative research, GIST applies Natural Language Processing (NLP) and Large Language Models (LLMs) for data extraction.
Possible tasks include:
If you are interested, please contact malte.schierholz@stat.uni-muenchen.de.
Are you interested in how AI can enhance democratic processes? This thesis offers a unique opportunity to explore whether and how Large Language Models (LLMs) can make gathering public opinion more actionable for policymakers without misrepresenting subgroup views.
The project builds on publicly accessible data from the EU’s policy feedback platform to fine-tune LLMs on current issues, and quantitatively assess how well they represent public discourse overall and for population subgroups. You will also explore privacy considerations and the user experience of policymakers working with such a tool.
This project builds on the chair's expertise in trustworthy ML, LLM alignment and LLM-assisted surveys. We seek motivated students with an interest in public policy and qualitative data analysis. Prior experience working with LLMs through APIs is helpful, but can be developed during the project.
Interested? Then reach out with your CV/transcript to mail@marcusnovotny.com & christoph.kern@stat.uni-muenchen.de for a first meeting. Looking forward to hearing from you!
This thesis will explore the potential of Data Science for assessing the sustainability of land management and urban planning. The project deals with the implementation of regional or national guidelines for climate change adaptation and climate protection in urban planning. The central question is how Data Science can be used to evaluate political measures and administrative actions. The tasks are:
Dataset: Planning documents, which contain detailed information on the environmental condition, environmental risks, and necessary measures at the building level, as well as precise data on building type, height, and building density. Dataset of regional plans that outline requirements for urban planning. Additionally, there is the possibility to access flood maps, climate risk maps, etc., as well as court cases and lawsuits related to the plans. Federal states: NRW, Bavaria, and the Rhine-Main-Neckar region.
A specific research question can be developed in collaboration with the research team. The work requires an independent approach, an interest in interdisciplinary work, basic knowledge of sustainability and climate change topics, and good knowledge of German. There is the opportunity to offer a student assistant position as part of the Master's thesis. If interested, please send an email with a CV and a short cover letter to felicitas.sommer@tum.de and bolei.ma@lmu.de.
Training machine learning (ML) models relies on annotated (or also called labeled) training data. Large Language Models (LLMs) offer great potential for data annotation. However, human annotations are likely still needed for difficult or ambiguous annotations. A reasonable collaboration setup of LLM and human annotators could assign the easier instances to the LLM and the more complicated ones to humans. This approach could reduce annotation costs and let the human annotators focus on the more ambiguous cases. Best practices for allocating annotation tasks between humans and LLM, in particular for subjective tasks, are yet to be developed. In this thesis you could develop and test algorithms for allocating tasks between the two annotators and study their impact on quality and cost. Indicators to route an instance to the human (expert ?) annotator could for example be a self-assessment of the LLMs certainty. If interested, please email your CV and a brief explanation of interest in the topic to jacob.beck@stat.uni-muenchen.de and CC soda@stat.uni-muenchen.de.
The way in which an annotation task is structured affects the annotations that human annotators provide, a result called annotation sensitivity. For example, the order in which annotations are collected, and the number of screens, can change whether tweets are annotated as containing hate speech or offensive language (https://dl.acm.org/doi/10.1007/978-3-031-21707-4_19; https://aclanthology.org/2024.uncertainlp-1.8/). With the growing use of LLMs as annotators, we wonder whether LLMs also show annotation sensitivity. Since LLMs are built on data produced by humans, it might be that the models inherit similar biases. In this thesis, you could replicate findings from the above studies with LLM annotators. If interested, please email your CV and a brief explanation of interest in the topic to jacob.beck@stat.uni-muenchen.de and CC soda@stat.uni-muenchen.de.
We are seeking a motivated Master's student to embark on a methodological thesis project aimed at extending the scope of substantive-model compatible (SMC)-FCS multiple imputation (MI) techniques in discrete-time survival analysis (DTSA) to accommodate time-varying variables. Building on our existing work, which has successfully extended SMC-FCS MI for time-invariant covariates, this project will tackle the additional complexities introduced by time-varying variables. The successful candidate will conduct comprehensive Monte Carlo simulations to evaluate the extended methodology, and contribute to refining the practice of discrete-time survival analysis in the presence of missing data. If you are interested, please contact anna-carolina.haensch@stat.uni-muenchen.de.
Due to Caro Haensch’s availability, supervision of this thesis topic will be possible again from April 2026 onwards. If you are interested, please get in touch in mid-April 2026.
The exponential growth of scientific literature makes comprehensive literature reviews increasingly challenging for individual researchers, and slows the turnover of core ideas (Pan et al. 2018). We aim to address this issue by developing a transparent and reusable LLM pipeline to automatically summarize empirical evidence across fields. The envisioned system will parse full paper texts to systematically extract variables and causal links, and summarize them in a graphical user interface. The contribution lies in both an exemplary review of a field of your choice, and the software artefact for reuse.
This project offers an exciting master thesis opportunity on the science of science. A gold standard dataset of ~150 manually coded studies is available for training purposes, and supervisors have deep expertise in the topic. You will gain practical experience in both literature analysis and scientific software development. Ideal candidates have a background in statistics and/or computer science, and are motivated to deepen their experience in developing effective & reproducible prompting strategies.
Interested? Then reach out with your CV/transcript to marcus.novotny@stat.uni-muenchen.de & christoph.kern@stat.uni-muenchen.de for a first meeting. Looking forward to hearing from you!
Public agencies are increasingly automating the allocation of scarce public resources by making use of risk prediction models. While a wide range of studies focuses on bias in the application of such models, the long-term fairness implications of algorithmically assisted decisions are not fully understood. Building on the emerging literature of dynamic fairness, this project aims at studying feedback loops and the long-term consequences of algorithmic decision-making in social contexts. If you are interested, please contact christoph.kern@stat.uni-muenchen.de and cc anna-carolina.haensch@stat.uni-muenchen.de.
Research in algorithmic fairness and responsible AI has proposed numerous technical definitions of model fairness and associated metrics (e.g. Mitchell et al. 2022). These metrics typically encode different normative perspectives and are often in conflict with each other – i.e., the same model may not be able to comply with all metrics at the same time. These incompatibilities raise critical questions regarding which fairness concepts should be prioritized in a given application context.
While prior work, such as Makhlouf et al. (2022), provide guidelines in the form of “Fairness Trees” to help navigate the various proposed metrics, this project aims to use participatory approaches to understand how public stakeholders would choose between different fairness concepts and metrics in practice. We therefore seek students interested in exploring the connections between responsible AI, fairness perceptions and participatory approaches. If you are interested, please contact christoph.kern@stat.uni-muenchen.de and cc anna-carolina.haensch@stat.uni-muenchen.de.
This thesis focuses on creating and adapting teaching materials on data literacy and evidence-based decision-making specifically tailored for church leadership. Drawing from established resources such as the Data Literacy and Evidence Building book, the project aims to design practical, accessible materials that equip leaders in church contexts to make informed, data-driven decisions in increasingly complex environments. The project will involve reviewing existing materials, and adapting them to the specific needs of church leadership under the guidance of Prof. Ongono and Prof. Wollbold from the Faculty of Catholic Theology and the guidance of Dr. Haensch from the Statistics Institute. It will also involve creating synthetic data as well as a selection of simple R or Python notebooks for training under the guidance of Dr. Haensch from the Statistics Institute. This work will contribute to strengthening leadership training by providing a foundation for data-informed strategies while respecting the values and unique challenges of church organizations.
If you are interested, please contact anna-carolina.haensch@stat.uni-muenchen.de.
Due to Caro Haensch’s availability, supervision of this thesis topic will be possible again from April 2026 onwards. If you are interested, please get in touch in mid-April 2026.
The Federal Statistical Office of Germany (Destatis) offers a wide range of topics for scientific theses on Bachelor and Master level. Please first check directly with Destatis whether the desired topic is still available. Afterwards, submit a short exposé (1–2 pages) on the topic, your CV, and a transcript of records to us (soda@stat.uni-muenchen.de) so that we can try to arrange university supervision through us.