Theses

Explore the topics listed below, or feel free to propose a thesis topic of your own!

To ensure effective supervision, please prepare a short exposé (2 pages) and submit it to the assigned supervisor. The exposé should outline your chosen thesis topic(s), your motivation, a tentative structure for your work, and your desired submission deadline. Provide details on your current familiarity with the topic at hand and any preliminary progress (i.e., what literature you have read so far) you have made. Please also include your current transcript of records.

Social Data Science

TikTok has become a central platform for public discourse, shaping political campaigns, social movements, and cultural trends. In this thesis, you will analyze TikTok data to understand how content spreads, influences public opinion, or mobilizes communities. The thesis can focus on content analysis (e.g., text and video data), interaction patterns (e.g., likes, comments, shares), or sentiment analysis to understand audience engagement.
Please note that accessing TikTok data requires submitting a data access application, which should be filed one month before the planned start of your thesis.
If interested, please email anna-carolina.haensch@stat.uni-muenchen.de.

Financial regulators and central banks are increasingly integrating sustainability aspects into their operations. The Corporate Sustainability Reporting Directive (CSRD) mandates that ~50000 European companies will have to publish sustainability reports in the future, a great source of data for statistical analysis.

One particular challenge is that companies communicate their sustainability information through unstructured PDF reports that contain both numerical and textual data. To make this information amenable to quantitative research, GIST applies Natural Language Processing (NLP) and Large Language Models (LLMs) for data extraction.

Possible tasks include:

you could implement additional features using Python in our data extraction pipeline and/or compare different methodologies.
you could review, replicate and extend existing literature that makes use of sustainability reports.

If you are interested, please contact malte.schierholz@stat.uni-muenchen.de.

This thesis will explore the potential of Data Science for assessing the sustainability of land management and urban planning. The project deals with the implementation of regional or national guidelines for climate change adaptation and climate protection in urban planning. The central question is how Data Science can be used to evaluate political measures and administrative actions. The tasks are:

Inventory and categorization of topics, principles, goals, and measures in administrative documents and laws
Structuring the information from the texts using NLP techniques, particularly Large Language Models
Development of a framework with subject matter experts, through which categories can be classified, evaluated, and compared
Automatic analysis and evaluation of documents
Development of a clear visualization of the data

Dataset: Planning documents, which contain detailed information on the environmental condition, environmental risks, and necessary measures at the building level, as well as precise data on building type, height, and building density. Dataset of regional plans that outline requirements for urban planning. Additionally, there is the possibility to access flood maps, climate risk maps, etc., as well as court cases and lawsuits related to the plans. Federal states: NRW, Bavaria, and the Rhine-Main-Neckar region.

A specific research question can be developed in collaboration with the research team. The work requires an independent approach, an interest in interdisciplinary work, basic knowledge of sustainability and climate change topics, and good knowledge of German. There is the opportunity to offer a student assistant position as part of the Master's thesis. If interested, please send an email with a CV and a short cover letter to felicitas.sommer@tum.de and bolei.ma@lmu.de.

Methodological Research

Summary: Large Language Models (LLMs) are increasingly used in various natural language processing (NLP) tasks for evaluation. One area of interest is their application in handling multiple-choice questions, particularly those used in surveys with Likert scales to measure preferences, opinions, or similarity judgments. While LLMs demonstrate strong reasoning abilities, it remains unclear how consistent they are when faced with different variations of multiple-choice scales.

For instance, the same conceptual question may be presented with different response scales when asking how similar two given terms are to each other:

Version 1: A: Extremely similar, B: Very similar, C: Moderately similar, D: Slightly similar, E: Not similar at all
Version 2: A: Totally similar, B: Rather similar, C: Slightly similar, D: Slightly different, E: Rather different, F: Totally different

Given such variations, do LLMs provide consistent responses? How does their performance vary across different scales and question formulations? How can we assess the performance with different scales? Understanding these aspects is crucial for validating the reliability of LLM-based decision-making.

You will investigate these questions under well-defined experimental conditions. The empirical study will include but are not limited to data collection, LLM evaluation and consistency analysis.

Qualifications & Skills Required:

Background in Natural Language Processing (NLP) or Machine Learning.
Experience with Python and deep learning frameworks (e.g., Hugging Face Transformers, PyTorch).
Ability to work with large-scale language models.

Reference:

Balepur, Nishant, Rachel Rudinger, and Jordan Lee Boyd-Graber (2025). Which of These Best Describes Multiple Choice Evaluation with LLMs? A) Forced B) Flawed C) Fixable D) All of the Above. https://arxiv.org/abs/2502.14127

Application Process: Interested candidates should submit a CV and a copy of their current transcripts of records to bolei.ma@lmu.de and c.haensch@lmu.de.

Training machine learning (ML) models relies on annotated (or also called labeled) training data. Large Language Models (LLMs) offer great potential for data annotation. However, human annotations are likely still needed for difficult or ambiguous annotations. A reasonable collaboration setup of LLM and human annotators could assign the easier instances to the LLM and the more complicated ones to humans. This approach could reduce annotation costs and let the human annotators focus on the more ambiguous cases. Best practices for allocating annotation tasks between humans and LLM, in particular for subjective tasks, are yet to be developed. In this thesis you could develop and test algorithms for allocating tasks between the two annotators and study their impact on quality and cost. Indicators to route an instance to the human (expert ?) annotator could for example be a self-assessment of the LLMs certainty. If interested, please email your CV and a brief explanation of interest in the topic to jacob.beck@stat.uni-muenchen.de and CC soda@stat.uni-muenchen.de.

The way in which an annotation task is structured affects the annotations that human annotators provide, a result called annotation sensitivity. For example, the order in which annotations are collected, and the number of screens, can change whether tweets are annotated as containing hate speech or offensive language (https://dl.acm.org/doi/10.1007/978-3-031-21707-4_19; https://aclanthology.org/2024.uncertainlp-1.8/). With the growing use of LLMs as annotators, we wonder whether LLMs also show annotation sensitivity. Since LLMs are built on data produced by humans, it might be that the models inherit similar biases. In this thesis, you could replicate findings from the above studies with LLM annotators. If interested, please email your CV and a brief explanation of interest in the topic to jacob.beck@stat.uni-muenchen.de and CC soda@stat.uni-muenchen.de.

We are seeking a motivated Master's student to embark on a methodological thesis project aimed at extending the scope of substantive-model compatible (SMC)-FCS multiple imputation (MI) techniques in discrete-time survival analysis (DTSA) to accommodate time-varying variables. Building on our existing work, which has successfully extended SMC-FCS MI for time-invariant covariates, this project will tackle the additional complexities introduced by time-varying variables. The successful candidate will conduct comprehensive Monte Carlo simulations to evaluate the extended methodology, and contribute to refining the practice of discrete-time survival analysis in the presence of missing data. If you are interested, please contact anna-carolina.haensch@stat.uni-muenchen.de.

The swift advancement of artificial intelligence (AI) has sparked both enthusiasm and concerns, with governments embracing the new technology for administrative efficiency, but also regulating it to protect the rights of citizens. To navigate this dynamic environment, policymakers require timely and dependable data on the attitudes of the population towards AI. Despite calls for research (e.g., Montag et al., 2024), no established short scale exists to measure these attitudes systematically within large survey panels such as the SOEP, which are an important foundation for evidence-based policy design.

Our research group is developing a concise and reliable scale to assess AI attitudes, examining variations across demographic groups in Germany and their impacts on technology acceptance or life outcomes such as career opportunities. Thesis students will collaborate with an experienced research team, gain practical skills in survey methodology and statistics, and contribute to an ongoing research project with a clear path toward academic publication.

We welcome applications from motivated sociology and statistics students with an interest in technology (policy) and public opinion research. A bachelor thesis would be more focussed on theoretical foundations, including a literature review and the design of a survey experiment with hypothesis building, while a master thesis would also have time to collect and analyze data from such an experiment, e.g., using factor and regression analyses.

Interested? Then reach out with your CV / transcript and to marcus.novotny@stat.uni-muenchen.de & christoph.kern@stat.uni-muenchen.de for a first meeting. We are looking forward to hearing from you!

Fairness

Automated Machine Learning and Hyperparameter optimization techniques can be used to tune fairness-aware machine learning models that trade off predictive accuracy and a fairness measure (for example, equality of opportunity). However, a recent study challenges the assumption that there is a fairness-accuracy tradeoff, and suggests that only a few noisy samples per dataset are responsible for the perceived unfairness. When learning to abstain from such noisy samples using a bagging-based classifier, the study claims that standard models already produce fair predictions. However, this comes with the cost of having several data points not classified, and ideally, one would like to minimize the number of data points the classifier abstains from. The goal of this thesis is to find out if we can:

reproduce the findings from the study on a large and cleaned set of datasets for fairness research, and
minimize the number of samples from which the classifier abstains via hyperparameter optimization (while maintaining good predictive performance), and
research if calibration of the classifier (on out-of-bag data) could be used to further improve performance.

Alternatively, one can try to reproduce findings from other studies, such as the ones from Perrone et al. or Cruz and Hardt in light of these findings. If you are interested, please contact christoph.kern@stat.uni-muenchen.de, matthias.feurer@stat.uni-muenchen.de and cc anna-carolina.haensch@stat.uni-muenchen.de.

Public agencies are increasingly automating the allocation of scarce public resources by making use of risk prediction models. While a wide range of studies focuses on bias in the application of such models, the long-term fairness implications of algorithmically assisted decisions are not fully understood. Building on the emerging literature of dynamic fairness, this project aims at studying feedback loops and the long-term consequences of algorithmic decision-making in social contexts. If you are interested, please contact christoph.kern@stat.uni-muenchen.de and cc anna-carolina.haensch@stat.uni-muenchen.de.

ML methods are increasingly used in combination with ideas from the causal inference literature to explore heterogeneous treatment effects. Such approaches are useful, for example, for personalizing treatments in medicine or for selecting optimal treatment regimes in the delivery of welfare state measures. While topics such as explainability and transparency have already been studied in the past (see, e.g. policy trees), the connection of the causal learning literature to the fairML literature is still weak. However, it is well known that there are many biases present in data used for developing personalized treatments in medicine or in access to welfare state measures. Therefore, we seek students interested in exploring the connection between causal learning and fairML. If you are interested, please contact christoph.kern@stat.uni-muenchen.de, r.bach@uni-mannheim.de, and cc anna-carolina.haensch@stat.uni-muenchen.de.

Research suggests that the successful integration of refugees and asylum seekers may depend on the location to which they are assigned, as certain locations may be better suited to certain refugee characteristics (Bansak et al. (2018)).

To this end, a number of algorithmic refugee location matching tools have been developed. These include tools such as GeoMatch and AnnieTM Moore. These tools use supervised machine learning to predict refugee integration outcomes in potential assignment locations. Further, they apply optimal matching approaches to strategically assign refugees to locations where the probability of a desired integration outcome is maximised.

Although the tools are designed to support rather than replace human decision-makers, they raise important questions about their reliability and fairness.

To this end, it is crucial to evaluate existing matching tools with a focus on fairness. For this purpose, extensive data from the IAB-BAMF-SOEP Survey on Refugees in the German Context (SOEP) will be used.

If you are interested, please contact clara.strasserceballos@stat.uni-muenchen.de and christoph.kern@stat.uni-muenchen.de

Statistical Education

This thesis focuses on creating and adapting teaching materials on data literacy and evidence-based decision-making specifically tailored for church leadership. Drawing from established resources such as the Data Literacy and Evidence Building book, the project aims to design practical, accessible materials that equip leaders in church contexts to make informed, data-driven decisions in increasingly complex environments. The project will involve reviewing existing materials, and adapting them to the specific needs of church leadership under the guidance of Prof. Ongono and Prof. Wollbold from the Faculty of Catholic Theology and the guidance of Dr. Haensch from the Statistics Institute. It will also involve creating synthetic data as well as a selection of simple R or Python notebooks for training under the guidance of Dr. Haensch from the Statistics Institute. This work will contribute to strengthening leadership training by providing a foundation for data-informed strategies while respecting the values and unique challenges of church organizations.
If you are interested, please contact anna-carolina.haensch@stat.uni-muenchen.de.

Mild cognitive impairment (MCI), affecting over 15% of adults aged 50 and above, often progresses to dementia, underscoring the importance of early detection. This thesis project focuses on developing innovative, machine learning-based diagnostics for MCI using non-invasive data collection methods. Unlike traditional approaches that rely on extensive neuropsychological testing unsuitable for widespread screening, this project proposes the use of machine learning algorithms to analyze computer use behaviors, particularly mouse movement data.
Participants in a large Internet panel, engaging with surveys on various digital devices, will have their mouse movements recorded. This data will serve as the foundation for developing algorithms capable of predicting levels of cognitive functioning and identifying early signs of MCI based on how participants interact with standardized tasks and questionnaires. This project presents a unique opportunity for students to contribute to critical advancements in medical diagnostics, offering a cost-efficient, automated, and unobtrusive method to potentially delay the onset of severe dementia symptoms. If interested, please contact felix.henninger@stat.uni-muenchen.de and CC soda@stat.uni-muenchen.de.

Official Statistics

The Federal Statistical Office of Germany (Destatis) offers a wide range of topics for scientific theses on Bachelor and Master level. In many cases, the university supervision can also be arranged through us. If you are interested, feel free to contact Dr. Caro Haensch (c.haensch@lmu.de) for further information.

Contact Person

Dr. Anna-Carolina Haensch

Send an email

More