Theses

Explore the topics listed below, or feel free to propose a thesis topic of your own!

To ensure effective supervision, please prepare a short exposé (2 pages) and submit it to the assigned supervisor. The exposé should outline your chosen thesis topic(s), your motivation, a tentative structure for your work, and your desired submission deadline. Provide details on your current familiarity with the topic at hand and any preliminary progress (i.e., what literature you have read so far) you have made. Please also include your current transcript of records.

Social Data Science

Financial regulators and central banks are increasingly integrating sustainability aspects into their operations. The Corporate Sustainability Reporting Directive (CSRD) mandates that ~50000 European companies will have to publish sustainability reports in the future, a great source of data for statistical analysis.

One particular challenge is that companies communicate their sustainability information through unstructured PDF reports that contain both numerical and textual data. To make this information amenable to quantitative research, GIST applies Natural Language Processing (NLP) and Large Language Models (LLMs) for data extraction.

Possible tasks include:

you could implement additional features using Python in our data extraction pipeline and/or compare different methodologies.
you could review, replicate and extend existing literature that makes use of sustainability reports.

If you are interested, please contact malte.schierholz@stat.uni-muenchen.de.

This thesis will explore the potential of Data Science for assessing the sustainability of land management and urban planning. The project deals with the implementation of regional or national guidelines for climate change adaptation and climate protection in urban planning. The central question is how Data Science can be used to evaluate political measures and administrative actions. The tasks are:

Inventory and categorization of topics, principles, goals, and measures in administrative documents and laws
Structuring the information from the texts using NLP techniques, particularly Large Language Models
Development of a framework with subject matter experts, through which categories can be classified, evaluated, and compared
Automatic analysis and evaluation of documents
Development of a clear visualization of the data

Dataset: Planning documents, which contain detailed information on the environmental condition, environmental risks, and necessary measures at the building level, as well as precise data on building type, height, and building density. Dataset of regional plans that outline requirements for urban planning. Additionally, there is the possibility to access flood maps, climate risk maps, etc., as well as court cases and lawsuits related to the plans. Federal states: NRW, Bavaria, and the Rhine-Main-Neckar region.

A specific research question can be developed in collaboration with the research team. The work requires an independent approach, an interest in interdisciplinary work, basic knowledge of sustainability and climate change topics, and good knowledge of German. There is the opportunity to offer a student assistant position as part of the Master's thesis. If interested, please send an email with a CV and a short cover letter to felicitas.sommer@tum.de and bolei.ma@lmu.de.

Methodological Research

The paper Valid Survey Simulations with Limited Human Data (Krsteski et al., 2025) provides a strong foundation for a Master's thesis that connects LLM-based survey simulation with the framework of Prediction-Powered Inference (PPI). Rather than treating synthetic responses as ground truth, PPI offers a principled statistical approach: LLM-generated survey data serves as the "prediction" component, while a small set of real human responses acts as the labeled sample used to debias and correct the estimates. This naturally reframes rectification not as an ad hoc postprocessing step, but as a statistically valid correction with formal guarantees on coverage and error rates.
The thesis would replicate the core experiments from Krsteski et al. and extend them by conducting a large-scale practicability study that evaluates the real-world feasibility of PPI-based survey simulation at scale. Concretely, this involves assessing sample efficiency (how much human data is actually needed for valid correction), computational cost across model sizes and survey contexts, and the robustness of PPI guarantees under realistic deployment conditions such as distributional shift or limited demographic diversity in the rectification sample. The goal is to move beyond methodological proof-of-concept toward a systematic, empirically grounded assessment of when and under what conditions PPI-corrected LLM simulations can realistically substitute or augment traditional survey data collection.

Please contact C.Haensch@lmu.de

Social science researchers often need to combine data from different surveys or sources. These sources may measure similar things but use different categories or codes. To use them together, researchers must harmonize (or match up) the data after collecting it -- creating Ex-Post Harmonised datasets. For example, if two surveys use different codes for job types, researchers may need to change the codes to match each other, combine or split categories, and redistribute data between old and new categories. Significant time and effort is invested in creating ex-post harmonised datasets. However, this has not been a standard way to document their preparation, making it difficult for other researchers to understand and reuse prior work.

The Crossmaps framework [link] address this issue of transparency by representing the ex-post harmonisation process as a graph. The {xmap} R package [link] implements data structures and validation functions for the framework, providing built-in safeguards to avoid data leakage and graph-based methods for standardised documentation.

This thesis offers opportunities to work on applied and theoretical extensions of the framework according to your interests. Students interested in statistical programming and data visualisation could work on package development and documentation, as well as functions for visualising crossmap graphs. Others interested in data preprocessing and multiverse analyses could examine the impact of alternative harmonisation strategies on downstream outcomes. Finally, more theoretically focused students may be interested in examining ex-post harmonisation through the lens of missing data imputation.

If interested, please email your CV/transcript and a short topic proposal based on your interests to cynthia.huang@lmu.de.

Current benchmarks for evaluating values, opinions and behaviors in LLMs are static, US-centric, and lack in generalization and representativeness. At the same time, social scientists have built high-quality data infrastructures to accurately measure attitudes and values across populations and subgroups. This project seeks to utilize social surveys to illustrate how these data sources can be used for LLM evaluations. Building on recent efforts such as folktables and folktexts, the goal is to build a data processing pipeline and interface that allows AI researchers to access data distributions from selected social surveys for model evaluation and alignment.

For this thesis project, we are seeking motivated students with strong programming skills and interest in model evaluations and social data science. If interested, please email your CV/transcript to christoph.kern@stat.uni-muenchen.de.

The paper Valid Survey Simulations with Limited Human Data: The Roles of Prompting, Fine-Tuning, and Rectification (Krsteski et al., 2025) provides a strong foundation for a Bachelor’s or Master’s thesis that both replicates and extends its core approach. The thesis could reproduce the main experiments comparing prompting, fine-tuning, and rectification for large language model–based survey simulations, while focusing on two extensions: (a) applying the framework to a non-US, non-English context such as German-language survey data (e.g., from ALLBUS or SOEP) to examine linguistic and cultural transferability, and (b) advancing the rectification method by introducing subgroup-specific or adaptively weighted correction terms (e.g., by gender, education, or income). This would allow the thesis to test the robustness of the original findings across different contexts and evaluate whether more granular rectification schemes can better mitigate systematic model bias in synthetic survey data.

If you are interested, please contact anna-carolina.haensch@stat.uni-muenchen.de.

Training machine learning (ML) models relies on annotated (or also called labeled) training data. Large Language Models (LLMs) offer great potential for data annotation. However, human annotations are likely still needed for difficult or ambiguous annotations. A reasonable collaboration setup of LLM and human annotators could assign the easier instances to the LLM and the more complicated ones to humans. This approach could reduce annotation costs and let the human annotators focus on the more ambiguous cases. Best practices for allocating annotation tasks between humans and LLM, in particular for subjective tasks, are yet to be developed. In this thesis you could develop and test algorithms for allocating tasks between the two annotators and study their impact on quality and cost. Indicators to route an instance to the human (expert ?) annotator could for example be a self-assessment of the LLMs certainty. If interested, please email your CV and a brief explanation of interest in the topic to jacob.beck@stat.uni-muenchen.de and CC soda@stat.uni-muenchen.de.

We are seeking a motivated Master's student to embark on a methodological thesis project aimed at extending the scope of substantive-model compatible (SMC)-FCS multiple imputation (MI) techniques in discrete-time survival analysis (DTSA) to accommodate time-varying variables. Building on our existing work, which has successfully extended SMC-FCS MI for time-invariant covariates, this project will tackle the additional complexities introduced by time-varying variables. The successful candidate will conduct comprehensive Monte Carlo simulations to evaluate the extended methodology, and contribute to refining the practice of discrete-time survival analysis in the presence of missing data. If you are interested, please contact anna-carolina.haensch@stat.uni-muenchen.de.

Algorithmic Fairness

As we know that state-of-the-art AI models exhibit representational bias based on gender, race, etc., there is limited work on how specific regions are represented in data, especially the Global South regions. Studies further highlight that fairness evaluations are constrained by the availability, selection and processing of datasets (Simson et al. 2024) which often fail to capture the diversity of protected attributes relevant across different regions. This limitation becomes more critical in the context of the Global South, where social identities are shaped by additional dimensions such as caste/tribal affiliations, educational status and linguistic diversity. As a result, many forms of discrimination remain invisible within current fairness evaluation pipelines for Global South regions.

This thesis will explore data-centric limitations in AI fairness with a focus on the Global South. Further directions include investigating representational issues of the AI fairness and ethics landscape itself and developing a taxonomy of AI biases/stereotypes towards Global South following the initial work of Nadeem et al. (2025).
If you are interested, please contact christoph.kern@stat.uni-muenchen.de and cc anna-carolina.haensch@stat.uni-muenchen.de.

Research in algorithmic fairness and responsible AI has proposed numerous technical definitions of model fairness and associated metrics (e.g. Mitchell et al. 2022). These metrics typically encode different normative perspectives and are often in conflict with each other – i.e., the same model may not be able to comply with all metrics at the same time. These incompatibilities raise critical questions regarding which fairness concepts should be prioritized in a given application context.

While prior work, such as Makhlouf et al. (2022), provide guidelines in the form of “Fairness Trees” to help navigate the various proposed metrics, this project aims to use participatory approaches to understand how public stakeholders would choose between different fairness concepts and metrics in practice. We therefore seek students interested in exploring the connections between responsible AI, fairness perceptions and participatory approaches. If you are interested, please contact christoph.kern@stat.uni-muenchen.de and cc anna-carolina.haensch@stat.uni-muenchen.de.

Practitioners make a multitude of decisions when developing machine learning systems. Building on prior work, we source participatory input on a subset of decisions related to designing an ML system. In this master thesis you will work with a newly collected batch of participatory data to check for patterns in participants’ responses, how these related to practical machine learning models and in particular how free text suggestions can be integrated into the design of an ML system — and potentially converted into actual code.

If you are interested, please contact christoph.kern@stat.uni-muenchen.deand cc jan.simson@stat.uni-muenchen.de.

Statistical Education

.

Official Statistics

The Federal Statistical Office of Germany (Destatis) offers a wide range of topics for scientific theses on Bachelor and Master level. Please first check directly with Destatis whether the desired topic is still available. Afterwards, submit a short exposé (1–2 pages) on the topic, your CV, and a transcript of records to us (soda@stat.uni-muenchen.de) so that we can try to arrange university supervision through us.

Contact Person

Dr. Anna-Carolina Haensch

Send an email

More