Theses
We are always looking for motivated students who are interested in writing about a topic connected to our current research projects!
We are always looking for motivated students who are interested in writing about a topic connected to our current research projects!
To ensure effective supervision, please prepare a short exposé (2 pages) and submit it to the assigned supervisor. The exposé should outline your chosen thesis topic(s), your motivation, a tentative structure for your work, and your desired submission deadline. Provide details on your current familiarity with the topic at hand and any preliminary progress (i.e., what literature you have read so far) you have made. Please also include your current transcript of records.
TikTok has become a central platform for public discourse, shaping political campaigns, social movements, and cultural trends. In this thesis, you will analyze TikTok data to understand how content spreads, influences public opinion, or mobilizes communities. The thesis can focus on content analysis (e.g., text and video data), interaction patterns (e.g., likes, comments, shares), or sentiment analysis to understand audience engagement.
Please note that accessing TikTok data requires submitting a data access application, which should be filed one month before the planned start of your thesis.
If interested, please email anna-carolina.haensch@stat.uni-muenchen.de.
Due to Caro Haensch’s availability, supervision of this thesis topic will be possible again from April 2026 onwards. If you are interested, please get in touch in mid-April 2026.
Financial regulators and central banks are increasingly integrating sustainability aspects into their operations. The Corporate Sustainability Reporting Directive (CSRD) mandates that ~50000 European companies will have to publish sustainability reports in the future, a great source of data for statistical analysis.
One particular challenge is that companies communicate their sustainability information through unstructured PDF reports that contain both numerical and textual data. To make this information amenable to quantitative research, GIST applies Natural Language Processing (NLP) and Large Language Models (LLMs) for data extraction.
Possible tasks include:
If you are interested, please contact malte.schierholz@stat.uni-muenchen.de.
This thesis will explore the potential of Data Science for assessing the sustainability of land management and urban planning. The project deals with the implementation of regional or national guidelines for climate change adaptation and climate protection in urban planning. The central question is how Data Science can be used to evaluate political measures and administrative actions. The tasks are:
Dataset: Planning documents, which contain detailed information on the environmental condition, environmental risks, and necessary measures at the building level, as well as precise data on building type, height, and building density. Dataset of regional plans that outline requirements for urban planning. Additionally, there is the possibility to access flood maps, climate risk maps, etc., as well as court cases and lawsuits related to the plans. Federal states: NRW, Bavaria, and the Rhine-Main-Neckar region.
A specific research question can be developed in collaboration with the research team. The work requires an independent approach, an interest in interdisciplinary work, basic knowledge of sustainability and climate change topics, and good knowledge of German. There is the opportunity to offer a student assistant position as part of the Master's thesis. If interested, please send an email with a CV and a short cover letter to felicitas.sommer@tum.de and bolei.ma@lmu.de.
Summary: Large Language Models (LLMs) are increasingly used in various natural language processing (NLP) tasks for evaluation. One area of interest is their application in handling multiple-choice questions, particularly those used in surveys with Likert scales to measure preferences, opinions, or similarity judgments. While LLMs demonstrate strong reasoning abilities, it remains unclear how consistent they are when faced with different variations of multiple-choice scales.
For instance, the same conceptual question may be presented with different response scales when asking how similar two given terms are to each other:
You will investigate these questions under well-defined experimental conditions. The empirical study will include but are not limited to data collection, LLM evaluation and consistency analysis.
Qualifications & Skills Required:
Reference:
Application Process: Interested candidates should submit a CV and a copy of their current transcripts of records to bolei.ma@lmu.de and c.haensch@lmu.de.
Training machine learning (ML) models relies on annotated (or also called labeled) training data. Large Language Models (LLMs) offer great potential for data annotation. However, human annotations are likely still needed for difficult or ambiguous annotations. A reasonable collaboration setup of LLM and human annotators could assign the easier instances to the LLM and the more complicated ones to humans. This approach could reduce annotation costs and let the human annotators focus on the more ambiguous cases. Best practices for allocating annotation tasks between humans and LLM, in particular for subjective tasks, are yet to be developed. In this thesis you could develop and test algorithms for allocating tasks between the two annotators and study their impact on quality and cost. Indicators to route an instance to the human (expert ?) annotator could for example be a self-assessment of the LLMs certainty. If interested, please email your CV and a brief explanation of interest in the topic to jacob.beck@stat.uni-muenchen.de and CC soda@stat.uni-muenchen.de.
The way in which an annotation task is structured affects the annotations that human annotators provide, a result called annotation sensitivity. For example, the order in which annotations are collected, and the number of screens, can change whether tweets are annotated as containing hate speech or offensive language (https://dl.acm.org/doi/10.1007/978-3-031-21707-4_19; https://aclanthology.org/2024.uncertainlp-1.8/). With the growing use of LLMs as annotators, we wonder whether LLMs also show annotation sensitivity. Since LLMs are built on data produced by humans, it might be that the models inherit similar biases. In this thesis, you could replicate findings from the above studies with LLM annotators. If interested, please email your CV and a brief explanation of interest in the topic to jacob.beck@stat.uni-muenchen.de and CC soda@stat.uni-muenchen.de.
We are seeking a motivated Master's student to embark on a methodological thesis project aimed at extending the scope of substantive-model compatible (SMC)-FCS multiple imputation (MI) techniques in discrete-time survival analysis (DTSA) to accommodate time-varying variables. Building on our existing work, which has successfully extended SMC-FCS MI for time-invariant covariates, this project will tackle the additional complexities introduced by time-varying variables. The successful candidate will conduct comprehensive Monte Carlo simulations to evaluate the extended methodology, and contribute to refining the practice of discrete-time survival analysis in the presence of missing data. If you are interested, please contact anna-carolina.haensch@stat.uni-muenchen.de.
Due to Caro Haensch’s availability, supervision of this thesis topic will be possible again from April 2026 onwards. If you are interested, please get in touch in mid-April 2026.
The swift advancement of artificial intelligence (AI) has sparked both enthusiasm and concerns, with governments embracing the new technology for administrative efficiency, but also regulating it to protect the rights of citizens. To navigate this dynamic environment, policymakers require timely and dependable data on the attitudes of the population towards AI. Despite calls for research (e.g., Montag et al., 2024), no established short scale exists to measure these attitudes systematically within large survey panels such as the SOEP, which are an important foundation for evidence-based policy design.
Our research group is developing a concise and reliable scale to assess AI attitudes, examining variations across demographic groups in Germany and their impacts on technology acceptance or life outcomes such as career opportunities. Thesis students will collaborate with an experienced research team, gain practical skills in survey methodology and statistics, and contribute to an ongoing research project with a clear path toward academic publication.
We welcome applications from motivated sociology and statistics students with an interest in technology (policy) and public opinion research. A bachelor thesis would be more focussed on theoretical foundations, including a literature review and the design of a survey experiment with hypothesis building, while a master thesis would also have time to collect and analyze data from such an experiment, e.g., using factor and regression analyses.
Interested? Then reach out with your CV / transcript and to marcus.novotny@stat.uni-muenchen.de & christoph.kern@stat.uni-muenchen.de for a first meeting. We are looking forward to hearing from you!
Public agencies are increasingly automating the allocation of scarce public resources by making use of risk prediction models. While a wide range of studies focuses on bias in the application of such models, the long-term fairness implications of algorithmically assisted decisions are not fully understood. Building on the emerging literature of dynamic fairness, this project aims at studying feedback loops and the long-term consequences of algorithmic decision-making in social contexts. If you are interested, please contact christoph.kern@stat.uni-muenchen.de and cc anna-carolina.haensch@stat.uni-muenchen.de.
Research suggests that the successful integration of refugees and asylum seekers may depend on the location to which they are assigned, as certain locations may be better suited to certain refugee characteristics (Bansak et al. (2018)).
To this end, a number of algorithmic refugee location matching tools have been developed. These include tools such as GeoMatch and AnnieTM Moore. These tools use supervised machine learning to predict refugee integration outcomes in potential assignment locations. Further, they apply optimal matching approaches to strategically assign refugees to locations where the probability of a desired integration outcome is maximised.
Although the tools are designed to support rather than replace human decision-makers, they raise important questions about their reliability and fairness.
To this end, it is crucial to evaluate existing matching tools with a focus on fairness. For this purpose, extensive data from the IAB-BAMF-SOEP Survey on Refugees in the German Context (SOEP) will be used.
If you are interested, please contact clara.strasserceballos@stat.uni-muenchen.de and christoph.kern@stat.uni-muenchen.de
Research in algorithmic fairness and responsible AI has proposed numerous technical definitions of model fairness and associated metrics (e.g. Mitchell et al. 2022). These metrics typically encode different normative perspectives and are often in conflict with each other – i.e., the same model may not be able to comply with all metrics at the same time. These incompatibilities raise critical questions regarding which fairness concepts should be prioritized in a given application context.
While prior work, such as Makhlouf et al. (2022), provide guidelines in the form of “Fairness Trees” to help navigate the various proposed metrics, this project aims to use participatory approaches to understand how public stakeholders would choose between different fairness concepts and metrics in practice. We therefore seek students interested in exploring the connections between responsible AI, fairness perceptions and participatory approaches. If you are interested, please contact christoph.kern@stat.uni-muenchen.de and cc anna-carolina.haensch@stat.uni-muenchen.de.
This thesis focuses on creating and adapting teaching materials on data literacy and evidence-based decision-making specifically tailored for church leadership. Drawing from established resources such as the Data Literacy and Evidence Building book, the project aims to design practical, accessible materials that equip leaders in church contexts to make informed, data-driven decisions in increasingly complex environments. The project will involve reviewing existing materials, and adapting them to the specific needs of church leadership under the guidance of Prof. Ongono and Prof. Wollbold from the Faculty of Catholic Theology and the guidance of Dr. Haensch from the Statistics Institute. It will also involve creating synthetic data as well as a selection of simple R or Python notebooks for training under the guidance of Dr. Haensch from the Statistics Institute. This work will contribute to strengthening leadership training by providing a foundation for data-informed strategies while respecting the values and unique challenges of church organizations.
If you are interested, please contact anna-carolina.haensch@stat.uni-muenchen.de.
Due to Caro Haensch’s availability, supervision of this thesis topic will be possible again from April 2026 onwards. If you are interested, please get in touch in mid-April 2026.
The Federal Statistical Office of Germany (Destatis) offers a wide range of topics for scientific theses on Bachelor and Master level. In many cases, the university supervision can also be arranged through us. If you are interested, feel free to contact our chair's administration ( soda@stat.uni-muenchen.de) for further information.