LMU-NYU-Workshop “Making Data Work: Tools for Better Statistical Practice”

We invite you to join us at the LMU-NYU-Workshop “Making Data Work: Tools for Better Statistical Practice” in Munich, Germany on June 5-6, 2025. This interactive workshop brings together experts developing and evaluating tools that support learning and practice in statistics and data science.

Over two days, we will:

Share and explore innovative tools for learning and applying statistics by presenting our approaches to one another,
engage in hands-on sessions where we test each other’s tools and give direct feedback,
discuss together how to evaluate the effectiveness of these tools and how to scale and communicate their impact,
conclude with a collaborative brainstorming session on concrete ideas for future cooperation.

The workshop is designed for statisticians, educators, and tool developers who want to refine their approaches and connect with others tackling similar challenges. Attendance is limited to ensure deep engagement and interaction. Please find a tentative agenda attached.

Tools

thinkCausal

Jennifer Hill, George Perrett (New York University)

thinkCausal is a tool for thinking about and doing causal inference. It will help you answer causal questions while learning the basics of causal inference. With thinkCausal, you can visualize your data, choose a causal inference goal and interpret your results. If you get stuck along the way, the built in “just-in-time” learning approach will help you make decisions and learn about causal inference.

RAINER: R Assistant IN Error Resolution

Niklas Ippisch (LMU Munich)

RAINER is an R package designed to help beginners tackle the steep learning curve of the programming language and overcome its often unhelpful error messages. It provides real-time, context-specific feedback by analyzing the user’s environment to identify the issue, explain its cause, and suggest possible solutions. In addition to automatic error feedback, the package allows users to ask individual questions when their code does not produce the expected results. This feedback, displayed in the console, is generated using a GPT model, with prompts grounded in a prior evaluation and informed by theories of scientific feedback quality. RAINER is currently available on GitHub and is being evaluated in an introductory undergraduate course on statistical programming at LMU.

OneTutor

Philipp Csistian (OneTutor)

OneTutor is an AI-assisted tutoring system designed for university students and professors. It answers students’ questions based on lecture materials and empowers professors to create AI-assisted quizzes, keeping them as the human-in-the-loop to ensure the highest quality output. Developed at the Technical University of Munich (TUM), OneTutor has been in use since November at TUM, supporting over 3,000 students who have asked more than 50,000 questions and submitted over 100,000 quizzes. The platform is now active at nine Bavarian universities as part of the AIffectiveness research project, which studies the impact of AI in education across various subjects and contextual factors.

xmap

Cynthia Huang (Monash University)

The R package ‘xmap’ implements a graph-based data structure for transforming data between related statistical classifications. The tool is intended to support more transparent and reproducible ex-post harmonisation of survey and other datasets. I’ve presented earlier versions of the tool for feedback but have not yet done any formal evaluation.

Visualization of Multiverse Analysis

Daniel Krähmer (LMU Munich)

Empirical research is full of analytical decisions (e.g., which control variables to use, how to deal with outliers), many of which are neither fully justified nor comprehensively discussed in traditional research articles. As a result, readers are typically left with a fragmentary—and carefully curated—impression of the findings that a given dataset can reasonably support. Multiverse analysis, by contrast, systematically explores all sensible model specifications, contrasting the author’s preferred estimate with the range of all possible estimates. While this approach increases transparency, it complicates the presentation of results, as multiverse results operate on an unfamiliar scale and may include hundreds, thousands, or even millions of coefficients.
I will present a graphical method that aims to communicate multiverse results comprehensively, intuitively, and irrespective of scale. The visualization combines features of density plots, specification curves, and heatmaps, equipping applied researchers with a tool to efficiently summarize and interpret large multiverse analyses.

A Decision Database to Record Choices during Data Analysis

Sherry Zhang (University of Texas)

I will present a work-in-progress tool we are developing to create a decision database that records choices made by analysts during data analysis. Using large language models, we extract decisions and their justifications from published literature and store them in a structured and searchable format. This opens up new possibilities for studying choices made in applied data analysis — both within specific domains and across fields that use similar statistical techniques. We’ve also developed methods to measure similarity between papers and visualize them as clusters, revealing common and unique choices made in applications.

OCR Recommender

Jan Kamlah (UB Mannheim)

The OCR Recommender Service is a web application crafted to simplify your text recognition projects. Forget the information overload; our platform asks targeted questions to grasp your project’s nuances, ensuring tailored guidance. From image enhancement to data extraction, we cover all modern text recognition facets. Whether you are delving into historical academia or handling sensitive business data, our platform suggests customized solutions for your unique project needs.

FAIRplexica

Jan Kamlah (UB Mannheim)

FAIRplexica is an open-source AI assistant for research data management (RDM). It is based on Perplexica, uses the metasearch engine SearXNG to retrieve relevant RDM resources and local LLMs (via Ollama) to answer user questions. API providers like OpenAI, Groq, Anthropic, Google as well as custom API endpoints are available as well.

RESQUE

Felix Schönbrodt (LMU Munich, OSC)

The Research Quality Evaluation (RESQUE) framework provides recommendations for a responsible research assessment that does not rely on flawed metrics such as the journal impact factor or the h-index.
In alignment with the principles of CoARA, this approach acknowledges diverse academic contributions, prioritizes the quality of work rather than its volume, and integrates qualitative peer assessment with the responsible use of quantitative indicators. All proposed indicators are open-source and reproducible.

Application

If you would like to participate, please let us (markus.herklotz@stat.uni-muenchen.de) know by April 22 with a short description (4-5 sentences) of your tool and what you would like to present. If available, please also let us know whether you have already conducted any evaluation of your tool (e.g., user feedback, classroom use, pilot studies). We are interested in all stages of tools, from finished products to ideas still in early development for which you would like feedback.

If you are interested in attending but do not have access to funding, please let us know—we will do our best to identify additional resources to support your participation.Looking forward to your participation!

Best,

LMU Munich - Frauke Kreuter, Anna-Carolina Haensch, and Markus Herklotz
New York University - Jennifer Hill