Re-Evaluating the Machine Learning Pipeline to Improve Fairness and Reliability

Project Description

Fairness in machine learning continues to be a highly relevant issue, with unfair models making headlines on a regular basis. In this project, we re-evaluate the complete machine learning pipeline from the sourcing of data, over the design of ML systems all the way to their implementation with an eye on algorithmic fairness and robustness. We develop new methodologies and collect data to better understand the reliability of findings in the field. We further critically examine the usage and composition of datasets, highlighting gaps and providing recommendations for more sustainable practices. Among other things, we focus on the influence of design decisions, highlighting potential issues of fairness hacking and introducing a new methodology to systematically study and address issues of reliability.

Contact Person

Jan Simson

Publications

  • Simson, J., Draxler, F., Mehr, S., and Kern, C. (2025). Less is More: Preventing Harmful Data Practices by using Participatory Input to Navigate the Machine Learning Multiverse. The ACM Conference on Human Factors in Computing Systems (CHI 2025). https://doi.org/10.1145/3706598.3713482
  • J. Simson, A.Fabris, C. Kern. 2024. Lazy Data Practices Harm Fairness Research. In The 2024 ACM Conference on Fairness, Accountability, and Transparency (FAccT ’24), June 03–06, 2024, Rio de Janeiro, Brazil. ACM, New York, NY, USA, 18 pages. https://doi.org/10.1145/3630106.3658931
  • J. Simson, F. Pfisterer, C. Kern. 2024. One Model Many Scores: Using Multiverse Analysis to Prevent Fairness Hacking and Evaluate the Influence of Model Design Decisions. In The 2024 ACM Conference on Fairness, Accountability, and Transparency (FAccT ’24), June 03–06, 2024, Rio de Janeiro, Brazil. ACM, New York, NY, USA, 16 pages.https://doi.org/10.1145/3630106.3658974
  • Simson, J., Fabris, A. and Kern, C. (2024). Unveiling the Blindspots: Examining Availability and Usage of Protected Attributes in Fairness Datasets Proceedings of the 3rd European Workshop on Algorithmic Fairness (EWAF’24). https://ceur-ws.org/Vol-3908/