Masterarbeit
Automatic FAIRness Assessment of Research Softwar using AI
Research Area
Intelligent Information Management
Advisers
Dr. Sheeba Samuel
Description
In modern scientific research, software plays a critical role in producing, analyzing, and reproducing results. To improve transparency and reuse of scientific outputs, the FAIR Principles—Findable, Accessible, Interoperable, and Reusable—have been widely promoted within the open science community. While these principles were originally designed for research data, they are increasingly applied to research software as well. Many research groups publish their code on platforms such as GitHub, which has become one of the largest repositories for open-source and research-related software. Ensuring that such repositories follow FAIR practices can significantly enhance the reproducibility and long-term usability of scientific work. Despite the growing emphasis on FAIR principles, many research software repositories do not fully comply with them. For example, repositories may lack clear licensing, structured metadata, persistent identifiers, or sufficient documentation. Currently, evaluating whether a repository follows FAIR principles is often done manually or through limited guidelines, which makes large-scale assessment difficult. With the vast number of repositories hosted on GitHub, there is a need for automated methods to assess the degree to which research software adheres to FAIR principles. The objective of this thesis is to design and implement an AI-assisted approach for assessing the FAIRness of research software repositories hosted on GitHub. The work aims to define measurable indicators for the FAIR principles and apply repository mining techniques and AI-based information extraction to analyze relevant repository features. The thesis will develop a prototype tool or framework that automatically analyzes GitHub repositories and evaluates their compliance with FAIR principles.


