A multidisciplinary team led out of the ADAPT Centre at Trinity College Dublin has published cutting-edge research on a new tool designed to enable efficient data linkages between rare diseases and environmental datasets.  

Published recently in Nature Digital Medicine, the researchers present SERDIF (Semantic Environmental and Rare Disease data Integration Framework), an innovative framework that enables health data researchers to efficiently link environmental and health data sources through location and time information.  

Data fragmentation

SERDIF addresses critical challenges in rare disease research, such as data fragmentation and variations in data formats, and advances the potential for improved patient outcomes.  This is the first-ever detailed description of such a framework, marking a significant milestone in the field of medical research data management.

The research was led by Dr Albert Navarro-Gallinad, who served as the lead author and primary researcher on the study. He collaborated closely with principal investigators at Trinity College Dublin, Professor Declan O’Sullivan from the School of Computer Science and Statistics and Professor Mark Little from the School of Medicine.

The research team also included Dr Fabrizio Orlandi and Dr Jennifer Scott. International collaborators Dr Neil Basu and Dr Enock Havyarimana from the University of Glasgow contributed to the study.

Dr Navarro-Gallinad said: “Our goal was to create a user-friendly framework that adheres to the FAIR principles of Findability, Accessibility, Interoperability and Reusability (FAIR).  

"By making SERDIF open source, we enable researchers worldwide to efficiently link complex environmental and health data, fostering collaboration and accelerating discoveries. This openness is crucial for advancing our understanding of how environmental factors, which are amplified by climate change, impact rare diseases." 

Integrating diverse datasets

Rare diseases affect millions of people globally and research is often hindered by the scarcity of data and difficulties in accessing and integrating diverse datasets.

SERDIF leverages cutting-edge technologies from computer science, including Semantic Web technologies and Knowledge Graphs, to address pressing public health challenges.

By providing an open platform that offers a clear provenance record to accurately integrate diverse datasets, the framework promotes collaboration among researchers and clinicians in the rare disease community.

Prof O'Sullivan said: “Applying Knowledge Graph approaches in the area of environmental health presents a new frontier. Developing a usable framework to guide the implementation of these technologies for health data researchers reduces the technical complexities that are typically associated with data linkage tasks.” 

The implementation of the SERDIF framework signifies a major advancement in patient-centred healthcare and research. By efficiently linking data from various sources, researchers can gain a more comprehensive understanding of rare diseases, leading to faster diagnoses and the development of new therapies.

Prof Little added: “This collaboration between computer science and medicine,  part of the HELICAL MSCA ITN, demonstrates how interdisciplinary efforts can lead to significant breakthroughs.

"The SERDIF framework not only ensures patient privacy but also fosters global collaboration, potentially leading to more effective treatments and a better understanding of these complex conditions influenced by environmental factors.

Wide application

"Although this was developed focusing on the rare disease ANCA vasculitis, we envisage its wide application in studying the totality of environmental exposures individuals encounter throughout their lives and how these exposures affect help, a field known as exposome research." 

The SERDIF framework was evaluated with researchers studying climate-related health hazards affecting vasculitis disease activity across European countries. Usability metrics consistently improved, indicating SERDIF's effectiveness in linking complex environmental and health datasets.

The framework also showcased its versatility beyond rare diseases by enabling epidemiologists to study environmental factors in a pregnancy cohort in Lombardy, Italy.

The research was supported by funding from the MSCA-ITN program for PhD training as part of the European Union’s Horizon 2020 research and innovation programme, and was finalised during Dr Albert Navarro-Gallinad’s time at the Human Technopole, which also provided technical support for hosting the user interface.