Semantic data integration for explainable artificial intelligence in personalized medicine

Patrícia Eugénio, Daniel Faria, Catia Pesquita

LASIGE, Faculdade de Ciências, Universidade de Lisboa

Introduction

In personalized medicine, multi-omics present potential that can be explored through ML methods, including “black-box” models, such as deep neural networks, to generate predictions and knowledge about domain relationships contained in data.

The inability to explain their results to clinical experts in a human-understandable way hinders their usability in the medical domain.




There is a possible solution for this problem in adding explainability to learning algorithms, by providing a contextual semantic layer through ontologies and Knowledge Graphs (KG).

This will allow to bridge the gap between AI data and medical application, by making medical “AI-empowered knowledge” accessible for clinicians and clinical researchers to understand and use.

Objectives

The goal is to populate a knowledge graph composed of multiple biomedical ontologies with instance data produced by different biomedical techniques in personalized oncology. This work proposes to develop:

A conceptual model for immunopeptidomic and transcriptomic data that not only supports semantic data integration but also bridges across the KATY ontologies;


A generalizable and reusable methodology for semantic annotation that can automatically integrate experimental datasets into the KG.


Methodology

Preliminary Results




These preliminary results correspond to snapshots from the constructed ontology and query result visual outputs. These results answer a fraction of the competency questions provided by immunopeptidomic domain experts and were obtained by querying the visual graph in GraphDB, a Semantic Graph Database (or RDF triplestore).

Authors

Patrícia Eugénio

LASIGE, Faculdade de Ciências

Daniel Faria

LASIGE, Faculdade de Ciências

Catia Pesquita

LASIGE, Faculdade de Ciências

Funding

This work was supported by FCT through the LASIGE Research Unit (UIDB/00408/2020 and UIDP/00408/2020) and by the KATY project which has received funding from the European Union’s Horizon 2020 research and innovation program under grant agreement No 101017453.