Pattern-Guided

Association Rule Mining

for Complex Ontology Alignment

Beatriz Lima, Daniel Faria, Catia Pesquita

LASIGE, Faculdade de Ciências, Universidade de Lisboa

Conceptual differences between ontologies are often so profound ...

... That to ensure interoperability we need to establish complex correspondences such as:

We developed a targeted application of Association Rule Mining to find known complex alignment patterns.

Motivation

Ontology Alignment is an essential tool for interoperability and semantic data integration, but most works in the field are restricted to finding simple correspondences.

Existing lexical-based approaches are limited to finding complex mappings where there is a lexical similarity between the entities, which is often not the case in real-world ontologies.

Existing traditional Association Rule Mining based approaches have a catch-all philosophy, exhaustively searching for frequent patterns and using predefined complex alignment patterns to filter the results a posteriori.

We propose novel pattern mining based algorithms for targeted complex ontology alignment, where patterns are used a priori, to allow for a targeted Association Rule Mining process, which we have found to be:

System

External ontology loading system facilities (AMLC)

The loading step retrieves the set of shared individuals between the two ontologies and organises the ontology information (types, relations and property values of each individual, ranges and domains of the properties and hierarchical relations between classes) in hash-tables.

Matching algorithms

There are individual matchers dedicated to each of the complex alignment patterns, which search the hash-table data structures containing the relevant data for the targeted alignment pattern.

The support (or frequency) of the source and target entities that participate in the pattern are stored and a common Association Rule Mining matching algorithm is responsible for extracting association rules.

Refinement algorithms

These algorithms receive mappings generated by some of the pattern matching algorithms as input and refine those mappings, converting simple subsumption mappings into complex equivalence ones.

Filtering algorithms

Different filters select which of the candidate mappings to include in the final alignment, excluding redundant mappings and conflicting mappings with lower confidence.

An aggregator algorithm combines mappings for the same entity into a single mapping using logical operators, such as “AND” and “OR”.

Evaluation

Data

We chose the Populated Conference dataset for the evaluation of the proposed algorithms, which is avail­able in the OAEI 2020 Complex track. The dataset comprises five ontologies, from which we chose cmt and conference to align, given its richness in terms of complex patterns.

Manual scale

We manually classified the resulting mappings according to a rating scale consisting of the following five categories with associated scores.

Results

  • Our algorithms cover eight distinct complex patterns, from which seven were found in the cmt-conference dataset.


  • They were unable to find mappings for some of the patterns present in the reference, however, they found several mappings for patterns not present in the reference with high weighted precision.


  • These results show that the reference alignment is not exhaustive in all non­trivial correspondences that are valid between these two ontologies, suggesting that complex alignment references may be incomplete.

One-Minute Video

Authors

Beatriz Lima

LASIGE, Faculdade de Ciências

Daniel Faria

LASIGE, Faculdade de Ciências

Catia Pesquita

LASIGE, Faculdade de Ciências

Funding

This work was supported by FCT through the LASIGE Research Unit (UIDB/00408/2020 and UIDP/00408/2020). It was also partially supported by the KATY project which has received funding from the European Union’s Horizon 2020 research and innovation program under grant agreement No 101017453.