Dealing with Semantic Heterogeneity During Data Integration

0
82

Authors: Elisabeth Métais, Zoubida Kedad

Tags: 1999, conceptual modeling

Multi-sources information systems, such as data warehouse systems, involve heterogeneous sources. In this paper, we deal with the semantic heterogeneity of the data instances. Problems may occur when confronting sources, each time different level of denominations have been used for the same value, e.g. “vermilion” in one source, and “red” in an other. We propose to manage this semantic heterogeneity by using a linguistic dictionary. “Semantic operators” allow a linguistic flexibility in the queries, e.g. two tuples with the values “red” and “vermilion” could match in a semantic join on the “color” attribute. A particularity of our approach is it states the scope of the flexibility by defining classes of equivalent values by the mean of “priority nodes”. They are used as parameters for allowing the user to define the scope of the flexibility in a very natural manner, without specifying any distance.

Read the full paper here: https://link.springer.com/chapter/10.1007/3-540-47866-3_22