A Scalable Algorithm for One-to-One, Onto, and Partial Schema Matching with Uninterpreted Column Names and Column Values

0
254

Authors: Boris Rabinovich, Mark Last

Tags: 2014, conceptual modeling

In this paper, the authors propose a five-step approach to the problem of identifying semantic correspondences between attributes of two database schemas. It is one of the key challenges in many database applications such as data integration and data warehousing. The authors’ research is focused on uninterpreted schema matching, where the column names and column values are uninterpreted or unreliable. The approach implements Bayesian networks, Pearson’s correlation and mutual information to identify inter-attribute dependencies. Additionally, the authors propose an extension to their algorithm that allows the user to manually enter the known mappings to improve the automated matching results. The five-step approach also allows data privacy preservation. The authors’ evaluation experiments show that the proposed approach enhances the current set of schema matching techniques.

Read the full paper here: https://www.igi-global.com/journal/journal-database-management