Authors: Henning Köhler, Jeeva Ganesan, Pieta Brown, Sebastian Link
Tags: 2016, conceptual modeling
Probabilistic databases accommodate well the requirements of modern applications that produce large volumes of uncertain data from a variety of sources. We propose an expressive class of probabilistic keys which empowers users to specify lower and upper bounds on the marginal probabilities by which keys should hold in a data set of acceptable quality. Indeed, the bounds help organizations balance the consistency and completeness targets for their data quality. For this purpose, algorithms are established for an agile schema- and data-driven acquisition of the right lower and upper bounds in a given application domain, and for reasoning about these keys. The efficiency of our acquisition framework is demonstrated theoretically and experimentally.Read the full paper here: https://link-springer-com.proxy2.hec.ca/chapter/10.1007/978-3-319-46397-1_13