On Complex Value Relations in Hive

0
83

Authors: Laurent d’Orazio, Matthieu Pilven, Stefanie Scherzinger

Tags: 2019, conceptual modeling

In this paper, we raise the question how data architects model their data for processing in Apache Hive. This well-known SQL-on-Hadoop engine supports complex value relations, where attribute types need not be atomic. In fact, this feature seems to be one of the prominent selling points, e.g., in Hive reference books. In an empirical study, we analyze Hive schemas in open source repositories. We examine to which extent practitioners make use of complex value relations and accordingly, whether they write queries over complex types. Understanding which features are actively used will help make the right decisions in setting up benchmarks for SQL-on-Hadoop engines, as well as in choosing which query operators to optimize for.

Read the full paper here: https://link.springer.com/chapter/10.1007/978-3-030-34146-6_13