Uncategorized

Benchmarking Performance for Migrating a Relational Application to a Parallel Implementation

October 20, 2020

146

Authors: Karen C. Davis, Krishna Karthik Gadiraju, Paul G. Talaga

Tags: 2014, conceptual modeling

Many organizations rely on relational database platforms for OLAP-style querying (aggregation and filtering) for small to medium size applications. We investigate the impact of scaling up the data sizes for such queries. We intend to illustrate what kind of performance results an organization could expect should they migrate current applications to big data environments. This paper benchmarks the performance of Hive [20], a parallel data warehouse platform that is a part of the Hadoop software stack. We set up a 4-node Hadoop cluster using Hortonworks HDP 1.3.2 [10]. We use the data generator provided by the TPC-DS benchmark [3] to generate data of different scales. We use a representative query provided in the TPC-DS query set and run the SQL and Hive Query Language (HiveQL) versions of the same query on a relational database installation (MySQL) and on the Hive cluster. We measure the speedup for query execution for all dataset sizes resulting from the scale up. Hive loads the large datasets faster than MySQL, while it is marginally slower than MySQL when loading the smaller datasets.

Read the full paper here: https://link-springer-com.proxy2.hec.ca/chapter/10.1007/978-3-319-12256-4_6

Benchmarking Performance for Migrating a Relational Application to a Parallel Implementation

EDITOR PICKS

Roger H.L. Chiang – 2023 ASOCA Winner

Join us in the magical Miami for the 2023 AIS SIGSAND!

Participate in SAND sessions at AMCIS 2023 – August 10 –...

POPULAR POSTS

Participate in SAND sessions at AMCIS 2023 – August 10 –...

Conceptual Modelling in the “Digital First” Era — A Joint AIS...

TheoryOn: A Design Framework and System for Unlocking Behavioral Knowledge through...

POPULAR CATEGORY

Share this:

EDITOR PICKS

POPULAR POSTS

POPULAR CATEGORY