Unified Management of Multi-model Data
The variety of data is one of the most challenging issues for research and practice in data management. The...
SkipSJoin: A New Physical Design for Distributed Big Data Warehouses in Hadoop
Hadoop uses horizontal partitioning to improve the performance of a big data warehouse. A major challenge when horizontally partitioning...
Learning k-Occurrence Regular Expressions from Positive and Negative Samples
Deterministic regular expressions (DREs) are a core part of XML schema languages such as DTD/XSD and are used in...
Modeling Data Lakes with Data Vault: Practical Experiences, Assessment, and Lessons Learned
Data lakes have become popular to enable organization-wide analytics on heterogeneous data from multiple sources. Data lakes store data...
Requirements-Driven Visualizations for Big Data Analytics: A Model-Driven Approach
Choosing the right Visualization techniques is critical in Big Data Analytics. However, decision makers are not experts on visualization...
Don’t Tune Twice: Reusing Tuning Setups for SQL-on-Hadoop Queries
SQL-on-Hadoop processing engines have become state-of-the-art in data lake analysis. However, the skills required to tune such systems are...
A Graph Model for Taxi Ride Sharing Supported by Graph Databases
The emergence of more complex, data-intensive applications motivates a high demand of effective data modeling for graph databases to...
Comprehensive Process Drift Detection with Visual Analytics
Recent research has introduced ideas from concept drift into process mining to enable the analysis of changes in business...
A Probabilistic Approach to Event-Case Correlation for Process Mining
Process mining aims to understand the actual behavior and performance of business processes from event logs recorded by IT...
Keyword Search Algorithm over Large RDF Datasets
Keyword search tools have been used to query RDF data. They can be labeled as schema-based when the RDF...