Skip to Main content Skip to Navigation
Conference papers

HADAD: A Lightweight Approach for Optimizing Hybrid Complex Analytics Queries

Abstract : Hybrid complex analytics workloads typically include (i) data management tasks (joins, filters, etc.), easily expressed using relational algebra (RA)-based languages, and (ii) complex analytics tasks (regressions, matrix decompositions, etc.), mostly expressed in linear algebra (LA) expressions. Such workloads are common in a number of areas, including scientific computing, web analytics, business recommendation, natural language processing, speech recognition. Existing solutions for evaluating hybrid complex analytics queriesranging from LA-oriented systems, to relational systems (extended to handle LA operations), to hybrid systems-fail to provide a unified optimization framework for such a hybrid setting. These systems either optimize data management and complex analytics tasks separately, or exploit RA properties only while leaving LA-specific optimization opportunities unexplored. Finally, they are not able to exploit precomputed (materialized) results to avoid computing again (part of) a given mixed (LA and RA) computation. We describe HADAD, an extensible lightweight approach for optimizing hybrid complex analytics queries, based on a common abstraction that facilitates unified reasoning: a relational model endowed with integrity constraints, which can be used to express the properties of the two computation formalisms. Our approach enables full exploration of LA properties and rewrites, as well as semantic query optimization. Importantly, our approach does not require modifying the internals of the existing systems. Our experimental evaluation shows significant performance gains on diverse workloads, from LA-centered ones to hybrid ones.
Document type :
Conference papers
Complete list of metadata

https://hal.inria.fr/hal-03347677
Contributor : Ioana Manolescu Connect in order to contact the contributor
Submitted on : Friday, September 17, 2021 - 1:57:31 PM
Last modification on : Thursday, September 30, 2021 - 3:34:16 AM

File

main.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-03347677, version 1

Citation

Rana Alotaibi, Bogdan Cautis, Alin Deutsch, Ioana Manolescu. HADAD: A Lightweight Approach for Optimizing Hybrid Complex Analytics Queries. ACM SIGMOD 2021 - International Conference on Management of Data, Jun 2021, Xi'an / Online, China. ⟨hal-03347677⟩

Share

Metrics

Record views

27

Files downloads

175