Approximate Query Processing: What is New and Where to Go?: A Survey on Approximate Query Processing

doi:10.1007/S41019-018-0074-4

Open AccessJournal ArticleDOI

Approximate Query Processing: What is New and Where to Go?: A Survey on Approximate Query Processing

Kaiyu Li, +1 more

- 01 Dec 2018 -

Data Science and Engineering

- Vol. 3, Iss: 4, pp 379-397

TLDR

The survey can help the partitioners to understand existing AQP techniques and select appropriate methods in their applications and provide research challenges and opportunities of AQP.

Abstract:

Online analytical processing (OLAP) is a core functionality in database systems. The performance of OLAP is crucial to make online decisions in many applications. However, it is rather costly to support OLAP on large datasets, especially big data, and the methods that compute exact answers cannot meet the high-performance requirement. To alleviate this problem, approximate query processing (AQP) has been proposed, which aims to find an approximate answer as close as to the exact answer efficiently. Existing AQP techniques can be broadly categorized into two categories. (1) Online aggregation: select samples online and use these samples to answer OLAP queries. (2) Offline synopses generation: generate synopses offline based on a-priori knowledge (e.g., data statistics or query workload) and use these synopses to answer OLAP queries. We discuss the research challenges in AQP and summarize existing techniques to address these challenges. In addition, we review how to use AQP to support other complex data types, e.g., spatial data and trajectory data, and support other applications, e.g., data visualization and data cleaning. We also introduce existing AQP systems and summarize their advantages and limitations. Lastly, we provide research challenges and opportunities of AQP. We believe that the survey can help the partitioners to understand existing AQP techniques and select appropriate methods in their applications.

Approximate Query Processing: What is New and Where to Go?: A Survey on Approximate Query Processing

Citations

QTune: a query-aware database tuning system with deep reinforcement learning

A survey of data partitioning and sampling methods to support big data analysis

A Survey of Traffic Prediction: from Spatio-Temporal Data to Intelligent Transportation

HeavyKeeper: An Accurate Algorithm for Finding Top-$k$ Elephant Flows

Querying shortest paths on time dependent road networks

References

Answering queries using views: A survey

Probabilistic counting algorithms for data base applications

Trajectory Data Mining: An Overview

Online aggregation

BlinkDB: queries with bounded errors and bounded response times on very large data

Related Papers (5)

BlinkDB: queries with bounded errors and bounded response times on very large data

The Aqua approximate query answering system

Synopses for Massive Data: Samples, Histograms, Wavelets, Sketches

Online aggregation

Congressional samples for approximate answering of group-by queries