•Journal•ISSN: 2364-1541

Data Science and Engineering

Springer Science+Business Media

About: Data Science and Engineering is an academic journal published by Springer Science+Business Media. The journal publishes majorly in the area(s): Computer science & Graph (abstract data type). It has an ISSN identifier of 2364-1541. It is also open access. Over the lifetime, 199 publications have been published receiving 2978 citations.

...read moreread less

Topics: Computer science, Graph (abstract data type), Deep learning, Data management, Graph database ...read more

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Big Data Reduction Methods: A Survey

[...]

Muhammad Habib ur Rehman¹, Chee Sun Liew¹, Assad Abbas², Prem Prakash Jayaraman³, Teh Ying Wah¹, Samee U. Khan² - Show less +2 more•Institutions (3)

Information Technology University¹, North Dakota State University², Swinburne University of Technology³

10 Dec 2016-Data Science and Engineering

TL;DR: This article presents a review of methods that are used for big data reduction including the network theory, big data compression, dimension reduction, redundancy elimination, data mining, and machine learning methods.

...read moreread less

Abstract: Research on big data analytics is entering in the new phase called fast data where multiple gigabytes of data arrive in the big data systems every second. Modern big data systems collect inherently complex data streams due to the volume, velocity, value, variety, variability, and veracity in the acquired data and consequently give rise to the 6Vs of big data. The reduced and relevant data streams are perceived to be more useful than collecting raw, redundant, inconsistent, and noisy data. Another perspective for big data reduction is that the million variables big datasets cause the curse of dimensionality which requires unbounded computational resources to uncover actionable knowledge patterns. This article presents a review of methods that are used for big data reduction. It also presents a detailed taxonomic discussion of big data reduction methods including the network theory, big data compression, dimension reduction, redundancy elimination, data mining, and machine learning methods. In addition, the open research issues pertinent to the big data reduction are also highlighted.

...read moreread less

138 citations

Journal Article•DOI•

Medical Big Data: Neurological Diseases Diagnosis Through Medical Data Analysis

[...]

Siuly Siuly¹, Yanchun Zhang², Yanchun Zhang¹•Institutions (2)

Victoria University, Australia¹, Fudan University²

27 Jul 2016-Data Science and Engineering

TL;DR: The challenges of medical big data handing are explored and the concept of the computer-aided diagnosis (CAD) system how it works is introduced and a survey of developed CAD methods in the area of neurological diseases diagnosis is provided.

...read moreread less

Abstract: Diagnosis of neurological diseases is a growing concern and one of the most difficult challenges for modern medicine. According to the World Health Organisation’s recent report, neurological disorders, such as epilepsy, Alzheimer’s disease and stroke to headache, affect up to one billion people worldwide. An estimated 6.8 million people die every year as a result of neurological disorders. Current diagnosis technologies (e.g. magnetic resonance imaging, electroencephalogram) produce huge quantity data (in size and dimension) for detection, monitoring and treatment of neurological diseases. In general, analysis of those medical big data is performed manually by experts to identify and understand the abnormalities. It is really difficult task for a person to accumulate, manage, analyse and assimilate such large volumes of data by visual inspection. As a result, the experts have been demanding computerised diagnosis systems, called “computer-aided diagnosis (CAD)” that can automatically detect the neurological abnormalities using the medical big data. This system improves consistency of diagnosis and increases the success of treatment, save lives and reduce cost and time. Recently, there are some research works performed in the development of the CAD systems for management of medical big data for diagnosis assessment. This paper explores the challenges of medical big data handing and also introduces the concept of the CAD system how it works. This paper also provides a survey of developed CAD methods in the area of neurological diseases diagnosis. This study will help the experts to have some idea and understanding how the CAD system can assist them in this point.

...read moreread less

122 citations

Journal Article•DOI•

Approximate Query Processing: What is New and Where to Go?: A Survey on Approximate Query Processing

[...]

Kaiyu Li¹, Guoliang Li¹•Institutions (1)

Tsinghua University¹

01 Dec 2018-Data Science and Engineering

TL;DR: The survey can help the partitioners to understand existing AQP techniques and select appropriate methods in their applications and provide research challenges and opportunities of AQP.

...read moreread less

Abstract: Online analytical processing (OLAP) is a core functionality in database systems. The performance of OLAP is crucial to make online decisions in many applications. However, it is rather costly to support OLAP on large datasets, especially big data, and the methods that compute exact answers cannot meet the high-performance requirement. To alleviate this problem, approximate query processing (AQP) has been proposed, which aims to find an approximate answer as close as to the exact answer efficiently. Existing AQP techniques can be broadly categorized into two categories. (1) Online aggregation: select samples online and use these samples to answer OLAP queries. (2) Offline synopses generation: generate synopses offline based on a-priori knowledge (e.g., data statistics or query workload) and use these synopses to answer OLAP queries. We discuss the research challenges in AQP and summarize existing techniques to address these challenges. In addition, we review how to use AQP to support other complex data types, e.g., spatial data and trajectory data, and support other applications, e.g., data visualization and data cleaning. We also introduce existing AQP systems and summarize their advantages and limitations. Lastly, we provide research challenges and opportunities of AQP. We believe that the survey can help the partitioners to understand existing AQP techniques and select appropriate methods in their applications.

...read moreread less

99 citations

Journal Article•DOI•

Big Data Privacy: Challenges to Privacy Principles and Models

[...]

Jordi Soria-Comas, Josep Domingo-Ferrer

01 Mar 2016-Data Science and Engineering

TL;DR: How well the two main privacy models used in anonymization meet the requirements of big data, namely composability, low computational cost and linkability is evaluated.

...read moreread less

Abstract: This paper explores the challenges raised by big data in privacy-preserving data management. First, we examine the conflicts raised by big data with respect to preexisting concepts of private data management, such as consent, purpose limitation, transparency and individual rights of access, rectification and erasure. Anonymization appears as the best tool to mitigate such conflicts, and it is best implemented by adhering to a privacy model with precise privacy guarantees. For this reason, we evaluate how well the two main privacy models used in anonymization (k-anonymity and \(\varepsilon \)-differential privacy) meet the requirements of big data, namely composability, low computational cost and linkability.

...read moreread less

89 citations

Journal Article•DOI•

A Survey of Traffic Prediction: from Spatio-Temporal Data to Intelligent Transportation

[...]

Haitao Yuan¹, Guoliang Li¹•Institutions (1)

Tsinghua University¹

01 Mar 2021-Data Science and Engineering

TL;DR: Wang et al. as discussed by the authors provided a comprehensive survey on traffic prediction, which is from the spatio-temporal data layer to the intelligent transportation application layer, and split the whole research scope into four parts from bottom to up, where the four parts are, respectively, spatiotemporal data, preprocessing, traffic prediction and traffic application.

...read moreread less

Abstract: Intelligent transportation (e.g., intelligent traffic light) makes our travel more convenient and efficient. With the development of mobile Internet and position technologies, it is reasonable to collect spatio-temporal data and then leverage these data to achieve the goal of intelligent transportation, and here, traffic prediction plays an important role. In this paper, we provide a comprehensive survey on traffic prediction, which is from the spatio-temporal data layer to the intelligent transportation application layer. At first, we split the whole research scope into four parts from bottom to up, where the four parts are, respectively, spatio-temporal data, preprocessing, traffic prediction and traffic application. Later, we review existing work on the four parts. First, we summarize traffic data into five types according to their difference on spatial and temporal dimensions. Second, we focus on four significant data preprocessing techniques: map-matching, data cleaning, data storage and data compression. Third, we focus on three kinds of traffic prediction problems (i.e., classification, generation and estimation/forecasting). In particular, we summarize the challenges and discuss how existing methods address these challenges. Fourth, we list five typical traffic applications. Lastly, we provide emerging research challenges and opportunities. We believe that the survey can help the partitioners to understand existing traffic prediction problems and methods, which can further encourage them to solve their intelligent transportation applications.

...read moreread less

87 citations

Collapse

Performance

Metrics

200

Papers

2,978

Citations

No. of papers from the Journal in previous years
Year	Papers
2023	15
2022	29
2021	28
2020	28
2019	25
2018	23