scispace - formally typeset
G

Gustavo E. A. P. A. Batista

Researcher at University of New South Wales

Publications -  142
Citations -  9328

Gustavo E. A. P. A. Batista is an academic researcher from University of New South Wales. The author has contributed to research in topics: Computer science & Dynamic time warping. The author has an hindex of 33, co-authored 129 publications receiving 7490 citations. Previous affiliations of Gustavo E. A. P. A. Batista include University of California, Riverside & University of São Paulo.

Papers
More filters
Journal ArticleDOI

A study of the behavior of several methods for balancing machine learning training data

TL;DR: This work performs a broad experimental evaluation involving ten methods, three of them proposed by the authors, to deal with the class imbalance problem in thirteen UCI data sets, and shows that, in general, over-sampling methods provide more accurate results than under-sampled methods considering the area under the ROC curve (AUC).
Proceedings ArticleDOI

Searching and mining trillions of time series subsequences under dynamic time warping

TL;DR: This work shows that by using a combination of four novel ideas the authors can search and mine truly massive time series for the first time, and shows that in large datasets they can exactly search under DTW much more quickly than the current state-of-the-art Euclidean distance search algorithms.
Journal ArticleDOI

An analysis of four missing data treatment methods for supervised learning

TL;DR: This analysis indicates that missing data imputation based on the k-nearest neighbor algorithm can outperform the internal methods used by C4.5 and CN2 to treat missing data, and can also outperforms the mean or mode imputation method, which is a method broadly used to treatMissing values.
Book ChapterDOI

Class Imbalances versus Class Overlapping: An Analysis of a Learning System Behavior

TL;DR: This work develops a systematic study aiming to question whether class imbalances are truly to blame for the loss of performance of learning systems or whether the class imbalance are not a problem by themselves.
Proceedings Article

A Complexity-Invariant Distance Measure for Time Series.

TL;DR: This work introduces the first complexity-invariant distance measure for time series, and shows that it generally produces significant improvements in classification accuracy, and it is shown that this improvement does not compromise efficiency, since it can be lower bound and use a modification of triangular inequality.