scispace - formally typeset
Journal ArticleDOI

Outlier Detection Algorithms in Data Mining Systems

Mikhail Petrovskiy
- 01 Jul 2003 - 
- Vol. 29, Iss: 4, pp 228-237
TLDR
A new outlier detection algorithm is suggested, based on methods of fuzzy set theory and the use of kernel functions and possesses a number of advantages compared to the existing methods.
Abstract
The paper discusses outlier detection algorithms used in data mining systems. Basic approaches currently used for solving this problem are considered, and their advantages and disadvantages are discussed. A new outlier detection algorithm is suggested. It is based on methods of fuzzy set theory and the use of kernel functions and possesses a number of advantages compared to the existing methods. The performance of the algorithm suggested is studied by the example of the applied problem of anomaly detection arising in computer protection systems, the so-called intrusion detection systems.

read more

Citations
More filters
Journal ArticleDOI

Overview and Framework for Data and Information Quality Research

TL;DR: An overview of the evolution and current landscape of data and information quality research is presented and a framework to characterize the research along two dimensions: topics and methods is introduced.
Journal ArticleDOI

A Survey of Outlier Detection Methods in Network Anomaly Identification

TL;DR: A comprehensive survey of well-known distance-based, density-based and other techniques for outlier detection and compare them is presented and definitions of outliers are provided and their detection based on supervised and unsupervised learning in the context of network anomaly detection are discussed.
Journal ArticleDOI

A comprehensive survey of numeric and symbolic outlier mining techniques

TL;DR: This survey discuses practical applications of outlier mining, provides a taxonomy for categorizing related mining techniques, and provides a comprehensive review of these techniques with their advantages and disadvantages.
Patent

Multiple measurement mode in a physiological sensor

TL;DR: In this paper, a physiological sensor that estimates a true parameter value by providing a predicted parameter value, multiple measurements are taken to increase the accuracy of the predicted parameter values, and the sensor can be reapplied between measurements to decrease the probability of an erroneous prediction caused by sensor misplacement.
References
More filters
Book

Data Mining: Concepts and Techniques

TL;DR: This book presents dozens of algorithms and implementation examples, all in pseudo-code and suitable for use in real-world, large-scale data mining projects, and provides a comprehensive, practical look at the concepts and techniques you need to get the most out of real business data.
Journal ArticleDOI

LOF: identifying density-based local outliers

TL;DR: This paper contends that for many scenarios, it is more meaningful to assign to each object a degree of being an outlier, called the local outlier factor (LOF), and gives a detailed formal analysis showing that LOF enjoys many desirable properties.
Journal ArticleDOI

Efficient algorithms for mining outliers from large data sets

TL;DR: A novel formulation for distance-based outliers that is based on the distance of a point from its kth nearest neighbor is proposed and the top n points in this ranking are declared to be outliers.