scispace - formally typeset
Journal ArticleDOI

Privacy preserving record linkage approaches

Reads0
Chats0
TLDR
This work proposes a certain methodology for preserving the privacy of various record linkage approaches, implements, examines and compares four pairs of privacy preserving record linkage methods and protocols and presents also a blocking scheme as an extension to the privacy preserve record linkage methodology.
Abstract
Privacy-preserving record linkage is a very important task, mostly because of the very sensitive nature of the personal data. The main focus in this task is to find a way to match records from among different organisation data sets or databases without revealing competitive or personal information to non-owners. Towards accomplishing this task, several methods and protocols have been proposed. In this work, we propose a certain methodology for preserving the privacy of various record linkage approaches and we implement, examine and compare four pairs of privacy preserving record linkage methods and protocols. Two of these protocols use n-gram based similarity comparison techniques, the third protocol uses the well known edit distance and the fourth one implements the Jaro-Winkler distance metric. All of the protocols used are enhanced by private key cryptography and hash encoding. This paper presents also a blocking scheme as an extension to the privacy preserving record linkage methodology. Our comparison is backed up by extended experimental evaluation that demonstrates the performance achieved by each of the proposed protocols.

read more

Citations
More filters
Book

Data Matching: Concepts and Techniques for Record Linkage, Entity Resolution, and Duplicate Detection

TL;DR: Data matching (also known as record or data linkage, entity resolution, object identification, or field matching) is the task of identifying, matching and merging records that correspond to the same entities from several databases or even within one database as mentioned in this paper.
Journal ArticleDOI

Privacy-preserving record linkage using Bloom filters

TL;DR: A new protocol for privacy-preserving record linkage with encrypted identifiers allowing for errors in identifiers has been developed, based on Bloom filters on q-grams of identifiers, which yields linkage results comparable to non-encrypted identifiers and superior to results from phonetic encodings.
Journal ArticleDOI

A taxonomy of privacy-preserving record linkage techniques

TL;DR: This paper presents an overview of techniques that allow the linking of databases between organizations while at the same time preserving the privacy of these data, and presents a taxonomy of PPRL techniques to characterize these techniques along 15 dimensions.
Book ChapterDOI

Privacy-Preserving Record Linkage for Big Data: Current Approaches and Research Challenges

TL;DR: The challenges of PPRL for Big data poses several challenges, with the three major ones being scalability to multiple large databases, due to their massive volume and the flow of data within Big Data applications, and achieving high quality results of the linkage in the presence of variety and veracity of Big Data.
Book ChapterDOI

A constraint satisfaction cryptanalysis of bloom filters in private record linkage

TL;DR: A novel attack, based on constraint satisfaction, is introduced to provide a rigorous analysis for BFE and guidelines regarding how to mitigate risk against the attack and an empirical analysis with data derived from public voter records is conducted to illustrate the feasibility of the attack.
References
More filters
Journal ArticleDOI

Duplicate Record Detection: A Survey

TL;DR: This paper presents an extensive set of duplicate detection algorithms that can detect approximately duplicate records in a database and covers similarity metrics that are commonly used to detect similar field entries.
Book

Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison

TL;DR: In this paper, a mudflap assembly for use with a dump vehicle having dual tires at the rear end thereof and including a pair of flexible flap sections one of which is supported by a rigid member adjacent the dual tires and the other is located above and to the rear of the rigid member and is secured at its upper end to the dump body.
Journal ArticleDOI

Duplicate Record Detection: A Survey

TL;DR: This paper presents an extensive set of duplicate detection algorithms that can detect approximately duplicate records in a database and covers similarity metrics that are commonly used to detect similar field entries.
Related Papers (5)