This book provides modern technical answers to the legal requirements of pseudonymisation as recommended by privacy legislation. It covers topics such as modern regulatory frameworks for sharing and linking sensitive information, concepts and algorithms for privacy-preserving record linkage and their computational aspects, practical considerations such as dealing with dirty and missing data, as well as privacy, risk, and performance assessment measures. Existing techniques for privacy-preserving record linkage are evaluated empirically and real-world application examples that scale to population sizes are described. The book also includes pointers to freely available software tools, benchmark data sets, and tools to generate synthetic data that can be used to test and evaluate linkage techniques.

This book consists of fourteen chapters grouped into four parts, and two appendices. The first part introduces the reader to the topic of linking sensitive data, the second part covers methods and techniques to link such data, the third part discusses aspects of practical importance, and the fourth part provides an outlook of future challenges and open research problems relevant to linking sensitive databases. The appendices provide pointers and describe freely available, open-source software systems that allow the linkage of sensitive data, and provide further details about the evaluations presented. A companion Web site at https://dmm.anu.edu.au/lsdbook2020 provides additional material and Python programs used in the book.

This book is mainly written for applied scientists, researchers, and advanced practitioners in governments, industry, and universities who are concerned with developing, implementing, and deploying systems and tools to share sensitive information in administrative, commercial, or medical databases.


The Book describes how linkage methods work and how to evaluate their performance. It covers all the major concepts and methods and also discusses practical matters such as computational efficiency, which are critical if the methods are to be used in practice - and it does all this in a highly accessible way!
David J. Hand, Imperial College, London











Peter Christen is Professor at the Australian National University (ANU) Research School of Computer Science (RSCS). His research interests are in record linkage and data mining, with a focus on privacy and machine learning aspects of record linkage. He has published nearly 200 articles in these areas, including the monograph 'Data Matching' published by Springer in 2012.

Thilina Ranbaduge is a research fellow at the Australian National University (ANU) Research School of Computer Science (RSCS). His research interests are in privacy-preserving record linkage, multi-database linkage, and data mining. He has published more than 30 papers related to record linkage and privacy-preserving record linkage.

Rainer Schnell is Professor at the University of Duisburg-Essen and holds the Chair in Research Methodology in the Social Sciences. He had been Director of the Centre of Comparative Social Surveys at the University of London, City from 2015 to 2017. He is a survey methodologist with a research focus on non-sampling errors, applied sampling, census operations, and privacy-preserving record linkage. He has published four books on research methodology and more than 60 papers on record linkage.


Verwandte Artikel

Linking Sensitive Data Christen, Peter, Schnell, Rainer, Ranbaduge, Thilina

171,19 €*