Repository logo

Contributions to Statistical Theory of Data Privacy

Loading...
Thumbnail ImageThumbnail Image

Journal Title

Journal ISSN

Volume Title

Publisher

Université d'Ottawa | University of Ottawa

Creative Commons

Attribution-NonCommercial-ShareAlike 4.0 International

Abstract

This thesis explores key challenges and methodologies in the statistical theory of data privacy, focusing on disclosure risk assessment and synthetic data generation. The research reviews established privacy frameworks, such as k-anonymity,-diversity, t-closeness, and differential privacy, and highlights their practical limitations. To address these gaps, a new approach to Correct Attribution Probability (CAP) is proposed, utilizing equivalence classes to enhance applicability and interpretability. The thesis also provides a detailed analysis of synthetic data generation methods, assessing their utility and privacy implications, and thoroughly examines the Synthpop package. Several improvements to Synthpop are proposed, including better handling of data dependencies, the incorporation of privacy metrics like differential privacy, and more robust utility evaluation methods. These contributions aim to improve the balance between data privacy and utility.

Description

Keywords

Data privacy, Synthetic data generation

Citation

Related Materials

Alternate Version