Wang, Peng

Resource type

Thesis

Thesis type

(Thesis) M.Sc.

Date created

2014-05-05

Authors/Contributors

Author: Wang, Peng

Abstract

Publishing data without revealing the sensitive information about individuals is an important issue in the field of computer science. In recent years, there are several methods widely used to protect people’s privacy: generalization, bucketization and randomization. In this thesis, we begin with giving definition of several well-known privacy protection notions: k-anonymity, l-diversity and t-closeness, and discussing their three major drawbacks, namely, 1) the lack of flexibility for handling different types of variable sensitivity; 2) the large loss of information utility; 3) the vulnerability to auxiliary information. We then propose a new approach by generating the multiple-sized buckets to offer a better protection of individual privacy. This approach also has a higher information utility without violating personal privacy. We design two pruning algorithms for two-sized bucketing: lose-based pruning and privacy-based pruning. Both of them make the two-sized bucketing algorithm perform efficiently for the real data. We also implement a recursive algorithm to test our multiple size bucketing approach. Finally, we apply it to the empirical studies on the real data to demonstrate its effectiveness.

Keywords

Identifier

etd8398

Copyright statement

Copyright is held by the author.

Permissions

The author granted permission for the file to be printed, but not for the text to be copied and pasted.

Scholarly level

Graduate student (Masters)

Supervisor or Senior Supervisor

Thesis advisor: Wang, Ke

Member of collection

Computing Science Theses

Download file	Size
etd8398_PWang.pdf	1.19 MB

Multiple-sized Bucketization For Privacy Protection

Keywords

Views & downloads - as of June 2023