Information and Estimation Theoretic Approaches to Data Privacy

dc.contributor.authorAsoodeh, Shahaben
dc.contributor.departmentMathematics and Statisticsen
dc.contributor.supervisorLinder, Tamásen
dc.contributor.supervisorAlajaji, Fadyen
dc.date.accessioned2017-05-26T20:19:29Z
dc.date.available2017-05-26T20:19:29Z
dc.degree.grantorQueen's University at Kingstonen
dc.description.abstractWarner [145] in 1960s proposed a simple mechanism, now referred to as the randomized response model, as a remedy for what he termed “evasive answer bias” in survey sampling. The randomized response setting is as follows: $n$ people participate in a survey and a statistician asks each individual a sensitive yes-no question and seeks to find the ratio of "yes" responses. For privacy purposes, individuals are given a biased coin that comes up heads with probability $a\in(0,\frac{1}{2})$. Each individual flips the coin in private. If it comes up heads, they lie and if it comes up tails, they tell the truth. Warner derived a maximum likelihood unbiased estimator for the true ratio of "yes" based on the reported responses. Thus the parameter of interest is estimated accurately while preserving the privacy of each user and avoiding survey answer bias. In this thesis, we generalize Warner's randomized response model in several directions: (i) we assume that the response of each individual consists of private and non-private data and the goal is to generate a response which carries as much "information" about the non-private data as possible while limiting the "information leakage" about the private data, (ii) we propose mathematically well founded metrics to quantify the tradeoff between how much the response leaks about the private data and how much information it conveys about the non-private data, (iii) we make no assumptions on the alphabets of the private and non-private data, and (iv) we design optimal response mechanisms which achieve the fundamental tradeoffs. Unlike the large body of recent research on privacy which studied the problem of reducing disclosure risk, in this thesis we formulate and study the tradeoff between utility (e.g., statistical efficiency) and privacy (e.g., information leakage). Our approach (which is two-fold: information-theoretic and estimation-theoretic) and results shed light on the fundamental limits of the utility-privacy tradeoff.en
dc.description.degreePhDen
dc.embargo.liftdate2017-11-07
dc.embargo.termsWe are writting a paper out of the thesis. Hence, we prefer to make the thesis publicly available after the paper is submitted to a journal (which is likely to be around July 1st)en
dc.identifier.urihttp://hdl.handle.net/1974/15872
dc.language.isoengen
dc.relation.ispartofseriesCanadian thesesen
dc.rightsAttribution-NonCommercial-NoDerivs 3.0 United Statesen
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/3.0/us/
dc.subjectInformation theoryen
dc.subjectEstimation theoryen
dc.subjectData privacyen
dc.subjectPrivacy-preserving mechanism designen
dc.titleInformation and Estimation Theoretic Approaches to Data Privacyen
dc.typethesisen

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Asoodeh_Shahab_201705_PhD.pdf
Size:
1.23 MB
Format:
Adobe Portable Document Format
Description:
Thesis document

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
2.6 KB
Format:
Item-specific license agreed upon to submission
Description: