What does ε-differential privacy mean?

3 minute read

Differential privacy (Dwork et al. 2006) is an emerging standard for database privacy. As such, differential privacy is concerned with databases describing individuals. Given a randomized algorithm for answering queries on a database, differential privacy requires that the probability of any fixed response does not change more than a factor \(e^\epsilon\) upon any substitution of a single individual in the database. What this means can be a little mysterious. The goal of this post is to shed a little light on this mystery by giving a Bayesian interpretation of \(e^\epsilon\).

The idea behind differential privacy is that if the effect of making an arbitrary single substitution in the database is small enough, the query result cannot be used to infer much about any single individual, and therefore provides privacy. Differentially private processes rely on randomization, that is making random choices, to achieve the wanted insensitivity to arbitrary single point changes. Furthermore, differential privacy is defined as property of the process, independent of the input. Applying randomization generally degrades the query result utility as it is “noisy.” The challenge of creating differentially private methods is to balance this requirement with providing the best query answer we can.

More formally, we declare two databases to be “neighbors” if they differ in at most one point (row). A randomized algorithm \(A\) is a procedure that is allowed to make random decisions, usually implemented by giving the algorithm access to a stream of random bits. Such a random algorithm is \(\epsilon\)-differential private if for all pairs \((D, D')\) of neighboring databases, all subsets \(S\) of possible outputs of \(A\), we have that \[P(A(D) \in S) \leq \exp(\epsilon)\; P(A(D') \in S),\] where the probability is taken over the randomness in \(A\).

Now assume that Bob has a database and he runs an \(\epsilon\)-differentially private algorithm \(A\) to produce output \(x\). Alice receives \(x\). She has a suspicion that the data Bob has is \(D\), and wants to compute the probability \(P(D | x)\) of this being the case conditioned on the new information \(x\) she just received. Since Bob is not a big fan of security through obscurity, the algorithm \(A\) is no secret and Alice has access to it. This means that she can compute \(P(x | D) = P(A(D) = x)\).

Bayes’ Theorem states that \[ P(D | x) = \frac{P(x | D)P(D)}{P(x)}. \] Here \(P(D)\) reflects Alice’s prior suspicion that \(D\) is the data Bob has, i.e., the probability that she assigns to \(D\) being Bob’s data before receiving \(x\). Bayes’ Theorem gives Alice a way to update this prior to a posterior probability \(P(D | x)\).

Alice knows that an alternative statement of Bayes’ Theorem, also called Bayes’ Rule, can be useful. For an alternative event \(D'\) we have that \[ \frac{P(D | x)}{P(D' | x)} = \frac{P(D)}{P(D')} \frac{P(x | D)}{P(x | D')} \] which we can write as \[ OR(D, D', x) = O(D,D') \; \Lambda(x, D, D') \] where \(OR(D, D', x) = \frac{P(D | x)}{P(D' | x)}\) is called the posterior odds ratio, \(O(D,D') = \frac{P(D)}{P(D')}\) is called the prior odds ratio, and \(\Lambda(x,D,D') = \frac{P(x | D)}{P(x | D')}\) is called the likelihood ratio.

What Bayes’ Rule states is that the posterior odds of two alternatives is the product of their prior odds and the likelihood ratio of the data under these alternatives. The likelihood ratio \(\Lambda\) is also interpreted as the predictive power of \(D\) as it is the degree to which hypothesis \(D\) surpasses hypothesis \(D'\) as a predictor of the data \(x\).

We can restate differential privacy as the following ratio bound on neighboring datasets \(D\) and \(D'\) \[ \frac{P(A(D) \in S)}{P(A(D') \in S)} \leq \exp(\epsilon), \] which we recognize as \[ \Lambda(x, D, D') \leq \exp(\epsilon). \] In other words, we can think of \(\epsilon\)-differential privacy placing a bound on the predictive power in Bayes’ Rule, or how much the knowledge of \(x\) can update Alice’s prior odds ratio.

References

Dwork, C., F. McSherry, K. Nissim, and A. Smith. 2006. “Calibrating Noise to Sensitivity in Private Data Analysis.” In 3rd IACR TCC, 486–503.