genewitch 2 days ago

AOL found out and thus we all found out that you can't anonymize certain things, web searches in that case. I used to have bookmarked some literature from maybe ten years ago that said,(proved with math?), any moderate collection of data from or by individuals that fits certain criteria is de-anonymizeable, if not by itself, then with minimal extra data. I want to say it included if, for instance, instead of changing all occurances of genewitch to user9843711, every instance of genewitch was a different, unique id.

I apologize for not having cites or a better memory at this time.

1
genewitch 1 day ago

> The root of this problem is the core problem with k-anonymity: there is no way to mathematically, unambiguously determine whether an attribute is an identifier, a quasi-identifier, or a non-identifying sensitive value. In fact, all values are potentially identifying, depending on their prevalence in the population and on auxiliary data that the attacker may have. Other privacy mechanisms such as differential privacy do not share this problem.

see also: https://en.wikipedia.org/wiki/Differential_privacy which alleges to solve this; that is, wiki says that the only attacks are side-channel attacks like errors in the algorithm or whatever.

catlifeonmars 1 day ago

If you squint a little, this problem is closely related to oblivious transfer as well