Data collection is ubiquitous. Data are useful for a variety of purposes, from supporting research to helping allocate political representation. It benefits society to enable data use for such purposes, but it’s also important to protect people’s privacy in the process. Organizations across industry and government are increasingly turning to differential privacy (DP), an approach to privacy-preserving data analysis that limits how much information about an individual is learned from an analysis. Chances are DP has been used to provide a privacy guarantee for an analysis of your data: Companies like Google, Apple, Meta, Microsoft, and Uber, as well as government agencies like the U.S. Census Bureau have all used it in the past several years.
Not all differential privacy systems are created equal, though. The strength of privacy protections offered by DP depends on a “privacy loss budget” parameter, called epsilon. Epsilon is a measure of the amount of information “leaked” about individuals from the use of their data. This value can be chosen to be anything from zero to infinity, where smaller epsilon values correspond to stronger levels of privacy protections. Privacy protections can vary wildly according to how epsilon is set: bigger epsilons can leak much more information about individuals. For example, when epsilon is 0.1, an observer or attacker is 1.1 times more likely to learn something about you, compared to if they had never seen your data. If epsilon is 10, this becomes 22,000 times more likely. Despite epsilon’s importance as an indicator of privacy risk, it is seldom communicated to the people whose personal data are used by technology companies and other large organizations. This is in part because epsilon is difficult to reason about, even among experts. It is a unitless and contextless parameter, making it challenging to map onto real-world outcomes. Furthermore, it specifies probabilistic guarantees, meaning people must reason under uncertainty to fully grasp its implications. However, not explaining epsilon to people who are deciding whether to share their data under DP leaves them ill-informed about the protections that are being offered.
You'll Probably Be Protected: Explaining Differential Privacy Guarantees
Submitted 11 hours ago by Tea@programming.dev to technology@lemmy.zip
https://cdt.org/insights/youll-probably-be-protected-explaining-differential-privacy-guarantees/