Advances in computer science have enabled analysis of data in ways previously unthinkable. This has led to powerful new uses of data, providing us with countless benefits across virtually all aspects of our lives. For systems utilizing sensitive data, novel functionality has sometimes provided novel routes for exposure of the underlying data. This functionality may come with dangerous new assumptions or undermine old ones, allowing unexpected inferences. As a result, release of seemingly innocuous information may reveal sensitive data in surprising new ways. This exposure can be detrimental, but it can often enable desirable new capabilities as well. Regardless of whether these exposures can help or harm us, we benefit from a deeper understanding of when and how they can arise.
We explore this issue in the context of three systems. First, we examine the impact on election systems of recent advances in the ability to reidentify sheets of paper. These advances pose some threat to the secret ballot, but they enable new measures for verifying election integrity. Next, we develop techniques for and discuss the use of markings on Scantron-style bubble forms as a biometric. These forms are used in a variety of circumstances, and potential implications vary from enabling cheating detection on standardized tests to undermining anonymous surveys. Finally, we examine data leakage from collaborative filtering recommender systems, finding that recommendations can be inverted to infer individuals' underlying transactions. This demonstrates that even subjecting data to massive-scale aggregation and complex algorithms can be insufficient to protect sensitive details. By explicitly considering the ways in which novel functionality exposes sensitive data, we hope to reduce the risk for those whose intimate details may be encoded in this data while encouraging responsible uses of the data.
Princeton University Library aims to describe library materials in a manner that is respectful to the individuals and communities who create, use, and are represented in the collections we manage.
Read more...