Project Abstract
Information flow analysis has largely ignored the setting where the analyst has neither control over nor a complete model of the analyzed system. We formalize such limited information flow analyses and study an instance of it: detecting the usage of data by websites. We prove that these problems are ones of causal inference. Leveraging this connection, we push beyond traditional information flow analysis to provide a systematic methodology based on experimental science and statistical analysis. Our methodology allows us to systematize prior works in the area viewing them as instances of a general approach and to develop a statistically rigorous tool, AdFisher, for detecting information usage.
AdFisher uses machine learning to automate the selection of a statistical test. We use it to find that Google's Ad Settings is opaque about some features of a user's profile, that it does provide some choice on ads, and that these choices can lead to seemingly discriminatory ads. In particular, we found that visiting webpages associated with substance abuse will change the ads shown but not the settings page. We also found that setting the gender to female results in getting fewer instances of an ad related to high paying jobs than setting it to male.
For this last gender-related finding, we have explored ways in which discrimination may arise in the targeting of job-related advertising, noting the potential for multiple parties to contribute to its occurrence. We then examined the statutes and case law interpreting the prohibition on advertisements that indicate a preference based on protected class, and consider its application to online advertising. We argued that online services can lose Communications Decency Act Section 230 immunity if they target ads toward or away from protected classes without explicit instructions from advertisers to do so.
Software
We make our tool, AdFisher, freely available on Github at https://github.com/tadatitam/info-flow-experiments.
The code used for running our experiments and the raw data from them are available below with each publication that details the results. Also see the tool's webpage to learn about the results we found with it.
Publications
Automated Experiments on Ad Privacy Settings: A Tale of Opacity, Choice, and Discrimination Privacy Enhancing Technologies Symposium (PETS) 2015 See the website: here Read the paper: official version, preprint Tech report arXiv:1408.6491: version 1, version 2 Download the code and raw data: version 1, version 2 Read additional details here |
A Methodology for Information Flow Experiments The IEEE Computer Security Foundations Symposium (CSF) 2015 Read the paper: official version, preprint Tech report arXiv:1405.2376 Read the TR here Download the code and raw data here |
Poster: Information Flow Experiments to study News Personalization Poster at the IEEE Symposium on Security and Privacy, 2015 Read the paper: official abstract, preprint |
Information Flow Investigations: Extended Abstract Abstract for 5-Minute Talk at CSF 2013 Read the paper here |