Information Flow Experiments

Project Abstract

Information flow analysis has largely ignored the setting where the analyst has neither control over nor a complete model of the analyzed system. We formalize such limited information flow analyses and study an instance of it: detecting the usage of data by websites. We prove that these problems are ones of causal inference. Leveraging this connection, we push beyond traditional information flow analysis to provide a systematic methodology based on experimental science and statistical analysis. Our methodology allows us to systematize prior works in the area viewing them as instances of a general approach and to develop a statistically rigorous tool, AdFisher, for detecting information usage.

AdFisher uses machine learning to automate the selection of a statistical test. We use it to find that Google's Ad Settings is opaque about some features of a user's profile, that it does provide some choice on ads, and that these choices can lead to seemingly discriminatory ads. In particular, we found that visiting webpages associated with substance abuse will change the ads shown but not the settings page. We also found that setting the gender to female results in getting fewer instances of an ad related to high paying jobs than setting it to male.

For this last gender-related finding, we have explored ways in which discrimination may arise in the targeting of job-related advertising, noting the potential for multiple parties to contribute to its occurrence. We then examined the statutes and case law interpreting the prohibition on advertisements that indicate a preference based on protected class, and consider its application to online advertising. We argued that online services can lose Communications Decency Act Section 230 immunity if they target ads toward or away from protected classes without explicit instructions from advertisers to do so.

Software

We make our tool, AdFisher, freely available on Github at https://github.com/tadatitam/info-flow-experiments.

The code used for running our experiments and the raw data from them are available below with each publication that details the results. Also see the tool's webpage to learn about the results we found with it.

Publications

Amit Datta, Anupam Datta, Jael Makagon, Deirdre K. Mulligan, and Michael Carl Tschantz
Discrimination in Online Advertising: A Multidisciplinary Inquiry
Conference on Fairness, Accountability, and Transparency (FAT*), 2018
Read the paper: pdf
Read additional details: appendix

Amit Datta, Michael Carl Tschantz, and Anupam Datta
Automated Experiments on Ad Privacy Settings: A Tale of Opacity, Choice, and Discrimination
Privacy Enhancing Technologies Symposium (PETS) 2015
See the website: here
Read the paper: official version, preprint
Tech report arXiv:1408.6491: version 1, version 2
Download the code and raw data: version 1, version 2
Read additional details here

Michael Carl Tschantz, Amit Datta, Anupam Datta, and Jeannette M. Wing
A Methodology for Information Flow Experiments
The IEEE Computer Security Foundations Symposium (CSF) 2015
Read the paper: official version, preprint
Tech report arXiv:1405.2376
Read the TR here
Download the code and raw data here

Amit Datta, Anupam Datta, Suman Jana, and Michael Carl Tschantz
Poster: Information Flow Experiments to study News Personalization
Poster at the IEEE Symposium on Security and Privacy, 2015
Read the paper: official abstract, preprint

Michael Carl Tschantz, Anupam Datta, and Jeannette M. Wing
Information Flow Investigations
CMU Tech Report CMU-CS-13-118
Read the paper here

Michael Carl Tschantz, Anupam Datta, and Jeannette M. Wing
Information Flow Investigations: Extended Abstract
Abstract for 5-Minute Talk at CSF 2013
Read the paper here