Big Data’s ‘Correlation’ Tactics Defy Privacy Regulation’s ‘Causation’ Basis, Creating Ethical Concerns, Panelists Say
Big data could drastically alter genetic research and education, but big data tactics in those areas have also raised the largest social and ethical concerns, said panelists Monday night at a White House big data review workshop. Any regulatory frameworks should establish principles accounting for such concerns, without creating sector-specific rules, said Alondra Nelson, a Columbia University sociology professor who studies biomedicine and health.
Sign up for a free preview to unlock the rest of this article
Timely, relevant coverage of court proceedings and agency rulings involving tariffs, classification, valuation, origin and antidumping and countervailing duties. Each day, Trade Law Daily subscribers receive a daily headline email, in-depth PDF edition and access to all relevant documents via our trade law source document library and website.
Privacy regulations should give consumers a degree of due process, said Microsoft Principal Researcher Kate Crawford, allowing them to review data used for important decisions, such as healthcare decisions. The government must educate consumers and parents on big data benefits if it’s to be used properly and effectively, said Steven Hodas, executive director of the New York City Education Department’s Innovate NYC program.
White House Chief Privacy Officer Nicole Wong kicked off Monday’s event, webcast from New York University. It was the second of three workshops the White House helped organize as part of its big data review (WID Jan 24 p8). The first workshop focused on the technology behind big data (WID March 4 p3), while Monday’s event dealt with social and ethical concerns. One final workshop -- April 1 at University of California-Berkeley -- will tackle governance issues, said Wong. She said the review group -- which has met with privacy and civil liberties advocates, various federal agencies, academics, international regulators, and private sector leaders in healthcare and finance, among others -- will conclude its work in mid-April.
"The hard work is in the middle,” Wong said. The group must examine the cases where the desire for privacy protections abuts big data tactics that could advance or imperil “important values like fairness, or nondiscrimination, or human freedom, or health and safety, or economic prosperity,” Wong said.
One of those cases is genetic data, Columbia’s Nelson said. Genetic data is “multivocal,” it “contain[s] multitudes,” she said, referencing Walt Whitman’s poem “Song of Myself.” Genes hold numerous types of information simultaneously, she said. They can be used in genealogical research, medical examinations or forensic work, she said.
Think of the genome of Henrietta Lacks, Nelson said, whose continuously reproducing cells have been central to medical research -- from the creation of the polio vaccine to cloning research -- for over 60 years. Scientists recently made Lacks’ fully sequenced genome publicly available online, believing nothing could be inferred about her ancestors, Nelson said. Other researchers quickly dispelled this notion, using the genome to discover a great deal of information, from physical appearances to which diseases family members were prone to get. The situations revealed “a lack of concern about privacy and consent,” Nelson said, and “resulted in a dehumanization of data,” she said. “That’s what we should be concerned about."
Using genomes to infer familial information is no longer unique, Nelson said. Genetic technology is used to identify criminal suspects, she said. Unidentified genetic information has been cross-referenced against DNA in police or medical databases, and will match with family members who may have no criminal history. “This is more prevalent,” she said, after a Supreme Court decision in June said the police could take DNA samples from those arrested, but not convicted, for serious crimes (http://1.usa.gov/OtGw5v). “We need a flexible, analytical, ethical and regulatory approach to account for the inherent characteristics of DNA that make it informative simultaneously in these numerous contexts,” Nelson said. For instance, she said, the Genetic Information Nondiscrimination Act (GINA) prevents employers and insurers from discriminating on the basis of DNA testing. But that applies only to genetic medical testing, not law enforcement activity that yields genetic material, Nelson said. “We need policies that take the social life of DNA into account.”
Big data techniques like this are “concerned with correlation,” but privacy regulations are “concerned with causation,” said Microsoft’s Crawford -- and that’s a problem. The Health Insurance Portability and Accountability Act (HIPAA) dictates “some of the strongest privacy protections” for health data, Crawford said. But not big health data: “Bringing HIPAA to big data is kind of like bringing a knife to a gunfight,” she said. A due process framework for big data would be a better approach, she said, giving consumers a right to know when big data is being used to make decisions. “That decision, if serious enough, should mean the individual should know that big data is being brought into this determination, and they should be able to challenge that determination and correct that data,” Crawford said. Health and employment determinations merit lots of due process, she said. Advertising determinations, less so. This avoids “sector-based” regulation, she said.
And it empowers consumers, said Innovate NYC’s Hodas. That’s been a “crucial error” of many big data initiatives. Hodas brought up inBloom, a nonprofit aiming to create educational data dashboards that educators and parents could access, and which would let school districts control who has access to specific data, Hodas said. Funded by the Bill & Melinda Gates Foundation and Carnegie Corp., inBloom launched in February 2013 (http://bit.ly/1daTDE3) with the support of nine states, Hodas said. It faced immediate opposition, smeared as a “national database” controlled by “unaccountable” nonprofits, Hodas said. All nine states have either backed out or have legislation pending to bar participation, he said.
That inBloom could be “stigmatized” so easily shows the manifest hesitations about “datafication,” Hodas said. People feel “human knowledge and understanding are being replaced by algorithm,” he said. But those same people -- students, educators, especially parents -- were not originally engaged, said Hodas. “The crucial error of inBloom’s evangelists was not at the outset creating software for parents,” he said. “That mistake is unfortunately symptomatic of much government and foundation policymaking. A simple failure of empathy and the basic principles of marketing.”