Finding Common Rules to Wrangle Big Data

Scholar argues for unified rules of data governance to achieve fairness, efficiency, and stability.

The idea of “big data” generates both fear and hope. Apple’s Tim Cook has portrayed data availability as a warring enemy, warning that personal information “is being weaponized against us with military efficiency.” In contrast, other commentators praise data availability as a revolutionary force for good across industries such as health care and finance.

In a new paper, William Magnuson, a law professor at Texas A&M University School of Law, explains these conflicting perspectives. Magnuson describes how both angst and excitement about data revolve around three common values: fairness, efficiency, and stability. The increasing magnitude and availability of data can either undermine or promote these values, depending on how those data are managed and used.

Magnuson also explains that existing data laws and regulations have limited applicability but that data do not—and neither should data rules. Given common problems and opportunities of data, Magnuson argues for a set of common approaches to data governance; a unified set of principles to enshrine in laws and regulations.

One set of problems and opportunities revolves around fairness. Magnuson first introduces the view that the expanding scope of data usage creates harm. Decisions may incorporate data points that correlate with race, religion, or sex in a way that leads to discrimination. A bank could use factors that correlate with race—say, zip codes or first names—to decide on issuing mortgages, leading to race-based discrimination. Or sentencing algorithms may rely on data about prior offenses or personality disorders, which could reinforce discriminatory patterns of conviction or diagnosis.

Widespread data may also undermine fairness by eroding human dignity and privacy. Decisions that treat humans as statistics do not respect the dignity of individuals. And uses of personal data by companies without individual consent unfairly disregard privacy.

But Magnuson points out that this perspective on fairness is only one side of the story. Availability of personal data also allows people to share information to their benefit, such as by disclosing purchasing histories that build credit. Another perspective also presents data analysis as a protector of dignity and equality, providing the tools to shine a light on bias and compensate for human flaws.

A similar dichotomy exists around efficiency, Magnuson reasons. He describes how targeted ads based on consumer data profiles border on predatory, exploiting vulnerabilities to push products and services. Not only can companies leverage data to manipulate consumer behavior, but they also can use customer data to hinder competition, protecting market position and raising prices. Having more data could also lead to erroneous decisions because decision-makers may defer to concrete algorithms when a more complex judgment is required, Magnuson explains.

On the other hand, data can lead to more efficient consumer decisions by empowering informed choices. Data can improve competition by giving small startups in areas such as fintech and health care more avenues to compete against incumbents. And data can sharpen decisions when analyzed for applications such as autonomous vehicles or vaccine development.

Finally, the increasing availability of data both stabilizes and destabilizes systems. Data availability expands opportunities for a more responsive government, enhancing stability. Destabilizing forces also can increase, however, when data are readily accessible. Governments that have access to troves of data about residents can retaliate for civil dissent. And algorithms that trade on high frequency information about markets can trigger erratic market shifts.

Magnuson presents unified solutions to minimize these problems across fairness, efficiency, and stability while still capturing beneficial opportunities.

He argues that individuals should have “substantial control” over their data, including cell phone data, health data, and financial data. Magnuson points out that even though this rule of individual control seems self-evident, it remains contested. Individual control would include property rights, and people could decide for themselves whether to maintain privacy or sell their information, striking their own balance of values.

Although individual control would be foundational, Magnuson argues that governments should have an “eminent domain” power over data—the right to access and collect data—for legitimate goals related to society’s health or safety. This right of governments would have limits to ensure transparency to the public and reviewability by courts.

Finally, Magnuson advocates requirements that companies with data should follow appropriate security procedures and implement systems to protect data from breaches. He also favors expansive liability rules so that individuals harmed by hacks can pursue compensation, further deterring careless storage of information.

The current regulatory environment for data governance is complex, varying by industry and location. The Health Insurance Portability and Accountability Act regulates health data, the Gramm-Leach-Bliley Act regulates financial data, and the Federal Information Security Modernization Act regulates federal government data. At the same time, California has the California Consumer Privacy Act and Massachusetts has the Act Relative to Consumer Protection From Security Breaches. Abroad, Europe has the General Data Protection Regulation.

In contrast to this patchwork of regulations, Magnuson contends that relying on unified regulatory principles across contexts is the best way to handle widespread data diffusion. Common solutions should address the reoccurring problems of fairness, efficiency, and stability.