Navigating the Challenges of Publicizing Social Media Data

Scholars argue that data-sharing lessons from clinical trials can apply to social media.

Although social media companies seem to know a great deal about their platform users, policymakers have expressed alarm that they do not know much about how these companies work. Without increased transparency about these companies’ data practices, policymakers may find it difficult to verify social media companies’ user privacy or content moderation claims.

In a recent article, Christopher J. Morten of Columbia Law School, Gabriel Nicholas of the New York University School of Law, and Salomé Viljoen of Michigan Law School, explain that giving independent researchers meaningful access to data can lead to a better understanding of the current social media ecosystem. Access to data can help researchers learn about social media companies and their practices, as well as how content, such as misinformation, is shared and proliferated, they claim.

Yet, as Morten, Nicholas, and Viljoen point out, granting access has not been easy. Social media companies control and manage the data they collect, and they protect their data as commercial secrets. In response to Cambridge Analytica scandal which compromised users’ privacy, some companies have played it safe by reducing access to their data.

Morten, Nicholas, and Viljoen argue policymakers can strike a balance between increasing access to data while protecting commercial and privacy interests. They point to the example of pharmaceutical and medical companies sharing their clinical trial data.

Pharmaceutical and medical companies conduct clinical trials on human volunteers to research the safety and efficacy of their health products. Morten, Nicholas, and Viljoen claim that these companies, like social media companies, once raised privacy concerns and protected the data collected from these clinical trials as proprietary.

The authors explain that now, however, the Food and Drug Administration Amendments Act of 2007 (FDAAA) mandates that companies share certain clinical trial data to a free and accessible platform administered by the National Institutes of Health (NIH). In addition, the FDAAA requires the Food and Drug Administration (FDA) to publish on its website the clinical trial data it had relied upon to approve a company’s product.

Outside of the FDAAA mandate, FDA and NIH fund data-sharing initiatives, such as the Yale Open Data Access Project (YODA). Through companies’ data contributions, the authors explain, these platforms can share even the most sensitive types of clinical trial data with qualified researchers who comply with the platform’s access and use agreements.

Morten, Nicholas, and Viljoen assert that this data-sharing regime works. They observe that approximately 75 to 80 percent of applicable clinical trial results are being reported. Moreover, this regime is sensitive to the type of data collected, allowing for tailored approaches to data that have particular levels of commercial or privacy risks.

The authors claim that the lessons from the clinical trial data-sharing regime can provide a pathway to regulating social media data sharing.

Looking at the FDAAA as an example, Morten, Nicholas, and Viljoen argue that policymakers should first enact legislation that mandates data sharing. Currently, there are congressional bills introduced for mandating social media data sharing, such as the Platform Accountability and Transparency Act, which would require social media companies to share more data with qualified researchers. But such proposals, they note, remain unenacted and controversial.

Any data-sharing legislation, Morten, Nicholas, and Viljoen insist, must empower regulators—public and private—to ensure compliance with data-sharing mandates. Drawing examples from clinical trial data sharing, they offer three recommendations.

First, they highlight the importance of funding data-sharing platforms and initiatives, as seen in the example of NIH’s funding its own database and other data-sharing projects, such as YODA. To that end, Morten, Nicholas, and Viljoen recommend ensuring that future regulators have access to reliable public funding.

Second, similar to how NIH curates its data-sharing platform, Morten, Nicholas, and Viljoen argue that legislation should legally empower regulators to control how required data is shared, accessed, and used. They further contend that even private regulators engaging in experimental data-sharing initiatives should have similar legal rights to manage data-sharing responsibly.

Finally, Morten, Nicholas, and Viljoen suggest that legislators should give regulators meaningful enforcement capabilities—such as naming-and-shaming methods or imposing fines. They argue that even minimal efforts to enforce the FDAAA have led to significant compliance with clinical trial data-sharing requirements.

In addition to mandating data sharing and enforcement mechanisms, Morten, Nicholas, and Viljoen recommend a system that tailors data-sharing mechanisms to the particular risks and characteristics of specific data types.

They note that under the FDAAA, only two types of data—summary data and metadata—must be shared. These types of data contain less sensitive information compared to individual patient-level data and can have commercial secrets redacted from them. Morten, Nicholas, and Viljoen also emphasize that individual patient-level data can still be shared through experimental data-sharing initiatives such as YODA.

Like the regulatory regime for clinical trial data, Morten, Nicholas, and Viljoen propose a three-tier distinction and treatment of social media data—individual data, summary data, and metadata. They explain that tailoring a specific data-sharing approach to one of the three tiers of data would better address the different levels of commercial and privacy risks associated with each type. Without this three-tier distinction, debates on data-sharing might continue to focus narrowly on the merits of individual data.

Morten, Nicholas, and Viljoen hope that the lessons of clinical trial data sharing will help lawmakers navigate the challenges of drafting effective data-sharing. But for now, the authors assert, the current state of social media remains in a “data secrecy dark age.”

The Federal Trade Commission (FTC) seems to agree on this point. In September, the FTC issued a report scrutinizing the data practices of major social media companies. It found that the companies’ current data practices pose privacy risks to social media users and non-users. The FTC recommended, among other things, that Congress pass legislation that would limit companies’ surveillance practices and grant consumers data rights.