Modern subway station escalators leading to platforms, symbolizing the structured pathways of access rights. In the context of online platforms, such rights enable research but impose narrow constraints, raising questions about academic freedom.

27 November 2024

Why access rights to platform data for researchers restrict, not promote, academic freedom

When it comes to regulating online platforms, the need for more in-depth research and reliable data is one of the few things everyone seems to agree on. Both national German platform laws and the European Digital Services Act (DSA) have therefore created research data access rights. However, they define narrowly what kind of research can be done with such data. Do these legal frameworks therefore impinge on academic freedom?

The starting point of the many parliamentary hearings of the CEOs of large technology platforms in the past years has been a suspicion that the companies running major platforms learn significantly more about the impact of their business on people from the data they collect than they let on. Occasionally, this has been underscored by the revelations of whistleblowers such as Frances Haugen. Mitigating the informational asymmetries that arise in this area has therefore become somewhat of a rallying call for internet activists, researchers and politicians. One legislative response to this is the creation of rights to data access for researchers: Provisions that enable researchers to demand data access from providers, if under specific conditions (such as being publicly funded, publicising the results or protecting users’ privacy rights).

Platform data as a valuable resource for researchers

The data generated by internet users is seen as a promising source of knowledge across all disciplines that research human behaviour, or aim to draw conclusions from human behaviour. In particular, user-generated content such as posts, comments and search queries on large online platforms are considered as an extremely attractive quantitative starting point because they can directly depict human behaviour (rather than, say, a survey), on a large scale, in machine-readable format and also for under-recorded groups or phenomena. Consequently, there are a variety of attempts to utilise this data not just in the social sciences, but also in the context of epidemiological research or even for city planning.

Shortcomings of voluntary access provision by platforms

Opposed to the potential these data have for increasing our knowledge about the world (and ourselves) is often the commercial interest of the providers to monetise, not share the keys to their respective data as their most valuable asset. In addition, they are simultaneously obliged by privacy laws such as the European GDPR to not share user data with third parties without consent. Despite these factors, some providers have voluntarily provided researchers’ with access in the past, or donated data to research archives. From researchers’ point of view, the downside of these voluntary practices is that they are often very limited in scope and that they rely on the goodwill of the company. Sometimes, companies’ reserve a right to prevent publication of the results. All of this not only discourages research that runs counter to commercial interests. It also makes it hard to secure funding for research projects that is often contingent on documenting reliable data access already in the application stage.

Research data access as an annex to platform regulation?

Legislature in Germany and the EU has responded to the frequent critique of these circumstances by creating access rights. These access rights conceptualise the issue around data access to providers’ databases for researchers as an issue of the effects of platforms themselves. Somewhat in contrast to the German and European academic freedom, they enable access only for specific research purposes.

This is done in different forms. The now repealed provision of § 5a Netzwerkdurchsetzungsgesetz (NetzDG) provided an access right under the main condition that the phenomenon of communication on platforms is being researched (irrespective of the researchers’ discipline, approach or goal). The current provisions in Art. 40 para. 4, para 12 DSA run parallel to the systemic risks of very large online platforms. In our reading, this means that the research not only has to be covered by one of the public goods named in Art. 34 para. 1 DSA, such as public health, but it also has to investigate the negative impact of a platform on this respective good (e.g. enabling access to investigate a platforms’ impact on anorexia, but not an epidemiological study that uses movement data to study the efficiency of a lockdown against an epidemic). The narrowest scope for access rights was defined by the also now-repealed § 19 para. 3 Urheberrechte-Diensteanbietergesetz (UrhDaG), a special copyright law. The provision did not expressly require specific research purposes, but it only enabled access to platforms data in order to monitor platforms’ compliance with their obligations under the UrhDaG.

Consequences for academic freedom

This epitomises the misconception of research data access rights as mere annexes to platform regulation laws: It presupposes that the legislators already know which knowledge they or the society needs, and gives researchers only a very limited role in filling supposed gaps in knowledge about platformed communication or its regulation, discounting serendipity and whole disciplines and research approaches.

According to a widely held view, such as that of the German Federal Constitutional Court, academic freedom is based on the idea that science, free from social utility and political expediency, best serves the state and society.

From the dictum of scientific autonomy, it is also regularly concluded that science policy must not completely exclude subjects or disciplines, because this would assert an allocation of relevance to which neither the state nor the EU are entitled. However, this is precisely what happens in Art. 40 DSA. The access regime does not take into account the overwhelming majority of disciplines or specific research approaches that are interested in questions other than investigating the systemic risks of very large online platforms and very large online search engines. The provision privileges research only to the extent that it examines these services as suspected causes of social problems.

In other words, the research data accesses in question do not aim at scientific adequacy, but only at regulatory adequacy. Thematically tailored access requirements restrict research autonomy with regard to the choice of research topic, research question and research scope. Although other topics, questions or research projects with a digital focus may still be legally permissible, selective access to data will in fact make them less favourable and regularly unfeasible.

What effect will this have?

Two scenarios are conceivable, both problematic in terms of scientific freedom: On the one hand, the customisation of data access may actually motivate researchers to investigate a regulatorily relevant question that they would not have addressed if they had broader data access. But it does not seem conducive to scientific excellence if researchers feel compelled to limit their own topics and questions in order to be able to work on a supposedly highly topical issue thanks to exclusive data access. Such a narrowing of topics stifles creative scientific processes. On the other hand, the narrow access to research data merely excludes researchers who are interested in other topics without motivating other researchers to devote themselves to relevant questions. In this case, the restriction is difficult to justify in terms of equality law. In such cases, the regulatory purpose does not profit from the restricted way in which access is being granted.

Where do we go from here?

Since regulatory considerations alone are unlikely to justify restricting access to research data, other legal interests must be pointed to justify the restrictions on research.

One possible justification would be the protection of the companies obliged to provide access. It is common practice, in terms of fundamental rights, to regard each individual disclosure of sensitive personal or corporate data as a separate interference requiring justification. Any access to data increases the number of people who could gain access to internal company information and misuse it, for example by passing it on to competitors. However, this argument is also quite weak, since research that is more restricted in its autonomy cannot justify more, but rather less, encroachment on corporate rights. Moreover, the balancing of interests in the law granting access to data must be abstracted from individual cases and cannot be based on presumed levels of legal activation.

As a result, the research data access provisions of digital law in the current shape do themselves a disservice. If they were more generous towards data-based research and the privileged kinds of research, they would be more clearly justified and lose the taint of regulatory instrumentalisation.

Authors

Tobias Mast is head of a research program at the Leibniz Institute for Media Research | Hans Bredow Institute. Martin Fertmann is a Junior Researcher at the same institute. Both authors wrote this blog post as contribution to the DSA Research Network, in the context of which they have recently published a research article on data access rights in the German Journal Wissenschaftsrecht (WissR, open access), which this post is based on: Mast/Fertmann, Forschungsdatenzugang und Technologieregulierung, 57 WissR (2024), p. 101-128, DOI 10.1628/wissr-2024-0011.

This post represents the view of the author and does not necessarily represent the view of the institute itself. For more information about the topics of these articles and associated research projects, please contact info@hiig.de.