Skip to content
LLMs
24 May 2023| doi: 10.5281/zenodo.7957165

Teaching Norms to Large Language Models – The Next Frontier of Hybrid Governance 

Large Language Models (LLMs) have significantly advanced natural language processing capabilities, enabling them to generate human-like text. However, their growing presence raises concerns about potential societal risks and ethical considerations. To ensure responsible deployment of LLMs, it is crucial to teach them societal norms. This blog post explores the ways in which we can teach norms to LLMs and introduces the concept of hybrid governance, which emphasises the interdependencies of public and private norms. We will also delve into DeepMind Sparrow and its 23 rules for reinforced human feedback as an example to illustrate effective norm teaching methods.

What are LLMs?

LLMs, such as OpenAI’s GPT models, are cutting-edge artificial intelligence systems capable of generating coherent and contextually relevant text. While LLMs offer numerous benefits, including improved language translation, content generation, and customer service, they also pose societal risks (Weidinger et al., 2022). These risks include the propagation of misinformation, amplification of biases, and potential misuse in areas such as deepfakes or malicious content generation. The risks can be based on automated decision making with real-world effects. Geoffrey Hinton, the so-called Godfather of AI, who recently left Google, reminded us that you need not be physically at the Capital to initiate a riot on the building. Thus AI systems could also manipulate people to do harm.  

Ways to Teach LLMs Norms

To address the risks associated with LLMs, teaching them societal norms becomes imperative, even though it is not the only way of doing so. Several approaches can be employed to achieve this goal: 

  • Explicit Instruction: LLMs can be explicitly taught by predefined norms and ethical guidelines to guide their behaviour and content generation. 
  • Reward Modelling: By employing reinforcement learning techniques, LLMs can be rewarded for producing outputs that align with desired norms, encouraging them to conform to societal expectations.
  • Human Feedback: Actively involving human reviewers who provide feedback and guidance during the training process helps LLMs learn from human expertise and perspectives. 

“Sparrow” serves as an example for an LLM chatbot, developed by Google DeepMind, a prominent AI research organisation belonging to Alphabet, which has – as far as we know – not been released to the public yet. Sparrow employs “reinforced human feedback” to teach norms. It involves a continuous feedback loop between human reviewers and the chatbot during its training process. This iterative process allows Sparrow to learn and improve its responses over time, aligning them with human expectations and societal norms.

DeepMind has developed 23 rules for reinforced human feedback specifically for Sparrow (Glaese 2022). These rules provide guidelines to human reviewers on evaluating the chatbot’s outputs and thereby shaping its behaviour. By incorporating ethical considerations into its training, Sparrow aims to exhibit responsible behaviour and avoid generating harmful or inappropriate content. The first rules are: 

Amelia Glaese et al, Improving alignment of dialogue agents via targeted human judgements, DeepMind Working Paper 2022-09-20

These rules all sound plausible, however, one might easily come up with an alternative list of equally plausible rules. Even though they all have links to ethical and legal norms, it’s the choice of a private company to pick this rule. Given the societal relevance of the bots – if released to the world – we might ask whether that is ideal.  

Concept of Hybrid Governance and the Interdependencies of Public and Private Norms’

In this respect, the concept of hybrid governance, which we suggest as a new perspective on rule structures in the field of social media, can also be applied to LLMs: Indeed, the normative development of communication rules on online platforms puts traditional notions of rulemaking in trouble. The overlap, interdependence, and inseparability of private (e.g. community standards) and public communication rules on social media platforms should therefore be analysed under the lens of a new category: hybrid speech governance (Schulz, 2022). This perspective can help to find appropriate approaches to contain private power without simply transferring state-centric concepts unchanged to platform operators. This applies to questions of the rationale of communication rules, rule of law requirements, and fundamental rights obligations. We see more and more that state regulation tries to work indirectly and formulates conditions for private rulemaking. The EU’s Digital Services Act (DSA) adopts this perspective of hybrid speech governance in its Art. 14. This serves as a prominent example to find initial legislative answers to the questions raised. However, this is only the beginning of the story of hybrid governance. Academia, practice, and jurisprudence will have to flesh out the DSA’s approach to hybrid speech governance in detail. The discussion of this new perspective of governance needs to be broadened beyond the realm of platform regulation, namely in relation to LLMs. 

Implications 

Teaching norms to LLMs represents a significant step towards responsible AI deployment. By employing reward modelling and reinforced human feedback, we can shape LLM behaviour and mitigate potential societal risks. We do not suggest that this approach is perfect or the only way to deal with those risks, but it might be one component of the solution. 

We propose that the norms we teach LLMs should be understood as a hybrid between private and public rulemaking. This means that we, as a civil society, as a scientific community, and also as legislators, need to consider what requirements we address to the private setting of these rules. It also means thinking about the processes by which these rules are created and enforced. This can mean structured procedures of rulemaking with stakeholder engagement or even setting up new multi-stakeholder-bodies to advise the companies developing LLMs. That OpenAI even called for a US Agency to regulate LLMs and the EU will soon have an AI Act should not lead to the illusion that the problem could be addressed by state regulation and compliance by the companies alone. Tools such as risk assessments – under the DSA or AI Act – can also act as transmission belts between state action and private regulation.   

The concept of hybrid governance highlights the importance of considering both public and private norms in governing LLM behaviour. As we navigate this evolving landscape, a collaborative effort involving researchers, civil society actors, policymakers, and technology developers will be crucial to ensure that LLMs adhere to societal norms while supporting innovation and progress.

References

Glaese, A. et al. (2022). Improving alignment of dialogue agents via targeted human judgments. DeepMind Working Paper. https://doi.org/10.48550/arXiv.2209.14375.

Schulz, W. (2022). Changing the Normative Order of Social Media from Within: Supervisory Bodies. In Celeste, E., Heldt, A., and Keller, C. I. (Eds.), Constitutionalising Social Media (pp. 237-238). Bloomsbury Publishing. 

Weidinger, L. et al. (2022). Taxonomy of Risks Posed by Language Models. Proceedings of the ACM Conference on Fairness, Accountability, and Transparency (FAccT ‘22), Association for Computing Machinery, New York, NY, USA, pp. 214-229. https://doi.org/10.1145/3531146.3533088.

This post represents the view of the author and does not necessarily represent the view of the institute itself. For more information about the topics of these articles and associated research projects, please contact info@hiig.de.

Wolfgang Schulz, Prof. Dr.

Research Director

Christian Ollig

Associated Researcher: Leibniz Institute for Media Research │Hans-Bredow-Institute

Sign up for HIIG's Monthly Digest

HIIG-Newsletter-Header

You will receive our latest blog articles once a month in a newsletter.

Explore current HIIG Activities

Research issues in focus

HIIG is currently working on exciting topics. Learn more about our interdisciplinary pioneering work in public discourse.

Further articles

The photo shows an arrow sign on a brick wall, symbolising the DSA in terms of navigating platform power.

Navigating platform power: from European elections to the regulatory future

Looking back at the European elections in June 2024, this blog post takes stock of the Digital Services Act’s effect in terms of navigating platform power.

The image shows a football field from above. The players are only visible because of their shadows, symbolizing Humans in the Loop.

AI Under Supervision: Do We Need ‘Humans in the Loop’ in Automation Processes?

Automated decisions have advantages but are not always flawless. Some suggest a Human in the Loop as a solution. But does it guarantee better outcomes?

The image shows blue dices that are connected to eachother, symbolising B2B platforms.

The plurality of digital B2B platforms

This blog post dives into the diversity of digital business-to-business platforms, categorising them by governance styles and strategic aims.