0% found this document useful (0 votes)
62 views74 pages

Public AI 2025

The Public AI white paper outlines a vision for AI systems that prioritize public interest, transparency, and accountability, addressing the current dominance of proprietary AI models by a few companies. It proposes a framework for developing public AI infrastructure through three pathways: compute, data, and models, while introducing a 'gradient of publicness' to assess AI initiatives based on their openness and governance. The paper emphasizes the need for coordinated investments in open-source ecosystems, public compute infrastructure, and talent development to ensure equitable access to AI technologies.

Uploaded by

satryobudiharto
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
62 views74 pages

Public AI 2025

The Public AI white paper outlines a vision for AI systems that prioritize public interest, transparency, and accountability, addressing the current dominance of proprietary AI models by a few companies. It proposes a framework for developing public AI infrastructure through three pathways: compute, data, and models, while introducing a 'gradient of publicness' to assess AI initiatives based on their openness and governance. The paper emphasizes the need for coordinated investments in open-source ecosystems, public compute infrastructure, and talent development to ensure equitable access to AI technologies.

Uploaded by

satryobudiharto
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 74

Public AI – White Paper

Supported by Commissioned by
Legal notice

Commissioned by Rights
© Bertelsmann Stiftung, Gütersloh The text of this publication is licensed under the
May 2025 Creative Commons Attribution 4.0 International
License. You can find the complete license text
Publisher at: https://creativecommons.org/licenses/by/4.0/
Bertelsmann Stiftung legalcode.en
Carl-Bertelsmann-Straße 256
33311 Gütersloh
Phone +49 5241 81-0
www.bertelsmann-stiftung.de The infographics are licensed under the Creative
Commons Attribution-NonCommercial-NoDerivatives
Supported by 4.0 International License. You can find the complete
Open Future license text at: https://creativecommons.org/licenses/
by-nc-nd/4.0/legalcode.en
Authors
Dr. Felix Sieker, Bertelsmann Stiftung
Dr. Alek Tarkowski, Open Future
Lea Gimpel, Digital Public Goods Alliance The visualizations are not meant to be exhaustive.
Dr. Cailean Osborne, Oxford Internet Institute All logos are excluded, as they are protected by
copyright, not covered by the above-mentioned CC
Responsible license, and may not be used.
Dr. Felix Sieker, Bertelsmann Stiftung
Recommended citation style
Editing Sieker/Tarkowski/Gimpel/Osborne (2025). Public AI –
Barbara Serfozo, Berlin White Paper. Bertelsmann Stiftung. Gütersloh.

Infographics DOI 10.11586/2025040


Jakub Koźniewski

Layout and Typesetting


Nicole Meyerholz, Bielefeld
Public AI – White Paper
Dr. Felix Sieker,
Dr. Alek Tarkowski,
Lea Gimpel,
Dr. Cailean Osborne

Reviewer list
Albert Cañigueral, Barcelona Supercomputing Center
Amin Oueslati, The Future Society
Ben Burtenshaw, Hugging Face
Brandon Jackson, Public AI Network
Huw Roberts, Oxford Internet Institute, University of
Oxford
Isabel Hou, Taiwan AI Academy
Jakob Mökander, Digital Ethics Center, Yale University
Jennifer Ding, Boundary Object Studio
Laura Galindo, AI policy expert
Luca Cominassi, AI policy expert
Martin Hullin, Bertelsmann Stiftung
Martin Pompéry, SINE Foundation
Marta Ziosi, Oxford Martin AI Governance Initiative,
University of Oxford
Paul Keller, Open Future
Paul Sharratt, Sovereign Tech Agency
Ravi Iyer, USC Marshall
Yacine Jernite, Hugging Face
Zoe Hawkins, Tech Policy Design Institute

Supported by Commissioned by
Table of contents

Preface 6

Executive summary 8

Glossary 11

1 | Introduction 13

2 | Technical primer: What are AI technologies and how do they work? 17


Defining artificial intelligence 17
The deep learning paradigm 21
“Attention is all you need”: Transformers and the rise of generative AI 22
The generative AI development process 24
Pretraining phase 24
Post-training phase 26
Deployment 26
AI scaling laws: The contested future of AI 26
What are AI scaling laws? 27
The evolution of AI scaling laws 27
Scaling and AI’s environmental footprint 28
What is the future of AI scaling laws? 29

3 | The generative AI stack 32


Overview of the AI stack 32
Advantages of the AI stack concept 33
Concentrations of power in the AI stack 35
Characteristics of key layers of the stack 37
Compute 37
Data 39
Models 42

4
Table of contents

4 | The public AI framework 45


The concept of public digital infrastructure 45
Public, private and civic actors in public digital infrastructure 47
Proposals for public AI 47
Public AI Network 47
Mozilla Foundation 48
Vanderbilt Policy Accelerator 49
Defining public AI infrastructure 49
Gradient of publicness of AI systems 52
Goals and governance principles of public AI policies 55
Governance of public AI 56

5 | AI strategy and three pathways to public AI 58


Elements of a public AI strategy 58
The public AI ecosystem and its orchestrating institution 60
Three pathways toward public AI infrastructure: compute, data and model 61
Compute pathway to public AI 61
Compute: bottlenecks 61
Compute: opportunities 63
Data pathway to public AI 65
Data: bottlenecks 65
Data: opportunities 66
Model pathway to public AI 68
Models: bottlenecks 68
Models: opportunities 69
Additional measures 71
Coda: mission-driven public AI policy 73

5
Preface

Artificial Intelligence stands at a pivotal crossroads. Grounded in a realistic analysis of the constraints
While its potential to transform society is immense, across the AI stack – compute, data and models – the
the power to shape its trajectory is becoming in- paper translates the concept of Public AI into a con-
creasingly concentrated. Today, a small number of crete policy framework with actionable steps. Cen-
dominant technology firms hold sway not only over tral to this framework is the conviction that public AI
the most advanced AI models but also the founda- strategies must ensure the continued availability of
tional infrastructure – compute capacity, data re- at least one fully open-source model with capabili-
sources and cloud platforms – that makes these ties approaching those of proprietary state-of-the-
systems possible. This consolidation of influence art systems. Achieving this goal requires three key
represents more than a market imbalance; it poses a actions: coordinated investing in the open-source
direct threat to the principles of openness, transpar- ecosystem, providing public compute infrastructure,
ency and democratic accountability. and building a robust talent base and institution-
al capacity.
When only a handful of actors define how AI systems
are built and used, public oversight erodes. These It calls for the continued existence of at least one
systems increasingly reflect the values and econom- fully open-source model near the frontier of capa-
ic incentives of their creators, often at the expense of bility and lays out three imperatives to achieve this:
inclusion, accountability and democratic oversight. strengthening open-source ecosystems, investing in
Without intervention, these trends risk entrenching public compute infrastructure, and building the tal-
structural inequities and shrinking the space for al- ent base to develop and use open models.
ternative approaches.
To guide implementation, the paper introduces the
This white paper outlines a strategic countervision: concept of a “gradient of publicness” to AI poli-
Public AI. It proposes a model of AI development and cy – a tool for assessing and shaping AI initiatives
deployment grounded in transparency, democrat- based on their openness, governance structures, and
ic governance and open access to critical infrastruc- alignment with public values. This framework en-
ture. Public AI refers to systems that are accountable ables policymakers to evaluate where a given initia-
to the public, where foundational resources such as tive falls on the spectrum from private to public and
compute, data and models are openly accessible and to identify actionable steps to increase public benefit.
every initiative serves a clearly defined public pur-
pose. We extend our sincere thanks to Alek Tarkowski, Lea
Gimpel and Cailean Osborne for their valuable in-
sights and contributions to this work.

6
Preface

As you engage with the ideas presented here, we in-


vite you to consider how this vision can inform your
own decision-making and inspire policies that are
both inclusive and forward-looking. Together, let
us harness AI to keep it from deepening divisions
while ensuring it broadens democratic possibility and
strengthens social solidarity.

Dr. Felix Sieker


Project Manager
Digitization and the Common Good
Bertelsmann Stiftung

Martin Hullin
Director
Digitization and the Common Good
Bertelsmann Stiftung

7
Executive summary

Today’s most advanced AI systems and foundation reality. In particular, it advances this timely conver-
models are largely proprietary and controlled by a sation by making the following two novel contribu-
small number of companies. There is a striking lack tions.
of viable public or open alternatives. This gap means
that cutting-edge AI remains in the hands of a select A vision for Public AI grounded in the reality of
few, with limited orientation toward the public inter- the AI stack
est, accountability or oversight.
A vision for public AI needs to take into account to­
Public AI is a vision of AI systems that are meaning- day’s constraints at the compute, data and model
ful alternatives to the status quo. In order to serve layers of the AI stack, and offer actionable steps to
the public interest, they are developed under trans- overcome these limitations. This white paper offers
parent governance, with public accountability, eq- a clear overview of AI systems and infrastructures
uitable access to core components (such as data and conceptualized as a stack of interdependent ele-
models), and a clear focus on public-purpose func- ments, with compute, data and models as its core
tions. layers. It also identifies critical bottlenecks and de-
pendencies in today’s AI ecosystem, where depen-
In practice, public AI projects ensure that the public dency on dominant or even monopolistic commercial
has both insight into and influence over how AI sys- solutions constrains development of public alterna-
tems are built and used. They aim to make the key tives. It highlights the need for policy approaches
building blocks – data, open source software and that can orchestrate resources and various actors
open models – accessible to all on fair terms. Cru- across layers, rather than attempting complete verti-
cially, public AI initiatives are oriented toward broad cal integration of a publicly owned solution.
societal benefit, rather than private gain.
To achieve this, it proposes three core policy recom-
Over the past year, momentum behind Public AI pro- mendations:
posals has been steadily growing, with a series of
influential reports and initiatives by Public AI Net- 1. Develop and/or strengthen fully open source
work, Mozilla and the Vanderbilt Policy Accelerator models and the broader open source ecosystem
demonstrating the importance of this approach. And
even more importantly, various initiatives are devel- 2. Provide public compute infrastructure to support
oping various components and whole AI systems that the development and use of open models
fulfill this vision of Public AI.
3. Scale investments in AI capabilities to ensure that
This white paper builds on earlier proposals for Pub- sufficient talent is developing and adopting these
lic AI and is aimed at policymakers and funders, with models
the goal of helping to turn the vision of Public AI into

8
Executive summary

In order to achieve this, complementary pathways The “gradient of publicness”: A framework for
for Public AI development need to be pursued, fo- Public AI
cused on the three core layers of the AI stack: com-
pute, data, and models: The white paper also offers a “gradient of public­
ness” framework, rooted in Public Digital Infra­
1. Compute Pathway: It focuses on providing strate- structure principles. This framework can guide
gic public computing resources, particularly decision-making around investments in AI infra­
supporting open-source AI development. Key structure and help increase public value while
recommendations include ensuring computing acknowledging existing constraints and limitations
access for fully open projects, expanding compute to building fully Public AI.
for research institutions, and improving coordi-
nation between public compute initiatives. This framework maps AI interventions along a con-
tinuum – from fully public to fully private – based
2. Data Pathway: It emphasizes creating high- on their attributes (e.g. accessibility, openness, in-
quality datasets as digital public goods through teroperability), functions (e.g. enabling social or eco-
commons-based governance. This includes de- nomic goals) and modes of control (e.g. democratic
veloping datasets as publicly accessible resources governance and accountability). It serves as both
while protecting against value extraction, and es- a diagnostic and strategic tool for assessing where
tablishing public data commons with appropriate an intervention falls along this continuum, and for
governance mechanisms. identifying interventions that could strengthen its
public value.
3. Model Pathway: It centers on fostering an eco-
system of fully open source AI models, including The gradient of publicness consists of the following
both a state-of-the-art “capstone model” and six distinct levels, each representing different de-
specialized smaller models. The strategy empha- grees of public attributes, functions, and control:
sizes building sustainable open source AI devel-
opment capabilities rather than simply competing Level 1: Commercial provision of AI components
with commercial labs. with public attributes
Commercial entities develop and share open source
Several additional measures are highlighted that do components (e.g., Meta’s open-sourcing of PyTorch)
not fit within one of the three pathways, but help se- with high public accessibility but limited public func-
cure key public interest goals. This includes invest- tion and control.
ing in AI talent and capabilities to develop and deploy
AI systems in the public interest, supporting par- Level 2: Commercial AI infrastructure with public
adigm-shifting innovation toward more efficient attributes and functions
technologies, funding open-source software and Privately controlled platforms like Hugging Face
tools, and building effective deployment pathways Hub that democratize access to AI tools while main-
for public AI applications. taining commercial oversight but serving public
interest goals.
This approach acknowledges the importance of the
various layers and different paths that can be pur- Level 3: Public computing infrastructure
sued to attain Public AI. It also argues for coordinated Government-funded supercomputers and data cen-
interventions across the entire AI stack, orchestrated ters (e.g., EU AI Factories) that provide computing
by new public institutions capable of managing de- resources through public-private partnerships with
centralized AI development ecosystems. moderate to high public control.

9
Executive summary

Level 4: Public provision of AI components If you are a policymaker or funder seeking concrete
Publicly funded datasets, benchmarks, and tools policy or funding guidance:
(e.g., Mozilla’s Common Voice) developed specifical-
ly as digital public goods with high public control and • Begin with the Introduction, then focus on chap-
clear public functions. ter 4 for the gradient of publicness framework and
chapter 5 for specific policy recommendations.
Level 5: Full-stack public AI infrastructure built with
commercial compute
AI systems like the OLMo model by Allen Institute for
AI that are fully open source but rely on commercial
computing infrastructure, limiting public control at
the compute layer.

Level 6: Full-stack public AI infrastructure


Fully autonomous public AI systems like Spain’s Alia,
built with public data, models, and computing infra-
structure, achieving the highest level of publicness
across all layers.

Reading Guide

We encourage readers to explore the full report to


gain a comprehensive understanding of the public AI
vision and its implications. Depending on your spe-
cific interests, we recommend the following entry
points.

If you are interested in the technological founda­


tions of AI and the factors contributing to the rise
of generative models:

• Begin with the Introduction, then read Chapter 2.


These sections provide a technical primer on AI
technologies and the key breakthroughs that have
led to today’s generative AI models.

If you are a policymaker working on AI policy and/or


(investment) strategy:

• Start with the Introduction, then read chapter 3


for an overview of the AI stack (compute, data and
models). Focus on the “gradient of publicness” in
chapter 4, and turn to chapter 5 for an overview
of pathways and recommendations for building
public AI.

10
Glossary

AI model model, enabling similar performance with lower


A mathematical and computational system trained to computational requirements.
recognize patterns or make decisions based on input Fine-tuning
data. The process of adapting a pretrained model to
AI scaling laws specific tasks through additional training on task-
An observed pattern showing that AI model specific datasets, preserving general knowledge
performance improves predictably with more data, while optimizing for particular applications.
larger models (more parameters) and more compute Foundation model
resources. A large AI model trained on broad datasets and
AI system designed for adaptability across many tasks.
A machine-based system that infers how to Foundation models can be unimodal or multimodal;
generate outputs – like predictions, content large language models are a subset.
recommendations, or decisions – to influence Generative AI
physical or virtual environments. It combines models, A class of AI systems designed to create new content
data, infrastructure and interfaces and can vary in – such as text, images or music – based on learned
autonomy and adaptiveness. patterns from training data.
Artificial general intelligence Graphics processing unit (GPU)
A theoretical form of AI capable of understanding, A specialized processor designed for handling
learning and performing any intellectual task that a parallel computations, originally developed for
human can do. video games, now crucial for efficiently training and
Artificial intelligence running deep learning models.
The broad field of computer science focused on Large language model (LLM)
creating machines or software capable of tasks A type of foundation model trained on massive text
that normally require human intelligence (e.g., corpora, capable of generating or understanding
perception, reasoning, decision-making). natural language, often containing billions of
Computer vision parameters.
A branch of AI that enables computers to interpret Machine learning
and process visual information from the world, such A branch of AI where systems improve their
as images or video. performance on tasks over time by learning
Deep learning from data including methods like supervised,
A subset of machine learning that uses multi-layered unsupervised and reinforcement learning.
neural networks (often with many hidden layers) to Model parameters or weights
automatically learn complex representations of data. Numeric values within a model (e.g., connection
Distillation strengths in a neural network) that are adjusted
A technique where a smaller “student” model is during training to enable the model to perform its
trained to replicate the behavior of a larger “teacher” tasks.

11
Glossary

Moore’s law Reinforcement learning


A prediction made by Intel co-founder Gordon Reinforcement learning is a machine learning
Moore in 1965 that the number of transistors on a paradigm where a system learns by interacting with
chip would double approximately every two years, an environment and improving through rewards.
leading to exponential growth in computing power. Variants reinforcement learning from human
Natural language processing (NLP) feedback (RLHF) and reinforcement learning from
The field at the intersection of AI and linguistics AI-generated feedback (RLAIF) are used to align AI
focused on enabling computers to understand, models with desired behaviors.
interpret and generate human language. Quantization
Neural network A technique that reduces the computational and
A type of AI model inspired by the brain’s memory costs of AI models by representing weights
interconnected neurons, designed to recognize and activations with lower-precision data types,
patterns and relationships in data. typically without requiring additional training data.
Open model Small model
In this white paper, open models refer to AI A deliberately compact AI model with millions
models for which the model architecture, trained to billions of parameters, optimized for resource
parameters (i.e., weights and biases) and some efficiency and often adapted from larger models
documentation are released under open-source through distillation or quantization.
licenses. Transistor
Open source license A tiny semiconductor device that acts as a switch
An open source license is a legal agreement that or amplifier in electronic circuits, forming the
allows users to freely use, modify, and distribute fundamental building block of virtually all modern
software or other works, while specifying certain electronic devices.
conditions that must be met when using or Transformers
redistributing the software. A deep learning architecture introduced in 2017,
Open source model based on attention mechanisms, that underpins
There are competing definitions of open-source the recent surge in AI advancements by enabling
models. In this white paper, we define an open- models to process and generate sequence data with
source model as an AI model whose trained unprecedented effectiveness.
parameters (i.e., weights and biases) – referred
to as an open model – are released alongside the
code, data and accompanying documentation used
in its development, all under free and open-source
licenses.
Public AI
Public AI refers to AI systems developed with
transparent governance, public accountability, open
and equitable access to core components, such as
data and models, and clearly defined public functions
at their core.

12
1 | Introduction

Artificial Intelligence (AI) stands as one of the most Emerging concentrations of power
prominent and potentially transformative technologies
of the past decade. With the rapid ascent of new indus- Three years on, concentrations of power in AI have
try leaders like OpenAI and Anthropic, alongside a stra- only deepened. A small group of dominant technol-
tegic pivot toward AI by incumbents such as Microsoft, ogy companies now control not only most state-
Google and Meta, AI is increasingly shaping the future of-the-art AI models, but also the foundational
of sectors ranging from healthcare and finance to edu- infrastructure that shapes the field. This emerging
cation and government. At the same time, critics warn AI oligopoly builds on existing monopolies in cloud
that AI may be yet another hype-driven, extractive and computing and digital platforms, reinforcing the
unsustainable technology lacking a clear social purpose. dominance of hyperscalers and platform giants.

Public debate often swings between uncritical en- This concentration is not merely structural – it has
thusiasm for AI and deep concern over its existential far-reaching consequences. When a handful of ac-
risks. Despite these polarized narratives, the under- tors define how AI is built and deployed, the benefits
lying reality is that AI technologies are becoming are captured by the few, while the risks are borne by
deeply embedded in social, economic and political the public. AI systems increasingly reflect the values,
systems. Hype or not, AI is likely to remain a lasting incentives and worldviews of their creators, often at
force – making it all the more urgent to shape its the expense of public inclusion, accountability and
development around clear public values. democratic oversight.

In late 2022, just as commercial AI labs were launch- Compounding the issue is the rapid pace of AI devel-
ing the first public-facing applications based on gen- opment, characterized by two destabilizing trends.
erative AI models, Mariana Mazzucato and Gabriela First, corporate competition has triggered a relent-
Ramos published an op-ed arguing for public policies less race toward ever more capable AI models, often
and institutions “designed to ensure that innova- framed as a pursuit of so-called “superintelligence.”
tions in AI are improving the world.” They warned, As a result, billions are being invested in systems de-
instead, that a new generation of digital technologies signed to achieve commercial dominance, often with
is being “deployed in a [policy] vacuum.” According little regard for societal benefit or safety. These in-
to Mazzucato and Ramos, public interventions are vestments in compute infrastructure frequently
essential to steer the technological revolution in a lack a clear articulation of the public needs they are
direction that turns technical innovation into out- meant to address, and can have significant environ-
comes that serve the public interest.1 mental costs.

1 Mariana Mazzucato and Gabriela Ramos. “AI in the Common Second, geopolitical shifts are pushing states to treat
Interest.” Project Syndicate, 26 Dec. 2022. https://www. AI as a zero-sum game. Governments are engaging
project-syndicate.org/commentary/ethical-ai-requires-state-
regulatory-frameworks-capacity-building-by-gabriela-
in AI nationalism – walling off innovation behind
ramos-and-mariana-mazzucato-2022-12 borders in the name of digital sovereignty – rather

13
Introduction

than fostering international cooperation. Sovereign perimentation and adaptation, especially for aca-
AI strategies often mirror the priorities of dominant demic researchers, startups and non-commercial
commercial actors, focusing heavily on large-scale applications.
investments in compute infrastructure under the as-
sumption that public benefits will follow. Howev- While open source models do not eliminate market
er, this approach risks entrenching existing power concentration, they can reduce dependency on a few
asymmetries and deepening dependencies on the dominant industry players and allow public interest
providers of critical AI components. applications to be developed without restrictive
licenses or opaque constraints. In this way, open
As the AI Now Institute has observed, current AI pol- source AI serves as critical infrastructure for aligning
icy tends to “conflate public benefit with private AI development with democratic values and public
sector success,” leading to strategies that priori-
2
goals.
tize industrial competitiveness over accountability,
transparency or equitable access. Without a course Moreover, visions for public AI emphasize the need
correction, this trajectory could lock in structural to strengthen the broader ecosystem of AI infra-
inequities and limit democratic control over founda- structure and tools. Without a sustainable open
tional AI systems. source AI ecosystem, efforts to keep AI development
transparent and aligned with the public interest will
The need for a countervision: Public AI remain fragile and limited.

In response to this growing imbalance, interest is Two urgent realities demand immediate action. First,
building around alternatives to the dominant com- the window for intervention is closing. As a handful
mercial AI paradigm – a concept broadly described of corporate and state actors consolidate control over
as public AI. While the definition of public AI remains critical infrastructure and resources – such as com-
fluid, it generally refers to AI systems developed with pute, datasets and talent – the cost of building al-
transparent governance, public accountability, open ternatives continues to rise. Second, the stakes are
and equitable access to core components such as global. AI’s impact on labor, healthcare, education,
data and models, and clearly defined public-purpose the environment and democracy transcends borders.
functions. Without proactive measures, its benefits will accrue
to a privileged minority, while the risks – from dis-
A central pillar of public AI is the development of information to algorithmic bias – will disproportion-
fully open source models, which involves releasing ately affect marginalized groups.
the trained parameters (i.e., weights and biases) of
models – commonly referred to as open models –
along with the code, data and documentation used in
their creation under open source licenses. Open mod-
els ensure that foundational AI systems are acces-
sible, inspectable and modifiable by a wide range of
actors, including researchers, public institutions and
civil society. Unlike proprietary models, fully open
source models offer transparency into model weights
and training data, allowing for reproducibility and
independent auditing. They also lower barriers to ex-

2 Amba Kak and Sarah Myers West “2023 Landscape. Confronting


tech power.” AI Now Institute. 2023. https://ainowinstitute.
org/wp-content/uploads/2023/04/AI-Now-2023-Landscape-
Report-FINAL.pdf

14
Introduction

INFOBOX | Open source AI and degrees of openness

The development of publicly controlled, state-of- however, that AI models are made up of several
the-art open source AI models should be a central components beyond architecture and weights.
goal of public AI policies. These models provide For example, the Model Openness Framework5
a public alternative to commercial offerings and breaks models down into 16 components span-
help foster an ecosystem in which smaller, com- ning code, data and documentation, and outlines
plementary models can thrive. three tiers of model openness depending on how
fully these components are shared under permis-
The benefits and risks of open source AI develop- sive licenses.
ment have been the subject of intense policy
debate in recent years. While early discussions Today, many models described as “open” or “open
focused largely on the safety risks of open source source” share only a limited set of components.
AI systems, there is growing support for such In most cases, only model parameters are re-
models and increasing recognition of their poten- leased, accompanied by some documentation.
tial to democratize AI development. While this allows for reuse, it offers insufficient
transparency into the training data and the devel-
However, the term “open source AI” is currently opment process.6 These models are more accu-
used to describe models with varying degrees of rately described as open-weight models.7 In this
openness. In some cases, “open washing” occurs – white paper, we use the generic term open model
when models that are not truly open are marketed to refer to a broad spectrum of models released
as meeting open source standards.3 Public AI poli- under open terms, including commercial open-
cy must therefore be grounded in a clear and con- weight models from companies such as Mistral or
sistent definition of what constitutes open source DeepSeek.
AI. The Open source AI Definition Initiative pro-
posed a binary definition: AI systems released At the same time, a key recommendation of this
under licenses that permit use, study, modifica- white paper is the development of fully open
tion and redistribution. This definition, however, source AI models, which we define as the com-
has not yet been widely adopted.4 Alternatively, plete release of model parameters, architecture,
Irene Solaiman of Hugging Face has conceptual- code, datasets and associated documentation
ized openness as a gradient – from fully closed to under open source licenses. Currently, very few
fully open. state-of-the-art models meet this definition –
among them, OLMo 2 by the Allen Institute for AI.
In this white paper, we focus on open models –
that is, models in which both the architecture and
parameters (i.e. weights and biases) are released 5 Model Openess Framework. https://isitopen.ai/ Accessed 27
April 2025.
under permissive licenses. It is important to note,
6 Zuzanna Warso, Paul Keller and Max Gahntz. “Towards Robust
Training Data Transparency.” Open Future. 19 June 2024.
3 Sarah Kessler. “Openwashing.” New York Times. 19 May 2024. https://openfuture.eu/publication/towards-robust-training-
https://www.nytimes.com/2024/05/17/business/what-is- data-transparency/
openwashing-ai.html 7 “Open weights: not quite what you‘ve been told.” Open Source
4 Open Source Initiative. “Open Source AI Definition.” Accessed Initiative. https://opensource.org/ai/open-weights Accessed 23
27 April 2025. https://opensource.org/ai April 2025.

15
Introduction

Why this report? In chapter 5, we outline three pathways to public AI


– starting at the compute, data and model layers re-
This white paper serves as a guide for policymakers spectively. For each layer, we identify key bottlenecks
and funders at a critical juncture. It aims to: and propose solutions that advance the vision of
public AI. The chapter concludes with strategic rec-
• 
Demystify AI: By breaking down generative AI ommendations to connect these pathways into a
technology into a “stack” comprising compute, cohesive public AI agenda.
data and models layers, we cut through the hype,
make its complexities more graspable and expose
underlying power imbalances.

• Identify bottlenecks and dependencies: We iden-


tify choke points in today’s AI ecosystem that
reinforce concentration, from GPU shortages to
data monopolies.

• Outline pathways to public AI: We propose a


vision of AI as public digital infrastructure and
spotlight policy and funding interventions that
could enable a public AI ecosystem to emerge.

In chapter 2, we offer a technical primer that defines


AI and distinguishes key types of AI technologies,
with a focus on machine learning and generative AI.
We also describe the dominant development para-
digm for generative AI – the transformer architec-
ture – and explain the scaling laws that underpin it.

In chapter 3, we define the generative AI stack and


provide an overview of its three key layers: compute,
data and models. These layers shape the conditions
for building public AI solutions. We pay particular
attention to the concentration of power at each layer
and the implications this has for public AI efforts.

In chapter 4, we introduce a framework for public AI


grounded in the concept of public digital infrastruc-
ture. Drawing on existing proposals, we analyze the
attributes, functions and ownership structures of
components in public AI systems. We then introduce
a gradient of publicness for AI solutions, which ac-
counts for varying degrees of dependency, especially
at the compute layer. This chapter also sets out gov-
ernance principles for public AI.

16
2 | Technical primer: What are AI
technologies and how do they work?

A lack of precision around AI technologies and the Defining artificial intelligence


technical aspects of their development remains a
major limitation in current policy debates. Broadly speaking, AI research and development is
concerned with building computer systems capable
Understanding AI – including both the technical and of performing tasks that typically require human in-
economic dimensions of its development – is essen- telligence.8 The origins of the term are attributed to
tial for crafting targeted, effective and efficient pol- an academic workshop held in 1956 at Dartmouth
icy interventions. Yet there is often little clarity in College in the United States.9 Since then, the field has
discussions about how these technologies are devel- gone through multiple cycles of optimism and disil-
oped, including the resources required and the sci- lusionment, with various approaches to training ma-
entific or engineering breakthroughs that have made chines to replicate or approximate human cognitive
this development possible. The term “AI” is fre- tasks. As the field has evolved, so too have definitions
quently used to refer to a wide array of technologies of AI – often shifting in response to changing capa-
with very different characteristics, without distin- bilities and expectations. What once marked the cut-
guishing clearly between them. ting-edge of AI, such as speech or image recognition,
has now become routine in many applications.
Greater clarity is needed to design realistic pathways
for implementing public AI strategies. This section A widely accepted definition comes from the OECD,
outlines key technical aspects of AI development and which describes an AI system as “a machine-based
lays the groundwork for the analysis in the following system that, for explicit or implicit objectives, in-
chapters, which explore how to create public infra- fers, from the input it receives, how to generate out-
structures and solutions in the AI space, with a par- puts such as predictions, content, recommendations
ticular focus on generative AI. or decisions that can influence physical or virtual en-
vironments. AI systems vary in their levels of au-
To this end, the section provides a technical explain- tonomy and adaptiveness after deployment.”10 This
er of AI technologies. It includes a basic typology, an definition has been adopted by numerous govern-
overview of the transformer paradigm – the dom-
inant development model for generative AI – and a 8 Stuart Russell and Peter Norvig. “Artificial Intelligence:
A Modern Approach, 4th US Ed.” (Berkeley: University of
discussion of AI scaling laws, which help explain why Berkeley, 2022).
generative AI requires such vast amounts of compute. 9 Haroon Sheikh, Corien Prins, and Erik Schrijvers, “Artificial
These scaling laws are a critical factor in determining Intelligence: Definition and Background,” in Mission AI,
Research for Policy (Cham: Springer, 2023), 15–41, https://doi.
which actors are able to develop generative AI and org/10.1007/978-3-031-21448-6_2
help explain why creating truly public AI remains so 10 “Explanatory memorandum on the updated OECD definition
challenging. of an AI system.” OECD, 2024. https://www.oecd.org/content/
dam/oecd/en/publications/reports/2024/03/explanatory-
memorandum-on-the-updated-oecd-definition-of-an-ai-
system_3c815e51/623da898-en.pdf. Accessed 23 April 2025.

17
Technical primer: What are AI technologies and how do they work?

Figure 1 | AI terminology

ARTIFICIAL
INTELLIGENCE
MACHINE
LEARNING

DEEP
LEARNING

GENERATIVE
AI

Illustration by: Jakub Koźniewski

ments and international institutions in in the devel- Machine learning (ML) is central to contemporary AI
opment of AI governance frameworks. 11
and refers to the development of computer programs
– specifically, statistical models – that learn from
While we adopt the OECD’s definition as a general data rather than relying on explicitly programmed
framing of AI, this report focuses on machine learn- instructions. Tom Mitchell defines ML as follows: “a
ing (ML) approaches to AI, and in particular the con- computer program is said to learn from experience
temporary paradigm of generative AI. In this section, (E) with respect to some class of tasks (T) and per-
we define these technical concepts before discussing formance measure (P), if its performance at tasks in
them in greater detail in the sections that follow. T, as measured by P, improves with experience E.”12
ML encompasses a variety of statistical learning ap-

11 CEIMIA. “A Comparative Framework for AI Regulatory Policy.”


https://ceimia.org/en/projet/a-comparative-framework-for- 12 Tom Mitchell. “Machine Learning.” (McGraw Hill, 1997).
ai-regulatory-policy/ Accessed 23 April 2025. https://www.cs.cmu.edu/~tom/mlbook.html

18
Technical primer: What are AI technologies and how do they work?

proaches, including supervised learning (learning model focused solely on text, while GPT-4o is a mul-
from labeled data), unsupervised learning (identify- timodal model capable of processing and generating
ing patterns in unlabeled data), and reinforcement text, audio, images and video. While the terms “large
learning (learning through interaction with an envi- language model” (LLM) and “foundation model” are
ronment). often used interchangeably, LLMs are technically a
subset of foundation models that specialize in lan-
Deep learning is a branch of ML that uses artificial guage processing. Foundation models more broadly
neural networks with multiple layers – hence “deep” include systems focused on other modalities, such as
– to detect patterns in raw data.13 deep learning is vision or audio, or combinations of them.
especially effective for handling unstructured data
such as text or images, making it popular in fields Although generative AI is currently receiving signifi-
like natural language processing and computer vi- cant attention – with high-profile tools like ChatGPT
sion. It also plays a key role in generative AI applica- and Claude capturing public attention – it is import-
tions, as described below. ant to note that the most common AI use cases still
rely on more traditional ML approaches, such as
Generative AI refers to the application of deep learn- regression models, random forests and clustering
ing techniques to build models that, given an input algorithms. According to the AI Mapping 2025 report,
– typically a natural language prompt – can gener- which studied 750 French AI startups, ML remains
ate novel outputs such as text, images, audio or code, the most widely used AI technique (28%), followed
without being explicitly programmed for each task.14 by deep learning (20%) and generative AI (15%).17
Recent advances in generative AI have been driven by The relative popularity of such ML approaches is re-
the emergence of transformer-based architectures flected in the download statistics of widely used open
and scaling laws, which we explain in more detail source software libraries for AI. For example, scikit-
below, as well as the rise of user-facing tools, such as learn, a Python library that implements ML algo-
OpenAI’s ChatGPT. Generative AI includes both un-
15
rithms and known as “the Swiss army knife for ML,”
imodal models, such as language or vision models, is downloaded up to 3 million times per day.18, 19 It
and multimodal models, which can process and gen- is followed by PyTorch, a popular framework for
erate multiple types of data – such as text, images or training deep learning models, and transformers, a
audio – either as inputs or outputs. library for accessing and fine-tuning models hosted
on Hugging Face Hub, which are both downloaded
Foundation models represent a major category up to 1.5 million times per day.20, 21
within generative AI. They are characterized by their
large scale (with up to trillions of parameters), train-
ing on vast and diverse datasets and adaptability to a
wide range of downstream applications.16 Foundation
models can be either unimodal or multimodal. For
17 France Digitale. “Mapping Des Startups Françaises de l’IA.”
example, OpenAI’s GPT-3 is a unimodal foundation https://mappings.francedigitale.org/ia-2025 Accessed 23 April
2025.
18 PyPI Stats. “scikit-learn.” Accessed 23 April 2025. https://
13 Yann LeCun, et al. “Deep Learning.” Nature, vol. 521, no. pypistats.org/packages/scikit-learn
7553, May 2015, pp. 436–44. www.nature.com, https://doi. 19 Inria. “The 2019 Inria-French Academy of Sciences-Dassault
org/10.1038/nature14539 Accessed 3 April 2025. Systèmes Innovation Prize: Scikit-Learn, a Success Story
14 McKinsey. What Is ChatGPT, DALL-E, and Generative AI? for Machine Learning Free Software.” https://www.inria.fr/
https://www.mckinsey.com/featured-insights/mckinsey- en/2019-inria-french-academy-sciences-dassault-systemes-
explainers/what-is-generative-ai Accessed 23 April 2025. innovation-prize-scikit-learn-success-story Accessed 23 April
15 OpenAI. Introducing ChatGPT. https://openai.com/index/ 2025.
chatgpt/ Accessed 13 March 2024. 20 PyPI Stats. “transformers.” https://pypistats.org/packages/
16 Rishi Bommasani, et al. On the Opportunities and Risks of transformers Accessed 23 April 2025.
Foundation Models. arXiv:2108.07258, arXiv, 12 July 2022. 21 PyPI Stats. “torch.” https://pypistats.org/packages/torch
arXiv.org, https://doi.org/10.48550/arXiv.2108.07258 Accessed 23 April 2025.

19
Technical primer: What are AI technologies and how do they work?

Generative AI models are increasingly seen as poten- This dual nature – general-purpose but also highly
tial general-purpose technologies (GPTs) – foun- adaptable – sets ML apart from other historical GPTs
dational innovations that reshape society and the and adds complexity to policy debates. Policymak-
economy across multiple sectors, much like the ers must grapple with this duality when designing
printing press, electricity, computers or the inter- frameworks that support both broad societal benefits
net. 22 More broadly, ML techniques are defined by and context-specific applications of ML and genera-
their versatility and capacity to generate innova- tive AI technologies.
tion across domains, thanks to their core ability to
identify patterns and make predictions from data.
As such, ML represents a paradigm shift from ear-
lier single-purpose AI systems. At the same time,
these models can be easily adapted for domain-spe-
cific applications without requiring major redesigns.

22 Sabrina Küspert, Nicolas Moës, Connor Dunlop. “The Value


Chain of General-Purpose AI.” Ada Lovelace Institute. 10
February 2023. https://www.adalovelaceinstitute.org/blog/
value-chain-general-purpose-ai/

Figure 2 | Neural network

An example of a simple neural network


capable of recognizing letters of the alphabet.
0.5 0.1

0.955
0.1 0.8 0.3 A
0.246
0.5 0.1 0.4 B
0.122
0.2 0.2 0.9 C

INPUT LAYER OUTPUT LAYER


0.4 0.7

HIDDEN LAYERS

Illustration by: Jakub Koźniewski

20
Technical primer: What are AI technologies and how do they work?

The deep learning paradigm process, the network makes predictions, compares
them to correct answers, calculates the error and up-
Various ML paradigms have been developed to train dates the weights to reduce this error.25 Importantly,
models to learn from data and perform specific tasks. the total number of weights – often expressed as 2B,
This ability to train models based on data, and the ap- 7B, or 40B (to indicate billions of parameters) – de-
plications that can be built on its basis, are the key termines the size of the model and plays a crucial
technological capacities that public AI policies aim to role in determining its capabilities and effectiveness.
secure for the public interest. In this section, we focus This architecture has proven powerful for complex
on deep learning approaches, which involve the use data types like images, text and audio, enabling
of neural networks. Deep learning has become a cen- major breakthroughs in fields from computer vision
tral paradigm for generative AI development. It gained to natural language processing.
significant momentum following a breakthrough re-
search paper published in 2006,23 and the pioneering Neural networks were first developed in the 1950s
application of that research in the development of and, while promising, were long constrained by two
AlexNet in 2012.24 As the paradigm evolved over the major limitations: insufficient computing power and
last decade, three types of resources emerged as par- limited access to data. Over the next decades, devel-
ticularly critical to its success: compute, data and opment of machine learning technologies – includ-
model architectures (including model size). ing recent advances in generative AI – depended on
securing these key resources. The field experienced
The underlying design of neural networks is based on several cycles of enthusiasm and disappointment
computational model architectures inspired by bio- until three key developments converged in the 2010s:
logical neurons in the brain. These networks consist
of interconnected artificial neurons organized into • 
Model architectures: The development of novel
layers that process and transform data through a se- architectures, such as convolutional neural net-
ries of mathematical operations. Each network has an works, initially used for computer vision and later
input layer that receives raw data, hidden layers that adopted for machine translation. Ultimately, the
process this information and an output layer that transformer architecture enabled the emergence
produces results. The “deep” in deep learning refers of generative AI applications.
to the presence of multiple hidden layers between the
input and output layers, which enables the model to • Compute: Advances in compute infrastructure,
learn increasingly complex and abstract representa- including the use of Graphics Processing Units
tions of data. (GPUs) for machine learning and the development
of the CUDA programming platform, allowed for
Figure 2 shows an example of a simple neural net- the massive parallel computations required for
work capable of recognizing letters of the alphabet. AI training.
The network learns by adjusting the weights – the
strengths of the connections between neurons – • Data: The creation of large labeled datasets en-
through a process called backpropagation. In this abled a data-centric approach to AI and reinforced
the dominance of supervised learning. Many of
these datasets were built from publicly available
23 Geoffrey E. Hinton, et al. “A Fast Learning Algorithm for Deep
Belief Nets.” Neural Computation, vol. 18, no. 7, July 2006, or openly shared web content, highlighting how
pp. 1527–54. DOI.org (Crossref), https://doi.org/10.1162/ open data can facilitate the collection and cura-
neco.2006.18.7.1527
tion of training examples.
24 Alex Krizhevsky, et al. “ImageNet Classification with Deep
Convolutional Neural Networks.” Advances in Neural
Information Processing Systems, vol. 25, Curran Associates,
Inc., 2012. Neural Information Processing Systems, 25 David E. Rumelhart, et al. “Learning Representations by Back-
https://papers.nips.cc/paper_files/paper/2012/hash/ Propagating Errors.” Nature, vol. 323, no. 6088, Oct. 1986, pp.
c399862d3b9d6b76c8436e924a68c45b-Abstract.html 533–36. https://doi.org/10.1038/323533a0

21
Technical primer: What are AI technologies and how do they work?

A pivotal moment came in 2012 with AlexNet, a deep ical analyses of ImageNet revealed that the label-
neural network that significantly outperformed ing process introduced multiple forms of bias, which
existing methods in image recognition. Its success went unaddressed even as the dataset became foun-
demonstrated that deep learning models could achieve dational to AI research.28 Similar problems applied to
breakthrough results when trained on large datasets other computer vision datasets built at that time, and
using sufficient computational power. This marked these challenges have continued with the creation of
the beginning of the modern deep learning era. training datasets for generative AI models.29

AlexNet would not have been possible without Ima-


geNet, a dataset containing millions of labeled imag- “Attention is all you need”:
es across thousands of categories. ImageNet provided Transformers and the rise of
the scale necessary to train deep neural networks in generative AI
ways that had not previously been feasible. Its suc-
cess relied on publicly available web content that Among recent technological innovations, the devel-
could be scraped, as well as on Amazon’s Mechanical opment of the transformer architecture stands out as
Turk platform, where tens of thousands of low-paid arguably the most significant breakthrough driving
workers labeled the images. According to Fei Fei Li, the rapid progress of generative AI over the past de-
who led the work, the ImageNet project was the first cade. Introduced by machine translation researchers
initiative to demonstrate the critical role of large- at Google in 2017, this architecture marked a major
scale datasets for AI development.26 leap forward in machine learning for language pro-
cessing.30
Equally important was the leap in computation-
al capabilities. A key breakthrough came with GPUs, Unlike earlier models that processed text sequen-
which were originally designed for computer games tially, transformers can process entire sequences
but proved transformative for AI due to their superi- simultaneously through a mechanism called self-at-
or ability to handle multiple calculations simultane- tention. This mechanism allows the model to weigh
ously. Nvidia‘s introduction of the CUDA computing the relevance of each word in relation to all others in
platform in 2006 was especially important, as it al- a sentence, greatly enhancing its capacity to under-
lowed GPUs to be used for general-purpose machine stand context and meaning.
learning tasks. This convergence of massive datasets
and GPU-powered parallel processing capabilities The complexity of the self-attention mechanisms
proved decisive in accelerating AI development. means that transformers require compute and data
resources at a scale not seen in any prior algorithm.
Already with ImageNet, data governance issues that This characteristic has key consequences for how AI
now plague AI development were beginning to sur- development is resourced – an issue that public AI
face. The labor of some 49,000 underpaid Mechanical policies must take seriously.
Turk workers remained largely invisible. The work
involved in labeling images scraped from the web,
and the rights of the individuals depicted in those
images, were also poorly acknowledged.27 Later, crit- 28 Kate Crawford and Trevor Paglen. “Excavating AI. The Politics
of Images in Machine Learning Training Sets.” 19 September
2019. https://excavating.ai/
26 Timothy B. Lee. “Why the Deep Learning Boom Caught Almost 29 Alek Tarkowski and Zuzanna Warso. “AI Commons. Filling the
Everyone by Surprise.” Understanding AI. 5 November 2024. governance vacuum on the use of information commons for AI
https://www.understandingai.org/p/why-the-deep-learning- training.” 12 January 2023. https://openfuture.eu/publication/
boom-caught ai-commons/
27 Vi Hart. “Changing My Mind about AI, Universal Basic Income, 30 Ashish Vaswani, et al. Attention Is All You Need.
and the Value of Data.” The Art of Research. 30 May 2019. arXiv:1706.03762, arXiv, 2 Aug. 2023. arXiv.org, https://doi.
https://theartofresearch.org/ai-ubi-and-data/ org/10.48550/arXiv.1706.03762

22
Technical primer: What are AI technologies and how do they work?

INFOBOX | The attention mechanism in transformer architectures

The attention mechanism starts by converting simultaneously and compute their relationships.
each token – a word or subword unit – into a For instance, in the sentence:
high-dimensional vector. It then computes three
matrices, known as queries, keys and values from “The water continued to flow steadily,
these vector tokens. These matrices help the gradually eroding the bank.”
model determine how information flows between The attention mechanism allows the model to rec-
tokens, allowing it to capture both short-range ognize that “water” and “flow” are highly relevant
and long-range dependencies in the text. This for interpreting the meaning of “bank,” helping it
architecture enables models to learn increasingly infer that the term refers to a riverbank rather
sophisticated language patterns. than a financial institution – even though the rel-
evant words appear several tokens earlier. When
Previous approaches to language processing re- processing each word, the model calculates atten-
lied primarily on techniques like recurrent neural tion scores to determine how much weight to as-
networks (RNNs), which processed text sequen- sign to every other word in the sequence.
tially (i.e., one word at a time). Consider the sen-
tence: It is important to note that transformer archi-
tectures come in different variants that use the
“Watching the water flow gently, she sat attention mechanism in task-specific ways. En-
down by the bank.” coder-only models process entire inputs at once
RNN-based models would process the sentence to build rich contextual representations (as in
word by word. By the time it reaches the word the example above), making them well suited for
“bank,” the model might struggle to recall earlier tasks such as text classification and comprehen-
words like “water” and “flow” because RNNs tend sion. Decoder-only models generate outputs one
to forget information as the sequence grows lon- token at a time, attending to previously gener-
ger. While more sophisticated architectures like ated tokens, and are used in text generation and
long short-term memory networks (LSTMs) im- completion systems. Encoder-decoder models
proved on basic RNNs, they still faced fundamen- combine both approaches by first encoding the
tal limitations in processing long sequences. full input before the decoder produces an output,
enabling tasks like translation, summarization
The key innovation of transformers was the in- and question answering.
troduction of the attention mechanism, which en-
ables the model to process all words in a sequence

Since 2017, the transformer architecture has become achieved by scaling three key resources: data, com-
the foundation for a wide range of language tasks. Its pute and the size of the resulting model. The char-
most prominent application has been in LLMs, no- acteristics of transformer architectures – and their
tably the Generative Pretrained Transformer (GPT) high demands for compute and data – are crucial
family developed by OpenAI, beginning with GPT-2 considerations for public AI policies.
in 2019. Since then, generative AI development has
focused on extending and improving model capa- Recent architectural innovations have extended
bilities. In this development paradigm, progress is transformer capabilities to handle multiple modal-

23
Technical primer: What are AI technologies and how do they work?

ities simultaneously. These multimodal models are Pretraining phase


able to process combinations of text, images, audi
and video through specialized adaptations of the Pretraining is the first step in the model develop-
original architecture. At the same time, researchers ment pipeline, in which models are trained through
are exploring novel approaches like domain-specific self-supervised learning to identify general patterns
fine-tuning, which improves model performance for from large datasets.
specific applications without relying solely on in-
creasing model size. Advances in quantization and Before any training begins, the data must be pre-
pruning re also making it possible to deploy larger pared. This process starts with data collection, fol-
models more efficiently on resource-constrained lowed by the cleaning and validation of raw data. At
hardware. this stage, legal issues – such as licensing – are ad-
dressed. The cleaned dataset is then filtered to re-
move low-quality or inappropriate content. The data
The generative AI development is further processed; for example, text data used to
process train LLMs is tokenized to ensure it can be efficiently
processed by the model. If sensitive data is involved,
The development of generative AI models is a multi- privacy and security measures are also implement-
stage process that combines different training ap- ed. In parallel, the model architecture, software and
proaches to create capable AI systems. This process other tooling are designed and prepared.
involves multiple components, including the model
architecture, code used to train or evaluate a model, This phase requires substantial computational re-
code used to preprocess training datasets, and data- sources and, since 2020, has been shaped by scaling
sets used for model training, evaluation or align- laws – a trend in which increasing model size, data-
ment. Throughout the process, computing resources set size and compute power leads to improved per-
are used to train the generative AI model on large formance (these laws are described in more detail on
volumes of data. page 27). Each new generation of frontier models
requires greater quantities of these key resources.
In this section, we provide an overview of the gen- However, computational demands vary significantly
erative AI development process, from the pretrain- depending on model scale, and recent advances in
ing of base foundation models to their post-training training efficiency have begun to reduce these re-
fine-tuning and eventual deployment in user-fac- quirements. A new small model paradigm is emerg-
ing applications. The technical characteristics of this ing, shifting focus from ever-larger models to more
process determine the resources needed to create efficient and resource-conscious approaches.31
generative AI. Securing access to these resources is a
key objective of public AI policies. The aim here is to In each case, pretraining is the most resource-in-
provide a conceptual framework that illustrates the tensive phase of model development. Its output is a
core stages and decision points in developing gen- pretrained base model – not yet suitable for deploy-
erative AI systems. Understanding how the compo- ment, but capable of serving as a general-purpose
nents interconnect – and where dependencies arise foundation for further training.
– is essential for designing realistic and actionable
pathways toward public AI.

The diagram below offers a simplified illustration of


the generative AI development process, divided into
three phases: pretraining, post-training and infer-
31 Rina Diane Caballar. “What Are Small Language Models
ence. (SLM)?.” IBM. 31 October 2024. https://www.ibm.com/think/
topics/small-language-models Accessed 3 April 2025.

24
Technical primer: What are AI technologies and how do they work?

Figure 3 | Development of a generative AI model

Training
Datasets
Model
Software Architecture
& tooling
PRE-TRAINING

Model
pre-training

Base Model

Datasets Software
& tooling
POST-TRAINING

Model
post-training

Deployed Model
Inference
Compute
DEPLOYMENT

User-facing apps

Illustration by: Jakub Koźniewski

25
Technical primer: What are AI technologies and how do they work?

Post-training phase Deployment

During this phase, the base model is refined through In the so-called inference stage, the trained model
additional training to improve its capabilities for is deployed for use in user-facing applications, re-
specific domains and align its behavior with specific quiring additional computational resources known
objectives. At this stage, various datasets are used as inference compute. Alternatively, a model can be
to further train the model, including validation, in- hosted on a cloud platform and made available via an
struction and benchmark datasets. API, enabling third parties to build their own appli-
cations on top of it. Inference compute is essential
Development methods in this phase have evolved to run the model and, in the case of large frontier
rapidly, with two key approaches becoming prom- models, can involve significant costs and environ-
inent in 2024. The first is supervised fine-tuning, mental impact. This is also the reason why small,
in which models are trained on domain-specific or more sustainable models are being developed.
task-specific datasets. For example, a model might
be finetuned on medical literature to enhance its The overview presented here simplifies what is often
healthcare-related capabilities. The second approach a far more complex development process. In prac-
relies on reinforcement learning to enhance model tice, model development is rarely a one-time effort.
reasoning and decision-making. Two primary meth- AI labs typically seek to aim to improve their models
ods are reinforcement learning from human feedback over time, revisiting earlier stages of the process.
(RLHF), where human preferences guide the learning As a result, development is often iterative, with feed-
process, and reinforcement learning from AI feed- back from later stages informing adjustments to
back (RLAIF), which uses other AI systems to provide earlier components. Developers frequently cycle be-
training signals. 32
tween phases as they refine their systems. Finally,
compound AI systems combine multiple models to
This phase also includes evaluation and deploy- build holistic workflows and applications.
ment optimization, ensuring that the models meet
requirements for accuracy, reliability, computa- This overview also does not fully reflect the diversi-
tional efficiency and safety before deployment in ty of open model development ecosystems, such as
real-world applications. This includes testing on those found on platforms like Hugging Face. Such
standardized benchmarks, which measure model development entails the combined use of various
capabilities across diverse domains including rea- models. In such cases, work often begins with an
soning, knowledge and safety. 33
Models are also eval- openly available base model from which derivative
uated through specialized testing methodologies like models are produced, such as through fine-tuning.
adversarial testing and hallucination detection. Furthermore, smaller models can be created from
large models through techniques like distillation or
The result of this phase is a fully trained model, quantization (see Section 3).
ready for deployment.

AI scaling laws: The contested


future of AI
In the previous section, we provided an overview of

32 Nathan Lambert. “Beyond Human Data: RLAIF Needs a


how generative AI is developed, and demonstrated
Rebrand.” Interconnects. 26 April 2023. https://www. how compute, data, software and architectural com-
interconnects.ai/p/beyond-human-data-rlaif
ponents are used in this process. In this section we
33 Reginald Martyr. “LLM Benchmarks Explained: Significance,
Metrics & Challenges.” Orq.ai. 26 February 2025. https://orq.ai/
explain AI scaling laws – a defining feature of trans-
blog/llm-benchmarks former-based model development. The discovery of

26
Technical primer: What are AI technologies and how do they work?

these empirical relationships has made AI develop- earlier models, in particular GPT-3, had been un-
ment increasingly dependent on securing ever-larger dertrained: the models were too large relative to the
amounts of compute power and training data. Ensur- amount of data and compute used. A new approach,
ing access to these resources must be a core focus of suggested in the paper, allowed smaller models to be
any public AI policy aiming to support state-of-the- as effective as larger ones, if higher quality data was
art model development. used for training over extended periods of time.

What are AI scaling laws? The evolution of AI scaling laws

A 2020 research paper by OpenAI researchers ob- While scaling laws have driven significant gains in AI
served that there are scaling laws 34 inherent in performance, recent research suggests that the ben-
transformer-based model development. The paper efits of continued scaling may be slowing.38 The key
identified a power law relationship between three constraint is the limited availability of high-quality
scalable resources for AI training – model parameter training data. Ilya Sutskever, one of the co-founders
count, training data size and computational power – of OpenAI and a key figure in transformer-based
and model performance. The approach has been de- model development, argues that we have reached
scribed as the “bigger is better” paradigm in AI.35 “peak data,”39 as all frontier AI labs rely on scraping
web data, and scaling laws are beginning to plateau.
Scaling laws are not natural laws. They are based on The stakes in this debate are high. Proponents of
empirical observations: performance on benchmark artificial general intelligence (AGI), a form of artifi-
tasks improves when all three inputs – model size, cial super-intelligence, often place their bets on the
dataset size and compute – are increased together continued validity of scaling laws. This belief in the
during the pre-training phase. In other words, trans- power of AI scaling also underpins recent massive
former models – known for their resource-intensive investments into compute, such as the $500 billion
nature due to their ability to process vast amounts of investment announced by OpenAI, Oracle and Soft-
data in parallel – are central to this paradigm. These bank in January 2025.40 Critics, meanwhile, view the
improvements only emerge when all three compo- current wave of AI development as yet another hype
nents are scaled up together during the pretraining cycle destined to collapse because of the consequenc-
phase. Researchers found that increasing any single es of scaling laws.41
factor in isolation leads to diminishing returns. 36

Opinions about the future of AI scaling laws vary.


The so-called Chinchilla scaling law introduced in Some experts argue that scaling laws plateau nat-
2022 by DeepMind, refined this model by proposing urally over time and that the current slowdown is a
a more optimal ratio between model size, training natural progression of pretraining scaling laws and
data and compute. 37
The research demonstrated that
38 Gaël Varoquaux, et al. ibid.
34 Jared, Kaplan, et al. Scaling Laws for Neural Language Models. 39 Jeffrey Dastin. “AI with Reasoning Power Will Be Less
arXiv:2001.08361, arXiv, 23 Jan. 2020. arXiv.org, https://doi. Predictable, Ilya Sutskever Says.” Reuters. 14 December 2024.
org/10.48550/arXiv.2001.08361 https://www.reuters.com/technology/artificial-intelligence/
35 Gaël Varoquaux, et al. Hype, Sustainability, and the Price ai-with-reasoning-power-will-be-less-predictable-ilya-
of the Bigger-Is-Better Paradigm in AI. arXiv:2409.14160, sutskever-says-2024-12-14/
arXiv, 1 Mar. 2025. arXiv.org, https://doi.org/10.48550/ 40 Reuters. “SoftBank to Invest $500 Mln in OpenAI, The
arXiv.2409.14160 Information Reports.” 30 September 2024. https://www.
36 Cameron R. Wolfe. “Scaling Laws for LLMs: From GPT-3 to O3.” reuters.com/technology/softbank-invest-500-mln-openai-
Deep (Learning) Focus, 6 Jan. 2025, https://cameronrwolfe. information-reports-2024-09-30/
substack.com/p/llm-scaling-laws 41 Gary Marcus. “The Most Underreported and Important Story in
37 Jordan, Hoffmann, et al. Training Compute-Optimal Large AI Right Now Is That Pure Scaling Has Failed to Produce AGI.”
Language Models. arXiv:2203.15556, arXiv, 29 Mar. 2022. arXiv. Fortune. 19 February 2025. https://fortune.com/2025/02/19/
org, https://doi.org/10.48550/arXiv.2203.15556 generative-ai-scaling-agi-deep-learning/

27
Technical primer: What are AI technologies and how do they work?

was always to be expected. Others predict that AI reasoning models that generate multiple candidate
scaling laws will not diminish but rather evolve, as outputs and internally select the best one.47 Early
the scaling effects depend on multiple factors. 42
results indicate this approach can significantly boost
performance without increasing model size or re-
An evolution in scaling is already underway. The quiring additional training data.48 The approach
singular focus on model pretraining –dominant gained attention with the release of OpenAI’s rea-
between 2020 and 2023 – is giving way to a new par- soning models, such as O1 and O3, which demon-
adigm that focuses on advantages gained in later strated strong benchmark performance using this
stages of development, or even in the deployment method.
phase. 43
Even if the original pretraining scaling laws
are beginning to level off, these other phases are in- Even if the original scaling laws – now referred to
creasingly seen as the next frontier. Some research- as pretraining scaling – begin to plateau, this does
ers argue that the future of AI capabilities might not necessarily imply a reduction in computation-
depend more on finding the right balance between al demand. Inference scaling continues to gain trac-
these three scaling dimensions rather than pushing tion and could sustain high resource requirements.
any single dimension to its limits. 44
Consequently, the development and deployment of
transformer-based models may remain costly, even
Starting in 2024, leading AI companies have shifted as architectures and training methods evolve.49
their focus to scaling during the post-training phase.
In this phase, optimization involves refining models As described in chapter 2, AI scaling laws are a key
after their initial training through reinforcement driver of concentration in the AI ecosystem. An AI
learning techniques and other fine-tuning meth- development approach that is based on transformer
ods. 45
These methods help align models with human architecture, and that adheres to AI scaling laws has
preferences and specific tasks, and enable the devel- several repercussions. First, it creates the kind of
opment of models capable of more complex reason- concentrations of market power outlined in the pre-
ing. Research suggests that the relationship between vious section. Second, it creates new and distinct
resources invested in post-training and resulting forms of digital divide, which are related to uneven
performance improvements follows its own distinct access to computing resources. And third, it dramati-
scaling patterns. Some researchers argue that there cally increases the environmental footprint of AI sys-
may be more headroom for improvement in this tems.
phase than in pretraining, particularly as reinforce-
ment learning methods continue to advance.46
Scaling and AI’s environmental footprint
A related trend is inference compute scaling, where
greater computational resources are allocated during The training and deployment of large AI models
model use to enhance performance. This has emerged comes with a major environmental footprint that
as a new frontier, particularly in the development of extends beyond just the energy consumption of data

42 Cameron R. Wolfe. ibid.


47 Maxwell Zeff. “Current AI Scaling Laws Are Showing
43 Dario Amodei. “On DeepSeek and Export Controls.” Dario Diminishing Returns, Forcing AI Labs to Change Course.”
Amodei (blog). January 2025. https://www.darioamodei.com/ TechCrunch, 20 November 2024. https://techcrunch.
post/on-deepseek-and-export-controls Accessed 3 April 2025. com/2024/11/20/ai-scaling-laws-are-showing-diminishing-
44 Jordan Hoffmann, et al. ibid. returns-forcing-ai-labs-to-change-course/
45 Cameron R. Wolfe. ibid. 48 Cameron R. Wolfe. ibid.
46 Cameron R. Wolfe. “Basics of Reinforcement Learning for LLMs. 49 “Scaling Laws – O1 Pro Architecture, Reasoning Training
Understanding the problem formulation and basic algorithms Infrastructure, Orion and Claude 3.5 Opus ‘Failures.’”
for RL.” Deep (Learning) Focus, 25 September 2023. https:// SemiAnalysis, 11 December 2024, https://semianalysis.
cameronrwolfe.substack.com/p/basics-of-reinforcement- com/2024/12/11/scaling-laws-o1-pro-architecture-reasoning-
learning/ training-infrastructure-orion-and-claude-3-5-opus-failures/

28
Technical primer: What are AI technologies and how do they work?

centers. For example, training a single large language adox:57 as individual models become more efficient,
model can emit up to 550 metric tons of CO2.50 More- overall environmental impact increases because im-
over, the energy required for deployment – known as proved efficiency leads to broader deployment and
inference – accounts for a substantial share of on- more frequent use. This dynamic raises questions
going AI-related energy use, ranging from one-third about whether the current trajectory of AI develop-
at Meta to as much as 60% at
51 Google.52 In addition, ment, with its emphasis on scale, is environmentally
the cooling systems required to prevent data centers sustainable in the long term.
from overheating often rely on large volumes of
water, placing further strain on local water resourc- These environmental concerns are an important fac-
es. 53
As AI systems become more widespread, these tor that should be integrated into public AI policy-
energy demands continue to grow. It is telling that making. Any deployment of AI technologies must
the dominant AI companies are evolving into energy address the environmental impact inherent in to-
companies.54 This trajectory is fundamentally unsus- day’s generative AI systems. Commercial AI labs –
tainable, as computational demands grow faster than operating within the paradigm of AI scaling laws and
improvements in model performance.55 leveraging full-stack approaches – tend to embrace a
“bigger is better” model of data center development,
The environmental costs of AI extend beyond ener- often at the expense of environmental sustainabili-
gy consumption. Beyond data centers, the entire AI ty.58 AI’s environmental impact, alongside financial
supply chain raises serious concerns about the en- limitations, is a major reason why public AI policies
vironmental costs of contemporary approaches to should not simply replicate commercial strategies
AI development and deployment 56
– from the ex- focused on ever-larger models.
traction of raw materials for GPUs to the mounting
problem of electronic waste from discarded hard-
ware. This creates what researchers call a Jevons par- What is the future of AI scaling laws?

In late 2024, news about DeepSeek‘s V3 and R1 mod-


els led many media observers to question the scal-
50 Gaël Varoquaux, et al. ibid. ing paradigm‘s future. Initial reports suggested that
51 Carole-Jean Wu, et al. “Sustainable AI: Environmental DeepSeek had managed to reduce development costs
Implications, Challenges and Opportunities.” Proceedings of from from hundreds of billions to just several billion
Machine Learning and Systems, vol. 4, Apr. 2022, pp. 795–813.
proceedings.mlsys.org, https://proceedings.mlsys.org/paper_ dollars. However, it has since become clear that these
files/paper/2022/hash/462211f67c7d858f663355eff93b745e- models do not represent a fundamental shift in AI
Abstract.html
development, contrary to what some experts initial-
52 David Patterson, et al. “The Carbon Footprint of Machine
Learning Training Will Plateau, Then Shrink.” Computer, ly suggested.
vol. 55, no. 7, July 2022, pp. 18–28. IEEE Xplore, https://doi.
org/10.1109/MC.2022.3148714
DeepSeek ultimately did not demonstrate that a new
53 Cindy Gordon. “AI Is Accelerating the Loss of Our Scarcest
Natural Resource: Water.” Forbes. 25 February 2024. https:// generation of state-of-the-art models can be built
www.forbes.com/sites/cindygordon/2024/02/25/ai-is- with dramatically reduced compute requirements.
accelerating-the-loss-of-our-scarcest-natural-resource-
The widely cited $5.6 million training figure referred
water/
54 Alex Lawson. “Google to Buy Nuclear Power for AI Datacentres
in ‘World First’ Deal.” The Guardian, 15 October 2024. https:// 57 Alexandra Sasha Luccioni, Emma Strubell, and Kate Crawford.
www.theguardian.com/technology/2024/oct/15/google-buy- ‘From Efficiency Gains to Rebound Effects: The Problem of
nuclear-power-ai-datacentres-kairos-power Jevons’ Paradox in AI’s Polarized Environmental Debate’. arXiv,
55 Gaël Varoquaux, et al. ibid. 27 January 2025. https://doi.org/10.48550/arXiv.2501.16548
56 Ana Valdivia. “The Supply Chain Capitalism of AI : A Call 58 Dara Kerr. “AI Brings Soaring Emissions for Google and
to (Re)Think Algorithmic Harms and Resistance through Microsoft, a Major Contributor to Climate Change.” NPR, 12
Environmental Lens.” Information, Communication & Society, July 2024. NPR, https://www.npr.org/2024/07/12/g-s1-9545/
Oct. 2024, pp. 1–17. DOI.org (Crossref), https://doi.org/10.1080/ ai-brings-soaring-emissions-for-google-and-microsoft-a-
1369118X.2024.2420021 major-contributor-to-climate-change

29
Technical primer: What are AI technologies and how do they work?

only to the final training run. The total cost of Deep- The release of DeepSeek’s models offers several im-
Seek’s AI infrastructure is estimated at $1.6 billion. portant lessons for public AI strategy. First, under
The company operates around 50,000 GPUs – com- the current scaling paradigm, access to computing
parable to major Western AI labs and consistent with power remains a critical requirement for develop-
the demands of the AI scaling laws. 59
ing state-of-the-art AI models. Even with the inno-
vations introduced by the DeepSeek team, building
As explained in the technical report, DeepSeek re- state-of-the-art models remains a resource-inten-
searchers achieved performance on par with existing sive endeavor. Even if costs – and thus environmen-
state-of-the-art models through two major algo- tal impact – are reduced at each stage of training and
rithmic improvements. 60
The first was an advance- deployment, the Jevons paradox still applies: effi-
ment of the mixture of experts technique, which ciency gains may drive wider adoption, ultimately
divides the model into specialized submodels. In- increasing total energy consumption.62
stead of activating the full model during training
and inference, only relevant submodels are engaged, However, computing power is only one part of the
reducing computational requirements. equation. The DeepSeek example shows that signif-
icant gains can also be achieved through advances
The second breakthrough prompted by U.S. export in machine learning techniques. Many of these de-
controls, which restrict DeepSeek to using lower- pend on the availability of a state-of-the-art model,
performance Nvidia chips with limited memory which can be used to create derivative, smaller mod-
bandwidth – a major constraint, since model training els – enabling capability transfer without needing to
requires moving massive volumes of data between replicate the full computational cost of the original
memory and processing units. In response, DeepSeek model.
developed methods to reduce memory bandwidth
demands during both pretraining and inference.61 Therefore, concentration of compute does not auto-
matically result in a lasting concentration of model
Moreover, the DeepSeek researchers demonstrated capability. Distillation techniques used by DeepSeek
that the advanced capabilities of state-of-the-art show that smaller, more energy-efficient models can
models like DeepSeek-R1 can be distilled into small- be created on the basis of larger models.
er models. These models outperform state-of-the-
art models, both open and proprietary. The company The key takeaway from the release of DeepSeek’s
has also pledged to release open-weight versions of models is that ultimately it confirms the economics
its models – freely available for use, though without of frontier AI development under the current scaling
access to the original training data or certain propri- paradigm. Any public AI strategy must still secure
etary components. access to significant computing – comparable to
those of major commercial AI labs – in order to de-
velop state-of-the-art models. This either requires
significant investment in public compute infrastruc-
59 Anton Shilov. “DeepSeek Might Not Be as Disruptive as ture, which would only yield results over the longer
Claimed, Firm Reportedly Has 50,000 Nvidia GPUs and Spent term, or accepting a degree of reliance on commer-
$1.6 Billion on Buildouts.” Tom’s Hardware, 2 February 2025,
https://www.tomshardware.com/tech-industry/artificial-
cial compute providers.
intelligence/deepseek-might-not-be-as-disruptive-as-
claimed-firm-reportedly-has-50-000-nvidia-gpus-and-
spent-usd1-6-billion-on-buildouts
60 DeepSeek-AI, et al. DeepSeek-R1: Incentivizing Reasoning
Capability in LLMs via Reinforcement Learning.
arXiv:2501.12948, arXiv, 22 Jan. 2025. arXiv.org, https://doi. 62 Jennifer Collins. “What Does DeepSeek Mean for AI’s
org/10.48550/arXiv.2501.12948 Environmental Impact?” DW.com. 30 January 2025. https://
61 Ben Thompson. “DeepSeek FAQ.” Stratechery. 27 January 2025. www.dw.com/en/what-does-chinas-deepseek-mean-for-ais-
https://stratechery.com/2025/deepseek-faq/ energy-and-water-use/a-71459557

30
Technical primer: What are AI technologies and how do they work?

At the same time, the DeepSeek example illustrates


that machine learning techniques are far from reach-
ing their full potential. Algorithmic improvements
and post-training methods such as distillation can
make training more efficient and enable the develop-
ment of smaller, more accessible models. A public
AI strategy could therefore prioritize a research and
innovation agenda aimed at reducing dependence on
transformer architectures and their associated scal-
ing laws. Small model development can also comple-
ment the creation of frontier models – particularly
when models are openly released – by supporting an
ecosystem in which a range of models can be adapted
and used by diverse actors.

31
3 | The generative AI stack

Overview of the AI stack Figure 4 | Layers of the AI stack

AI systems can be understood to comprise a layered


technological stack in which each layer interacts with
and supports the others. Each layer typically serves
a distinct purpose, as different components – often APPS
developed by different actors – contribute to the cre-
ation and operation of the overall system. This mod-
ular perspective, similar to how the internet is often
described as a technological stack, helps clarify the
roles, dependencies and forms of collaboration in-
MODEL
volved in AI development and deployment.63

The basic AI stack consists of the following layers,


arranged from the bottom up:

DATA
• 
Compute: This foundational layer refers to the
physical and software infrastructure that enables
AI development and deployment. At its core are
specialized processors or chips – primarily GPUs
– designed to handle the massive parallel com-
COMPUTE
putations required for training and running AI
models. To make these chips usable at scale, two
elements are essential: software frameworks that
optimize GPU performance and the integration
of GPUs into data centers, where they are stacked Illustration by: Jakub Koźniewski
and networked into powerful, scalable compute
systems, often delivered via cloud platforms.
• Models: This layer refers to the AI models them-
• Data: This layer involves storage, processing and selves. Each model consists of an architecture and
transfer of datasets used in both the pretraining a set of parameters – its weights and biases – re-
and post-training phases of AI development. fined through training. These models are typically
deployed as cloud-based services.

63 Cole Stryker, “What Is an AI Stack?” IBM. 29 November 2024.


https://www.ibm.com/think/topics/ai-stack

32
The generative AI stack

• 
Applications: In this layer, AI models are em- es their use for AI workloads, while PyTorch is a deep
bedded in user-facing systems and applications. learning framework that abstracts hardware com-
Running these applications requires additional plexity and offers high-level APIs for efficient model
computing power, referred to as inference or test- development.
time compute.
In the data layer, software manages data ingestion,
In analyzing public AI development pathways, we processing and organization. For models, it provides
focus on the compute, data and model layers. We do development environments as well as tools for train-
not address the applications layer, as it lies down- ing, evaluation and fine-tuning. Widely used open
stream from the core layers under discussion. The source libraries include scikit-learn and PyTorch for
existence of public AI applications depends on the training machine learning models, GPT-NeoX for
availability and capabilities of AI systems built training large language models, vLLM for model in-
through the orchestration of resources at these foun- ference or serving, transformers for fine-tuning and
dational levels. Therefore, when considering ele- LM Harness and lighteval for evaluation. At the ap-
ments of public AI policy, application development plication layer, software provides APIs, monitoring
pathways will be listed as a key additional measure. systems and user interfaces that make AI models ac-
cessible and usable by downstream developers and
Beyond the fundamental layers – from hardware to users.
applications – additional layers are sometimes iden-
tified to emphasize other critical functions. For ex- Because software development cuts across all layers
ample, Mozilla introduces a safeguards layer in its AI of the stack, it is difficult to define a standalone pub-
stack to highlight the importance of tools and mech- lic AI development pathway that focuses solely on
anisms that ensure the safety of AI systems. 64
Sim- software. Instead, support for software development
ilarly, a talent layer is often added to highlight the should be considered a key complementary measure
indispensable role of human talent and know-how in within each of the pathways.
driving AI development.65
The section below outlines the advantages of the
Software used in AI development can also be seen as stack model for designing public AI policy and high-
a cross-cutting layer that spans the entire AI stack. lights how it helps clarify concentrations of power
At each level, both proprietary and open source solu- in the AI ecosystem. This is followed by a closer look
tions are commonly used by developers. At the com- at the three core layers – compute, data and models.
pute level, software is essential for managing and The characteristics of these layers shape dependen-
orchestrating hardware resources during the de- cies on commercially provided resources that pub-
velopment and deployment of AI models. It creates lic AI initiatives must navigate. They also present key
abstractions that allow developers to harness com- considerations for any effort to develop independent
puting power without dealing directly with the com- public AI systems.
plexity of low-level hardware. For example, Nvidia’s
CUDA provides direct access to GPUs and optimiz-
Advantages of the AI stack concept
64 Adrien Basdevant, et al. “Towards a Framework for Openness in
Foundation Models. Proceedings from the Columbia Convening
on Openness in Artificial Intelligence.” Mozilla. 21 May 2024. The stack model can be a useful framework for gov-
https://foundation.mozilla.org/en/research/library/towards-a- erning complex technologies. In this context, gover-
framework-for-openness-in-foundation-models/
nance is understood as the exertion of control over
65 Ganesh Sitaraman and Alex Pascal. “The National Security
Case for Public AI.” Vanderbilt Policy Accelerator for the various layers of the stack and the orchestration
Political Economy and Regulation. 24 September 2024. of actors at each layer to achieve specific outcomes
https://cdn.vanderbilt.edu/vu-URL/wp-content/uploads/
sites/412/2024/09/27201409/VPA-Paper-National-Security-
through the use of the overall technology. Typical
Case-for-AI.pdf forms of such control include regulation and volun-

33
The generative AI stack

tary norms,66 and key governance questions concern tain monopoly power in a single layer, while foster-
the interplay between various layers of the stack. ing competition in – and thus commoditize – other
layers. From a business perspective, the stack model
This model also goes hand in hand with supply chain offers a way to analyze where profits are generated,
analyses, particularly at the hardware level, by re- accounting for both dependencies (such as on GPUs)
vealing how dependencies and power concentrations and competition in an environment where many
emerge in technological systems.67 At each layer solutions are openly shared.71
of the stack, power can accumulate, and the stack
model helps clarify these concentrations and their A public approach, on the other hand, focuses not on
broader impact on the technological system. control, but on orchestrating the various components
and layers to achieve public interest goals. A layered
This perspective also illustrates the interconnections approach, based on the stack metaphor, allows for
between AI systems and other digital infrastructures better governance of AI.72 It takes into account the
– such as the internet, online platforms and data complexity of AI systems, while also demonstrat-
centers – and shows how different actors (industry, ing their interdependent nature. It allows for exam-
governments, NGOs, academia and communities) ination of dependencies at different layers, as well as
both rely on and influence one another. 68
the benefits of sharing key resources, and their im-
pact on model development. For example, policies
Researchers from the Ada Lovelace Institute note that consider the entire stack can address more than
that the resource-intensive nature of AI development just compute resources. The European AI Continent
often renders “downstream” users dependent on Action Plan is an example of such a “full-stack” ap-
“upstream” providers, typically large AI companies. proach, as it includes measures on computing power,
This dynamic underscores the need for policymak- training data, models and deployment of AI sys-
ers to understand how value is created and distribut- tems.73
ed across the AI stack. 69

The stack metaphor also helps answer two key ques-


In a stack controlled by a commercial actor, the com- tions for a public AI strategy: whether a fully public
pany often pursues vertical integration, aiming to AI stack is possible and, if not, what types of inter-
control the entire stack rather than incorporat- ventions across the AI stack can best generate pub-
ing third-party components. This can lead to mo- lic value while minimizing dependencies. In Chapter
nopolistic power, particularly in digital platforms or 5, we recommend pathways to public AI based on this
cloud infrastructure. 70
An alternative strategy seeks analysis.
to “commoditize the complement,” that is, to ob-
The first question closely relates to sovereign AI
66 José Van Dijck. “Seeing the Forest for the Trees: Visualizing strategies, which aim to give nation-states inde-
Platformization and Its Governance.” New Media & Society, vol.
pendent control over a domestic AI stack. This idea
23, no. 9, Sept. 2021, pp. 2801–19. DOI.org (Crossref), https://
doi.org/10.1177/1461444820940293 is strongly promoted by Nvidia, which holds a domi-
67 Eleanor Shearer, Matt Davies and Mathew Lawrence. “The
Role of Public Compute.” Ada Lovelace Institute. 24 April 2024.
https://www.adalovelaceinstitute.org/blog/the-role-of- 71 Matt Bornstein, Guido Appenzeller, and Martin Casado. “Who
public-compute/ Accessed 3 April 2025, Ana Validivia, ibid. Owns the Generative AI Platform?” Andreessen Horowitz. 19
68 Victoria Ivanova et al. “Future Art Ecosystems. Vol.4 January 2023. https://a16z.com/who-owns-the-generative-ai-
Art x Public AI.” Serpentine Labs. 2025. https://reader. platform/
futureartecosystems.org/briefing/fae4/ 72 Jakob Mökander, et al. “Auditing Large Language Models: A
69 Sabrina Küspert, Nicolas Moës, Connor Dunlop. ibid. Three-Layered Approach.” AI and Ethics, vol. 4, no. 4, Nov.
70 Cecilia Rikap. “Antitrus t Policy and Artificial Intelligence: 2024, pp. 1085–115. Springer Link, https://doi.org/10.1007/
Some Neglected Issues.” Institute for New Economic Thinking. s43681-023-00289-2
10 June 2024. https://www.ineteconomics.org/perspectives/ 73 “AI Continent Action Plan.” European Commission. 9
blog/antitrust-policy-and-artificial-intelligence-some- April 2025. https://commission.europa.eu/topics/eu-
neglected-issues competitiveness/ai-continent_en

34
The generative AI stack

nant position at the hardware layer and counts states can afford the costs to train state-of-the-art mod-
seeking control over compute power among its key els. While DeepSeek initially appeared to mark a shift
customers. 74 A sovereign AI program requires, in in the economics of AI training, later analysis sug-
principle, full control of the AI stack – making it a gested otherwise.76 In addition to the high cost of ac-
highly contested concept. As Pablo Chavez notes, this quiring GPUs, building a data center with sufficient
is difficult to achieve: “In reality, what most coun- networking infrastructure and covering operational
tries working toward AI sovereignty are doing is expenses – such as electricity for running and cool-
building a Jenga-like AI stack that gives them enough ing hardware – requires major investment. As a re-
control and knowledge of AI technology to under- sult, only a few of the largest AI companies (Amazon,
stand and react to changing technology, market and Google, Meta, xAI and Microsoft) are able to pursue a
geopolitical conditions but falls short of complete full-stack approach, which demands massive invest-
control.” 75 In the following chapters, we offer a vi- ments in proprietary data centers.77 Among these,
sion of public AI that does not seek sovereign control Google and Amazon have a fully integrated AI stack,
but instead aims to secure the ability to orchestrate having developed their own chips (Google’s TPU and
resources across the AI stack in service of the pub- Amazon’s Inferentia and Trainium). Others still de-
lic good. pend on Nvidia, which holds a monopolistic position
in the GPU market.

Concentrations of power in the The costs, however, do not end with training. De-
AI stack ploying large AI models is also expensive, as it
requires sustained access to significant compute re-
Public AI visions, while not focused on digital sover- sources to process user queries in real time. For in-
eignty per se, must confront the question of whether stance, OpenAI’s ChatGPT reportedly incurred daily
developing AI systems in the public interest requires operating costs of up to $700,000 in 2023, due to the
some form of “sovereignty” – that is, control over need to continuously run thousands of GPUs.78
the AI stack. In other words, public AI must address
the concentrations of power that exist at various lay- This financial burden has pushed leading AI compa-
ers of the stack, where key resources are held by nies without full-stack capabilities – such as OpenAI,
commercial actors with dominant or near-monop- Anthropic and Mistral AI – to form partnerships with
olistic positions. This is especially true at the com- cloud hyperscalers like Amazon Web Services, Mic-
pute layer. The scale of investment needed to develop rosoft Azure and Google Cloud. This has resulted in a
viable alternatives makes such control extremely circular flow of capital between AI startups and scale-
difficult, if not impossible. As a result, the value gen- ups on the one hand and these cloud hyperscalers on
erated by AI systems is increasingly privatized and, the other hand. It is estimated that the three cloud hy-
in some cases, monopolized. Public AI policies aim to perscalers “contributed a full two-thirds of the $27
mitigate this trend. billion raised by fledgling AI companies in 2023”79

Market concentration is an outcome of the im-


76 Dario Amodei. ibid.
mense costs of training and deploying transform-
77 Ben Thompson. “AI Integration and Modularization.”
er-based generative AI models. Only a few companies Stratechery. 29 May 2024. https://stratechery.com/2024/ai-
integration-and-modularization/
78 Aaron Mok. “It Costs OpenAI Millions of Dollars a Day to Run
74 Angie Lee. “What is sovereign AI?” Nvidia. 28 February 2024. ChatGPT, Analyst Estimates.” Business Insider. 25 April 2023.
https://blogs.nvidia.com/blog/what-is-sovereign-ai/ Accessed https://www.businessinsider.com/how-much-chatgpt-costs-
3 April 2025. openai-to-run-estimate-report-2023-4
75 Pablo Chavez. “Sovereign AI in a Hybrid World: National 79 George Hammond. “Big Tech Outspends Venture Capital Firms
Strategies and Policy Responses.” Lawfare, 7 November 2024. in AI Investment Frenzy.” Financial Times, 29 December 2023.
https://www.lawfaremedia.org/article/sovereign-ai-in-a- https://www.ft.com/content/c6b47d24-b435-4f41-b197-
hybrid-world--national-strategies-and-policy-responses 2d826cce9532

35
The generative AI stack

and that the majority of capital raised by AI startups, Closely related to this is the emergence of a “com-
“up to 80-90% in early rounds,” was paid back to the pute divide.”84 Outside a small group of hyperscal-
same cloud hyperscalers. 80
ers and AI labs that have partnered with them, most
companies have to rent compute for AI development
This uneven allocation of AI’s resources – and of the and deployment. These costs have resulted in a di-
financial returns they generate – is a defining fea- vide between the “GPU rich” and “GPU poor” com-
ture of modern AI technologies. The field exhib- panies.85 A similar “computing divide” also exists
its characteristics of a natural monopoly, driven by between commercial labs and academic or nonprof-
the high cost of training and deploying AI systems. it research institutions.86 On a global scale, the un-
These costs stem from intense demand for com- even distribution of GPU-equipped data centers has
puting power, the high price of chips, the effort re- produced a new kind of digital divide. Countries are
quired to obtain and prepare large datasets, limited now classified into three tiers: “Compute North” na-
access to proprietary data and the high switching tions with advanced GPU data centers capable of de-
costs between cloud platforms. Economies of scale veloping cutting-edge AI, “Compute South” nations
in generative AI create “winner-takes-most” dy- with less-powerful facilities suitable for deploying
namics, which are reinforced by network effects and existing AI, and “Compute Desert” nations that lack
first-mover advantages, as early, large-scale sys- such infrastructure entirely and must rely on foreign
tems benefit from user-generated data and estab- computing resources.87
lished customer bases. This inequality is global, not
limited to any single jurisdiction.81 Across the world, public and non-commercial com-
puting resources are miniscule in comparison to
These concentrations of power occur, first of all, at commercial computing power. While the public sec-
the compute layer. AI development efforts are high- tor was an early mover – developing the first su-
ly dependent on the three dominant cloud providers percomputers for research purposes – today its
– Amazon, Microsoft and Google – and on Nvidia, investments are outpaced by the growth of commer-
which currently holds an overwhelming share of the cial compute capacity, as outlined above. In addition,
chip market. 82 At the data layer, leading AI develop- public supercomputers must support a wide range of
ment labs typically benefit also from their privileged research unrelated to generative AI and are therefore
access to proprietary data generated on platforms neither optimized for AI training nor available for
they own or control. This trend is exemplified by providing inference compute to deployed AI systems.
the merger of xAI and X, an AI company and a so- As a result, even nonprofit and academic initiatives
cial media platform respectively, both owned by Elon
Musk.83 84 Bridget Boakye, et al. “State of Compute Access: How to Bridge
the New Digital Divide.” Tony Blair Institute. 7 December 2023.
https://institute.global/insights/tech-and-digitalisation/
80 Matt Bornstein, Guido Appenzeller, and Martin Casado. ibid. state-of-compute-access-how-to-bridge-the-new-digital-
81 Competition and Markets Authority. “AI Foundation Models: divide
Update Paper.” GOV.UK, 16 April 2024. https://www.gov.uk/ 85 Alistair Barr. “The tech world is being dividing into ‘GPU
government/publications/ai-foundation-models-update- rich’ and ‘GPU poor.’ Here are the companies in each group.”
paper; Anselm Küsters, Matthias Kullas. “Competition in Business Insider Nederland, 28 August 2023, https://www.
Generative Artificial Intelligence.” CEP. 12 March 2024. https:// businessinsider.nl/the-tech-world-is-being-dividing-into-
www.cep.eu/eu-topics/details/competition-in-generative- gpu-rich-and-gpu-poor-here-are-the-companies-in-each-
artificial-intelligence-cepinput.html; Tejas N. Narechania group/
and Ganesh Sitaraman. “Antimonopoly Tools for Regulating 86 Tamay Besiroglu, et al. The Compute Divide in Machine
Artificial Intelligence.” SSRN. 25 September 2024. https://www. Learning: A Threat to Academic Contribution and Scrutiny?
ssrn.com/abstract=4967701 arXiv:2401.02452, arXiv, 8 Jan. 2024. arXiv.org, https://doi.
82 Jai Vipra and Sarah Myers West. “Computational Power and AI.” org/10.48550/arXiv.2401.02452
AI Now Institute. September 27, 2023. https://ainowinstitute. 87 Vili Lehdonvirta, Bóxī Wú and Zoe Hawkins. (2024). “Compute
org/publication/policy/compute-and-ai North vs. Compute South: The Uneven Possibilities of
83 Maxwell Zeff. “Elon Musk says xAI acquired X.” TechCrunch. 29 Compute-based AI Governance Around the Globe.” Proceedings
March 2025. https://techcrunch.com/2025/03/29/elon-musk- of the AAAI/ACM Conference on AI, Ethics, and Society, 7(1),
says-xai-acquired-x/ 828-838. https://doi.org/10.1609/aies.v7i1.31683

36
The generative AI stack

now typically rely on corporate infrastructure. Public Characteristics of key layers of


investments in compute are also constrained by fis- the stack
cal, infrastructural and ecological limitations.
In the following sections, we outline in more detail
A global mapping of public compute initiatives, con- the characteristics of key resources and infrastruc-
ducted by the Ada Lovelace Institute, shows that tures at the three core layers of the AI stack: com-
most rely on hybrid provisioning and funding mod- pute, data and models. These characteristics must be
els. They depend on private sector resources and ex- considered when designing public AI policies.
pertise, while aiming at ensuring public oversight of
commercial compute.88
Compute
Cecilia Rikap argues that the power exerted by the
largest AI companies goes beyond convention- In the absence of a universally accepted definition,
al forms of market concentration, as it is not limit- “compute” can refer both to a performance metric –
ed to ownership of technology.89 These companies measured in terms of calculations or floating-point
also employ a range of strategies – including lever- operations per second (FLOPs) – and to the physical
aging venture capital to secure preferential access to hardware that performs these calculations,90 name-
knowledge and capabilities, entrenching their posi- ly semiconductors. The UK government defines com-
tions through cloud services and vertical integration, pute as “computer systems where processing power,
capturing AI talent and influencing research agendas memory, data storage and network are assembled at
– to consolidate control and privatize the value gen- scale to tackle computational tasks beyond the capa-
erated by others. bilities of everyday computers.”91

These concentrations of power pose significant chal- A 2018 analysis by OpenAI showed that computing
lenges for developing public AI infrastructures, power used for AI training had increased 300,000
which face dependencies on monopolistic or oligop- times since 2012, the beginning of the “deep learn-
olistic actors across multiple layers of the stack. Two ing era.” A follow-up study by Epoch.ai, which
potential strategies have emerged in response. One analyzed 120 training runs of machine learning sys-
seeks independence at the compute layer through tems, found a fourfold annual growth rate in recent
massive investments in data centers and computing years – making it one of the fastest technological
power on a scale comparable to commercial expendi- expansions in decades. Overall, training compute
tures. However, this still entails reliance on Nvidia’s has grown by a staggering factor of 10 billion since
GPUs due to the company’s entrenched, monopolis- 2010.92
tic position. The other strategy accepts dependencies
at the compute layer and instead focuses on inde- Further scaling, however, faces four key con-
pendence at the model layer and in the development straints: energy consumption, chip manufactur-
of applications built on top of public models. These
pathways to public AI are explored in greater detail in 90 Amlan Mohanty. “Compute for India: A Measured Approach.”
chapter 5. Carnegie Endowment for International Peace. 17 may 2024.
https://carnegieendowment.org/posts/2024/05/compute-for-
india-a-measured-approach?lang=en
91 Department of Science, Innovation and Technology.
88 Matt Davies and Jai Vipra. “Mapping global approaches to “Independent Review of The Future of Compute: Final Report
public compute.” Ada Lovelace Institute. 4 November 2024. and Recommendations.” GOV.UK. https://www.gov.uk/
https://www.adalovelaceinstitute.org/policy-briefing/global- government/publications/future-of-compute-review/the-
public-compute/ future-of-compute-report-of-the-review-of-independent-
89 Cecilia Rikap. “Dynamics of Corporate Governance Beyond panel-of-experts Accessed 23 Apr. 2025.
Ownership in AI.” Common Wealth. 15 May 2024. https://www. 92 Jaime Sevilla, et al. “Compute Trends Across Three Eras of
common-wealth.org/publications/dynamics-of-corporate- Machine Learning.” Epoch AI. 16 Feb. 2022. https://epoch.ai/
governance-beyond-ownership-in-ai blog/compute-trends

37
The generative AI stack

ing, data availability and speed limits inherent to AI Software frameworks to run chips
training.93
Recognizing the role of specialized software is essen-
To better understand what compute entails – and tial – it acts as the bridge between hardware and in-
where bottlenecks arise in the development of gen- frastructure and helps explain much of the current
erative AI – it is helpful to break this layer into three concentration of power in the generative AI ecosys-
key component: advanced chips (primarily GPUs), tem.
and the two additional elements needed to make them
usable at scale: specialized software that enables effi- Effective use of GPUs for training and deploying AI
cient use of those chips, and data centers where GPUs models requires specialized software frameworks
are networked into large-scale compute systems. and tools. Two well-known examples are Nvid-
ia’s Compute Unified Device Architecture (CUDA)
Advanced chips and AMD’s Radeon Open Compute (ROCm). CUDA,
in particular, has become the industry standard for
Chips, or semiconductors, are arguably among the GPU-accelerated computing and quickly gained a
most important technological hardware in use today. first-mover advantage. Introduced in 2007, it en-
They underpin all digital technologies and serve as the ables developers to harness the parallel processing
backbone of most economic activities. The enormous power of GPUs for general-purpose tasks critical to
computational demands of training and deploying AI training and deployment.95
AI models – often involving trillions of calculations
– depend on modern chips’ ability to coordinate the CUDA’s importance became especially clear with the
work of billions of transistors etched into each unit. development of AlexNet in 2012, when the frame-
work enabled the training of the neural network and
There are two distinct categories of chips, each with reduced computation time from weeks to hours.96
its own supply chains, production requirements and Today, leading deep learning frameworks like Py-
strategic dependencies: Torch and TensorFlow – open sourced by Meta and
Google, respectively – are deeply integrated with
• Memory chips store and enable access to data, CUDA, making it difficult for competing platforms to
which is essential for high-performance AI work- gain traction. While media coverage often highlights
loads and a prerequisite for any computational task. Nvidia’s GPUs, CUDA represents an equally powerful
competitive “moat” for the company.
• Logic (or processing) chips – including CPUs,
GPUs and specialized chips like TPUs – carry out In contrast, ROCm, developed by AMD, is an open
computations. source alternative designed to offer similar function-
ality. Although ROCm supports various programming
The rapid advancements in AI have been driven large- models and provides an open platform, it has strug-
ly by improvements in specialized logic chips. As tra- gled to match CUDA’s widespread adoption. Nvid-
ditional CPUs proved too slow for AI training, GPUs ia’s extensive investments in its developer ecosystem
– originally developed for graphics rendering – have have helped cement CUDA as the de facto standard
been repurposed and optimized for this purpose. 94

95 Fatima Hameed Khan, et al. “Advancements in Microprocessor


93 Jamie Sevilla. “Can AI Scaling Continue Through 2030?” Architecture for Ubiquitous AI—An Overview on History,
Epoch AI. 20 Aug. 2024. https://epoch.ai/blog/can-ai-scaling- Evolution, and Upcoming Challenges in AI Implementation.”
continue-through-2030 Micromachines 2021, 12(6), 665; https://doi.org/10.3390/
94 More recently, other custom AI accelerators such as Google’s mi12060665
tensor processing units (TPUs) or Amazon’s Trainium chips 96 Alex Krizhevsky, et al. “ImageNet Classification with Deep
have been developed or are in development. However, as the Convolutional Neural Networks.” NeurIPS Proceedings.
dominant logic chips in AI are GPUs primarily supplied by https://proceedings.neurips.cc/paper/2012/file/
Nvidia, other accelerators are not the focus of this paper. c399862d3b9d6b76c8436e924a68c45b-Paper.pdf

38
The generative AI stack

preferred by most AI researchers and companies.97 Data


The popularity of this proprietary network creates a
major dependency for AI development. Data is a critical resource for AI systems. The devel-
opment of modern AI became possible only when
Data centers vast amounts of data became available in digital
form on the internet. In this context, the term data
Data centers are complex facilities that provide com- is used as a catch-all phrase encompassing all types
puting, storage and network infrastructures. Thanks of information, content and data sources. Rules and
to data centers, GPUs become integrated into a com- norms around access and use have shifted in re-
puting resource that is usable at scale. This includes cent years, as AI training emerged as a disruptive
not only GPUs but networking interfaces and other new use of data at massive scale. This has reopened
hardware elements. The storage infrastructure holds long-standing debates about copyright and sparked
the vast datasets required for AI training, while the new concerns about fair data use and the risk of ex-
network infrastructure connects system components ploitation by a small group of dominant commercial
within the data center, transfers data between com- AI companies.
pute nodes, and links the facility to end users and
global cloud networks.98 Under the current AI scaling paradigm, building more
capable models requires ever-larger datasets. Initial-
The cost of building and operating data centers is ly, the development process appeared open-ended,
immense, encompassing hardware, energy and cool- as developers tapped into as much accessible data as
ing systems, network infrastructure and personnel. possible. Today, however, data is increasingly viewed
State-of-the-art data centers are powered by tens of as a finite resource. Researchers from Epoch.ai pre-
thousands of high-end GPUs, primarily supplied by dict that a “peak human data” moment may occur
Nvidia. For example, the AI training cluster built by between 2026 and 2032, when further gains within
xAI in 2024 to train the large language model Grok the current paradigm may no longer be possible due
– one of the largest of its kind – began with 100,000 to data scarcity.101 Research conducted by the Data
Nvidia H100 GPUs and expanded to 200,000 by the Provenance Initiative also shows that, in response to
end of the year. 99 The estimated cost of such an in- data use for AI development, various actors are tak-
vestment is several billion dollars, with the company ing steps to reduce availability of content that they
reportedly raising $6 billion to support it. 100
publish on the web.102

The development of AI systems faces a paradox when


it comes to data: it is at once abundant and scarce. On
97 Serhii Nakonechnij. “ROCm vs CUDA Practical Comparison.”
one hand, the fact that the entire web’s content be-
Scimus. 12 August 2024 https://thescimus.com/blog/rocm-vs-
cuda-a-practical-comparison-for-ai-developers/ came a foundational resource for training all of the
98 Aadya Gupta and Adarsh Ranjan. “A primer on compute. dominant commercial generative models, whose cre-
Carnegie Endowment for International Peace.” 30 April 2024.
ators accessed it for free, proves the abundance of
https://carnegieendowment.org/posts/2024/04/a-primer-on-
compute?lang=en data. On the other hand, much of the available data
99 Mark Mantel. “xAI Has Apparently Completed the World’s remains proprietary. For the largest commercial AI
Fastest Supercomputer.” Heise Online. 4 September 2024.
enterprises, access to restricted, proprietary data
https://www.heise.de/en/news/xAI-has-apparently-
completed-the-world-s-fastest-supercomputer-9857540. – often amassed through consumer-facing digital
html
100 Wayne Williams. “Elon Musk raises USD 6 billion for xAI’s
Memphis data center; will purchase 100,000 NVIDIA chips to 101 Jamie Sevilla, et al. “Can AI scaling continue through 2030?.”
boost Tesla’s full self-driving FSD capabilities.” 28 November Epoch AI. 20 August 2024. Available at: https://epoch.ai/blog/
2024. https://www.techradar.com/pro/elon-musk-raises- can-ai-scaling-continue-through-2030
usd6-billion-for-xais-memphis-data-center-will-purchase- 102 Shayne Longpre, et al. Consent in Crisis: The Rapid Decline of
100-000-nvidia-chips-to-boost-teslas-full-self-driving-fsd- the AI Data Commons. arXiv:2407.14933, arXiv, 24 July 2024.
capabilities Accessed 3 April 2025. arXiv.org, https://doi.org/10.48550/arXiv.2407.14933

39
The generative AI stack

platforms controlled by the same companies – serves ing without explicit permission or licensing, espe-
as a key competitive advantage.103 For others, the lack cially under the fair use doctrine.
of access to this high-quality data presents a signifi-
cant competitive disadvantage. There is also growing evidence that some AI labs have
used data without permission, possibly in violation of
Data sources for AI training the law – as demonstrated by multiple court cases in-
volving major AI companies. For example, the Books3
When discussing data and datasets in the context of dataset, which included 183,000 books sourced from
AI training, it is important to recognize that data is pirate websites, was used to train early-generation
not a homogeneous concept. Generative AI models models released in 2022.105 While its use was eventu-
rely on diverse data sources for training, which can ally discontinued under pressure from rights holders,
be categorized by accessibility, licensing, structure Meta was reported to have trained models on LibGen
and sensitivity. – a similar pirate repository – as late as 2024.106 In
2025, Anna’s Archive, an aggregator of pirated books
Accessibility is the most important category, and and research articles, announced it had granted AI
distinctions should be made between private (pro- companies, including DeepSeek,107 access to its data-
prietary), public and openly shared data. Private data base. These examples show that AI training often op-
typically includes user-generated content collect- erates in a legal gray area – and sometimes outside
ed by companies that own dominant online plat- the boundaries of the law – when it comes to the use
forms and are now building generative AI models of data.
(e.g., Meta, Google, Microsoft, AWS). Incumbent AI
companies like OpenAI and Anthropic can also use Several copyright frameworks have been introduced
data generated by their own chatbots. Public data in- to regulate generative AI training, most notably the
cludes content from the open internet, either scraped European Union’s exception for text and data min-
directly by AI companies or aggregated into datasets ing, which includes opt-out provisions for commer-
such as Common Crawl and its derivatives. Finally, cial training under the 2019 Digital Single Market
openly licensed data – such as that from Wikimedia Directive. However, there remains a lack of clar-
– is valued by developers for both its quality and the ity around their interpretation and enforcement,
legal certainty it offers for training use. and these frameworks are increasingly contested, as
shown by ongoing policy debates about opt-outs in
Data comes in many forms, each governed by dif- the United Kingdom.108
ferent legal frameworks and requiring tailored gov-
ernance. The use of data for AI training often raises Other types of training data may consist of person-
copyright issues, 104
leading to a growing number of al data (subject to personal data protection laws) and
high-profile infringement lawsuits brought by cre-
ators, publishers and rights holders against leading 105 Alex Reisner. “These 183,000 Books Are Fueling the Biggest
Fight in Publishing and Tech.” The Atlantic, 25 September 2023.
AI firms. These legal challenges question the extent https://www.theatlantic.com/technology/archive/2023/09/
to which copyrighted works can be used for AI train- books3-database-generative-ai-training-copyright-
infringement/675363/
106 Alex Riesner. “The Unbelievable Scale of AI’s Pirated-
103 Cade Metz, et al. “How Tech Giants Cut Corners to Harvest Data Books Problem.” The Atlantic. 20 March 2025. https://www.
for A.I..” New York Times. 8 April 2024. https://www.nytimes. theatlantic.com/technology/archive/2025/03/libgen-meta-
com/2024/04/06/technology/tech-giants-harvest-data- openai/682093/
artificial-intelligence.html 107 Anna‘s Archive. “Copyright Reform Is Necessary for National
104 Daniel J. Gervais. “The Heart of the Matter: Copyright, AI Security.” Anna‘s Archive (blog). 31 January 2025. https://
Training, and LLMs.” SSRN. 1 November 2024. https://papers. annas-archive.li/blog/ai-copyright.html Accessed 3 April 2025.
ssrn.com/sol3/papers.cfm?abstract_id=4963711 Accessed 3 108 Joseph Bambridge and Dan Bloom. “UK Plans Fresh Round of
April 2025; Matthew Sag. “Fairness and Fair Use in Generative Talks to Take Sting out of AI Copyright Proposals.” POLITICO,
AI.” SSRN. 20 December 2023. https://papers.ssrn.com/sol3/ 3 April 2025, https://www.politico.eu/article/uk-plans-fresh-
papers.cfm?abstract_id=4654875 Accessed 3 April 2025. round-talks-lawmakers-ai-copyright-proposals/

40
The generative AI stack

non-personal data industrial, anonymized, statisti- though in some cases, particularly for security-re-
cal and administrative datasets. While general prin- lated benchmarks, openness may be limited.111
ciples can be applied to all types of data, there is no
“one size fits all” approach to data sharing. Copy- Synthetic data
right and privacy or personal data rights are the two
most important factors determining how data can be Synthetic data, generated by generative AI models,
shared and accessed – and how “open” it can be. is a distinct category of data increasingly used in AI
development, and it presents its own governance
The ongoing public debates about AI training data- challenges. Because it is synthetically produced, it
sets often focus exclusively on pretraining data. can be created quickly and cheaply. It is not subject to
Copyright and data governance laws attempt to strike intellectual property restrictions and typically avoids
a balance that allows publicly available internet data privacy and other rights-related concerns. For in-
to be reused in training datasets without enabling its stance, synthetic health data can be used in place of
exploitation. However, there is a tension between the real patient data to protect privacy. In general, syn-
view – common among AI developers – that data is a thetic data can be used to reflect real-world patterns
raw resource to be used freely, and the perspectives while helping to safeguard privacy or reduce bias.
of those who create and steward that data. Some models – such as Microsoft’s Phi family of
small language models – have been trained entirely
It is also important to note that datasets used in on synthetic data.112 Recently, a new model develop-
post-training stages play an increasingly signifi- ment paradigm, called model distillation, uses an ap-
cant role. For example, methods based on RLHF re- proach similar to training with synthetic data. In this
quire data on human preferences, typically sets of paradigm, a “teacher” generative AI model produces
generative AI prompts and responses. New types of outputs that a “student” model is then trained to
reasoning models rely on datasets designed to help replicate, bypassing the need for access to the origi-
models follow instructions, solve problems or eval- nal pretraining data.
uate results.109 And as this type of training becomes
increasingly important, public discussions will also Some researchers are optimistic about the potential
need to address the provision and governance of of synthetic data, particularly for protecting person-
these datasets, which are often constructed differ- al data during AI training.113 Others warn of associated
ently and do not raise the same legal concerns.110 risks, most notably model collapse – a hypothesized
First, fine tuning datasets are usually domain- or decline in performance when models are trained on
application-specific and require governance tai- synthetic rather than real data.114 Use of synthetic data
lored to their context. Many key domains for gen- remains contested, and the validity of the “model
erative AI – such as health and finance – depend
on sensitive data. Second, benchmarks for evalu- 111 Peter Mattson, et al. “Perspective: Unlocking ML requires an
ating model capabilities are key tools that – un- ecosystem approach.” MLCommons. 10 March 2023. https://
mlcommons.org/2023/03/unlocking-ml-requires-an-
like pretraining data – can often be developed and ecosystem-approach/
shared as digital public goods. Since standardized 112 Microsoft. “Phi open model family.” Microsoft. https://azure.
benchmarks guide generative AI development, they microsoft.com/en-us/products/phi/ Accessed 23 April 2025.
113 Philippe De Wilde, et al. “Recommendations on the Use of
benefit from open access and sound governance –
Synthetic Data to Train AI Models.” Tokyo: United Nations
University, 2024. https://collections.unu.edu/eserv/UNU:9480/
Use-of-Synthetic-Data-to-Train-AI-Models.pdf
109 Nathan Lambert. “The State of Post-Training in 2025.” 114 Ilia Shumailov, et al. “AI Models Collapse When Trained on
Interconnects. 12 March 2025, https://www.interconnects.ai/p/ Recursively Generated Data.” Nature, vol. 631, no. 8022, July
the-state-of-post-training-2025 2024, pp. 755–59. www.nature.com, https://doi.org/10.1038/
110 Nathan Lambert. “Why reasoning models will generalize.” s41586-024-07566-y; University of Oxford. “New Research
Interconnects. 28 January 2025, https://www.interconnects. Warns of Potential ‘Collapse’ of Machine Learning Models.”
ai/p/why-reasoning-models-will-generalize Accessed 3 April Department of Computer Science, 25 July 2024. https://www.
2025. cs.ox.ac.uk/news/2356-full.html

41
The generative AI stack

collapse” hypothesis has been questioned by other Foundation models


researchers.115 It is unclear whether the growing pres-
ence of generative AI content on the web will also have As defined by Stanford University’s Center for Re-
an impact on pretraining AI models with web content. search on Foundation Models, foundation models
And most probably, synthetic data can help with some “are trained on broad data at scale and are adapt-
challenges (like responsible AI training) but not over- able to a wide range of downstream tasks.”117 These
come other issues, such as the lack of sufficiently var- models exemplify AI development in the transform-
ied and complex data to train more capable models. er paradigm and require vast compute and data re-
sources. They have general capabilities that allow
them to serve as the basis for developing more spe-
Models cialized models. Because of their adaptability, they
function as general-purpose technologies with infra-
Generative AI models are statistical models trained to structural characteristics.118 The legal concept of
process a given type of input into a given type of out- general-purpose AI, introduced in the European
put. At their core, models consist of: Union’s AI Act, is based on this idea, and the term
LLMs typically describes the same type of models.
• their architecture (e.g., neural networks, trans- While foundation models are not necessarily multi-
formers or diffusion networks), and modal, an increasing number of foundation models
are multimodal in their capabilities. For example,
• their parameters (i.e., the weights and biases that OpenAI’s GPT-4 is a foundation model that can pro-
have been optimized through training). cess both text and images.

In the field of generative AI, models are primarily Small models


neural networks built on the transformer architec-
ture or its variants.116 Small models, including small language models and
small vision models, are compact, efficient alter-
In the following section, we provide an overview natives to LLMs that are designed to balance per-
of different types of models and their development formance with resource requirements.119 Their size
pathways. This section explains the distinctions be- typically ranges from a few million to a few bil-
tween foundation models, LLMs and small models, lion parameters, in contrast to larger models, which
while detailing various methods of model derivation. may contain hundreds of billions. The term small
The ability to derive new models from existing ones, model generally does not apply to task-specific mod-
provided that the latter are shared openly, fosters an els like BERT, which perform individual tasks such
ecosystem of collaboration and resource sharing. as summarization or categorization, even though
such models still account for a large share of indus-
There is currently no consistent typology for cate- trial AI applications. Like their larger counterparts,
gorizing generative AI models. Terms such as foun- small models are usually based on transformer ar-
dation models, general-purpose AI, large language chitectures and can be trained from scratch or de-
models and small language models are often used to rived from foundation models using techniques such
describe different types of models, typically differ- as distillation, pruning and quantization. Small mod-
entiated by their parameter size. els offer advantages in efficiency, accessibility and

115 Rylan Schaeffer, et al. “Position: Model Collapse Does Not Mean 117 Rishi Bommasani, et al. ibid.
What You Think.” arXiv:2503.03150, arXiv, 18 Mar. 2025. arXiv. 118 Alison Gopnik. “What AI Still Doesn’t Know How to Do.” The
org, https://doi.org/10.48550/arXiv.2503.03150 Wall Street Journal. 15 July 2022. https://www.wsj.com/tech/ai/
116 Dan Hendrycks, et al. “Measuring Massive Multitask Language what-ai-still-doesnt-know-how-to-do-11657891316
Understanding.” arXiv:2009.03300, arXiv, 12 Jan. 2021. arXiv. 119 https://medium.com/@nageshmashette32/small-language-
org, https://doi.org/10.48550/arXiv.2009.03300 models-slms-305597c9edf2

42
The generative AI stack

customization, making them ideal for deployment on In recent years, efforts have been made to more pre-
resource-constrained devices, edge computing en- cisely define what constitutes an open model. This is
vironments and domain-specific applications. They a necessary step toward creating standardized meth-
require less computational power and memory, en- ods for governing and sharing AI models. Frame-
abling wider adoption by a range of stakeholders works such as the Model Openness Framework124
while still maintaining strong performance in target- by the Generative AI Commons and the Framework
ed use cases. for Openness in Foundation Models125 by the Mo-
zilla Foundation list up to 16 components that ex-
Open models tend beyond model architecture and parameters.
These include code components (for training, evalu-
Open models are AI models whose architecture and ation and inference), data components (for training,
trained parameters (i.e., weights and biases) are post-training and evaluation), and documentation
released under open source licenses.120 Since Eleu- (such as model cards and dataset cards).
therAI’s release of GPT-Neo 121 as an open alternative
to OpenAI’s GPT in 2021, it has become increasingly A handful of research labs and nonprofit initiatives
common for AI researchers and developers to release – such as the Barcelona Supercomputing Center, the
open models. For example, in 2023, 66% of founda- Allen Institute for AI and EleutherAI – aim to set a
tion models were released as open models, and more higher bar for open source AI by releasing all model
than 1.5 million models are hosted on the Hugging components openly, including trained parameters,
Face Hub.122 code, data and documentation, all under free and
open source licenses.
However, there is currently no standard approach to
open releases of AI models, and many so-called open While most leading AI companies keep their models
models come with significant limitations – such as closed and accessible only via commercial APIs, Meta
withholding training data, using restrictive licens- has adopted a model openness strategy. However, its
es or prohibiting commercial use – compared to the models fall short of commonly accepted open source
norms established in open source software develop- standards due to limitations imposed by its custom
ment. Often, references to “open source models” are licenses.126 Other AI companies, including Mistral,
viewed as attempts at open-washing, diluting tradi- DeepSeek or Cohere, have also released open models.
tional open source standards in the context of gener- In recent months, DeepSeek is seen as the strongest
ative AI.123 example of an AI lab that combines commercial goals
with an open source mission.

Open models enable model derivation, enabling


120 Matt White, et al. “The Model Openness Framework: Promoting the creation of smaller and more specialized mod-
Completeness and Openness for Reproducibility, Transparency, els that offer benefits beyond those associated with
and Usability in Artificial Intelligence.” arXiv:2403.13784,
arXiv, 18 Oct. 2024. arXiv.org, https://doi.org/10.48550/ scale. This was demonstrated by DeepSeek, whose
arXiv.2403.13784 researchers distilled the reasoning capabilities of the
121 Sid Black, et al. “GPT-NeoX-20B: An Open-Source DeepSeek-R1 model (671 billion parameters) into
Autoregressive Language Model.” arXiv:2204.06745,
arXiv, 14 Apr. 2022. arXiv.org, https://doi.org/10.48550/
smaller models ranging from 1.5 billion to 70 bil-
arXiv.2204.06745 lion parameters. Within a week of R1’s release on the
122 Yolanda Gil and Raymond Perrault. “Artificial Intelligence Hugging Face platform, more than 500 derivative
Index Report 2025.” Stanford University Human-Centered
Artificial Intelligence. 7 April 2025. https://hai-production.
s3.amazonaws.com/files/hai_ai_index_report_2025.pdf 124 Matt White, et al. ibid.

123 David Gray Widder, et al. “Open (For Business): Big Tech, 125 Adrien Basdevant, et al. ibid.
Concentrated Power, and the Political Economy of Open AI.” 126 Stefano Maffuli. “Meta’s LLaMa License Is Not Open Source.”
4543807, 17 Aug. 2023. Social Science Research Network, Open Source Initiative, 20 July 2023, https://opensource.org/
https://doi.org/10.2139/ssrn.4543807 blog/metas-llama-2-license-is-not-open-source

43
The generative AI stack

versions had been shared, most of them quantized.127


Among these derivative models, worth noting is
an open source derivative created by Perplexity.ai,
which “un-censored” the original Chinese model.128

It is useful to view the various AI development proj-


ects as part of a broader ecosystem rooted in the
sharing of knowledge, components or entire models.
Elements of this collaborative approach can be found
across the AI landscape, including in major com-
mercial labs. At the same time, a more narrowly de-
fined ecosystem has emerged, composed of teams
committed to open source AI and aligned approach-
es to trustworthy or responsible AI.129 Distributed de-
velopment efforts and AI collaboratives like the Big
Science project are among the strongest examples
of ecosystem building.130 A mature AI development
ecosystem also includes interoperability and da-
ta-sharing standards, benchmarks, best practices for
responsible AI development and governance frame-
works.131

127 Florent Daudens. “Yes, DeepSeek R1‘s Release Is Impressive.


But the Real Story Is What….” LinkedIn. https://www.
linkedin.com/posts/fdaudens_yes-deepseek-r1s-release-is-
impressive-activity-7289681233427484672-I9MY Accessed 24
Apr. 2025.
128 AI Team. “Open Sourcing R1 1776.” Perplexity (blog). 18
February 2025. https://www.perplexity.ai/hub/blog/open-
sourcing-r1-1776
129 Mark Surman. “Introducing Mozilla.ai: Investing in trustworthy
AI.” Mozilla (blog). 22 March 2023. https://blog.mozilla.org/en/
mozilla/introducing-mozilla-ai-investing-in-trustworthy-ai/
130 Jennifer Ding, et al. “Towards Openness Beyond Open
Access: User Journeys through 3 Open AI Collaboratives.”
arXiv:2301.08488, arXiv, 20 Jan. 2023. arXiv.org, https://doi.
org/10.48550/arXiv.2301.08488
131 Peter Mattson, et al. ibid.

44
4 | The public AI framework

In this section, we propose a definition of public AI lated to that of digital public infrastructure (DPI), but
that builds on the broader concept of a public digital focuses on the provision of alternatives to key digital
infrastructure. We also draw on three previous pa- platforms and communication services.133 These in-
pers that define public AI and outline policies that frastructures are built and governed with the goal of
can support its development. advancing the public interest and maximizing public
value. They stand in contrast to extractive solutions
A standard for the publicness of digital infrastructure that concentrate power in the hands of a few at the
must go beyond vague notions of the public inter- expense of the broader population.
est and instead rely on a clear understanding of how
public value is created. The definition of a public dig- This definition, proposed by Open Future, builds on
ital infrastructure achieves this by identifying three previous work done by researchers from the UCL In-
key characteristics: attributes, functions and forms stitute for Innovation and Public Purpose, aims to
of control. more precisely define the public nature of such in-
frastructures by describing how infrastructures can
By combining the AI stack framework described in the generate public value. The goal of this “complex un-
previous section with this definition of public digital packing of what ‘public’ means [...] is to shift the
infrastructure, we propose an approach that defines focus of the debate from the technical aspects of in-
public AI by considering how public attributes, public frastructure (i.e., making things digital) to its social
functions and public control can apply to the differ- relevance (i.e., making things public).”134
ent layers of the AI stack. In addition, we offer a gradi-
ent of public AI releases that accounts for the fact that The IIPP report focuses on the first two characteris-
public AI can have dependencies on non-public re- tics of public digital infrastructure. Public attributes
sources, especially at the chip and compute layers. refer to the accessibility, openness or interoperability
of infrastructure. These features aim to ensure uni-
versal and unrestricted access, often through open
The concept of public digital licensing or interoperability mechanisms.
infrastructure
Public functions of infrastructure means the infra-
Public digital infrastructures are digital infrastruc- structure contributes to public goals, rather than
tures designed to maximize public value by com-
bining public attributes with public functions and 133 For an explanation of the two concepts, see: Jan Krewer. “Signs
of Progress: Digital Public Infrastructure Is Gaining Traction.”
various forms of public control. 132
The concept is re- Open Future, 13 March 2024. https://openfuture.eu/blog/signs-
of-progress-digital-public-infrastructure-is-gaining-traction
132 Jan Krewer and Zuzanna Warso. “Digital Commons as Providers 134 Zuzanna Warso. “Toward Public Digital Infrastructure: From
of Public Digital Infrastructures.” Open Future, 13 November Hype to Public Value.” AI Now. 15 October 2024. https://
2024. https://openfuture.eu/publication/digital-commons-as- ainowinstitute.org/publication/xii-toward-public-digital-
providers-of-public-digital-infrastructures infrastructure-from-hype-to-public-value

45
The public AI framework

Table 1 | Defining publicness: Attributes, functions and control of public infrastructures.

Public interest Public control

“For the public” “Of and by the public”

Public attributes Public functions Public control Public funding Public production

Infrastructure is Infrastructure Infrastructure is Infrastructure is Infrastructure is


publicly accessible, contributes to governed or overseen funded by the public. produced by the
open or interoperable. attaining public by the public. public.
interest goals and
creating public goods.
Source: Own table. Adapted from: https://openfuture.eu/wp-content/uploads/2024/11/241113_
Digital-Commons-as-Providers-of-Public-Digital-Infrastructures.pdf

merely serving as an alternative provider of mar- Neither public attributes nor public functions alone
ket-based goods or services. These goals can include are sufficient to define publicness. A focus on at-
enabling civic participation, fostering communi- tributes can be agnostic with regard to how infra-
ty and social relationships, stimulating economic ac- structure is used, and the outcomes of such uses.
tivity, improving quality of life or securing essential For example, open source AI solutions can be used
capabilities. Public infrastructure often creates pub- in ways that pursue private rather than public goals.
lic goods – resources with social rather than purely Conversely, a functional focus can overlook accessi-
market value. Brett Frischmann cites research as an bility. In other words, public interest goals can also
example of such a public good. 135
Public functions of be achieved through closed, private infrastructures.
infrastructure often entail filing supply gaps left by
market actors. Public digital infrastructure also needs to meet the
criterion of public control.137 This can take various
The underlying concept of the common good, as forms, including public oversight, public funding or
framed by Mariana Mazzucato, involves both the even public production and provision of infrastruc-
pursuit of shared objectives and care for shared pro- ture. Such infrastructure need not be state-owned or
cesses and relationships. It is a perspective that em- produced by public institutions. What matters is the
phasizes the importance of governance in the process presence of public control, which can also be under-
of generating social value and positions the state as stood as governance for the common good. This is
both a public entrepreneur and a market shaper. A the minimum necessary condition for digital infra-
common good perspective underscores the state’s structure to meet the standard of publicness.
role in setting direction and coordinating collective
action. Through effective governance, states can en- Further on, these three characteristics of public dig-
sure co-creation and participation, promote col- ital infrastructure will be applied to the AI stack
lective learning, secure access, transparency and to provide an overall definition of public AI. Pub-
accountability – all of which are essential to advanc- lic control is the most complex characteristic, where
ing the public interest in digital infrastructure. 136
instead of a binary choice there are multiple ap-
proaches that entail forms of public production,
135 Brett M. Frischmann. “Infrastructure: The Social Value of funding or control of infrastructures. Public actors
Shared Resources.” Oxford Academic, 24 May 2012. https://doi.
org/10.1093/acprof:oso/9780199895656.001.0001
136 Mariana Mazzucato. “Governing the Economics of the Common 137 Open Futures’ report on Public Digital Infrastructures describes
Good: From Correcting Market Failures to Shaping Collective this characteristic in terms of public ownership. In this report,
Goals.” Journal of Economic Policy Reform, vol. 27, no. 1, Jan. we rephrase this as public control, which encompasses also
2024, pp. 1–24. DOI.org (Crossref), https://doi.org/10.1080/174 forms of public ownership but is not limited to them. See: Jan
87870.2023.2280969 Krewer and Zuzanna Warso, ibid.

46
The public AI framework

do not necessarily need to fully produce or own such private actors and public input.”139 In doing so, the
infrastructure – what matters is their ability to or- state not only guides collective action but also pro-
chestrate other actors in support of public digital tects public infrastructure from being co-opted for
infrastructures that meet the remaining character- private gain. At the same time, strategies used by
istics. commercial actors to gain control over the AI stack
can be repurposed in service of the mission-driven
approach that should characterize public AI policies.
Public, private and civic actors in
public digital infrastructure Understanding of the state’s role calls for a shift
from viewing government as a passive actor or a
Ownership of public digital infrastructures – and the mere fixer of market failures to recognizing it as an
respective roles of public, private and civic actors – orchestrator capable of coordinating diverse con-
is a key issue. In the context of generative AI, this is tributors. The deployment of PDIs should be guided
largely a question of who owns or controls comput- by mission-oriented strategies and a market-shap-
ing power, the key dependency for building public AI. ing approach to policy.140 Generating public value
Proper forms of public control can ensure sustain- through digital infrastructures is not merely meant
ability, while some forms of private control risk cre- to fix the market or fill market gaps – it is a goal in
ating a situation in which rewards are privatized and itself. Importantly, public value can be created by
risks are socialized. various actors, including those in the private sector.
The state’s ability to steer this co-creation process is
Governments and public institutions must there- more important than its direct production capacity.
fore play a central role in the development and gov-
ernance of public digital infrastructure, including the
public AI stack. As noted by the World Bank, govern- Proposals for public AI
ments should have “a primary role and responsibility
in deciding whether and how digital public infra- Over the past year, several organizations – including
structure is provided in the interests of the broader the Public AI Network, Mozilla and the Vanderbilt
society and economy.” 138 Deployment of such infra- Policy Accelerator – have introduced frameworks
structures is therefore a collective effort involving for a public AI agenda. Each proposal outlines a set
various actors, but it is the state that plays a key role of conditions intended to maximize public value and
in orchestrating collective action and ensuring prop- safeguard the common good through the develop-
er outcomes. ment and deployment of AI. In our analysis of these
proposals, we identify a set of shared characteristics
This view of government as an orchestrator of out- that define public AI.
comes and public value goes beyond the traditional
public/private ownership divide. The idea of orches-
trating actions of various actors entails “govern- Public AI Network
ment direction, centrally defined public purpose,
and large-scale planning [to be] combined – in The Public AI Network’s policy paper adopts a fram-
still-emergent ways – with market mechanisms, ing that aligns closely with Mariana Mazzucato’s

139 Stephen J. Collier, James Christopher Mizes, and Antina von


Schnitzler, “Preface: Public Infrastructures / Infrastructural
Publics,” Limn, https://limn.it/articles/preface-public-
138 Vyjayanti T. Desai, et al. “How Digital Public Infrastructure infrastructures-infrastructural-publics/
Supports Empowerment, Inclusion, and Resilience.” World 140 Mariana Mazzucato. “From Market Fixing to Market-
Bank Blogs, 15 March 2023. https://blogs.worldbank.org/en/ Creating: A New Framework for Innovation Policy.” Industry
digital-development/how-digital-public-infrastructure- and Innovation, vol. 23, no. 2, Feb. 2016, pp. 140–56. DOI.org
supports-empowerment-inclusion-and-resilience (Crossref), https://doi.org/10.1080/13662716.2016.1146124

47
The public AI framework

mission-driven approach to public intervention. It Mozilla Foundation


argues that public AI initiatives are essential to safe-
guarding the common good – a goal unlikely to be The Mozilla Foundation describes public AI efforts as
achieved if “the next generation of infrastructure … is aimed at “reducing the friction for everyone to build
under the control of a few publicly unaccountable Big and use AI in a trustworthy manner.” Its analysis
Tech firms.” 141 Public AI is thus defined as a set of al- suggests that the market will prioritize only a nar-
ternatives that “make the advancement of the com- row set of profitable applications and therefore not
mon good their central goal.” To meet this definition, build “everything our society needs from AI.”142 In
the paper outlines a set of “minimum viable require- response, Mozilla proposes a public AI agenda cen-
ments”: tered on building a “robust ecosystem of initiatives”
around three core goals:
• Public access: providing everyone with afford-
able, direct access to AI tools • 
Public goods: the creation of open, accessible
public goods and shared resources at all levels of
• 
Public accountability: empowering citizens to the AI technology stack;
shape technological development
• 
Public orientation: centering the needs of people
• Permanent public goods: establishing sustainable and communities, particularly those most under-
foundations for AI development served by market-led development;

The concept of public access here entails offering pub- • 


Public use: prioritizing AI applications in the
lic options for core AI technologies that are otherwise public interest, especially those neglected by
delivered by the market, thereby providing alterna- commercial incentives or those that are consid-
tives to commercial offerings that are prone to be- ered inappropriate for private development.
coming natural monopolies. This includes access to
essential tools for AI development, such as code li- Mozilla frames public AI initiatives in contrast to
braries, training data and compute resources. Pub- private AI development, advocating for competi-
lic AI also aims to guarantee access to newly created tive alternatives to the proprietary models current-
public goods. Public accountability is understood both ly dominating the field. At the same time, public AI
in terms of compliance with trustworthy AI princi- is not intended to replace private companies, but
ples and in fostering public participation in AI devel- to coexist with them by offering “a different way of
opment. This includes mechanisms for oversight and a building technology for different needs.” Ultimate-
clearly articulated public purpose, centered on societal ly, the report argues for a public AI ecosystem that
needs and capabilities deemed valuable by the public. is pluralistic and involves public, civic and commer-
cial actors.
Finally, the requirement of permanent accessibility is
meant to ensure that public AI provides a stable and The report also identifies three significant risks that
reliable foundation that is not constrained by private could lead public AI solutions to replicate the harms
interests. The Public AI Network emphasizes that this seen in today’s AI ecosystem. These include: first,
does not necessarily imply direct public ownership. the development of an alternative ecosystem focused
Instead, it advocates for strategies that allow pub- solely on creating public goods without a clear public
lic AI “to be sustainably developed and independent- orientation; second, the risk of financial unsustain-
ly maintained as a public good, guaranteeing public ability without adequate funding; and third, the po-
control in perpetuity.”
142 Nik Marda, Jasmine Sun and Mark Surman. “Public AI. Making
AI work for everyone, by everyone.” Mozilla. September 2024.
141 Public AI network. “Public AI: Infrastructure for the common https://assets.mofoprod.net/network/documents/Public_AI_
good.” 10 August 2024. https://publicai.network/whitepaper Mozilla.pdf

48
The public AI framework

tential overdependence of public AI on governments development and the recruitment of AI talent into
and public funding. government roles.

In this regard, Mozilla’s approach diverges from These public initiatives are to be complemented by
Mazzucato’s mission-driven model of AI devel- stringent public-utility style regulations aimed at
opment by framing public AI as an ecosystem that private AI companies with monopolistic or oligop-
functions independently of both corporate and gov- olistic market power. Proposed regulatory mea-
ernmental control. As the foundation puts it, “We sures include structural separation rules that aim to
need a resilient and pluralistic AI ecosystem, in dismantle monopolistic control over multiple, in-
which no single entity – whether Big Tech or nation- terconnected layers of the AI stack; non-discrimi-
al governments – can unilaterally decide AI’s fu- nation rules to ensure equal access for all actors; and
ture.” However, this vision does not fully address the restrictions on foreign ownership, control and in-
current ecosystem’s structural dependencies or pro- vestment. Such regulations are deemed necessary
pose concrete strategies for mitigating them. to maintain a competitive private industry that can
provide private contractors to governments without
undermining innovation, effectiveness or resilience.
Vanderbilt Policy Accelerator
The proposal offers a vision of the dynamics between
The white paper “The National Security Case for public options in AI and a regulated private AI mar-
public AI” presents its approach to public AI in the ket. Government development of in-house solutions
context of the threats AI systems may pose to de- is expected to drive more competitive pricing from
mocracy and national security – specifically, how private contractors. At the same time, market com-
they “may threaten the resilience of democracies petition would be further supported through regula-
around the world.”143 Public AI is framed as a du- tory measures. The proposal seeks to strike a balance
al-purpose strategy: it safeguards democratic values, between reliance on private companies and the need
privacy and other fundamental rights while also pro- to build public sector capacity to tackle societal chal-
viding secure and resilient solutions for national de- lenges using AI.
fense and homeland security.

The Vanderbilt Policy Accelerator’s public AI frame- Defining public AI infrastructure


work has two components:
All three proposals can be mapped onto the public
• Developing a publicly-funded, publicly owned digital infrastructure framework described in the pre-
and publicly-operated AI tech stack vious section. The Public AI Network paper, with its
emphasis on the common good, aligns most closely
• Adopting public-utility-style regulations for with this model. The Vanderbilt Policy Accelerator of-
layers of the AI tech stack fers the strongest argument for addressing concentra-
tions of power within the AI stack and for establishing
The report argues for the establishment of public op- true public ownership of AI infrastructure.
tions in AI – publicly provided and managed com-
ponents of the AI stack and supply chain. This vision These proposals also acknowledge that properly es-
calls for substantial government investment to chal- tablishing the relationship between public and pri-
lenge existing monopolies, especially at the hard- vate actors – particularly regarding control and
ware layer, and to promote the creation of public data ownership of infrastructure – is a central challenge
centers, public cloud services, public datasets for AI for public AI initiatives. The Mozilla paper, for in-
stance, offers an important conceptualization of
143 Ganesh Sitaraman and Alex Pascal. ibid. public AI as an ecosystem involving various actors.

49
The public AI framework

Table 2 | Mapping different definitions of public AI onto the Public Digital Infrastructure framework.

Public AI network Mozilla Vanderbilt

Public attributes public access, permanent public goods public option


public goods

Public functions public accountability public orientation public option

Public control public accountability public use public option, public utility
regulation
Source: Own table.

Drawing on these proposals and the concept of pub- tainability, as captured by the idea of “permanent
lic digital infrastructure, public AI infrastructure can public goods.”
be defined in terms of the three following character-
istics of the AI stack: Both a full-stack AI infrastructure and its individual
layers or components can exhibit the characteristics
• 
Public attributes: Public AI provides univer- of public AI infrastructure. A publicly owned com-
sal and unrestricted access to components of the puting resource or an open source AI model, for ex-
stack, enabled through openness and interopera- ample, qualifies as public AI infrastructure. However,
bility. Key components are shared as digital public due to interdependencies between these layers – as
goods, and solutions are built on open stan- discussed in the previous chapter – policy efforts
dards. Systems and processes are auditable and should support full-stack approaches.
transparent. These attributes help reduce market
concentration and dependency on dominant com-
mercial actors.

• 
Public functions: Public AI delivers foundational
systems and services that support broad societal
and economic functions, particularly by enabling
downstream activities and public benefits. It sup-
ports essential public capabilities such as knowl-
edge sharing and civic participation. The public AI
stack creates an enabling environment for inno-
vation while safeguarding user rights and social
values.

• 
Public control: Public AI involves public control,
funding and/or production of the infrastructures
underpinning the generative AI stack. This may
take various governance forms – from direct gov-
ernment provision to public orchestration of other
actors. The goal is to place the AI stack under
democratic control, with mechanisms for collec-
tive decision-making and accountability. Public
ownership should also ensure long-term sus-

50
The public AI framework

INFOBOX | Public AI infrastructure, stack and systems

In this report, the terms public AI stack, public achieve “ends.” An AI system is the end product
AI infrastructure and public AI systems are used or application that functions on top of the infra-
to describe the goals and preferred outcomes of structural layers beneath it. A complete AI system
public AI policies. It is therefore important to clar- includes the underlying hardware and compute
ify how these terms differ, as well as how they in- power, data, code and model architecture used
terconnect. to train the model, the trained model itself (i.e.,
its parameters) and the application layer built on
Infrastructures are facilities, systems or institu- top of the stack. In contrast, a public AI initiative
tions that serve society-wide functions and pro- might focus on providing a single type of compo-
vide foundations for downstream activities and nent, for example by releasing datasets as digital
social benefits. Frischmann defines infrastruc- public goods, or by supporting a sustainable model
tures as “shared means for many ends,” empha- development ecosystem. Each of these compo-
sizing that they should be treated as shared nents, on its own, should meet the characteristics
resources.144 of a public AI infrastructure.

Public policies policy should focus on building The public AI policy framework assumes that
public AI infrastructures for two reasons. First, deploying AI systems alone is not sufficient to
AI – as a general-purpose technology – has infra- achieve social goals. Instead, public AI infrastruc-
structural characteristics due to its open-ended ture must be built to support the operation of AI
applications and societal impacts. In this respect, systems in the public interest. At the same time,
it is similar to earlier digital technologies like the public AI policy can also include the deployment
internet. Second, Frischmann’s concept of infra- of specific AI systems – also referred to as AI solu-
structure as a shared resource implies that com- tions. This focus is especially important at a time
mons-based management strategies can enable when the purpose and impact of AI deployment
broad, open access. Emphasizing the infrastruc- remain unclear.
tural nature of AI supports the idea of managing it
as a public resource to ensure wide accessibility. The issue is further complicated by the fact that
AI infrastructure can also be understood as an AI public AI systems can serve infrastructural func-
stack of interconnected layers. tions. Generative AI models, for example, are both
concrete AI systems and can function as digital in-
At the same time, AI technologies are often de- frastructure. Foundation models, also referred to
scribed as AI systems, which suggests specific in- as general-purpose AI models, exhibit such infra-
stances of a technology that not only provide structural characteristics. When open sourced,
“means” but also operate, perform tasks and they can serve as the basis for new or derivative
models.
144 Brett M. Frischmann, ibid.

51
The public AI framework

Gradient of publicness of AI systems This gradient can be understood as a continuum,


ranging from fully public AI systems to semi-public
In the previous section, we provided an overview of ones, and finally to commercial, closed systems with
several proposals for defining public AI infrastruc- minimal public functions. As solutions move further
ture, alongside a broader definition of public digital along toward publicness, they become less depen-
infrastructure. Today, there are few – if any – ex- dent on dominant market players and more capable
amples of public AI infrastructure that fully meet of enabling public agency and supporting indepen-
this definition. The most commonly cited reasons in- dent, mission-oriented development pathways.
clude the lack of publicly owned compute resources,
dependence on an oligopoly of cloud and data cen- The gradient of publicness also illustrates what must
ter providers and the near-monopoly on chip pro- change in AI infrastructures, systems or components
duction. Dependence on commercial solutions or to increase their alignment with public interest goals.
components is not inherently problematic, as many For example, many open models demonstrate strong
public infrastructures rely on commercial commodi- public attributes – such as open weights – but fail to
ties. The issue lies in the lack of public capacity to or- fulfill public functions because they are not active-
chestrate the development of infrastructure that is ly deployed in service of public objectives. Models
independent of market dynamics – able to generate shared on platforms like Hugging Face Hub may be
public goods and serve the public interest. accessible for experimentation, but are not necessar-
ily integrated into solutions for education, healthcare
The three characteristics of public AI infrastructure or climate mitigation.
– public attributes, public functions and public con-
trol – together influence the degree of publicness. Further examples include prominent open-weight
While some examples of fully public AI infrastructure models like Llama, models by Meta, Mistral and
exist, dependencies at the compute layer often make DeepSeek. These models indirectly support pub-
such deployment difficult. Public computing infra- lic functions by enabling downstream innovation and
structure is typically developed through some form model development. However, their overall public-
of public-private partnership. For this reason, it is ness remains limited due to a lack of transparency
useful to consider a gradient of publicness for AI sys- around training data and processes, which restricts
tems. broader access and accountability.

The gradient describes varying levels of publicness The gradient framework is one of the core contribu-
based on the three characteristics – attributes, func- tions of this white paper. It introduces a model that
tions and control – and how these apply across the maps AI initiatives along a continuum – from fully
layers of the AI stack. 145
These three characteris- public to fully private – based on the dimensions of
tics, taken together, determine an infrastructure’s attributes, functions and control. This approach pro-
ability to support public interest goals and develop- vides policymakers with a diagnostic and strategic
ment trajectories independent from market actors. tool for evaluating where a given intervention cur-
As such, the gradient serves as a practical tool for rently stands and identifying which policy measures
policymakers to identify strategies that enhance the – whether investment, regulation or institution-
publicness of AI systems and strengthen the public al design – could increase its publicness. The frame-
value of specific initiatives. work is particularly relevant when assessing choices
at the compute, data and model layers of the AI stack.

145 The inspiration for this gradient comes from Irene Solaiman’s
work on a gradient of release approaches for generative AI. Positions on the gradient depend on the extent to
See: Irene Solaiman. “The Gradient of Generative AI Release: which the three characteristics of public AI infra-
Methods and Considerations.” arXiv:2302.04844, arXiv, 5 Feb.
2023. arXiv.org, https://doi.org/10.48550/arXiv.2302.04844
structure are met. Weak forms of publicness require
Accessed 3 April 2025. at least public attributes and some public functions.

52
The public AI framework

Figure 5 | Gradient of publicness

HIGH
PUBLICNESS

Fully public
Full-stack Public AI infrastructure
approaches

Full-stack Public AI infrastructure built with commercial compute


Partial
public approaches

Public computing
infrastructure Public provision of AI components

Commercial AI infrastructure Commercial provision of AI


Commercial with public attributes
approaches and functions components with public attributes

LOW
PUBLICNESS
COMPUTE DATA MODEL

Illustration by: Jakub Koźniewski

Stronger forms require securing public control as Level 1: Commercial provision of AI components
well. with public attributes. These are specific compo-
nents – typically open models or libraries – de-
These differences can be systematically mapped onto veloped and shared by commercial actors. They are
the three foundational layers of the AI stack – com- publicly accessible (and often openly shared) by or-
pute, data and models. For each level, specific ex- ganizations that combine public interest goals with
amples vary in the extent to which they meet the a commercial interest in sustaining an ecosystem
characteristics of public AI infrastructure. around their solutions. These infrastructures typi-

53
The public AI framework

cally exhibit high public attributes, low to moderate Level 4: Public provision of AI components. These
public function and low to moderate public control. are individual components – such as datasets,
benchmarks or evaluation tools – developed with
Example: PyTorch146 is a deep learning framework public funding and/or hosted on public infrastruc-
that was open sourced by Meta and is currently host- ture. In some cases, such as datasets or software,
ed by the Linux Foundation. It is an openly shared there is no dependency on compute. These infra-
and collectively governed AI component that plays a structures typically have high public attributes,
central role in AI development. Meta and other con- moderate to high public function and high public
tributors have built an ecosystem of complementary control.
research and innovation around PyTorch, with Meta
maintaining a leading role. Example: Common Voice150 is a Mozilla initiative that
provides an open platform for sharing voice data for
Level 2: Commercial AI infrastructure with public AI training. It is widely cited as a best practice in re-
attributes and functions. This refers to privately sponsible, open data sharing.
controlled infrastructure that ensures some level of
access and has a public interest orientation. It in- Level 5: Full-stack public AI infrastructure built
cludes mechanisms such as public access to com- with commercial compute. These are infrastructures
mercial compute or platforms for sharing data and that have public attributes and functions but depend
models. These infrastructures typically have high on commercial compute both during development
public attributes, moderate public function and low and deployment phases. These infrastructures typ-
public control. ically have high public attributes, moderate to high
public function, and low public control, at least with
Example: Hugging Face 147
is a commercial entity with regard to computing power.
a mission of “democratizing good machine learning.”
It operates a model and dataset sharing platform that Example: OLMo151 is an open model built by the Allen
serve as the backbone of the open source AI ecosystem. Institute for AI that sets a high bar for transparency
in model, code and training data. The institute part-
Level 3: Public computing infrastructure. These are nered with Google to train the model on its Augusta
public computing resources – such as supercomput- computing infrastructure and to deploy it on Vertex
ers and data centers – funded entirely by the public AI, Google’s cloud platform.
sector or developed through public-private partner-
ships. These infrastructures typically have unclear Level 6: Full-stack public AI infrastructure. These
public attributes (unless specific conditions for ac- infrastructures integrate data, models and compute
cess are introduced), low to moderate public function resources that all meet the public AI standard. Sys-
and moderate to high public control. tems built on such infrastructure benefit from syn-
ergies across layers and are free from commercial
Example: AI Factories148 are public supercomput- dependencies. These infrastructures typically have
ers in the EU that are financed through a mix of pub- high public attributes, moderate to high public func-
lic and private funding. Their purpose is to support tion and moderate to high public control.
startups and research institutions, with generative AI
development as one of their goals.149 Example: Alia152 is a large language model developed
by the Barcelona Supercomputing Center, a public
146 PyTorch. https://pytorch.org/ Accessed April 27 April 2025.
147 Hugging Face. https://huggingface.co/ Accessed 27 April 2025.
148 European Commission. “AI Factories.” Digital Strategy. https:// 150 Common Voice Mozilla. https://commonvoice.mozilla.org/en
digital-strategy.ec.europa.eu/en/policies/ai-factories Accessed Accessed 27 April 2025.
27 April 2025. 151 OlMo Ai2. https://allenai.org/olmo Accessed 27 April 2025.
149 AI Factories | Shaping Europe’s digital future 152 Alia. https://www.alia.gob.es/eng/ Accessed 27 April 2025.

54
The public AI framework

research institution in Spain, using its MareNostrum ply be to build more and faster, but to challenge con-
5 supercomputer. The model sets a high standard for centrated power and promote the creation of public
transparency and openness and addresses a linguistic value throughout the AI supply chain,” and that
gap by supporting Spanish and four co-official lan- “public compute investments should instead be seen
guages in Spain. as an industrial policy lever for fundamentally re-
shaping the dynamics of AI development and there-
Several important points emerge from examining the fore the direction of travel of the entire sector.”154
gradient of publicness. First, most initiatives requir-
ing computing power will remain dependent on com- Even this framing of public policy goals might be too
mercial providers, except in rare cases where public ambitious or unrealistic in the face of concentrat-
institutions maintain their own supercomputers. ed power in the AI ecosystem. Rather than aiming to
Second, the provision of certain components, espe- fundamentally impact generative AI markets, policy
cially datasets and open source software for model should prioritize reducing dependencies and building
development, involves fewer such dependencies, independent capacity to generate public value. The
making it easier to build highly public AI infrastruc- key strategic challenge lies in identifying effective
ture. Third, the ability to orchestrate a full-stack ap- public interventions in a landscape where dominant
proach – integrating computing, data and software AI firms are consolidating control over digital stacks
– can significantly enhance publicness by creating and networks – and leveraging vast private funding
synergies across these layers. Full-stack initiatives to reinforce their position in the AI stack.155
are thus more critical to public AI strategies than iso-
lated efforts focused on individual components. Con- The goals of public AI policy should be to:
versely, public AI components tend to have greater
impact when they are part of a coordinated, full- • Develop AI infrastructure with components that
stack framework. are as public as possible by ensuring strong public
attributes and functions

Goals and governance principles of • Reduce dependency on dominant commer-


public AI policies cial providers of computing power and hardware
through investments in public compute infra-
The goal of public AI policy should not be to create structure, procurement policies with public inter-
competing infrastructures to commercial systems. est conditions and other governance mechanisms
Rather, public intervention should focus on sup-
porting alternatives: offering new options, advanc- • Create incentives to develop AI infrastructure
ing new development paradigms and strengthening characterized by greater publicness through in-
new capacities. In the next chapter, we outline gov- terventions that support public functions rather
ernance principles that support a mission-oriented, than relying solely on market-led development
public interest approach to AI. This also means that
public policy should avoid participating in an AI race • Identify, finance and support the development
driven by commercial interests or fixating on specu- and maintenance of digital public goods critical to
lative future needs. 153
public AI development

The issue of meaningful interventions at the compute


layer illustrates this approach well. A report by the 154 Eleanor Shearer, Matt Davies and Mathew Lawrence, ibid.
Ada Lovelace Institute on the role of public compute 155 Cecilia Rikap, “Antitrust Policy and Artificial Intelligence: Some
notes that “The aim of these policies should not sim- Neglected Issues,” Institute for New Economic Thinking, 10
June, 2024, https://www.ineteconomics.org/perspectives/blog/
antitrust-policy-and-artificial-intelligence-some-neglected-
153 Zuzanna Warso, ibid. issues

55
The public AI framework

• Promote research and innovation that reduce re- • Directionality and purpose: Public AI infrastruc-
liance on proprietary AI components by advancing ture should be built with clear intent and direc-
new paradigms for AI development tion, ensuring alignment with public values and
the principles outlined below. Public actors must
• Ensure adequate research talent and institution- orchestrate resources and stakeholders to en-
al capacity within the public sector to participate sure public AI initiatives generate public value and
meaningfully in AI development serve the common good.156

• 
Commons-based governance: Datasets, soft-
Governance of public AI ware, models and other key components of AI
systems should be stewarded as commons. Such
Governance of AI systems is a necessary component a framework encourages open access while es-
of any public AI agenda. By governance, we refer to tablishing responsible use, democratic oversight
the processes, structures and coordinated actions by and collective stewardship. Commons-based gov-
multiple actors through which decisions related to ernance encompasses approaches that challenge
AI are made and enforced. This concept extends be- proprietary ownership and promote shared con-
yond traditional legal frameworks to include vari- trol over resources.157 It balances broad access to
ous methods for setting and upholding norms – such resources with governance mechanisms that pro-
as standards, codes of practice, voluntary licensing tect rights, ensure quality and generate public
models or community-based rules. value (including economic value).158

It emphasizes the need for diverse stakeholders to • 


Open release of models and their components:
work together to achieve desired outcomes while key components of AI models – including model
mitigating potential risks and harms. In the context weights, architectures and documentation –
of public AI, these governance mechanisms are in- should be released under open source licenses and
tended to ensure that generative AI systems – and made universally accessible for use, study, mod-
their underlying components – are either publicly ification and redistribution. Governance of open
owned or, at a minimum, subject to meaningful pub- models should include strong transparency re-
lic oversight. This helps ensure that AI infrastruc- quirements and, where possible, openly shared
tures meet the criteria of public attributes and public training datasets.159
functions. Some of these mechanisms also address
foundational conditions necessary for digital infra- • Open source software: The advancement of AI
structures to serve the public interest. has relied heavily on open source tools like scikit-
learn and PyTorch, which are developed and
In what follows, we provide an overview of the core
principles for governing public AI. Some principles 156 Mariana Mazzucato, David Eaves and Beatriz Vasconcellos.
“Digital public infrastructure and public value: What is ‘public’
focus on reinforcing the public character of AI solu- about DPI?.” UCL Institute for Innovation and Public Purpose.
tions. Others are broader, representing good gover- 2024. https://www.ucl.ac.uk/bartlett/publications/2024/mar/
digital-public-infrastructure-and-public-value-what-public-
nance standards for AI systems more generally. To
about-dpi
serve a public function, AI systems must meet high 157 Alek Tarkowski and Jan Zygmuntowski. “Data Commons
governance standards that guarantee accountability, Primer.” Open Future, 20 September 2022. https://openfuture.
eu/publication/data-commons-primer
transparency and sustainability.
158 Alek Tarkowski and Zuzanna Warso. “Commons-Based Data
Set Governance for AI.” Open Future, 21 March 2024. https://
To achieve these ends, public AI governance should openfuture.eu/publication/commons-based-data-set-
governance-for-ai
be grounded in several high-level principles:
159 Alek Tarkowski. “Data Governance in Open Source AI.” Open
Future, 24 January 2025. https://openfuture.eu/publication/
data-governance-in-open-source-ai

56
The public AI framework

maintained through community-led processes. When incorporating these governance principles,


Open source AI tools should be governed as dig- public AI policies must go beyond simply providing
ital public goods, in line with the Digital Public compute, data or model capacity for public interest
Goods Standard,160 employing vendor-neutral and use cases. They must also shape how AI infrastruc-
transparent community governance and develop- tures are developed and used.
ment processes.

• Conditional computing: Public investments in


the AI stack should, wherever possible, be tied
to specific conditions and rules that shape how
a technology is developed and used. This princi-
ple is especially relevant to public investment in
computing hardware and infrastructure. Limited
compute resources made available through public
funding should be used efficiently and in service
of public interest goals, such as openness.

• Protecting digital rights: Public AI systems


should set a high standard for the protection of
digital rights, including privacy and data protec-
tion, copyright, freedom of expression and access
to information. This requires transparency, ac-
countability, grievance mechanisms and robust
frameworks for risk assessment and mitigation.

• 
Sustainable AI development: AI systems should
be developed and deployed in ways that promote
environmental sustainability, fair resource use
and long-term societal benefit. Public procure-
ment of AI infrastructure should include require-
ments for sustainable compute provision and the
development of responsible supply chains.161

• 
Reciprocity: Public and private actors that ben-
efit from public AI resources should ensure that
downstream applications and derivative products
adhere to these governance principles. This helps
prevent the privatization of public value and pro-
tects against corporate capture.

160 DPG Alliance. “DPG Standard.” GitHub. Accessed 27 April 2025.


https://github.com/DPGAlliance/DPG-Standard
161 Green Screen Coalition, et al. “Within Bounds: Limiting AI‘s
environmental impact.” Green Screen Coalition. 5 February
2025. https://greenscreen.network/en/blog/within-bounds-
limiting-ai-environmental-impact/

57
5 | AI strategy and three pathways to
public AI

In this chapter, we outline key elements of a public AI In doing so, a public AI strategy should shift away from
strategy by presenting three pathways for developing the current “AI race” among major commercial labs.
public AI solutions, based on the three core layers of In other words, it should treat AI technologies not as a
AI systems: data, compute and models. potential superintelligence, but as a “normal technol-
ogy.”162 This means pursuing a pragmatic development
The metaphor of a vertically integrated AI stack sug- strategy in which investments in compute are tied to
gests that the hardware layers at the bottom form the clearly defined goals for model development and de-
infrastructural foundation of any AI system. We pro- ployment. Another pillar of this strategy should focus
pose extending this concept to also treat datasets on fostering a more sustainable path for AI develop-
and models as forms of public infrastructure – crit- ment – one centered on small models and innovations
ical building blocks upon which public AI solutions that make AI technologies more sustainable in terms of
can be developed. In other words, there are three po- their energy consumption and environmental footprint.
tential pathways to public AI, each focused on devel-
oping computing power, datasets or models as public In the following sections, we describe in more de-
infrastructural resources. tail the elements of this strategy, divided into three
pathways. Each pathway focuses on one layer of the
The aim of this chapter is to illustrate what a com- AI stack: compute, data and models. In each case, the
plete public AI strategy could look like, with inter- goal of public AI policy should be to secure public at-
ventions at the various layers of the AI stack. tributes, functions and control over AI systems and
their components.

Elements of a public AI strategy 1. Compute layer: Public AI strategy at the compute


layer should aim to reduce dependence on com-
As shown in the previous chapter, full-stack ap- mercial computing power and, where necessary,
proaches to public AI development offer a higher de- develop publicly owned compute infrastructure.
gree of publicness than partial solutions. A public AI Supporting measures include research into more
strategy should therefore orchestrate the provision efficient AI development paradigms (within the
of computing power and training data to support an model layer) and research to establish meaningful
ecosystem of public AI models. This ecosystem would estimates of the computing needs of public AI ini-
include a state-of-the-art “capstone” model and a tiatives.
variety of smaller models. Supporting actions should
include investment in research and innovation, de- 162 Arvind Narayanan and Sayash Kapoor. “AI as Normal
velopment of a strong talent base and institutional Technology.” Knight First Amendment Institute. 2025. https://
kfai-documents.s3.amazonaws.com/documents/2b27e794d6/
capacity and programs that enable the deployment of AI-as-Normal-Technology---Narayanan---Kapoor.pdf
public AI systems for specific solutions and uses. Accessed 3 April 2025.

58
AI strategy and three pathways to public AI

Figure 6 | Elements of a public AI strategy

Orchestrating Institution

Ecosystem of small
and domain-specific models

Paradigm-shifting
innovation
AI talent & capabilities
MODELS

AI deployment pathways

Software and
tools development

Public provision of
a capstone model

Datasets as Public data


DATA

digital public goods commons


COMPUTE

Public compute for Public compute Better coordination


research institutions for open source AI between public
development compute Initiatives

Illustration by: Jakub Koźniewski

59
AI strategy and three pathways to public AI

2. Data layer: At the data layer, public AI strategy Helping this ecosystem thrive requires leveraging
should promote the development of high-qual- the power of public institutions, including market-
ity datasets that are both publicly accessible and shaping tools like industrial policy and public pro-
governed through democratic, commons-based curement, to support and strengthen it. Public in-
frameworks. These governance models should en- stitutions must also be part of this ecosystem, and
sure public attributes and functions while pro- public AI solutions should be built on infrastructures
tecting data from inappropriate value extraction, developed within it.163
such as free-riding.
A key role in this ecosystem in this ecosystem should
3. Model layer: At the model layer, public AI strate- be played by a public institution capable of orches-
gy should build on open source generative AI eco- trating actions across a decentralized network of
systems that demonstrate how technologies can be actors. Orchestration involves managing how public
developed with strong public attributes. It should digital infrastructure is produced and how it aligns
also aim to strengthen the public functions of these with evolving needs and values over time. It also
technologies by setting a clear development agen- includes the ability to adapt the ecosystem as those
da focused not on technology for its own sake, but needs, values and conditions change.164 In other
on generating public value – through targeted ap- words, proper orchestration allows an institution
plication development and demand creation. to control the generative capability of various infra-
structures in the public AI ecosystem.
Taken together, this public AI strategy is relatively
complex and requires new forms of governance ca- An orchestrating institution – or, more likely, a net-
pable of coordinating the actions of diverse actors to work of institutions – thus plays a critical role in the
achieve policy goals. It calls for a strong institution- ecosystem. Several blueprints for such institutions
al framework able to lead and orchestrate the strate- have been proposed, drawing inspiration from orga-
gy effectively. nizations that have successfully delivered other pub-
lic goods and coordinated complex ecosystems.

The public AI ecosystem and its Brandon Jackson, writing for Chatham House, pro-
orchestrating institution poses a “BBC for AI” or a British AI Corporation
(BAIC), which he describes as “a new institution that
A public AI strategy should aim to create an eco- would ensure that everyone has access to powerful,
system of public generative AI infrastructures and responsibly built AI capabilities. Yet the BAIC should
systems, rather than focus solely on individual ini- be more than just a head-to-head competitor with
tiatives or standalone institutional capacity. Calls for the private AI companies. It should be set up with an
centralized public AI development often overlook the institutional design that empowers it to chart an
distributed nature of modern AI research, which is
supported by open source development norms. Even
when investing in centralized capacity – such as
public compute resources – a public AI strategy must
163 Katja Bego. “Towards Public Digital Infrastructure: A Proposed
also support a broader ecosystem centered on public
Governance Model.” Nesta, 30 March 2022. https://www.
functions and the creation of public value. nesta.org.uk/project-updates/towards-public-digital-
infrastructure-a-proposed-governance-model/; Alek
Tarkowski, et al. “Generative Interoperability.” Open Future,
The governance mechanisms proposed in chapter 4 11 March 2022. https://openfuture.eu/publication/generative-
support this ecosystem-based approach by promot- interoperability

ing the sharing of data, software and knowledge, and 164 Antonio Cordella and Andrea Paletti. “Government as a
platform, orchestration, and public value creation: the Italian
by ensuring interoperability among solutions with- case.” Government Information Quarterly, 36 (4). ISSN 0740-
in the ecosystem. 624X. https://doi.org/10.1016/j.giq.2019.101409

60
AI strategy and three pathways to public AI

independent path, building innovative digital infra- begin with an overview of bottlenecks and opportu-
structure in the public interest.”165 nities, followed by a list of proposed solutions. At the
end, we offer three additional recommendations for
Another proposal, from the Center for Future Gen- supportive measures.
erations, outlines a “CERN for AI” 166
– a more cen-
tralized model for an institution combining in-house The goal of these sections is not to provide detailed
research with a broad mandate to collaborate with blueprints for every solution. Rather, we aim to out-
academia and industry. A separate proposal by Dan- line a comprehensive strategy, coordinated across
iel Crespo and Mateo Valero suggests that European the three layers and pathways to public AI. These
public AI efforts should draw inspiration from Airbus recommendations do not take into account the eco-
and Galileo, two examples of successful coordination nomic dimensions of deploying public AI infrastruc-
among industry actors to develop innovative, com- ture, which must be shaped by the specific local or
petitive technologies.167 Most recently, the CurrentAI regional context where such policies are developed.
partnership, launched at the Paris AI Action Summit Nor do we attempt to provide a comprehensive list of
in February 2025, aims to orchestrate public interest public AI efforts; examples are included only to illus-
AI development at the global level by coordinating trate different solutions.
the efforts of governments, philanthropies and pri-
vate companies.168
Compute pathway to public AI
It should be emphasized that the aim of this publi-
cation is not to prescribe any specific institutional Computing infrastructure forms a foundation for
model. Rather, it seeks to underscore the importance public AI development, yet policy approaches must
of thoughtful institutional design as the foundation strike a balance between addressing real infrastruc-
for any effective public AI strategy. tural needs and avoiding inflated investment claims.
While compute is essential to AI progress, public
initiatives should prioritize targeted, strategic in-
Three pathways toward public AI vestments rather than attempting to replicate the
infrastructure: compute, data and massive spending patterns of dominant commercial
model players.

Each of these pathways offers a different approach to


meeting public AI objectives and addressing depen- Compute: bottlenecks
dencies on market monopolies. For each pathway, we
Compute is an integral component of AI development
165 Brandon Jackson. “The UK needs a ‘British AI Corporation’, and deployment, yet its landscape is defined by an
modelled on the BBC.” Chatham House. 10 June 2024. https:// extraordinarily complex supply chain and market dy-
www.chathamhouse.org/2024/06/artificial-intelligence-
and-challenge-global-governance/07-uk-needs-british-ai- namics dominated by a few powerful players. As out-
corporation lined in the previous chapter, compute can be broken
166 Alex Petropoulos, et al. “Building CERN for AI.” Center for down into three critical components: advanced chips,
Future Generations. 30 January 2025. https://cfg.eu/building-
cern-for-ai/
software frameworks to run those chips and data
167 Daniel Crespo and Mateo Valero. “Es la hora de ‘AIbus’: por qué centers.
Europa debe crear una gran empresa de AI.” El Pais. 11 April
2024. https://elpais.com/tecnologia/2024-04-11/es-la-hora-
de-aibus-por-que-europa-debe-crear-una-gran-empresa-
At the hardware level, semiconductor production
de-ai.html is widely considered the most complex and global-
168 “Nouveau partenariat pour promouvoir l‘IA d‘intérêt général.” ly distributed supply chain in the world. It is typically
Élysée. 11 February 2025. https://www.elysee.fr/emmanuel-
macron/2025/02/11/nouveau-partenariat-pour-promouvoir-
divided into three main stages: chip design, front-
lia-dinteret-general

61
AI strategy and three pathways to public AI

end manufacturing and back-end manufacturing.169 system. The deep integration of CUDA with Nvidia
Chip design involves creating the architecture and hardware, its robust ecosystem of machine learning
layout of the semiconductor, often using specialized libraries and frameworks and its superior multi-GPU
software. Front-end manufacturing refers to the fab- scaling capabilities have entrenched the company’s
rication process in advanced foundries, where bil- position and raised substantial barriers for compet-
lions of transistors are etched onto each chip using itors.
photolithography and other precision techniques.
Back-end manufacturing involves the assembly, Data centers – the final layer – integrate chips and
packaging and testing of chips before integration software into usable compute systems. These fa-
into devices. cilities are also marked by high concentration due
to their massive cost and operational complexi-
Each stage of this supply chain is highly specialized ty. In January 2025, OpenAI, Oracle and SoftBank
and concentrated in the hands of a few firms. For in- announced the Stargate project, with a planned
stance, ASML in the Netherlands is the only company investment of up to $500 billion in data center infra-
in the world capable of producing extreme ultravio- structure. While the feasibility of this investment re-
let (EUV) lithography machines – critical equipment mains uncertain, the scale illustrates the intensifying
for manufacturing advanced chips. These machines global race for compute leadership.172 In 2024, com-
consist of over 100,000 components and cost roughly panies such as Microsoft, Meta, Amazon and Apple
€350 million each. spent approximately $218 billion on physical infra-
structure. As a result, state-of-the-art data centers
Front-end manufacturing is particularly capital- remain the domain of large, well-capitalized corpo-
and technology-intensive, leading to extreme mar- rations or publicly backed initiatives.173
ket concentration. Only a few firms – most notably
Samsung and Taiwan Semiconductor Manufactur- In short, these three components – chips, software
ing Company (TSMC) – can operate in this space. In and data centers – highlight deep dependencies on a
2024, TSMC held a 90% market share in advanced small number of dominant players and reflect the ex-
logic chips, which are essential for AI training and traordinary capital intensity of the compute market.
deployment. 170 That same year, TSMC invested more
than $30 billion in capital expenditures and manu- Public initiatives aimed at securing compute capac-
factured most of Nvidia’s chips, along with chips for ity often focus narrowly on sovereignty rather than
numerous Chinese companies, despite geopolitical public value. These efforts treat compute as a na-
tensions. 171
tional resource and view expanded capacity as an
end in itself. Investments, typically undertaken with
Specialized software frameworks also create signif- commercial partners, are rarely tied to public value
icant lock-in. Nvidia’s proprietary CUDA platform, conditions and often serve to bolster national com-
designed to run exclusively on its GPUs, dominates mercial actors. The notion of sovereign AI, when built
the market and has created a ‚walled garden‘ eco- on commercial compute infrastructure, risks rein-
forcing the market dominance of existing players.
169 Jan-Peter Kleinhans and Julia Christina Hess. “Governments’
role in the global semiconductor value chain #2.” Stiftung Neue
Verantwortung. 6 July 2022. https://www.interface-eu.org/
publications/eca-mapping
170 “TSMC’s Advanced Processes Remain Resilient Amid 172 Steve Holland. “Trump announces private-sector $500 billion
Challenges.” Trendforce. 8 April 2024. https://www.trendforce. investment in AI infrastructure.” Reuters. 22 January 2025.
com/news/2024/04/08/news-tsmcs-advanced-processes- https://www.reuters.com/technology/artificial-intelligence/
remain-resilient-amid-challenges/ trump-announce-private-sector-ai-infrastructure-
171 Chris Miller. “Chip War. The Fight for the World‘s Most investment-cbs-reports-2025-01-21/
Critical Technology.” Simon & Schuster. 4 October 2022. 173 Michael Flaherty. “Tech dollars flood into AI data centers.”
https://www.simonandschuster.com/books/Chip-War/Chris- Axios. 26 December 2024. https://www.axios.com/2024/12/20/
Miller/9781982172008 big-tech-capex-ai

62
AI strategy and three pathways to public AI

While there is growing consensus among policy- such partnerships, resulting in infrastructures with a
makers that public compute provision is essential, high degree of publicness.
the high costs and fast pace of technological change
make such interventions challenging. Unlike sover- In addition, public compute initiatives face two major
eign AI strategies, the goal should not merely be to shortcomings. First, they often lack effective allo-
expand national capacity, but to ensure that these cation mechanisms and operate without a clear vi-
investments generate public benefit. sion for which AI components and infrastructures
should be prioritized. Not all projects require large-
scale compute, and a better understanding of actu-
Compute: opportunities al computing needs is necessary to inform a sound
public compute strategy. Without clear criteria and
The high market concentration and capital intensity governance, resources risk being misallocated or un-
of computing infrastructure create significant chal- derused. Any large-scale initiative – such as a poten-
lenges for developing public compute initiatives. As tial “CERN for AI” – should begin with a systematic
noted in the Computing Commons report by the Ada assessment of demand, identifying which institu-
Lovelace Institute, “policymakers need to be realistic tions, researchers and projects require which lev-
about what can be achieved through public compute els of compute. This demand-driven approach should
projects alone,” as dependencies in semiconductor guide allocation policies, access rules and future
supply chains mean that “for most jurisdictions the scaling. A mission-driven strategy should also link
goal of ‘onshoring’ production will likely be a near compute use to clear public goals.
impossibility in the short to medium term.”
Second, compute initiatives are often framed pri-
The report also highlights “a lack of genuinely inde- marily as sovereign assets, with a focus on national
pendent alternatives to AI infrastructures operated control and expanding capacity as ends in them-
by the largest technology firms, which are over- selves. This perspective risks prioritizing geopolitical
whelmingly headquartered in the USA and China.”174 narratives about AI sovereignty over more mean-
At the same time, compute costs continue to rise ingful questions about who has access to computing
sharply, with spending on AI training runs increasing power and for what purpose.
by a factor of 2.5 annually in recent years.175 Even as
efficiency and optimization improve, total expendi- Given these structural dependencies, most com-
tures remain massive, with some leading AI compa- pute-based pathways toward public AI will likely fall
nies announcing plans to invest hundreds of billions in the middle of the publicness gradient.
of dollars in the coming years.
Policy recommendations for the compute pathway
Due to the high capital requirements and rapid pace include:
of technological change, it is more realistic to pur-
sue public-private partnerships, where governments Public compute for open source AI development
serve as one partner rather than the sole provider.
Aurora GPT – an initiative to build a science-focused The first recommendation is to ensure that fully open
foundation model in the United States – and the Eu- source AI projects – with open, accessible training
ropean AI Factories initiative are both examples of data – have access to sufficient compute resourc-
es. A core interest of any public AI agenda should

174 Matt Davies and Jai Vipra. “Computing Commons.”


be to guarantee that at least one open model exists
Ada Lovelace Institute. 7 February 2025. https://www. with capabilities comparable to the state-of-the-art,
adalovelaceinstitute.org/report/computing-commons/
alongside the institutional capacity to develop and
175 Be Cottier et al. “How Much Does It Cost to Train Frontier AI
Models?” Epoch. 25 January 2024. https://epoch.ai/blog/how-
work with such models (this is discussed further in
much-does-it-cost-to-train-frontier-ai-models the model pathway section). Public supercomputers

63
AI strategy and three pathways to public AI

and publicly supported data center initiatives play an and public research organizations. Many universi-
essential role in making this possible. ties face serious constraints due to the high cost and
limited availability of GPUs, often relying on expen-
A notable example is the collaboration between the sive commercial cloud services. Investing in public
Allen Institute for AI and the LUMI supercomput- supercomputing and data center initiatives is essen-
er in – part of the EU’s EuroHPC AI Factories initia- tial to ensure that cutting-edge AI research does not
tive. In 2024, the Allen Institute released OLMo, a remain confined to closed, private labs but can also
fully open source language model, including its pre- take place in universities and research institutes.
training dataset, trained using LUMI’s computing
power.176 The EuroHPC initiative represents a form of Providing compute access to the academic sector is
semi-public infrastructure, funded through a com- also a way to attract and retain top AI talent with-
bination of EU funds, national budgets and private in public research institutions. Without adequate re-
contributions.177 sources, researchers and students may be drawn to
well-funded corporate labs, limiting the develop-
Another example is the Aurora GPT, a U.S. project led ment of fully open source models and public interest
by Argonne National Lab in partnership with Intel, applications.
which relies on a public-private collaboration using
the Aurora supercomputer and Intel-provided GPUs Improved coordination between public
to develop a foundation model for science. 178
In Eu- compute initiatives
rope, the Barcelona Supercomputing Center recently
released Alia, a Spanish-language model described as Many existing public compute initiatives tend to
open, transparent and public.179 frame infrastructure primarily as a sovereign asset,
emphasizing national control and domestic capac-
These examples demonstrate the potential of pub- ity. However, this sovereignty-first approach risks
lic compute to support open source AI efforts. Howev- fragmenting the broader mission of public AI. A truly
er, sustaining progress at the frontier – and enabling public AI agenda should not reinforce narrow nation-
wide deployment – will require continuous expansion al strategies but instead promote cross-border coop-
and upgrades to public computing capacity. As pro- eration – particularly among democratic nations – to
prietary models continue to advance rapidly, there is a maximize the impact of public investment. Without
growing risk that open models will fall too far behind. coordinated strategies, governments risk duplicat-
To avoid this widening gap, expanded access to public ing efforts, underutilizing resources and falling short
compute for open source AI development is essential. of the scale needed to support open source AI ecosys-
tems and create viable public alternatives to propri-
Public compute for research institutions etary models.

A second strategic pillar of a public AI strategy is ex- One area where improved coordination is urgent is
panding access to compute for academic institutions the AI Factories initiative in Europe, which estab-
lished data centers in strategic locations, such as
large supercomputing hubs. This initiative would
176 Allen Institute for AI, “Hello OLMo: A Truly Open LLM.” Allen
Institute for AI Blog. January 9 January 2024. https://allenai. benefit from deeper collaboration across nation-
org/blog/hello-olmo-a-truly-open-llm-43f7e7359222 al borders to reach its full potential. One propos-
177 European High-Performance Computing Joint Undertaking
al that embodies this approach is the “Airbus for
(EuroHPC JU). “Discover EuroHPC JU.” https://eurohpc-ju.
europa.eu/about/discover-eurohpc-ju_en Accessed 27 April AI”180 model, which advocates pooling resources and
2025. building shared capacity to produce high-perform-
178 Rusty Flint. “AuroraGPT: Argonne Lab and Intel.” Quantum
ing, open AI models. Supporting such collaborative
Zeitgeist. 14 March 2024. https://quantumzeitgeist.com/
auroragpt-argonne-lab-and-intel/
179 Alia. https://www.alia.gob.es/eng/ Accessed 27 April 2025. 180 Daniel Crespo and Mateo Valero. ibid.

64
AI strategy and three pathways to public AI

frameworks is essential to transform scattered infra- Governance and ethical concerns related to data use
structure into a cohesive, strategic and effective pub- for AI training represent another major bottleneck.
lic AI ecosystem. The training of commercial models on the entire-
ty of the public internet – often under unclear legal
conditions and with minimal transparency – re-
Data pathway to public AI flects a lack of proper oversight and results in the ex-
traction of value from global knowledge and cultural
Current approaches to data sources for AI develop- commons.182 Evidence gathered by the Data Prove-
ment oscillate between proprietary control and un- nance Initiative shows that consent for web crawl-
restrained extraction from public sources. Unlike ing is steadily decreasing, especially among domains
the compute pathway, establishing a data commons whose content is used in AI training.183 And recent
is not primarily about investments in technology or data from the Wikimedia Foundation shows that rap-
hardware – it requires better governance of various idly growing automated traffic from web crawlers of
types of data. The data pathway offers opportunities AI companies is becoming a financial burden.184
to create genuine digital public goods with gover-
nance mechanisms that protect against value ex- Data quality presents a further challenge. Unlike
traction and ensure equitable access. This requires compute-related dependencies, poor data quali-
overcoming bottlenecks related to proprietary con- ty may not halt AI development but does undermine
trol, declining consent for data use and insufficient the usefulness and integrity of generative AI solu-
attention to data quality. tions. Training models on publicly available con-
tent often reflects not only poor governance but also
a lack of attention to data quality. Although this issue
Data: bottlenecks has been raised by some experts,185 recurring exam-
ples demonstrate that governance practices remain
While much of AI development is fueled by a culture insufficient.
of open sharing, data practices are largely shaped by
either proprietary control or unregulated use of pub- As a result, the widespread use of large datasets built
licly available sources. Few AI development teams from scraped web content risks reinforcing biases
make meaningful efforts to share high-quality, use- – such as the overrepresentation of well-resourced
ful datasets. At the same time, they seek competi- written languages and dominant cultural narra-
tive advantages through proprietary sources – such tives – exacerbating existing inequalities. In some
as user-generated and personal data from social cases, it has also led to the scaling of harmful con-
networks and online platforms, or data obtained tent, including explicit imagery.186 On a global scale,
through exclusive agreements. Public web con-
tent continues to be crawled and scraped, with at- 182 Paul Keller. “AI, the Commons and the Limits of Copyright.”
Open Future. 7 March 2024. https://openfuture.eu/blog/ai-the-
tempts to filter and improve quality. Yet there are
commons-and-the-limits-of-copyright/
signs that the social contract underpinning the open 183 Shayne Longpre et al. “Compulsory Licensing for Artificial
web is eroding, as content owners increasingly with- Intelligence Models,” arXiv preprint. 24 July 2024. https://
arxiv.org/abs/2407.14933
draw consent for their domains to be crawled. These
184 Birgit Mueller et al. “How Crawlers Impact the Operations of
trends contribute to a negative feedback loop, lead- the Wikimedia Projects,” Diff – Wikimedia Foundation Blog,
ing to what Stefaan Verhulst has described as a “data 1 April 1 2025. https://diff.wikimedia.org/2025/04/01/how-
crawlers-impact-the-operations-of-the-wikimedia-projects/
winter” – a decline in the willingness to see data as a
185 Will Orr and Kate Crawford. “Is AI Computation a Public
resource that can serve the common good.181 Good?” SocArXiv preprint. 2024. https://osf.io/preprints/
socarxiv/8c9uh_v1
186 Abeba Birhane et al. “Multimodal datasets: misogyny,
181 Stefaan Verhulst. “Are We Entering a Data Winter?” Policy pornography, and malignant stereotypes.” arXiv preprint. 5
Labs. 21 March 2024. https://policylabs.frontiersin.org/content/ October 2021. https://arxiv.org/pdf/2110.01963 Accessed 3 April
commentary-are-we-entering-a-data-winter 2025.

65
AI strategy and three pathways to public AI

inadequate data governance is increasingly seen a foundational data source190 – even when their legal
as enabling new forms of data colonialism and ex- status is ambiguous, as seen with the Books3 data-
tractive practices. 187
set. Simply increasing the volume of training data
is neither the only strategy nor the most effective
one. Developers are already focused on improving
Data: opportunities the quality of web-scraped data, as shown by proj-
ects like the FineWeb datasets, which filter and clean
The data pathway to public AI development aims to Common Crawl data.
create a pool of datasets and content collections that
function as digital public goods. While data is not High-quality data sources can support genera-
typically seen as public infrastructure for AI devel- tive AI development at various stages: pretraining,
opment, it can in fact possess public attributes, serve post-training or adaptation, inference and the cre-
public functions and be subject to public control.188 ation of synthetic data.191 Newer approaches to data-
For this to occur, there must be a dual obligation: set development rely less on pretraining and focus
to expand access and to better protect various data more on the post-training phase, which requires data
sources. In the case of data, the key challenges are that cannot be easily “found in the wild” or scraped
less about upstream dependencies and more about from public sources. This includes domain-specif-
downstream risks – specifically, the risk that pub- ic data for fine-tuning specialized models, as well
lic data is extracted for private gain, reinforcing in- as dialogues and task-specific examples used in in-
equality.189 This happens when private actors capture struction tuning. In model distillation, for exam-
the economic value generated by data without giv- ple, developers rely not on new human-generated
ing back to the people and institutions that created or data but on synthetic data generated by a “teacher”
maintained it as a public good. model. Public interventions must therefore address
both the governance of publicly available data and
This means data-sharing efforts must shift focus – the development of specialized datasets and tools
from simply increasing the volume of available data – covered further in the section on additional mea-
to improving its quality and implementing gover- sures, as part of the software and tool ecosystem for
nance mechanisms that ensure equitable, sustainable public AI.
sharing protected from value extraction. As a result,
both data transparency and novel gated access mod- Data is not a homogenous concept, and its use is gov-
els (to protect, for example, personal data rights) are erned by multiple legal frameworks that protect data
becoming central governance issues. In addition, en- rights, including copyright and personal data regu-
suring access to commercial datasets – at least for lations. As such, data sharing exists on a spectrum
research purposes – is a vital reciprocal measure to – from fully open to gated models – each of which
secure private data for public interest uses. can be understood as a form of commons-based gov-
ernance. These approaches are underpinned by key
Model developers have always relied on high-qual- principles: sharing as much data as possible while
ity open datasets. Wikipedia is a prime example of a maintaining necessary restrictions; ensuring trans-
structured, high-quality dataset, and books remain parency about data sources; respecting data subjects’
choices; protecting shared resources; maintaining
187 James Muldoon and Boxi A. Wu. “Artificial Intelligence in the
Colonial Matrix of Power.” Philosophy & Technology 36, no. 80 190 Alek Tarkowski et al. “Towards a Books Data Commons for
(2023). https://doi.org/10.1007/s13347-023-00687-8 Accessed AI Training.”Open Future. 8 April 2024. https://openfuture.
3 April 2025. eu/publication/towards-a-books-data-commons-for-ai-
188 Digital Public Goods Alliance, et al. “Exploring Data as and in training/
Service of the Public Good.” https://www.digitalpublicgoods. 191 Hannah Chafetz, Sampriti Saxena and Stefaan G. Verhulst.
net/PublicGoodDataReport.pdf Accessed 23 April 2025. “A Fourth Wave of Open Data? Exploring the Spectrum of
189 Paul Keller and Alek Tarkowski. “The Paradox of Open.” Open Scenarios for Open Data and Generative AI.” The GovLab. May
Future. https://paradox.openfuture.eu/ Accessed 23 April 2025. 2024. https://www.genai.opendatapolicylab.org/

66
AI strategy and three pathways to public AI

dataset quality; and establishing trusted institutions open access. Notable examples include the Human
to steward them. Genome Project, CERN’s Open Data portal and NA-
SA’s Earthdata platform.
Policy recommendations for the data pathway in-
clude: Public data commons

Datasets as digital public goods A public data commons is a data governance frame-
work that aims to secure public interest goals
Open data – such as Wikimedia content – has been through commons-based management of data.195
a foundational resource in the development of gen- These commons complement open data approach-
erative AI models. Using openly licensed data and es and are particularly well suited for cases involv-
content for AI training offers the advantage of legal ing sensitive data, where rights must be protected, or
certainty when developing models. Research by Gov- where economic factors tied to dataset creation and
Lab suggests that “the intersection of open data– maintenance must be considered.
specifically open data from government or research
institutions –and generative AI can not only improve Public data commons should be governed by three
the quality of the generative AI output but also help core principles:
expand generative AI use cases and democratize open
data access.”192 Open data therefore has the poten- • Stewarding access through clear sharing frame-
tial to support public functions of generative AI in works and permission interfaces
solutions that address global challenges like climate
change or healthcare.193 • Ensuring collective governance through defined
communities, trusted institutions and democratic
Using open datasets for AI training also offers the control
possibility of moving beyond current norms of model
openness, in which model weights are shared but • Generating public value through mission-orient-
training data remains closed and nontransparent. ed goals and public interest-oriented licensing
Open data enables the development of fully open AI models.
models.194 Ongoing efforts to build such models are
undertaken by organizations like EleutherAI, Spawn- To establish data commons for AI training, edicat-
ing and the Allen Institute for AI. ed public institutions are needed to act as trusted in-
termediaries. These institutions must also possess the
Public AI policies can build on more than a decade technical capabilities to build hosting platforms for
of experience developing open data infrastructure. modern training datasets. Public data commons serve
However, the approach must shift from simply re- a gatekeeping role, supporting various data types and
leasing as much data as possible to intentionally implementing flexible governance – from open access
creating high-quality, purpose-built datasets for AI to gated models that preserve individual and collec-
training. Beyond just training foundation models, tive data rights. Work on data commons is often mo-
many initiatives already provide open data for com- tivated by the need to protect community-owned data
putational research and demonstrate the value of from exploitation. Notable examples include the Māori
language datasets and AI tools developed by Te Hiku
Media, as well as African language datasets curated by
192 Digital Public Goods Alliance, ibid.
193 Hannah Chafetz, Sampriti Saxena and Stefaan G. Verhulst. ibid.
Common Voice and the African Languages Project.
194 Stefan Baack, et al. “Towards Best Practices for Open Datasets
for LLM Training Proceedings from the Dataset Convening.” 195 Alek Tarkowski and Zuzanna Warso. “Commons-based data
Mozilla. 13 January 2025. https://foundation.mozilla.org/en/ set governance for AI.” Open Future. 21 March 2024. https://
research/library/towards-best-practices-for-open-datasets- openfuture.eu/publication/commons-based-data-set-
for-llm-training/ governance-for-ai/ Accessed 3 April 2025.

67
AI strategy and three pathways to public AI

Model pathway to public AI French government’s support of the BLOOM model


through the Jean-Z supercomputer – deployment
Bottlenecks related to model development largely remains a major hurdle.
stem from previously described challenges at the
compute and data layers. Under the transformer At the model layer, openness and access vary widely.
paradigm, limited computing power and restricted There is a gradient of release strategies, ranging
access to training data constrain the ability to build from fully open source models to API-based access
capable models. At the same time, this layer is char- to fully closed models.196 The dominant AI labs that
acterized by strong norms of open sharing – of mod- build state-of-the-art models adopt various strate-
els and of related tools and components. However, gies, and although some have moved toward greater
bottlenecks within this layer primarily relate to the openness in the last year, none have released mod-
lack of highly capable open models that could serve els that meet the definition of open source AI. At best,
as a “capstone” for a broader ecosystem of open some share open weights but fail to disclose training
model development. data or provide transparency around training pro-
cesses. Among the more open actors, DeepSeek is the
In contrast, there are many smaller models that meet only lab that consistently releases open weights for
high standards of public control and public attri- all models. Mistral and Alibaba Labs release some
butes, yet they often remain primarily research or models in this way, and Meta has shared several
engineering artifacts. Building state-of-the-art open models under restrictive licenses. Other companies
models – such as in the case of DeepSeek – still re- have released only specific models – typically small-
quires significant compute resources, which remain er or task-specific ones – such as Google’s Gemma
largely accessible only to well-funded commercial and BERT, OpenAI’s Whisper or Microsoft’s Phi.
actors. While notable exceptions exist – such as the
fully open source OLMo models released by the Allen As explained previously, model development – es-
Institute for AI – these remain rare. As a result, there pecially among labs with limited access to compute
are even fewer downstream initiatives focused on – often involves creating derivatives from openly
developing tailored, public interest applications built shared models or architectures. This leads to depen-
on top of open models. dencies on major commercial players such as Meta,
Mistral, Alibaba Labs or DeepSeek. In each case, in-
complete transparency around training data and
Models: bottlenecks methods hinders further research and development.

Public AI policy goals at the model layer should cen- Over the last two years, multiple open small mod-
ter on ensuring the active development of open AI els have been released, often developed to address
models that can be deployed for public interest uses. linguistic or regional gaps left by major AI labs. Ex-
Under the dominant transformer paradigm, model amples include the SEA-LION models created by AI
development faces serious constraints related to Singapore; Aya, a family of multilingual models from
compute and to the lack of sufficient, high-quality Cohere; and Bielik, a Polish LLM built by the grass-
training data. roots Spichlerz initiative. However, these developers
typically lack the resources needed to sustain long-
Building foundation models from scratch is not only term development or deployment of their models,
extremely expensive, but also subject to rapid ob- limiting the impact of these alternatives.
solescence as state-of-the-art capabilities are ad-
vancing quickly. This makes it difficult to justify While models themselves are a key component facil-
large-scale public investments aimed at directly itating further AI development, open access to ad-
competing with commercial AI labs. Even when pub-
lic compute resources are made available – as in the 196 Irene Solaiman. ibid.

68
AI strategy and three pathways to public AI

ditional AI model components is just as important. stone model” and the creation of derivative small
As noted in the Model Openness Framework197 and models that are more sustainable and suited to spe-
the Framework for Openness in Foundation Models cific needs.
by the Columbia Convening on Openness and AI,198
openness in AI goes beyond the architecture and pa- Policy recommendations for the model pathway in-
rameters of individual models. It also includes the clude:
code and datasets used to train, finetune or evaluate
models.199, 200 Their development faces similar con- Provision of a capstone model
straints, typical for many open source projects, also
beyond AI development. Most so-called open models today fall short of genu-
ine open source standards – often omitting training
data, critical documentation or transparency around
Models: opportunities the training process. This undermines scientific re-
producibility and makes it difficult to audit or assess
These constraints suggest that public AI initiatives model bias. As a result, critical infrastructure is in-
need a different approach than competing directly creasingly shaped by private actors without demo-
with commercial AI development. Public AI strate- cratic oversight. While this is particularly true at the
gy should not aim to engage in an expensive race to frontier, not all cutting-edge models are developed
build the largest and most capable models – a strat- by private firms – for example, the Allen Institute for
egy that is neither realistic nor sustainable without AI’s OLMo 2 demonstrates that open and transparent
major increases in public compute capacity. In- alternatives are possible.
stead, the focus should be on fostering an ecosystem
of competitive open AI models and the components A robust public AI strategy should foster the devel-
needed to build them. opment and long-term sustainability of high-per-
formance, openly available models, ensuring a rich
The rapid pace of technological development means ecosystem of public alternatives to proprietary sys-
that the resources required to develop or deploy tems. Among these, governments should prioritize
models can shift quickly. With new AI paradigms, the the creation of at least one “capstone model” – a
demands for successful model development or de- permanently open, democratically governed model
ployment may change at any moment. For instance, that aspires to remain at or near the frontier of AI
recent advances in model distillation have made it capabilities. This model would serve not only as a
possible to create small, efficient models at relatively flagship public asset but also as a foundation for
low cost – models that can effectively compete with broad-based research, innovation and deployment.
earlier generations of large models. While multiple such models may emerge, the cap-
stone model would serve as a strategic anchor. Be-
Model-based pathways to public AI must be ground- cause open source is a global endeavor, adherence to
ed in a clear understanding of the technological open source standards should take precedence over
advances that enable more affordable yet capable the model’s country of origin.
models, along with targeted investments that sup-
port the entire open source AI ecosystem and public However, due to the speed of innovation in AI, build-
interest innovation. These pathways should support ing a state-of-the-art model carries the risk of rapid
both the development of a state-of-the-art “cap- obsolescence, as new breakthroughs by leading com-
mercial labs may quickly outpace public efforts. As

197 Matt White, et al. ibid.


discussed below, complementary options include in-
198 Adrien Basdevant, et al. ibid. vestments in small and domain-specific models as
199 Matt White, et al. ibid. well as in capabilities (i.e., human capital) in AI
200 Adrien Basdevant, et al. ibid. research and development.

69
AI strategy and three pathways to public AI

To function as a sustainable and reliable foundation Sustainable and open AI development


for a broader ecosystem, the capstone model must ecosystem
be provisioned as a permanent public good. A public
AI strategy must secure funding not only for model Public AI strategies should include targeted invest-
development but also for long-term deployment, ments in various digital public goods as key software
particularly to cover inference compute costs. components of AI development. Despite their critical
role, many widely used software tools – from Python
Development of small and domain-specific libraries for data preparation like pandas to machine
models learning libraries like scikit-learn and deep learning
frameworks like PyTorch – struggle with sustain-
Limited public compute resources can be strategical- ability, even as they deliver substantial public and
ly directed toward targeted interventions that deliver economic value.
public value or shape market dynamics. Small lan-
guage models that address specific linguistic or cul- Governments can play a key role by funding the
tural gaps exemplify public AI initiatives designed maintenance, security and advancement of these
not to compete directly in commercial markets but tools. Some have already begun: Germany’s Sover-
to create alternatives that generate public value. eign Tech Fund supports core Python libraries, and
France’s national AI strategy committed €32 million
For instance, the Southeast Asian Languages in One to the further development of scikit-learn and the
Network (SEA-LION), developed by AI Singapore, broader data science commons.201
focuses on building domain-specific models tailored
to Southeast Asian languages and cultural contexts – There is also a growing need to invest in open-ac-
including Burmese, Chinese, English, Filipino, cess AI safety research and open source tooling that
Indonesian, Khmer, Lao, Malay, Tamil, Thai and facilitate safe and responsible development of public
Vietnamese – thus addressing needs overlooked by AI systems. Projects like Inspect (UK AI Safety Insti-
global commercial AI development. Similarly, AINA tute), Compl-AI (ETH Zurich), and ROOST (launched
is a project led by the Catalan government that sup- at the 2025 AI Action Summit) are examples of pub-
ports the development of AI models in the Catalan licly oriented tools that help assess and improve AI
language to contribute to cultural preservation. safety, compliance and alignment.

There is also a need to conduct research into alter- Finally, open benchmarks are essential for measur-
native development approaches beyond the trans- ing model capabilities and societal impact. While
former architecture and its scaling laws. Innovation current benchmarks often focus on technical perfor-
should focus on creating less resource-intensive AI mance, new benchmarks should evaluate how well AI
technologies. Investing in alternative model archi- systems serve public goals, particularly within regu-
tectures is not merely a technical curiosity – it is a lated industries where the stakes for public safety or
strategic necessity for ensuring the sustainability general interest are high.
of AI development. By reducing the computational
burden, these alternative models could lower energy
demands and operational costs, making advanced AI
capabilities more accessible to public institutions and
research organizations.

201 Ministère de l‘Enseignement supérieur, de la Recherche et de


l‘Innovation. “Stratégie nationale pour l‘intelligence artificiele
– 2e phase.” 8 November 2021. https://www.enseignementsup-
recherche.gouv.fr/sites/default/files/2021-11/dossier-de-
presse---strat-gie-nationale-pour-l-intelligence-artificielle-
2e-phase-14920.pdf Accessed 23 April 2025.

70
AI strategy and three pathways to public AI

Additional measures els demonstrate how cross-sector investments in


human capital can anchor a resilient and inclusive
Aside from policies that build public AI infrastruc- public AI ecosystem.
tures and capacities through the three pathways,
several other measures are necessary for public AI Support paradigm-shifting innovation
policies to succeed. These are mainly aimed at sup-
porting an ecosystem with sufficient talent and ca- Much of this report centers on proposing pathways to
pacity to innovate, tools and resources to sustain develop public generative AI, taking into account the
collaboration, and the ability to build impactful realities of the current development paradigm, based
solutions. on the transformer architecture and related scaling
laws. These are the root cause of dependencies at the
Invest in AI talent and capabilities compute layer that make full public AI hard to attain.
Public AI strategy should therefore also focus on
Investing in AI capabilities, knowledge and skills supporting paradigm-shifting innovation. Efforts to
across the workforce and future generations is a cor- design new AI model architectures and make AI solu-
nerstone of any effective public AI strategy. These tions more energy-efficient are a vital element of a
efforts not only enable the development of public public AI strategy, as they could change the overall
generative AI models but also ensure that a wide conditions for AI development by shifting depen-
range of stakeholders – research institutions, SMEs, dencies in the AI stack. Changes in AI development
public agencies and non-digital sectors – can adopt, paradigms and the underlying economics related to
adapt and influence AI technologies in ways that hardware and infrastructure could eventually en-
align with local needs. able the provision of public compute and cloud infra-
structures. Such interventions would allow solutions
Far from being secondary to model development, that are today semi-public to become fully public, as
capability building is one of the most strategic roles dependencies on commercial hyperscalers and com-
governments can play, especially for smaller coun- pute would decrease. Investments in public compute
tries. While these countries may not lead in training capacities, such as supercomputing centers, should
state-of-the-art models, they can leverage open be coupled with a research agenda on new, more sus-
foundation models and build locally relevant AI ap- tainable paradigms of AI development.
plications – provided they invest in a strong base of
local developers, researchers and institutions. Unlike Invest in software and tools for the AI
capital-intensive model training, capability building ecosystem
offers sustainable, adaptable foundations for innova-
tion that can evolve with the technology. Software has previously been defined as a distinct
transversal layer that cuts across the other layers of
This requires public investment in skills, research the AI stack and plays a key role in all of them. Soft-
and institutional support – from grants for public ware is necessary to manage computing hardware,
interest research agendas to funding the creation build and work with massive datasets and train or
of dedicated institutes with clear mandates. For deploy generative AI models. For this reason, pub-
example, the UK’s AI Safety Institute, launched in lic AI policy should include funding for open source
2023, combines public funding, access to national software development. This could take the form of
compute resources and applied safety research to a public AI infrastructure fund,202 modeled on best
support responsible AI development. Similarly, practices like the German Sovereign Technology
Taiwan’s AI Academy, funded by the private sector,
addresses a talent gap in the workforce through
202 Paul Keller. “European Public Digital Infrastructure Fund.”
training programs and industry-academic collabo- Open Future. 27 February 2023. https://openfuture.eu/
ration, supported by access to compute. These mod- publication/european-public-digital-infrastructure-fund/

71
AI strategy and three pathways to public AI

Fund, which has been investing in Open Digital In- Build AI deployment pathways
frastructure for AI;203 the provision of targeted fund-
ing through existing public funding bodies, such as In principle, public infrastructures generate spill-
the aforementioned Fund or the UK Research and In- over effects – also known as positive externalities
novation (UKRI); 204 or direct funding for open source – through their use by a wide range of actors. At the
software through AI strategies, such as France’s same time, applications and solutions built on top
funding for scikit-learn and the broader data science of these infrastructures represent a more direct re-
commons in its national AI strategy. 205
alization of public AI’s value, as they can address
real-world problems and deliver tangible social ben-
Unlike hardware-focused computing investments, efits. These applications also provide essential feed-
software development can deliver outsized impacts back that can inform and improve the underlying
with relatively modest funding. Such software de- infrastructure.
velopment supports the open source AI ecosystem,
ensures collaboration and supports capacity devel- Because AI is a general-purpose technology, public
opment. AI infrastructure is abstract and broadly applicable
across many, if not all, spheres of life. To fulfill the
This recommendation is not limited to software but potential of investments in this infrastructure, spe-
also includes other tools such as evaluation frame- cific solutions need to be developed. Without them,
works, benchmarks and development environments. there is a risk that the capacities and value embedded
It can also entail the development of novel tools, such in public AI – such as datasets – will be captured by
as frameworks for democratic inputs to post-train- commercial actors and repurposed for private gain.
ing.206 In each case, public funding of open source Applications built on top of the public AI stack can
software development would lower barriers to entry address problems that are underserved by commer-
and ensure that no new bottlenecks are created due cial AI development. The focus should therefore be
to proprietary control over key building blocks (as in on advancing public interest goals – areas where pri-
the case of CUDA). Special attention should be given vate industry lacks incentives to invest, or where
to supporting underserved components of the stack, there is a risk of value capture by commercial actors
including data processing pipelines, model evalua- benefiting from first-mover advantage and network
tion suites and tools supporting public accountability effects.207
in deployment. Support should extend beyond initial
development to include the maintenance of key soft- Recent efforts to build public interest applications
ware and tools, treated as digital public goods. include GovLab’s New Commons Challenge, which
promotes responsible reuse of data for AI-driven
local decision-making and humanitarian response;
Gooey.ai’s Workflow Accelerator, which helps orga-
nizations develop AI assistants for farmers, nurses,
203 Adriana Groh. “AI Sovereignty Starts with Open Infrastructure.”
Sovereign Tech Agency. 27 February 2025. https://www. technicians and other frontline workers; and
sovereign.tech/news/ai-sovereignty-open-infrastructure/ EarthRanger, an AI-powered wildlife conservation
204 Tom Milton, Cailean Osborne, Matt Pickering. “A UK platform stewarded by the Allen Institute for AI.
Open-Source Fund to Support Software Innovation and
Maintenance.” Centre for British Progress. 17 April 2024.
https://britishprogress.org/uk-day-one/a-uk-open-source-
fund-to-support-software-innovati
205 “Stratégie nationale pour l‘intelligence artificiele – 2e phase.”
8 November 2021. https://www.enseignementsup-recherche.
gouv.fr/sites/default/files/2021-11/dossier-de-presse---
strat-gie-nationale-pour-l-intelligence-artificielle-2e-
phase-14920.pdf Accessed 23 April 2025.
206 “A Roadmap to Democratic AI.” The Collective Intelligence
Project. March 2024. https://www.cip.org/research/ai-roadmap 207 Nik Marda, Jasmine Sun and Mark Surman, ibid.

72
AI strategy and three pathways to public AI

Coda: mission-driven public AI dation model. While avoiding the hype driving much
policy of commercial AI development, public AI policy
should be grounded in a careful analysis of both the
The characteristics of public AI outlined in chapter demand for and supply of AI capabilities. In this case,
3 remain constant regardless of an AI system’s ar- provisioning a robust, state-of-the-art model as an
chitecture or capabilities. Public AI should be open open source technology would fill a critical market
and accessible, create public value and remain under gap, as commercial models typically offer, at best,
public control. In this sense, the vision of public AI is open weights without transparent documentation on
intentionally technologically agnostic. training data or development processes.

However, public AI policy must also offer clarity on As outlined in the model pathway above, we recom-
the kinds of technologies needed to serve the public mend supporting both the creation of a public foun-
interest. This requires the policy debate to engage – dation model and the development of various small
at least to some degree – with fundamental ques- models. This two-pronged approach fosters an open
tions about the types of AI systems being developed source AI ecosystem that benefits from a central, ca-
and deployed. pable model while also investing in more specialized,
sustainable solutions. The exact balance between
Specifically, public AI policy needs to reckon with the these two directions should be guided by a more de-
idea of Artificial General Intelligence (AGI), a vision tailed analysis of both the economics of AI develop-
promoted by many leading commercial AI labs. The ment and the needs of the public.
fuzzy and controversial term is typically used to de-
scribe AI systems that equal or surpass human intel- Currently, AI strategies often lack specificity on ei-
ligence. OpenAI, for instance, defines AGI as “highly ther the societal needs AI should address or the tech-
autonomous systems that outperform humans at nical capabilities required to meet them. Instead,
most economically valuable work.” 208 Policymakers public investments in AI are frequently motivated by
should approach the concept of AGI with caution, as a generalized belief that industrial policy must sup-
it often feeds into hype cycles and fosters a sense of port disruptive innovation as a remedy for econom-
technological determinism. 209
ic stagnation. Too often, these investments follow
the demands of industry, rather than focusing on the
An alternative is to treat AI as cultural technologies 210
real, everyday needs of citizens.212
– or “normal AI:” 211 technologies that that may fun-
damentally shape our societies without necessarily Proposals for public investment in computing power
exhibiting superhuman capabilities. illustrate this problem. These often lack a clear anal-
ysis or justification for the types of AI systems to be
The public AI strategy proposed in this report in- developed or the computing resources required. As
cludes the development of a state-of-the-art foun- a result, public AI policy risks replicating the same
unsustainable “more is better” investment logic that
208 Lauren Leffer. “In the Race to Artificial General Intelligence, drives much of the commercial AI sector. A shift
Where’s the Finish Line?.” Scientific American. 25 June 2024.
toward building AI as public digital infrastructure
https://www.scientificamerican.com/article/what-does-
artificial-general-intelligence-actually-mean/ – guided by the principles outlined in this report –
209 Zuzanna Warso. “The Digital Innovation We Need. Three offers a way to avoid these pitfalls and align AI de-
lessons on EU Research and Innovation funding.” Open Future.
velopment with the public good.
12 November 2024. https://openfuture.eu/publication/the-
digital-transformation-we-need/
212 Zuzanna Warso and Meret Baumgartner. “Putting money
210 Henry Farrell, et al. “Large AI models are cultural and social
where your mouth is? Insights into EU R&I funding for digital
technologies.” Science. 13 March 2025, Vol 387, Issue 6739, pp.
technologies.” Critical Infrastructure Lab. 2025. https://
1153-1156. https://www.science.org/stoken/author-tokens/ST-
openfuture.eu/publication/putting-money-where-your-
2495/full
mouth-is/
211 Arvind Narayanan and Sayash Kapoor. ibid.

73
Address | Contact

Bertelsmann Stiftung
Carl-Bertelsmann-Straße 256
33311 Gütersloh
Phone +49 5241 81-0
www.bertelsmann-stiftung.de

Dr. Felix Sieker


Project Manager
Digitalization and the Common Good
Phone +49 30 275788-156
felix.sieker@bertelsmann-stiftung.de

Supported by Commissioned by

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy