Public AI 2025
Public AI 2025
Supported by Commissioned by
Legal notice
Commissioned by Rights
© Bertelsmann Stiftung, Gütersloh The text of this publication is licensed under the
May 2025 Creative Commons Attribution 4.0 International
License. You can find the complete license text
Publisher at: https://creativecommons.org/licenses/by/4.0/
Bertelsmann Stiftung legalcode.en
Carl-Bertelsmann-Straße 256
33311 Gütersloh
Phone +49 5241 81-0
www.bertelsmann-stiftung.de The infographics are licensed under the Creative
Commons Attribution-NonCommercial-NoDerivatives
Supported by 4.0 International License. You can find the complete
Open Future license text at: https://creativecommons.org/licenses/
by-nc-nd/4.0/legalcode.en
Authors
Dr. Felix Sieker, Bertelsmann Stiftung
Dr. Alek Tarkowski, Open Future
Lea Gimpel, Digital Public Goods Alliance The visualizations are not meant to be exhaustive.
Dr. Cailean Osborne, Oxford Internet Institute All logos are excluded, as they are protected by
copyright, not covered by the above-mentioned CC
Responsible license, and may not be used.
Dr. Felix Sieker, Bertelsmann Stiftung
Recommended citation style
Editing Sieker/Tarkowski/Gimpel/Osborne (2025). Public AI –
Barbara Serfozo, Berlin White Paper. Bertelsmann Stiftung. Gütersloh.
Reviewer list
Albert Cañigueral, Barcelona Supercomputing Center
Amin Oueslati, The Future Society
Ben Burtenshaw, Hugging Face
Brandon Jackson, Public AI Network
Huw Roberts, Oxford Internet Institute, University of
Oxford
Isabel Hou, Taiwan AI Academy
Jakob Mökander, Digital Ethics Center, Yale University
Jennifer Ding, Boundary Object Studio
Laura Galindo, AI policy expert
Luca Cominassi, AI policy expert
Martin Hullin, Bertelsmann Stiftung
Martin Pompéry, SINE Foundation
Marta Ziosi, Oxford Martin AI Governance Initiative,
University of Oxford
Paul Keller, Open Future
Paul Sharratt, Sovereign Tech Agency
Ravi Iyer, USC Marshall
Yacine Jernite, Hugging Face
Zoe Hawkins, Tech Policy Design Institute
Supported by Commissioned by
Table of contents
Preface 6
Executive summary 8
Glossary 11
1 | Introduction 13
4
Table of contents
5
Preface
Artificial Intelligence stands at a pivotal crossroads. Grounded in a realistic analysis of the constraints
While its potential to transform society is immense, across the AI stack – compute, data and models – the
the power to shape its trajectory is becoming in- paper translates the concept of Public AI into a con-
creasingly concentrated. Today, a small number of crete policy framework with actionable steps. Cen-
dominant technology firms hold sway not only over tral to this framework is the conviction that public AI
the most advanced AI models but also the founda- strategies must ensure the continued availability of
tional infrastructure – compute capacity, data re- at least one fully open-source model with capabili-
sources and cloud platforms – that makes these ties approaching those of proprietary state-of-the-
systems possible. This consolidation of influence art systems. Achieving this goal requires three key
represents more than a market imbalance; it poses a actions: coordinated investing in the open-source
direct threat to the principles of openness, transpar- ecosystem, providing public compute infrastructure,
ency and democratic accountability. and building a robust talent base and institution-
al capacity.
When only a handful of actors define how AI systems
are built and used, public oversight erodes. These It calls for the continued existence of at least one
systems increasingly reflect the values and econom- fully open-source model near the frontier of capa-
ic incentives of their creators, often at the expense of bility and lays out three imperatives to achieve this:
inclusion, accountability and democratic oversight. strengthening open-source ecosystems, investing in
Without intervention, these trends risk entrenching public compute infrastructure, and building the tal-
structural inequities and shrinking the space for al- ent base to develop and use open models.
ternative approaches.
To guide implementation, the paper introduces the
This white paper outlines a strategic countervision: concept of a “gradient of publicness” to AI poli-
Public AI. It proposes a model of AI development and cy – a tool for assessing and shaping AI initiatives
deployment grounded in transparency, democrat- based on their openness, governance structures, and
ic governance and open access to critical infrastruc- alignment with public values. This framework en-
ture. Public AI refers to systems that are accountable ables policymakers to evaluate where a given initia-
to the public, where foundational resources such as tive falls on the spectrum from private to public and
compute, data and models are openly accessible and to identify actionable steps to increase public benefit.
every initiative serves a clearly defined public pur-
pose. We extend our sincere thanks to Alek Tarkowski, Lea
Gimpel and Cailean Osborne for their valuable in-
sights and contributions to this work.
6
Preface
Martin Hullin
Director
Digitization and the Common Good
Bertelsmann Stiftung
7
Executive summary
Today’s most advanced AI systems and foundation reality. In particular, it advances this timely conver-
models are largely proprietary and controlled by a sation by making the following two novel contribu-
small number of companies. There is a striking lack tions.
of viable public or open alternatives. This gap means
that cutting-edge AI remains in the hands of a select A vision for Public AI grounded in the reality of
few, with limited orientation toward the public inter- the AI stack
est, accountability or oversight.
A vision for public AI needs to take into account to
Public AI is a vision of AI systems that are meaning- day’s constraints at the compute, data and model
ful alternatives to the status quo. In order to serve layers of the AI stack, and offer actionable steps to
the public interest, they are developed under trans- overcome these limitations. This white paper offers
parent governance, with public accountability, eq- a clear overview of AI systems and infrastructures
uitable access to core components (such as data and conceptualized as a stack of interdependent ele-
models), and a clear focus on public-purpose func- ments, with compute, data and models as its core
tions. layers. It also identifies critical bottlenecks and de-
pendencies in today’s AI ecosystem, where depen-
In practice, public AI projects ensure that the public dency on dominant or even monopolistic commercial
has both insight into and influence over how AI sys- solutions constrains development of public alterna-
tems are built and used. They aim to make the key tives. It highlights the need for policy approaches
building blocks – data, open source software and that can orchestrate resources and various actors
open models – accessible to all on fair terms. Cru- across layers, rather than attempting complete verti-
cially, public AI initiatives are oriented toward broad cal integration of a publicly owned solution.
societal benefit, rather than private gain.
To achieve this, it proposes three core policy recom-
Over the past year, momentum behind Public AI pro- mendations:
posals has been steadily growing, with a series of
influential reports and initiatives by Public AI Net- 1. Develop and/or strengthen fully open source
work, Mozilla and the Vanderbilt Policy Accelerator models and the broader open source ecosystem
demonstrating the importance of this approach. And
even more importantly, various initiatives are devel- 2. Provide public compute infrastructure to support
oping various components and whole AI systems that the development and use of open models
fulfill this vision of Public AI.
3. Scale investments in AI capabilities to ensure that
This white paper builds on earlier proposals for Pub- sufficient talent is developing and adopting these
lic AI and is aimed at policymakers and funders, with models
the goal of helping to turn the vision of Public AI into
8
Executive summary
In order to achieve this, complementary pathways The “gradient of publicness”: A framework for
for Public AI development need to be pursued, fo- Public AI
cused on the three core layers of the AI stack: com-
pute, data, and models: The white paper also offers a “gradient of public
ness” framework, rooted in Public Digital Infra
1. Compute Pathway: It focuses on providing strate- structure principles. This framework can guide
gic public computing resources, particularly decision-making around investments in AI infra
supporting open-source AI development. Key structure and help increase public value while
recommendations include ensuring computing acknowledging existing constraints and limitations
access for fully open projects, expanding compute to building fully Public AI.
for research institutions, and improving coordi-
nation between public compute initiatives. This framework maps AI interventions along a con-
tinuum – from fully public to fully private – based
2. Data Pathway: It emphasizes creating high- on their attributes (e.g. accessibility, openness, in-
quality datasets as digital public goods through teroperability), functions (e.g. enabling social or eco-
commons-based governance. This includes de- nomic goals) and modes of control (e.g. democratic
veloping datasets as publicly accessible resources governance and accountability). It serves as both
while protecting against value extraction, and es- a diagnostic and strategic tool for assessing where
tablishing public data commons with appropriate an intervention falls along this continuum, and for
governance mechanisms. identifying interventions that could strengthen its
public value.
3. Model Pathway: It centers on fostering an eco-
system of fully open source AI models, including The gradient of publicness consists of the following
both a state-of-the-art “capstone model” and six distinct levels, each representing different de-
specialized smaller models. The strategy empha- grees of public attributes, functions, and control:
sizes building sustainable open source AI devel-
opment capabilities rather than simply competing Level 1: Commercial provision of AI components
with commercial labs. with public attributes
Commercial entities develop and share open source
Several additional measures are highlighted that do components (e.g., Meta’s open-sourcing of PyTorch)
not fit within one of the three pathways, but help se- with high public accessibility but limited public func-
cure key public interest goals. This includes invest- tion and control.
ing in AI talent and capabilities to develop and deploy
AI systems in the public interest, supporting par- Level 2: Commercial AI infrastructure with public
adigm-shifting innovation toward more efficient attributes and functions
technologies, funding open-source software and Privately controlled platforms like Hugging Face
tools, and building effective deployment pathways Hub that democratize access to AI tools while main-
for public AI applications. taining commercial oversight but serving public
interest goals.
This approach acknowledges the importance of the
various layers and different paths that can be pur- Level 3: Public computing infrastructure
sued to attain Public AI. It also argues for coordinated Government-funded supercomputers and data cen-
interventions across the entire AI stack, orchestrated ters (e.g., EU AI Factories) that provide computing
by new public institutions capable of managing de- resources through public-private partnerships with
centralized AI development ecosystems. moderate to high public control.
9
Executive summary
Level 4: Public provision of AI components If you are a policymaker or funder seeking concrete
Publicly funded datasets, benchmarks, and tools policy or funding guidance:
(e.g., Mozilla’s Common Voice) developed specifical-
ly as digital public goods with high public control and • Begin with the Introduction, then focus on chap-
clear public functions. ter 4 for the gradient of publicness framework and
chapter 5 for specific policy recommendations.
Level 5: Full-stack public AI infrastructure built with
commercial compute
AI systems like the OLMo model by Allen Institute for
AI that are fully open source but rely on commercial
computing infrastructure, limiting public control at
the compute layer.
Reading Guide
10
Glossary
11
Glossary
12
1 | Introduction
Artificial Intelligence (AI) stands as one of the most Emerging concentrations of power
prominent and potentially transformative technologies
of the past decade. With the rapid ascent of new indus- Three years on, concentrations of power in AI have
try leaders like OpenAI and Anthropic, alongside a stra- only deepened. A small group of dominant technol-
tegic pivot toward AI by incumbents such as Microsoft, ogy companies now control not only most state-
Google and Meta, AI is increasingly shaping the future of-the-art AI models, but also the foundational
of sectors ranging from healthcare and finance to edu- infrastructure that shapes the field. This emerging
cation and government. At the same time, critics warn AI oligopoly builds on existing monopolies in cloud
that AI may be yet another hype-driven, extractive and computing and digital platforms, reinforcing the
unsustainable technology lacking a clear social purpose. dominance of hyperscalers and platform giants.
Public debate often swings between uncritical en- This concentration is not merely structural – it has
thusiasm for AI and deep concern over its existential far-reaching consequences. When a handful of ac-
risks. Despite these polarized narratives, the under- tors define how AI is built and deployed, the benefits
lying reality is that AI technologies are becoming are captured by the few, while the risks are borne by
deeply embedded in social, economic and political the public. AI systems increasingly reflect the values,
systems. Hype or not, AI is likely to remain a lasting incentives and worldviews of their creators, often at
force – making it all the more urgent to shape its the expense of public inclusion, accountability and
development around clear public values. democratic oversight.
In late 2022, just as commercial AI labs were launch- Compounding the issue is the rapid pace of AI devel-
ing the first public-facing applications based on gen- opment, characterized by two destabilizing trends.
erative AI models, Mariana Mazzucato and Gabriela First, corporate competition has triggered a relent-
Ramos published an op-ed arguing for public policies less race toward ever more capable AI models, often
and institutions “designed to ensure that innova- framed as a pursuit of so-called “superintelligence.”
tions in AI are improving the world.” They warned, As a result, billions are being invested in systems de-
instead, that a new generation of digital technologies signed to achieve commercial dominance, often with
is being “deployed in a [policy] vacuum.” According little regard for societal benefit or safety. These in-
to Mazzucato and Ramos, public interventions are vestments in compute infrastructure frequently
essential to steer the technological revolution in a lack a clear articulation of the public needs they are
direction that turns technical innovation into out- meant to address, and can have significant environ-
comes that serve the public interest.1 mental costs.
1 Mariana Mazzucato and Gabriela Ramos. “AI in the Common Second, geopolitical shifts are pushing states to treat
Interest.” Project Syndicate, 26 Dec. 2022. https://www. AI as a zero-sum game. Governments are engaging
project-syndicate.org/commentary/ethical-ai-requires-state-
regulatory-frameworks-capacity-building-by-gabriela-
in AI nationalism – walling off innovation behind
ramos-and-mariana-mazzucato-2022-12 borders in the name of digital sovereignty – rather
13
Introduction
than fostering international cooperation. Sovereign perimentation and adaptation, especially for aca-
AI strategies often mirror the priorities of dominant demic researchers, startups and non-commercial
commercial actors, focusing heavily on large-scale applications.
investments in compute infrastructure under the as-
sumption that public benefits will follow. Howev- While open source models do not eliminate market
er, this approach risks entrenching existing power concentration, they can reduce dependency on a few
asymmetries and deepening dependencies on the dominant industry players and allow public interest
providers of critical AI components. applications to be developed without restrictive
licenses or opaque constraints. In this way, open
As the AI Now Institute has observed, current AI pol- source AI serves as critical infrastructure for aligning
icy tends to “conflate public benefit with private AI development with democratic values and public
sector success,” leading to strategies that priori-
2
goals.
tize industrial competitiveness over accountability,
transparency or equitable access. Without a course Moreover, visions for public AI emphasize the need
correction, this trajectory could lock in structural to strengthen the broader ecosystem of AI infra-
inequities and limit democratic control over founda- structure and tools. Without a sustainable open
tional AI systems. source AI ecosystem, efforts to keep AI development
transparent and aligned with the public interest will
The need for a countervision: Public AI remain fragile and limited.
In response to this growing imbalance, interest is Two urgent realities demand immediate action. First,
building around alternatives to the dominant com- the window for intervention is closing. As a handful
mercial AI paradigm – a concept broadly described of corporate and state actors consolidate control over
as public AI. While the definition of public AI remains critical infrastructure and resources – such as com-
fluid, it generally refers to AI systems developed with pute, datasets and talent – the cost of building al-
transparent governance, public accountability, open ternatives continues to rise. Second, the stakes are
and equitable access to core components such as global. AI’s impact on labor, healthcare, education,
data and models, and clearly defined public-purpose the environment and democracy transcends borders.
functions. Without proactive measures, its benefits will accrue
to a privileged minority, while the risks – from dis-
A central pillar of public AI is the development of information to algorithmic bias – will disproportion-
fully open source models, which involves releasing ately affect marginalized groups.
the trained parameters (i.e., weights and biases) of
models – commonly referred to as open models –
along with the code, data and documentation used in
their creation under open source licenses. Open mod-
els ensure that foundational AI systems are acces-
sible, inspectable and modifiable by a wide range of
actors, including researchers, public institutions and
civil society. Unlike proprietary models, fully open
source models offer transparency into model weights
and training data, allowing for reproducibility and
independent auditing. They also lower barriers to ex-
14
Introduction
The development of publicly controlled, state-of- however, that AI models are made up of several
the-art open source AI models should be a central components beyond architecture and weights.
goal of public AI policies. These models provide For example, the Model Openness Framework5
a public alternative to commercial offerings and breaks models down into 16 components span-
help foster an ecosystem in which smaller, com- ning code, data and documentation, and outlines
plementary models can thrive. three tiers of model openness depending on how
fully these components are shared under permis-
The benefits and risks of open source AI develop- sive licenses.
ment have been the subject of intense policy
debate in recent years. While early discussions Today, many models described as “open” or “open
focused largely on the safety risks of open source source” share only a limited set of components.
AI systems, there is growing support for such In most cases, only model parameters are re-
models and increasing recognition of their poten- leased, accompanied by some documentation.
tial to democratize AI development. While this allows for reuse, it offers insufficient
transparency into the training data and the devel-
However, the term “open source AI” is currently opment process.6 These models are more accu-
used to describe models with varying degrees of rately described as open-weight models.7 In this
openness. In some cases, “open washing” occurs – white paper, we use the generic term open model
when models that are not truly open are marketed to refer to a broad spectrum of models released
as meeting open source standards.3 Public AI poli- under open terms, including commercial open-
cy must therefore be grounded in a clear and con- weight models from companies such as Mistral or
sistent definition of what constitutes open source DeepSeek.
AI. The Open source AI Definition Initiative pro-
posed a binary definition: AI systems released At the same time, a key recommendation of this
under licenses that permit use, study, modifica- white paper is the development of fully open
tion and redistribution. This definition, however, source AI models, which we define as the com-
has not yet been widely adopted.4 Alternatively, plete release of model parameters, architecture,
Irene Solaiman of Hugging Face has conceptual- code, datasets and associated documentation
ized openness as a gradient – from fully closed to under open source licenses. Currently, very few
fully open. state-of-the-art models meet this definition –
among them, OLMo 2 by the Allen Institute for AI.
In this white paper, we focus on open models –
that is, models in which both the architecture and
parameters (i.e. weights and biases) are released 5 Model Openess Framework. https://isitopen.ai/ Accessed 27
April 2025.
under permissive licenses. It is important to note,
6 Zuzanna Warso, Paul Keller and Max Gahntz. “Towards Robust
Training Data Transparency.” Open Future. 19 June 2024.
3 Sarah Kessler. “Openwashing.” New York Times. 19 May 2024. https://openfuture.eu/publication/towards-robust-training-
https://www.nytimes.com/2024/05/17/business/what-is- data-transparency/
openwashing-ai.html 7 “Open weights: not quite what you‘ve been told.” Open Source
4 Open Source Initiative. “Open Source AI Definition.” Accessed Initiative. https://opensource.org/ai/open-weights Accessed 23
27 April 2025. https://opensource.org/ai April 2025.
15
Introduction
16
2 | Technical primer: What are AI
technologies and how do they work?
17
Technical primer: What are AI technologies and how do they work?
Figure 1 | AI terminology
ARTIFICIAL
INTELLIGENCE
MACHINE
LEARNING
DEEP
LEARNING
GENERATIVE
AI
ments and international institutions in in the devel- Machine learning (ML) is central to contemporary AI
opment of AI governance frameworks. 11
and refers to the development of computer programs
– specifically, statistical models – that learn from
While we adopt the OECD’s definition as a general data rather than relying on explicitly programmed
framing of AI, this report focuses on machine learn- instructions. Tom Mitchell defines ML as follows: “a
ing (ML) approaches to AI, and in particular the con- computer program is said to learn from experience
temporary paradigm of generative AI. In this section, (E) with respect to some class of tasks (T) and per-
we define these technical concepts before discussing formance measure (P), if its performance at tasks in
them in greater detail in the sections that follow. T, as measured by P, improves with experience E.”12
ML encompasses a variety of statistical learning ap-
18
Technical primer: What are AI technologies and how do they work?
proaches, including supervised learning (learning model focused solely on text, while GPT-4o is a mul-
from labeled data), unsupervised learning (identify- timodal model capable of processing and generating
ing patterns in unlabeled data), and reinforcement text, audio, images and video. While the terms “large
learning (learning through interaction with an envi- language model” (LLM) and “foundation model” are
ronment). often used interchangeably, LLMs are technically a
subset of foundation models that specialize in lan-
Deep learning is a branch of ML that uses artificial guage processing. Foundation models more broadly
neural networks with multiple layers – hence “deep” include systems focused on other modalities, such as
– to detect patterns in raw data.13 deep learning is vision or audio, or combinations of them.
especially effective for handling unstructured data
such as text or images, making it popular in fields Although generative AI is currently receiving signifi-
like natural language processing and computer vi- cant attention – with high-profile tools like ChatGPT
sion. It also plays a key role in generative AI applica- and Claude capturing public attention – it is import-
tions, as described below. ant to note that the most common AI use cases still
rely on more traditional ML approaches, such as
Generative AI refers to the application of deep learn- regression models, random forests and clustering
ing techniques to build models that, given an input algorithms. According to the AI Mapping 2025 report,
– typically a natural language prompt – can gener- which studied 750 French AI startups, ML remains
ate novel outputs such as text, images, audio or code, the most widely used AI technique (28%), followed
without being explicitly programmed for each task.14 by deep learning (20%) and generative AI (15%).17
Recent advances in generative AI have been driven by The relative popularity of such ML approaches is re-
the emergence of transformer-based architectures flected in the download statistics of widely used open
and scaling laws, which we explain in more detail source software libraries for AI. For example, scikit-
below, as well as the rise of user-facing tools, such as learn, a Python library that implements ML algo-
OpenAI’s ChatGPT. Generative AI includes both un-
15
rithms and known as “the Swiss army knife for ML,”
imodal models, such as language or vision models, is downloaded up to 3 million times per day.18, 19 It
and multimodal models, which can process and gen- is followed by PyTorch, a popular framework for
erate multiple types of data – such as text, images or training deep learning models, and transformers, a
audio – either as inputs or outputs. library for accessing and fine-tuning models hosted
on Hugging Face Hub, which are both downloaded
Foundation models represent a major category up to 1.5 million times per day.20, 21
within generative AI. They are characterized by their
large scale (with up to trillions of parameters), train-
ing on vast and diverse datasets and adaptability to a
wide range of downstream applications.16 Foundation
models can be either unimodal or multimodal. For
17 France Digitale. “Mapping Des Startups Françaises de l’IA.”
example, OpenAI’s GPT-3 is a unimodal foundation https://mappings.francedigitale.org/ia-2025 Accessed 23 April
2025.
18 PyPI Stats. “scikit-learn.” Accessed 23 April 2025. https://
13 Yann LeCun, et al. “Deep Learning.” Nature, vol. 521, no. pypistats.org/packages/scikit-learn
7553, May 2015, pp. 436–44. www.nature.com, https://doi. 19 Inria. “The 2019 Inria-French Academy of Sciences-Dassault
org/10.1038/nature14539 Accessed 3 April 2025. Systèmes Innovation Prize: Scikit-Learn, a Success Story
14 McKinsey. What Is ChatGPT, DALL-E, and Generative AI? for Machine Learning Free Software.” https://www.inria.fr/
https://www.mckinsey.com/featured-insights/mckinsey- en/2019-inria-french-academy-sciences-dassault-systemes-
explainers/what-is-generative-ai Accessed 23 April 2025. innovation-prize-scikit-learn-success-story Accessed 23 April
15 OpenAI. Introducing ChatGPT. https://openai.com/index/ 2025.
chatgpt/ Accessed 13 March 2024. 20 PyPI Stats. “transformers.” https://pypistats.org/packages/
16 Rishi Bommasani, et al. On the Opportunities and Risks of transformers Accessed 23 April 2025.
Foundation Models. arXiv:2108.07258, arXiv, 12 July 2022. 21 PyPI Stats. “torch.” https://pypistats.org/packages/torch
arXiv.org, https://doi.org/10.48550/arXiv.2108.07258 Accessed 23 April 2025.
19
Technical primer: What are AI technologies and how do they work?
Generative AI models are increasingly seen as poten- This dual nature – general-purpose but also highly
tial general-purpose technologies (GPTs) – foun- adaptable – sets ML apart from other historical GPTs
dational innovations that reshape society and the and adds complexity to policy debates. Policymak-
economy across multiple sectors, much like the ers must grapple with this duality when designing
printing press, electricity, computers or the inter- frameworks that support both broad societal benefits
net. 22 More broadly, ML techniques are defined by and context-specific applications of ML and genera-
their versatility and capacity to generate innova- tive AI technologies.
tion across domains, thanks to their core ability to
identify patterns and make predictions from data.
As such, ML represents a paradigm shift from ear-
lier single-purpose AI systems. At the same time,
these models can be easily adapted for domain-spe-
cific applications without requiring major redesigns.
0.955
0.1 0.8 0.3 A
0.246
0.5 0.1 0.4 B
0.122
0.2 0.2 0.9 C
HIDDEN LAYERS
20
Technical primer: What are AI technologies and how do they work?
The deep learning paradigm process, the network makes predictions, compares
them to correct answers, calculates the error and up-
Various ML paradigms have been developed to train dates the weights to reduce this error.25 Importantly,
models to learn from data and perform specific tasks. the total number of weights – often expressed as 2B,
This ability to train models based on data, and the ap- 7B, or 40B (to indicate billions of parameters) – de-
plications that can be built on its basis, are the key termines the size of the model and plays a crucial
technological capacities that public AI policies aim to role in determining its capabilities and effectiveness.
secure for the public interest. In this section, we focus This architecture has proven powerful for complex
on deep learning approaches, which involve the use data types like images, text and audio, enabling
of neural networks. Deep learning has become a cen- major breakthroughs in fields from computer vision
tral paradigm for generative AI development. It gained to natural language processing.
significant momentum following a breakthrough re-
search paper published in 2006,23 and the pioneering Neural networks were first developed in the 1950s
application of that research in the development of and, while promising, were long constrained by two
AlexNet in 2012.24 As the paradigm evolved over the major limitations: insufficient computing power and
last decade, three types of resources emerged as par- limited access to data. Over the next decades, devel-
ticularly critical to its success: compute, data and opment of machine learning technologies – includ-
model architectures (including model size). ing recent advances in generative AI – depended on
securing these key resources. The field experienced
The underlying design of neural networks is based on several cycles of enthusiasm and disappointment
computational model architectures inspired by bio- until three key developments converged in the 2010s:
logical neurons in the brain. These networks consist
of interconnected artificial neurons organized into •
Model architectures: The development of novel
layers that process and transform data through a se- architectures, such as convolutional neural net-
ries of mathematical operations. Each network has an works, initially used for computer vision and later
input layer that receives raw data, hidden layers that adopted for machine translation. Ultimately, the
process this information and an output layer that transformer architecture enabled the emergence
produces results. The “deep” in deep learning refers of generative AI applications.
to the presence of multiple hidden layers between the
input and output layers, which enables the model to • Compute: Advances in compute infrastructure,
learn increasingly complex and abstract representa- including the use of Graphics Processing Units
tions of data. (GPUs) for machine learning and the development
of the CUDA programming platform, allowed for
Figure 2 shows an example of a simple neural net- the massive parallel computations required for
work capable of recognizing letters of the alphabet. AI training.
The network learns by adjusting the weights – the
strengths of the connections between neurons – • Data: The creation of large labeled datasets en-
through a process called backpropagation. In this abled a data-centric approach to AI and reinforced
the dominance of supervised learning. Many of
these datasets were built from publicly available
23 Geoffrey E. Hinton, et al. “A Fast Learning Algorithm for Deep
Belief Nets.” Neural Computation, vol. 18, no. 7, July 2006, or openly shared web content, highlighting how
pp. 1527–54. DOI.org (Crossref), https://doi.org/10.1162/ open data can facilitate the collection and cura-
neco.2006.18.7.1527
tion of training examples.
24 Alex Krizhevsky, et al. “ImageNet Classification with Deep
Convolutional Neural Networks.” Advances in Neural
Information Processing Systems, vol. 25, Curran Associates,
Inc., 2012. Neural Information Processing Systems, 25 David E. Rumelhart, et al. “Learning Representations by Back-
https://papers.nips.cc/paper_files/paper/2012/hash/ Propagating Errors.” Nature, vol. 323, no. 6088, Oct. 1986, pp.
c399862d3b9d6b76c8436e924a68c45b-Abstract.html 533–36. https://doi.org/10.1038/323533a0
21
Technical primer: What are AI technologies and how do they work?
A pivotal moment came in 2012 with AlexNet, a deep ical analyses of ImageNet revealed that the label-
neural network that significantly outperformed ing process introduced multiple forms of bias, which
existing methods in image recognition. Its success went unaddressed even as the dataset became foun-
demonstrated that deep learning models could achieve dational to AI research.28 Similar problems applied to
breakthrough results when trained on large datasets other computer vision datasets built at that time, and
using sufficient computational power. This marked these challenges have continued with the creation of
the beginning of the modern deep learning era. training datasets for generative AI models.29
22
Technical primer: What are AI technologies and how do they work?
The attention mechanism starts by converting simultaneously and compute their relationships.
each token – a word or subword unit – into a For instance, in the sentence:
high-dimensional vector. It then computes three
matrices, known as queries, keys and values from “The water continued to flow steadily,
these vector tokens. These matrices help the gradually eroding the bank.”
model determine how information flows between The attention mechanism allows the model to rec-
tokens, allowing it to capture both short-range ognize that “water” and “flow” are highly relevant
and long-range dependencies in the text. This for interpreting the meaning of “bank,” helping it
architecture enables models to learn increasingly infer that the term refers to a riverbank rather
sophisticated language patterns. than a financial institution – even though the rel-
evant words appear several tokens earlier. When
Previous approaches to language processing re- processing each word, the model calculates atten-
lied primarily on techniques like recurrent neural tion scores to determine how much weight to as-
networks (RNNs), which processed text sequen- sign to every other word in the sequence.
tially (i.e., one word at a time). Consider the sen-
tence: It is important to note that transformer archi-
tectures come in different variants that use the
“Watching the water flow gently, she sat attention mechanism in task-specific ways. En-
down by the bank.” coder-only models process entire inputs at once
RNN-based models would process the sentence to build rich contextual representations (as in
word by word. By the time it reaches the word the example above), making them well suited for
“bank,” the model might struggle to recall earlier tasks such as text classification and comprehen-
words like “water” and “flow” because RNNs tend sion. Decoder-only models generate outputs one
to forget information as the sequence grows lon- token at a time, attending to previously gener-
ger. While more sophisticated architectures like ated tokens, and are used in text generation and
long short-term memory networks (LSTMs) im- completion systems. Encoder-decoder models
proved on basic RNNs, they still faced fundamen- combine both approaches by first encoding the
tal limitations in processing long sequences. full input before the decoder produces an output,
enabling tasks like translation, summarization
The key innovation of transformers was the in- and question answering.
troduction of the attention mechanism, which en-
ables the model to process all words in a sequence
Since 2017, the transformer architecture has become achieved by scaling three key resources: data, com-
the foundation for a wide range of language tasks. Its pute and the size of the resulting model. The char-
most prominent application has been in LLMs, no- acteristics of transformer architectures – and their
tably the Generative Pretrained Transformer (GPT) high demands for compute and data – are crucial
family developed by OpenAI, beginning with GPT-2 considerations for public AI policies.
in 2019. Since then, generative AI development has
focused on extending and improving model capa- Recent architectural innovations have extended
bilities. In this development paradigm, progress is transformer capabilities to handle multiple modal-
23
Technical primer: What are AI technologies and how do they work?
24
Technical primer: What are AI technologies and how do they work?
Training
Datasets
Model
Software Architecture
& tooling
PRE-TRAINING
Model
pre-training
Base Model
Datasets Software
& tooling
POST-TRAINING
Model
post-training
Deployed Model
Inference
Compute
DEPLOYMENT
User-facing apps
25
Technical primer: What are AI technologies and how do they work?
During this phase, the base model is refined through In the so-called inference stage, the trained model
additional training to improve its capabilities for is deployed for use in user-facing applications, re-
specific domains and align its behavior with specific quiring additional computational resources known
objectives. At this stage, various datasets are used as inference compute. Alternatively, a model can be
to further train the model, including validation, in- hosted on a cloud platform and made available via an
struction and benchmark datasets. API, enabling third parties to build their own appli-
cations on top of it. Inference compute is essential
Development methods in this phase have evolved to run the model and, in the case of large frontier
rapidly, with two key approaches becoming prom- models, can involve significant costs and environ-
inent in 2024. The first is supervised fine-tuning, mental impact. This is also the reason why small,
in which models are trained on domain-specific or more sustainable models are being developed.
task-specific datasets. For example, a model might
be finetuned on medical literature to enhance its The overview presented here simplifies what is often
healthcare-related capabilities. The second approach a far more complex development process. In prac-
relies on reinforcement learning to enhance model tice, model development is rarely a one-time effort.
reasoning and decision-making. Two primary meth- AI labs typically seek to aim to improve their models
ods are reinforcement learning from human feedback over time, revisiting earlier stages of the process.
(RLHF), where human preferences guide the learning As a result, development is often iterative, with feed-
process, and reinforcement learning from AI feed- back from later stages informing adjustments to
back (RLAIF), which uses other AI systems to provide earlier components. Developers frequently cycle be-
training signals. 32
tween phases as they refine their systems. Finally,
compound AI systems combine multiple models to
This phase also includes evaluation and deploy- build holistic workflows and applications.
ment optimization, ensuring that the models meet
requirements for accuracy, reliability, computa- This overview also does not fully reflect the diversi-
tional efficiency and safety before deployment in ty of open model development ecosystems, such as
real-world applications. This includes testing on those found on platforms like Hugging Face. Such
standardized benchmarks, which measure model development entails the combined use of various
capabilities across diverse domains including rea- models. In such cases, work often begins with an
soning, knowledge and safety. 33
Models are also eval- openly available base model from which derivative
uated through specialized testing methodologies like models are produced, such as through fine-tuning.
adversarial testing and hallucination detection. Furthermore, smaller models can be created from
large models through techniques like distillation or
The result of this phase is a fully trained model, quantization (see Section 3).
ready for deployment.
26
Technical primer: What are AI technologies and how do they work?
these empirical relationships has made AI develop- earlier models, in particular GPT-3, had been un-
ment increasingly dependent on securing ever-larger dertrained: the models were too large relative to the
amounts of compute power and training data. Ensur- amount of data and compute used. A new approach,
ing access to these resources must be a core focus of suggested in the paper, allowed smaller models to be
any public AI policy aiming to support state-of-the- as effective as larger ones, if higher quality data was
art model development. used for training over extended periods of time.
A 2020 research paper by OpenAI researchers ob- While scaling laws have driven significant gains in AI
served that there are scaling laws 34 inherent in performance, recent research suggests that the ben-
transformer-based model development. The paper efits of continued scaling may be slowing.38 The key
identified a power law relationship between three constraint is the limited availability of high-quality
scalable resources for AI training – model parameter training data. Ilya Sutskever, one of the co-founders
count, training data size and computational power – of OpenAI and a key figure in transformer-based
and model performance. The approach has been de- model development, argues that we have reached
scribed as the “bigger is better” paradigm in AI.35 “peak data,”39 as all frontier AI labs rely on scraping
web data, and scaling laws are beginning to plateau.
Scaling laws are not natural laws. They are based on The stakes in this debate are high. Proponents of
empirical observations: performance on benchmark artificial general intelligence (AGI), a form of artifi-
tasks improves when all three inputs – model size, cial super-intelligence, often place their bets on the
dataset size and compute – are increased together continued validity of scaling laws. This belief in the
during the pre-training phase. In other words, trans- power of AI scaling also underpins recent massive
former models – known for their resource-intensive investments into compute, such as the $500 billion
nature due to their ability to process vast amounts of investment announced by OpenAI, Oracle and Soft-
data in parallel – are central to this paradigm. These bank in January 2025.40 Critics, meanwhile, view the
improvements only emerge when all three compo- current wave of AI development as yet another hype
nents are scaled up together during the pretraining cycle destined to collapse because of the consequenc-
phase. Researchers found that increasing any single es of scaling laws.41
factor in isolation leads to diminishing returns. 36
27
Technical primer: What are AI technologies and how do they work?
was always to be expected. Others predict that AI reasoning models that generate multiple candidate
scaling laws will not diminish but rather evolve, as outputs and internally select the best one.47 Early
the scaling effects depend on multiple factors. 42
results indicate this approach can significantly boost
performance without increasing model size or re-
An evolution in scaling is already underway. The quiring additional training data.48 The approach
singular focus on model pretraining –dominant gained attention with the release of OpenAI’s rea-
between 2020 and 2023 – is giving way to a new par- soning models, such as O1 and O3, which demon-
adigm that focuses on advantages gained in later strated strong benchmark performance using this
stages of development, or even in the deployment method.
phase. 43
Even if the original pretraining scaling laws
are beginning to level off, these other phases are in- Even if the original scaling laws – now referred to
creasingly seen as the next frontier. Some research- as pretraining scaling – begin to plateau, this does
ers argue that the future of AI capabilities might not necessarily imply a reduction in computation-
depend more on finding the right balance between al demand. Inference scaling continues to gain trac-
these three scaling dimensions rather than pushing tion and could sustain high resource requirements.
any single dimension to its limits. 44
Consequently, the development and deployment of
transformer-based models may remain costly, even
Starting in 2024, leading AI companies have shifted as architectures and training methods evolve.49
their focus to scaling during the post-training phase.
In this phase, optimization involves refining models As described in chapter 2, AI scaling laws are a key
after their initial training through reinforcement driver of concentration in the AI ecosystem. An AI
learning techniques and other fine-tuning meth- development approach that is based on transformer
ods. 45
These methods help align models with human architecture, and that adheres to AI scaling laws has
preferences and specific tasks, and enable the devel- several repercussions. First, it creates the kind of
opment of models capable of more complex reason- concentrations of market power outlined in the pre-
ing. Research suggests that the relationship between vious section. Second, it creates new and distinct
resources invested in post-training and resulting forms of digital divide, which are related to uneven
performance improvements follows its own distinct access to computing resources. And third, it dramati-
scaling patterns. Some researchers argue that there cally increases the environmental footprint of AI sys-
may be more headroom for improvement in this tems.
phase than in pretraining, particularly as reinforce-
ment learning methods continue to advance.46
Scaling and AI’s environmental footprint
A related trend is inference compute scaling, where
greater computational resources are allocated during The training and deployment of large AI models
model use to enhance performance. This has emerged comes with a major environmental footprint that
as a new frontier, particularly in the development of extends beyond just the energy consumption of data
28
Technical primer: What are AI technologies and how do they work?
centers. For example, training a single large language adox:57 as individual models become more efficient,
model can emit up to 550 metric tons of CO2.50 More- overall environmental impact increases because im-
over, the energy required for deployment – known as proved efficiency leads to broader deployment and
inference – accounts for a substantial share of on- more frequent use. This dynamic raises questions
going AI-related energy use, ranging from one-third about whether the current trajectory of AI develop-
at Meta to as much as 60% at
51 Google.52 In addition, ment, with its emphasis on scale, is environmentally
the cooling systems required to prevent data centers sustainable in the long term.
from overheating often rely on large volumes of
water, placing further strain on local water resourc- These environmental concerns are an important fac-
es. 53
As AI systems become more widespread, these tor that should be integrated into public AI policy-
energy demands continue to grow. It is telling that making. Any deployment of AI technologies must
the dominant AI companies are evolving into energy address the environmental impact inherent in to-
companies.54 This trajectory is fundamentally unsus- day’s generative AI systems. Commercial AI labs –
tainable, as computational demands grow faster than operating within the paradigm of AI scaling laws and
improvements in model performance.55 leveraging full-stack approaches – tend to embrace a
“bigger is better” model of data center development,
The environmental costs of AI extend beyond ener- often at the expense of environmental sustainabili-
gy consumption. Beyond data centers, the entire AI ty.58 AI’s environmental impact, alongside financial
supply chain raises serious concerns about the en- limitations, is a major reason why public AI policies
vironmental costs of contemporary approaches to should not simply replicate commercial strategies
AI development and deployment 56
– from the ex- focused on ever-larger models.
traction of raw materials for GPUs to the mounting
problem of electronic waste from discarded hard-
ware. This creates what researchers call a Jevons par- What is the future of AI scaling laws?
29
Technical primer: What are AI technologies and how do they work?
only to the final training run. The total cost of Deep- The release of DeepSeek’s models offers several im-
Seek’s AI infrastructure is estimated at $1.6 billion. portant lessons for public AI strategy. First, under
The company operates around 50,000 GPUs – com- the current scaling paradigm, access to computing
parable to major Western AI labs and consistent with power remains a critical requirement for develop-
the demands of the AI scaling laws. 59
ing state-of-the-art AI models. Even with the inno-
vations introduced by the DeepSeek team, building
As explained in the technical report, DeepSeek re- state-of-the-art models remains a resource-inten-
searchers achieved performance on par with existing sive endeavor. Even if costs – and thus environmen-
state-of-the-art models through two major algo- tal impact – are reduced at each stage of training and
rithmic improvements. 60
The first was an advance- deployment, the Jevons paradox still applies: effi-
ment of the mixture of experts technique, which ciency gains may drive wider adoption, ultimately
divides the model into specialized submodels. In- increasing total energy consumption.62
stead of activating the full model during training
and inference, only relevant submodels are engaged, However, computing power is only one part of the
reducing computational requirements. equation. The DeepSeek example shows that signif-
icant gains can also be achieved through advances
The second breakthrough prompted by U.S. export in machine learning techniques. Many of these de-
controls, which restrict DeepSeek to using lower- pend on the availability of a state-of-the-art model,
performance Nvidia chips with limited memory which can be used to create derivative, smaller mod-
bandwidth – a major constraint, since model training els – enabling capability transfer without needing to
requires moving massive volumes of data between replicate the full computational cost of the original
memory and processing units. In response, DeepSeek model.
developed methods to reduce memory bandwidth
demands during both pretraining and inference.61 Therefore, concentration of compute does not auto-
matically result in a lasting concentration of model
Moreover, the DeepSeek researchers demonstrated capability. Distillation techniques used by DeepSeek
that the advanced capabilities of state-of-the-art show that smaller, more energy-efficient models can
models like DeepSeek-R1 can be distilled into small- be created on the basis of larger models.
er models. These models outperform state-of-the-
art models, both open and proprietary. The company The key takeaway from the release of DeepSeek’s
has also pledged to release open-weight versions of models is that ultimately it confirms the economics
its models – freely available for use, though without of frontier AI development under the current scaling
access to the original training data or certain propri- paradigm. Any public AI strategy must still secure
etary components. access to significant computing – comparable to
those of major commercial AI labs – in order to de-
velop state-of-the-art models. This either requires
significant investment in public compute infrastruc-
59 Anton Shilov. “DeepSeek Might Not Be as Disruptive as ture, which would only yield results over the longer
Claimed, Firm Reportedly Has 50,000 Nvidia GPUs and Spent term, or accepting a degree of reliance on commer-
$1.6 Billion on Buildouts.” Tom’s Hardware, 2 February 2025,
https://www.tomshardware.com/tech-industry/artificial-
cial compute providers.
intelligence/deepseek-might-not-be-as-disruptive-as-
claimed-firm-reportedly-has-50-000-nvidia-gpus-and-
spent-usd1-6-billion-on-buildouts
60 DeepSeek-AI, et al. DeepSeek-R1: Incentivizing Reasoning
Capability in LLMs via Reinforcement Learning.
arXiv:2501.12948, arXiv, 22 Jan. 2025. arXiv.org, https://doi. 62 Jennifer Collins. “What Does DeepSeek Mean for AI’s
org/10.48550/arXiv.2501.12948 Environmental Impact?” DW.com. 30 January 2025. https://
61 Ben Thompson. “DeepSeek FAQ.” Stratechery. 27 January 2025. www.dw.com/en/what-does-chinas-deepseek-mean-for-ais-
https://stratechery.com/2025/deepseek-faq/ energy-and-water-use/a-71459557
30
Technical primer: What are AI technologies and how do they work?
31
3 | The generative AI stack
DATA
•
Compute: This foundational layer refers to the
physical and software infrastructure that enables
AI development and deployment. At its core are
specialized processors or chips – primarily GPUs
– designed to handle the massive parallel com-
COMPUTE
putations required for training and running AI
models. To make these chips usable at scale, two
elements are essential: software frameworks that
optimize GPU performance and the integration
of GPUs into data centers, where they are stacked Illustration by: Jakub Koźniewski
and networked into powerful, scalable compute
systems, often delivered via cloud platforms.
• Models: This layer refers to the AI models them-
• Data: This layer involves storage, processing and selves. Each model consists of an architecture and
transfer of datasets used in both the pretraining a set of parameters – its weights and biases – re-
and post-training phases of AI development. fined through training. These models are typically
deployed as cloud-based services.
32
The generative AI stack
•
Applications: In this layer, AI models are em- es their use for AI workloads, while PyTorch is a deep
bedded in user-facing systems and applications. learning framework that abstracts hardware com-
Running these applications requires additional plexity and offers high-level APIs for efficient model
computing power, referred to as inference or test- development.
time compute.
In the data layer, software manages data ingestion,
In analyzing public AI development pathways, we processing and organization. For models, it provides
focus on the compute, data and model layers. We do development environments as well as tools for train-
not address the applications layer, as it lies down- ing, evaluation and fine-tuning. Widely used open
stream from the core layers under discussion. The source libraries include scikit-learn and PyTorch for
existence of public AI applications depends on the training machine learning models, GPT-NeoX for
availability and capabilities of AI systems built training large language models, vLLM for model in-
through the orchestration of resources at these foun- ference or serving, transformers for fine-tuning and
dational levels. Therefore, when considering ele- LM Harness and lighteval for evaluation. At the ap-
ments of public AI policy, application development plication layer, software provides APIs, monitoring
pathways will be listed as a key additional measure. systems and user interfaces that make AI models ac-
cessible and usable by downstream developers and
Beyond the fundamental layers – from hardware to users.
applications – additional layers are sometimes iden-
tified to emphasize other critical functions. For ex- Because software development cuts across all layers
ample, Mozilla introduces a safeguards layer in its AI of the stack, it is difficult to define a standalone pub-
stack to highlight the importance of tools and mech- lic AI development pathway that focuses solely on
anisms that ensure the safety of AI systems. 64
Sim- software. Instead, support for software development
ilarly, a talent layer is often added to highlight the should be considered a key complementary measure
indispensable role of human talent and know-how in within each of the pathways.
driving AI development.65
The section below outlines the advantages of the
Software used in AI development can also be seen as stack model for designing public AI policy and high-
a cross-cutting layer that spans the entire AI stack. lights how it helps clarify concentrations of power
At each level, both proprietary and open source solu- in the AI ecosystem. This is followed by a closer look
tions are commonly used by developers. At the com- at the three core layers – compute, data and models.
pute level, software is essential for managing and The characteristics of these layers shape dependen-
orchestrating hardware resources during the de- cies on commercially provided resources that pub-
velopment and deployment of AI models. It creates lic AI initiatives must navigate. They also present key
abstractions that allow developers to harness com- considerations for any effort to develop independent
puting power without dealing directly with the com- public AI systems.
plexity of low-level hardware. For example, Nvidia’s
CUDA provides direct access to GPUs and optimiz-
Advantages of the AI stack concept
64 Adrien Basdevant, et al. “Towards a Framework for Openness in
Foundation Models. Proceedings from the Columbia Convening
on Openness in Artificial Intelligence.” Mozilla. 21 May 2024. The stack model can be a useful framework for gov-
https://foundation.mozilla.org/en/research/library/towards-a- erning complex technologies. In this context, gover-
framework-for-openness-in-foundation-models/
nance is understood as the exertion of control over
65 Ganesh Sitaraman and Alex Pascal. “The National Security
Case for Public AI.” Vanderbilt Policy Accelerator for the various layers of the stack and the orchestration
Political Economy and Regulation. 24 September 2024. of actors at each layer to achieve specific outcomes
https://cdn.vanderbilt.edu/vu-URL/wp-content/uploads/
sites/412/2024/09/27201409/VPA-Paper-National-Security-
through the use of the overall technology. Typical
Case-for-AI.pdf forms of such control include regulation and volun-
33
The generative AI stack
tary norms,66 and key governance questions concern tain monopoly power in a single layer, while foster-
the interplay between various layers of the stack. ing competition in – and thus commoditize – other
layers. From a business perspective, the stack model
This model also goes hand in hand with supply chain offers a way to analyze where profits are generated,
analyses, particularly at the hardware level, by re- accounting for both dependencies (such as on GPUs)
vealing how dependencies and power concentrations and competition in an environment where many
emerge in technological systems.67 At each layer solutions are openly shared.71
of the stack, power can accumulate, and the stack
model helps clarify these concentrations and their A public approach, on the other hand, focuses not on
broader impact on the technological system. control, but on orchestrating the various components
and layers to achieve public interest goals. A layered
This perspective also illustrates the interconnections approach, based on the stack metaphor, allows for
between AI systems and other digital infrastructures better governance of AI.72 It takes into account the
– such as the internet, online platforms and data complexity of AI systems, while also demonstrat-
centers – and shows how different actors (industry, ing their interdependent nature. It allows for exam-
governments, NGOs, academia and communities) ination of dependencies at different layers, as well as
both rely on and influence one another. 68
the benefits of sharing key resources, and their im-
pact on model development. For example, policies
Researchers from the Ada Lovelace Institute note that consider the entire stack can address more than
that the resource-intensive nature of AI development just compute resources. The European AI Continent
often renders “downstream” users dependent on Action Plan is an example of such a “full-stack” ap-
“upstream” providers, typically large AI companies. proach, as it includes measures on computing power,
This dynamic underscores the need for policymak- training data, models and deployment of AI sys-
ers to understand how value is created and distribut- tems.73
ed across the AI stack. 69
34
The generative AI stack
nant position at the hardware layer and counts states can afford the costs to train state-of-the-art mod-
seeking control over compute power among its key els. While DeepSeek initially appeared to mark a shift
customers. 74 A sovereign AI program requires, in in the economics of AI training, later analysis sug-
principle, full control of the AI stack – making it a gested otherwise.76 In addition to the high cost of ac-
highly contested concept. As Pablo Chavez notes, this quiring GPUs, building a data center with sufficient
is difficult to achieve: “In reality, what most coun- networking infrastructure and covering operational
tries working toward AI sovereignty are doing is expenses – such as electricity for running and cool-
building a Jenga-like AI stack that gives them enough ing hardware – requires major investment. As a re-
control and knowledge of AI technology to under- sult, only a few of the largest AI companies (Amazon,
stand and react to changing technology, market and Google, Meta, xAI and Microsoft) are able to pursue a
geopolitical conditions but falls short of complete full-stack approach, which demands massive invest-
control.” 75 In the following chapters, we offer a vi- ments in proprietary data centers.77 Among these,
sion of public AI that does not seek sovereign control Google and Amazon have a fully integrated AI stack,
but instead aims to secure the ability to orchestrate having developed their own chips (Google’s TPU and
resources across the AI stack in service of the pub- Amazon’s Inferentia and Trainium). Others still de-
lic good. pend on Nvidia, which holds a monopolistic position
in the GPU market.
Concentrations of power in the The costs, however, do not end with training. De-
AI stack ploying large AI models is also expensive, as it
requires sustained access to significant compute re-
Public AI visions, while not focused on digital sover- sources to process user queries in real time. For in-
eignty per se, must confront the question of whether stance, OpenAI’s ChatGPT reportedly incurred daily
developing AI systems in the public interest requires operating costs of up to $700,000 in 2023, due to the
some form of “sovereignty” – that is, control over need to continuously run thousands of GPUs.78
the AI stack. In other words, public AI must address
the concentrations of power that exist at various lay- This financial burden has pushed leading AI compa-
ers of the stack, where key resources are held by nies without full-stack capabilities – such as OpenAI,
commercial actors with dominant or near-monop- Anthropic and Mistral AI – to form partnerships with
olistic positions. This is especially true at the com- cloud hyperscalers like Amazon Web Services, Mic-
pute layer. The scale of investment needed to develop rosoft Azure and Google Cloud. This has resulted in a
viable alternatives makes such control extremely circular flow of capital between AI startups and scale-
difficult, if not impossible. As a result, the value gen- ups on the one hand and these cloud hyperscalers on
erated by AI systems is increasingly privatized and, the other hand. It is estimated that the three cloud hy-
in some cases, monopolized. Public AI policies aim to perscalers “contributed a full two-thirds of the $27
mitigate this trend. billion raised by fledgling AI companies in 2023”79
35
The generative AI stack
and that the majority of capital raised by AI startups, Closely related to this is the emergence of a “com-
“up to 80-90% in early rounds,” was paid back to the pute divide.”84 Outside a small group of hyperscal-
same cloud hyperscalers. 80
ers and AI labs that have partnered with them, most
companies have to rent compute for AI development
This uneven allocation of AI’s resources – and of the and deployment. These costs have resulted in a di-
financial returns they generate – is a defining fea- vide between the “GPU rich” and “GPU poor” com-
ture of modern AI technologies. The field exhib- panies.85 A similar “computing divide” also exists
its characteristics of a natural monopoly, driven by between commercial labs and academic or nonprof-
the high cost of training and deploying AI systems. it research institutions.86 On a global scale, the un-
These costs stem from intense demand for com- even distribution of GPU-equipped data centers has
puting power, the high price of chips, the effort re- produced a new kind of digital divide. Countries are
quired to obtain and prepare large datasets, limited now classified into three tiers: “Compute North” na-
access to proprietary data and the high switching tions with advanced GPU data centers capable of de-
costs between cloud platforms. Economies of scale veloping cutting-edge AI, “Compute South” nations
in generative AI create “winner-takes-most” dy- with less-powerful facilities suitable for deploying
namics, which are reinforced by network effects and existing AI, and “Compute Desert” nations that lack
first-mover advantages, as early, large-scale sys- such infrastructure entirely and must rely on foreign
tems benefit from user-generated data and estab- computing resources.87
lished customer bases. This inequality is global, not
limited to any single jurisdiction.81 Across the world, public and non-commercial com-
puting resources are miniscule in comparison to
These concentrations of power occur, first of all, at commercial computing power. While the public sec-
the compute layer. AI development efforts are high- tor was an early mover – developing the first su-
ly dependent on the three dominant cloud providers percomputers for research purposes – today its
– Amazon, Microsoft and Google – and on Nvidia, investments are outpaced by the growth of commer-
which currently holds an overwhelming share of the cial compute capacity, as outlined above. In addition,
chip market. 82 At the data layer, leading AI develop- public supercomputers must support a wide range of
ment labs typically benefit also from their privileged research unrelated to generative AI and are therefore
access to proprietary data generated on platforms neither optimized for AI training nor available for
they own or control. This trend is exemplified by providing inference compute to deployed AI systems.
the merger of xAI and X, an AI company and a so- As a result, even nonprofit and academic initiatives
cial media platform respectively, both owned by Elon
Musk.83 84 Bridget Boakye, et al. “State of Compute Access: How to Bridge
the New Digital Divide.” Tony Blair Institute. 7 December 2023.
https://institute.global/insights/tech-and-digitalisation/
80 Matt Bornstein, Guido Appenzeller, and Martin Casado. ibid. state-of-compute-access-how-to-bridge-the-new-digital-
81 Competition and Markets Authority. “AI Foundation Models: divide
Update Paper.” GOV.UK, 16 April 2024. https://www.gov.uk/ 85 Alistair Barr. “The tech world is being dividing into ‘GPU
government/publications/ai-foundation-models-update- rich’ and ‘GPU poor.’ Here are the companies in each group.”
paper; Anselm Küsters, Matthias Kullas. “Competition in Business Insider Nederland, 28 August 2023, https://www.
Generative Artificial Intelligence.” CEP. 12 March 2024. https:// businessinsider.nl/the-tech-world-is-being-dividing-into-
www.cep.eu/eu-topics/details/competition-in-generative- gpu-rich-and-gpu-poor-here-are-the-companies-in-each-
artificial-intelligence-cepinput.html; Tejas N. Narechania group/
and Ganesh Sitaraman. “Antimonopoly Tools for Regulating 86 Tamay Besiroglu, et al. The Compute Divide in Machine
Artificial Intelligence.” SSRN. 25 September 2024. https://www. Learning: A Threat to Academic Contribution and Scrutiny?
ssrn.com/abstract=4967701 arXiv:2401.02452, arXiv, 8 Jan. 2024. arXiv.org, https://doi.
82 Jai Vipra and Sarah Myers West. “Computational Power and AI.” org/10.48550/arXiv.2401.02452
AI Now Institute. September 27, 2023. https://ainowinstitute. 87 Vili Lehdonvirta, Bóxī Wú and Zoe Hawkins. (2024). “Compute
org/publication/policy/compute-and-ai North vs. Compute South: The Uneven Possibilities of
83 Maxwell Zeff. “Elon Musk says xAI acquired X.” TechCrunch. 29 Compute-based AI Governance Around the Globe.” Proceedings
March 2025. https://techcrunch.com/2025/03/29/elon-musk- of the AAAI/ACM Conference on AI, Ethics, and Society, 7(1),
says-xai-acquired-x/ 828-838. https://doi.org/10.1609/aies.v7i1.31683
36
The generative AI stack
These concentrations of power pose significant chal- A 2018 analysis by OpenAI showed that computing
lenges for developing public AI infrastructures, power used for AI training had increased 300,000
which face dependencies on monopolistic or oligop- times since 2012, the beginning of the “deep learn-
olistic actors across multiple layers of the stack. Two ing era.” A follow-up study by Epoch.ai, which
potential strategies have emerged in response. One analyzed 120 training runs of machine learning sys-
seeks independence at the compute layer through tems, found a fourfold annual growth rate in recent
massive investments in data centers and computing years – making it one of the fastest technological
power on a scale comparable to commercial expendi- expansions in decades. Overall, training compute
tures. However, this still entails reliance on Nvidia’s has grown by a staggering factor of 10 billion since
GPUs due to the company’s entrenched, monopolis- 2010.92
tic position. The other strategy accepts dependencies
at the compute layer and instead focuses on inde- Further scaling, however, faces four key con-
pendence at the model layer and in the development straints: energy consumption, chip manufactur-
of applications built on top of public models. These
pathways to public AI are explored in greater detail in 90 Amlan Mohanty. “Compute for India: A Measured Approach.”
chapter 5. Carnegie Endowment for International Peace. 17 may 2024.
https://carnegieendowment.org/posts/2024/05/compute-for-
india-a-measured-approach?lang=en
91 Department of Science, Innovation and Technology.
88 Matt Davies and Jai Vipra. “Mapping global approaches to “Independent Review of The Future of Compute: Final Report
public compute.” Ada Lovelace Institute. 4 November 2024. and Recommendations.” GOV.UK. https://www.gov.uk/
https://www.adalovelaceinstitute.org/policy-briefing/global- government/publications/future-of-compute-review/the-
public-compute/ future-of-compute-report-of-the-review-of-independent-
89 Cecilia Rikap. “Dynamics of Corporate Governance Beyond panel-of-experts Accessed 23 Apr. 2025.
Ownership in AI.” Common Wealth. 15 May 2024. https://www. 92 Jaime Sevilla, et al. “Compute Trends Across Three Eras of
common-wealth.org/publications/dynamics-of-corporate- Machine Learning.” Epoch AI. 16 Feb. 2022. https://epoch.ai/
governance-beyond-ownership-in-ai blog/compute-trends
37
The generative AI stack
ing, data availability and speed limits inherent to AI Software frameworks to run chips
training.93
Recognizing the role of specialized software is essen-
To better understand what compute entails – and tial – it acts as the bridge between hardware and in-
where bottlenecks arise in the development of gen- frastructure and helps explain much of the current
erative AI – it is helpful to break this layer into three concentration of power in the generative AI ecosys-
key component: advanced chips (primarily GPUs), tem.
and the two additional elements needed to make them
usable at scale: specialized software that enables effi- Effective use of GPUs for training and deploying AI
cient use of those chips, and data centers where GPUs models requires specialized software frameworks
are networked into large-scale compute systems. and tools. Two well-known examples are Nvid-
ia’s Compute Unified Device Architecture (CUDA)
Advanced chips and AMD’s Radeon Open Compute (ROCm). CUDA,
in particular, has become the industry standard for
Chips, or semiconductors, are arguably among the GPU-accelerated computing and quickly gained a
most important technological hardware in use today. first-mover advantage. Introduced in 2007, it en-
They underpin all digital technologies and serve as the ables developers to harness the parallel processing
backbone of most economic activities. The enormous power of GPUs for general-purpose tasks critical to
computational demands of training and deploying AI training and deployment.95
AI models – often involving trillions of calculations
– depend on modern chips’ ability to coordinate the CUDA’s importance became especially clear with the
work of billions of transistors etched into each unit. development of AlexNet in 2012, when the frame-
work enabled the training of the neural network and
There are two distinct categories of chips, each with reduced computation time from weeks to hours.96
its own supply chains, production requirements and Today, leading deep learning frameworks like Py-
strategic dependencies: Torch and TensorFlow – open sourced by Meta and
Google, respectively – are deeply integrated with
• Memory chips store and enable access to data, CUDA, making it difficult for competing platforms to
which is essential for high-performance AI work- gain traction. While media coverage often highlights
loads and a prerequisite for any computational task. Nvidia’s GPUs, CUDA represents an equally powerful
competitive “moat” for the company.
• Logic (or processing) chips – including CPUs,
GPUs and specialized chips like TPUs – carry out In contrast, ROCm, developed by AMD, is an open
computations. source alternative designed to offer similar function-
ality. Although ROCm supports various programming
The rapid advancements in AI have been driven large- models and provides an open platform, it has strug-
ly by improvements in specialized logic chips. As tra- gled to match CUDA’s widespread adoption. Nvid-
ditional CPUs proved too slow for AI training, GPUs ia’s extensive investments in its developer ecosystem
– originally developed for graphics rendering – have have helped cement CUDA as the de facto standard
been repurposed and optimized for this purpose. 94
38
The generative AI stack
39
The generative AI stack
platforms controlled by the same companies – serves ing without explicit permission or licensing, espe-
as a key competitive advantage.103 For others, the lack cially under the fair use doctrine.
of access to this high-quality data presents a signifi-
cant competitive disadvantage. There is also growing evidence that some AI labs have
used data without permission, possibly in violation of
Data sources for AI training the law – as demonstrated by multiple court cases in-
volving major AI companies. For example, the Books3
When discussing data and datasets in the context of dataset, which included 183,000 books sourced from
AI training, it is important to recognize that data is pirate websites, was used to train early-generation
not a homogeneous concept. Generative AI models models released in 2022.105 While its use was eventu-
rely on diverse data sources for training, which can ally discontinued under pressure from rights holders,
be categorized by accessibility, licensing, structure Meta was reported to have trained models on LibGen
and sensitivity. – a similar pirate repository – as late as 2024.106 In
2025, Anna’s Archive, an aggregator of pirated books
Accessibility is the most important category, and and research articles, announced it had granted AI
distinctions should be made between private (pro- companies, including DeepSeek,107 access to its data-
prietary), public and openly shared data. Private data base. These examples show that AI training often op-
typically includes user-generated content collect- erates in a legal gray area – and sometimes outside
ed by companies that own dominant online plat- the boundaries of the law – when it comes to the use
forms and are now building generative AI models of data.
(e.g., Meta, Google, Microsoft, AWS). Incumbent AI
companies like OpenAI and Anthropic can also use Several copyright frameworks have been introduced
data generated by their own chatbots. Public data in- to regulate generative AI training, most notably the
cludes content from the open internet, either scraped European Union’s exception for text and data min-
directly by AI companies or aggregated into datasets ing, which includes opt-out provisions for commer-
such as Common Crawl and its derivatives. Finally, cial training under the 2019 Digital Single Market
openly licensed data – such as that from Wikimedia Directive. However, there remains a lack of clar-
– is valued by developers for both its quality and the ity around their interpretation and enforcement,
legal certainty it offers for training use. and these frameworks are increasingly contested, as
shown by ongoing policy debates about opt-outs in
Data comes in many forms, each governed by dif- the United Kingdom.108
ferent legal frameworks and requiring tailored gov-
ernance. The use of data for AI training often raises Other types of training data may consist of person-
copyright issues, 104
leading to a growing number of al data (subject to personal data protection laws) and
high-profile infringement lawsuits brought by cre-
ators, publishers and rights holders against leading 105 Alex Reisner. “These 183,000 Books Are Fueling the Biggest
Fight in Publishing and Tech.” The Atlantic, 25 September 2023.
AI firms. These legal challenges question the extent https://www.theatlantic.com/technology/archive/2023/09/
to which copyrighted works can be used for AI train- books3-database-generative-ai-training-copyright-
infringement/675363/
106 Alex Riesner. “The Unbelievable Scale of AI’s Pirated-
103 Cade Metz, et al. “How Tech Giants Cut Corners to Harvest Data Books Problem.” The Atlantic. 20 March 2025. https://www.
for A.I..” New York Times. 8 April 2024. https://www.nytimes. theatlantic.com/technology/archive/2025/03/libgen-meta-
com/2024/04/06/technology/tech-giants-harvest-data- openai/682093/
artificial-intelligence.html 107 Anna‘s Archive. “Copyright Reform Is Necessary for National
104 Daniel J. Gervais. “The Heart of the Matter: Copyright, AI Security.” Anna‘s Archive (blog). 31 January 2025. https://
Training, and LLMs.” SSRN. 1 November 2024. https://papers. annas-archive.li/blog/ai-copyright.html Accessed 3 April 2025.
ssrn.com/sol3/papers.cfm?abstract_id=4963711 Accessed 3 108 Joseph Bambridge and Dan Bloom. “UK Plans Fresh Round of
April 2025; Matthew Sag. “Fairness and Fair Use in Generative Talks to Take Sting out of AI Copyright Proposals.” POLITICO,
AI.” SSRN. 20 December 2023. https://papers.ssrn.com/sol3/ 3 April 2025, https://www.politico.eu/article/uk-plans-fresh-
papers.cfm?abstract_id=4654875 Accessed 3 April 2025. round-talks-lawmakers-ai-copyright-proposals/
40
The generative AI stack
non-personal data industrial, anonymized, statisti- though in some cases, particularly for security-re-
cal and administrative datasets. While general prin- lated benchmarks, openness may be limited.111
ciples can be applied to all types of data, there is no
“one size fits all” approach to data sharing. Copy- Synthetic data
right and privacy or personal data rights are the two
most important factors determining how data can be Synthetic data, generated by generative AI models,
shared and accessed – and how “open” it can be. is a distinct category of data increasingly used in AI
development, and it presents its own governance
The ongoing public debates about AI training data- challenges. Because it is synthetically produced, it
sets often focus exclusively on pretraining data. can be created quickly and cheaply. It is not subject to
Copyright and data governance laws attempt to strike intellectual property restrictions and typically avoids
a balance that allows publicly available internet data privacy and other rights-related concerns. For in-
to be reused in training datasets without enabling its stance, synthetic health data can be used in place of
exploitation. However, there is a tension between the real patient data to protect privacy. In general, syn-
view – common among AI developers – that data is a thetic data can be used to reflect real-world patterns
raw resource to be used freely, and the perspectives while helping to safeguard privacy or reduce bias.
of those who create and steward that data. Some models – such as Microsoft’s Phi family of
small language models – have been trained entirely
It is also important to note that datasets used in on synthetic data.112 Recently, a new model develop-
post-training stages play an increasingly signifi- ment paradigm, called model distillation, uses an ap-
cant role. For example, methods based on RLHF re- proach similar to training with synthetic data. In this
quire data on human preferences, typically sets of paradigm, a “teacher” generative AI model produces
generative AI prompts and responses. New types of outputs that a “student” model is then trained to
reasoning models rely on datasets designed to help replicate, bypassing the need for access to the origi-
models follow instructions, solve problems or eval- nal pretraining data.
uate results.109 And as this type of training becomes
increasingly important, public discussions will also Some researchers are optimistic about the potential
need to address the provision and governance of of synthetic data, particularly for protecting person-
these datasets, which are often constructed differ- al data during AI training.113 Others warn of associated
ently and do not raise the same legal concerns.110 risks, most notably model collapse – a hypothesized
First, fine tuning datasets are usually domain- or decline in performance when models are trained on
application-specific and require governance tai- synthetic rather than real data.114 Use of synthetic data
lored to their context. Many key domains for gen- remains contested, and the validity of the “model
erative AI – such as health and finance – depend
on sensitive data. Second, benchmarks for evalu- 111 Peter Mattson, et al. “Perspective: Unlocking ML requires an
ating model capabilities are key tools that – un- ecosystem approach.” MLCommons. 10 March 2023. https://
mlcommons.org/2023/03/unlocking-ml-requires-an-
like pretraining data – can often be developed and ecosystem-approach/
shared as digital public goods. Since standardized 112 Microsoft. “Phi open model family.” Microsoft. https://azure.
benchmarks guide generative AI development, they microsoft.com/en-us/products/phi/ Accessed 23 April 2025.
113 Philippe De Wilde, et al. “Recommendations on the Use of
benefit from open access and sound governance –
Synthetic Data to Train AI Models.” Tokyo: United Nations
University, 2024. https://collections.unu.edu/eserv/UNU:9480/
Use-of-Synthetic-Data-to-Train-AI-Models.pdf
109 Nathan Lambert. “The State of Post-Training in 2025.” 114 Ilia Shumailov, et al. “AI Models Collapse When Trained on
Interconnects. 12 March 2025, https://www.interconnects.ai/p/ Recursively Generated Data.” Nature, vol. 631, no. 8022, July
the-state-of-post-training-2025 2024, pp. 755–59. www.nature.com, https://doi.org/10.1038/
110 Nathan Lambert. “Why reasoning models will generalize.” s41586-024-07566-y; University of Oxford. “New Research
Interconnects. 28 January 2025, https://www.interconnects. Warns of Potential ‘Collapse’ of Machine Learning Models.”
ai/p/why-reasoning-models-will-generalize Accessed 3 April Department of Computer Science, 25 July 2024. https://www.
2025. cs.ox.ac.uk/news/2356-full.html
41
The generative AI stack
115 Rylan Schaeffer, et al. “Position: Model Collapse Does Not Mean 117 Rishi Bommasani, et al. ibid.
What You Think.” arXiv:2503.03150, arXiv, 18 Mar. 2025. arXiv. 118 Alison Gopnik. “What AI Still Doesn’t Know How to Do.” The
org, https://doi.org/10.48550/arXiv.2503.03150 Wall Street Journal. 15 July 2022. https://www.wsj.com/tech/ai/
116 Dan Hendrycks, et al. “Measuring Massive Multitask Language what-ai-still-doesnt-know-how-to-do-11657891316
Understanding.” arXiv:2009.03300, arXiv, 12 Jan. 2021. arXiv. 119 https://medium.com/@nageshmashette32/small-language-
org, https://doi.org/10.48550/arXiv.2009.03300 models-slms-305597c9edf2
42
The generative AI stack
customization, making them ideal for deployment on In recent years, efforts have been made to more pre-
resource-constrained devices, edge computing en- cisely define what constitutes an open model. This is
vironments and domain-specific applications. They a necessary step toward creating standardized meth-
require less computational power and memory, en- ods for governing and sharing AI models. Frame-
abling wider adoption by a range of stakeholders works such as the Model Openness Framework124
while still maintaining strong performance in target- by the Generative AI Commons and the Framework
ed use cases. for Openness in Foundation Models125 by the Mo-
zilla Foundation list up to 16 components that ex-
Open models tend beyond model architecture and parameters.
These include code components (for training, evalu-
Open models are AI models whose architecture and ation and inference), data components (for training,
trained parameters (i.e., weights and biases) are post-training and evaluation), and documentation
released under open source licenses.120 Since Eleu- (such as model cards and dataset cards).
therAI’s release of GPT-Neo 121 as an open alternative
to OpenAI’s GPT in 2021, it has become increasingly A handful of research labs and nonprofit initiatives
common for AI researchers and developers to release – such as the Barcelona Supercomputing Center, the
open models. For example, in 2023, 66% of founda- Allen Institute for AI and EleutherAI – aim to set a
tion models were released as open models, and more higher bar for open source AI by releasing all model
than 1.5 million models are hosted on the Hugging components openly, including trained parameters,
Face Hub.122 code, data and documentation, all under free and
open source licenses.
However, there is currently no standard approach to
open releases of AI models, and many so-called open While most leading AI companies keep their models
models come with significant limitations – such as closed and accessible only via commercial APIs, Meta
withholding training data, using restrictive licens- has adopted a model openness strategy. However, its
es or prohibiting commercial use – compared to the models fall short of commonly accepted open source
norms established in open source software develop- standards due to limitations imposed by its custom
ment. Often, references to “open source models” are licenses.126 Other AI companies, including Mistral,
viewed as attempts at open-washing, diluting tradi- DeepSeek or Cohere, have also released open models.
tional open source standards in the context of gener- In recent months, DeepSeek is seen as the strongest
ative AI.123 example of an AI lab that combines commercial goals
with an open source mission.
123 David Gray Widder, et al. “Open (For Business): Big Tech, 125 Adrien Basdevant, et al. ibid.
Concentrated Power, and the Political Economy of Open AI.” 126 Stefano Maffuli. “Meta’s LLaMa License Is Not Open Source.”
4543807, 17 Aug. 2023. Social Science Research Network, Open Source Initiative, 20 July 2023, https://opensource.org/
https://doi.org/10.2139/ssrn.4543807 blog/metas-llama-2-license-is-not-open-source
43
The generative AI stack
44
4 | The public AI framework
In this section, we propose a definition of public AI lated to that of digital public infrastructure (DPI), but
that builds on the broader concept of a public digital focuses on the provision of alternatives to key digital
infrastructure. We also draw on three previous pa- platforms and communication services.133 These in-
pers that define public AI and outline policies that frastructures are built and governed with the goal of
can support its development. advancing the public interest and maximizing public
value. They stand in contrast to extractive solutions
A standard for the publicness of digital infrastructure that concentrate power in the hands of a few at the
must go beyond vague notions of the public inter- expense of the broader population.
est and instead rely on a clear understanding of how
public value is created. The definition of a public dig- This definition, proposed by Open Future, builds on
ital infrastructure achieves this by identifying three previous work done by researchers from the UCL In-
key characteristics: attributes, functions and forms stitute for Innovation and Public Purpose, aims to
of control. more precisely define the public nature of such in-
frastructures by describing how infrastructures can
By combining the AI stack framework described in the generate public value. The goal of this “complex un-
previous section with this definition of public digital packing of what ‘public’ means [...] is to shift the
infrastructure, we propose an approach that defines focus of the debate from the technical aspects of in-
public AI by considering how public attributes, public frastructure (i.e., making things digital) to its social
functions and public control can apply to the differ- relevance (i.e., making things public).”134
ent layers of the AI stack. In addition, we offer a gradi-
ent of public AI releases that accounts for the fact that The IIPP report focuses on the first two characteris-
public AI can have dependencies on non-public re- tics of public digital infrastructure. Public attributes
sources, especially at the chip and compute layers. refer to the accessibility, openness or interoperability
of infrastructure. These features aim to ensure uni-
versal and unrestricted access, often through open
The concept of public digital licensing or interoperability mechanisms.
infrastructure
Public functions of infrastructure means the infra-
Public digital infrastructures are digital infrastruc- structure contributes to public goals, rather than
tures designed to maximize public value by com-
bining public attributes with public functions and 133 For an explanation of the two concepts, see: Jan Krewer. “Signs
of Progress: Digital Public Infrastructure Is Gaining Traction.”
various forms of public control. 132
The concept is re- Open Future, 13 March 2024. https://openfuture.eu/blog/signs-
of-progress-digital-public-infrastructure-is-gaining-traction
132 Jan Krewer and Zuzanna Warso. “Digital Commons as Providers 134 Zuzanna Warso. “Toward Public Digital Infrastructure: From
of Public Digital Infrastructures.” Open Future, 13 November Hype to Public Value.” AI Now. 15 October 2024. https://
2024. https://openfuture.eu/publication/digital-commons-as- ainowinstitute.org/publication/xii-toward-public-digital-
providers-of-public-digital-infrastructures infrastructure-from-hype-to-public-value
45
The public AI framework
Public attributes Public functions Public control Public funding Public production
merely serving as an alternative provider of mar- Neither public attributes nor public functions alone
ket-based goods or services. These goals can include are sufficient to define publicness. A focus on at-
enabling civic participation, fostering communi- tributes can be agnostic with regard to how infra-
ty and social relationships, stimulating economic ac- structure is used, and the outcomes of such uses.
tivity, improving quality of life or securing essential For example, open source AI solutions can be used
capabilities. Public infrastructure often creates pub- in ways that pursue private rather than public goals.
lic goods – resources with social rather than purely Conversely, a functional focus can overlook accessi-
market value. Brett Frischmann cites research as an bility. In other words, public interest goals can also
example of such a public good. 135
Public functions of be achieved through closed, private infrastructures.
infrastructure often entail filing supply gaps left by
market actors. Public digital infrastructure also needs to meet the
criterion of public control.137 This can take various
The underlying concept of the common good, as forms, including public oversight, public funding or
framed by Mariana Mazzucato, involves both the even public production and provision of infrastruc-
pursuit of shared objectives and care for shared pro- ture. Such infrastructure need not be state-owned or
cesses and relationships. It is a perspective that em- produced by public institutions. What matters is the
phasizes the importance of governance in the process presence of public control, which can also be under-
of generating social value and positions the state as stood as governance for the common good. This is
both a public entrepreneur and a market shaper. A the minimum necessary condition for digital infra-
common good perspective underscores the state’s structure to meet the standard of publicness.
role in setting direction and coordinating collective
action. Through effective governance, states can en- Further on, these three characteristics of public dig-
sure co-creation and participation, promote col- ital infrastructure will be applied to the AI stack
lective learning, secure access, transparency and to provide an overall definition of public AI. Pub-
accountability – all of which are essential to advanc- lic control is the most complex characteristic, where
ing the public interest in digital infrastructure. 136
instead of a binary choice there are multiple ap-
proaches that entail forms of public production,
135 Brett M. Frischmann. “Infrastructure: The Social Value of funding or control of infrastructures. Public actors
Shared Resources.” Oxford Academic, 24 May 2012. https://doi.
org/10.1093/acprof:oso/9780199895656.001.0001
136 Mariana Mazzucato. “Governing the Economics of the Common 137 Open Futures’ report on Public Digital Infrastructures describes
Good: From Correcting Market Failures to Shaping Collective this characteristic in terms of public ownership. In this report,
Goals.” Journal of Economic Policy Reform, vol. 27, no. 1, Jan. we rephrase this as public control, which encompasses also
2024, pp. 1–24. DOI.org (Crossref), https://doi.org/10.1080/174 forms of public ownership but is not limited to them. See: Jan
87870.2023.2280969 Krewer and Zuzanna Warso, ibid.
46
The public AI framework
do not necessarily need to fully produce or own such private actors and public input.”139 In doing so, the
infrastructure – what matters is their ability to or- state not only guides collective action but also pro-
chestrate other actors in support of public digital tects public infrastructure from being co-opted for
infrastructures that meet the remaining character- private gain. At the same time, strategies used by
istics. commercial actors to gain control over the AI stack
can be repurposed in service of the mission-driven
approach that should characterize public AI policies.
Public, private and civic actors in
public digital infrastructure Understanding of the state’s role calls for a shift
from viewing government as a passive actor or a
Ownership of public digital infrastructures – and the mere fixer of market failures to recognizing it as an
respective roles of public, private and civic actors – orchestrator capable of coordinating diverse con-
is a key issue. In the context of generative AI, this is tributors. The deployment of PDIs should be guided
largely a question of who owns or controls comput- by mission-oriented strategies and a market-shap-
ing power, the key dependency for building public AI. ing approach to policy.140 Generating public value
Proper forms of public control can ensure sustain- through digital infrastructures is not merely meant
ability, while some forms of private control risk cre- to fix the market or fill market gaps – it is a goal in
ating a situation in which rewards are privatized and itself. Importantly, public value can be created by
risks are socialized. various actors, including those in the private sector.
The state’s ability to steer this co-creation process is
Governments and public institutions must there- more important than its direct production capacity.
fore play a central role in the development and gov-
ernance of public digital infrastructure, including the
public AI stack. As noted by the World Bank, govern- Proposals for public AI
ments should have “a primary role and responsibility
in deciding whether and how digital public infra- Over the past year, several organizations – including
structure is provided in the interests of the broader the Public AI Network, Mozilla and the Vanderbilt
society and economy.” 138 Deployment of such infra- Policy Accelerator – have introduced frameworks
structures is therefore a collective effort involving for a public AI agenda. Each proposal outlines a set
various actors, but it is the state that plays a key role of conditions intended to maximize public value and
in orchestrating collective action and ensuring prop- safeguard the common good through the develop-
er outcomes. ment and deployment of AI. In our analysis of these
proposals, we identify a set of shared characteristics
This view of government as an orchestrator of out- that define public AI.
comes and public value goes beyond the traditional
public/private ownership divide. The idea of orches-
trating actions of various actors entails “govern- Public AI Network
ment direction, centrally defined public purpose,
and large-scale planning [to be] combined – in The Public AI Network’s policy paper adopts a fram-
still-emergent ways – with market mechanisms, ing that aligns closely with Mariana Mazzucato’s
47
The public AI framework
48
The public AI framework
tential overdependence of public AI on governments development and the recruitment of AI talent into
and public funding. government roles.
In this regard, Mozilla’s approach diverges from These public initiatives are to be complemented by
Mazzucato’s mission-driven model of AI devel- stringent public-utility style regulations aimed at
opment by framing public AI as an ecosystem that private AI companies with monopolistic or oligop-
functions independently of both corporate and gov- olistic market power. Proposed regulatory mea-
ernmental control. As the foundation puts it, “We sures include structural separation rules that aim to
need a resilient and pluralistic AI ecosystem, in dismantle monopolistic control over multiple, in-
which no single entity – whether Big Tech or nation- terconnected layers of the AI stack; non-discrimi-
al governments – can unilaterally decide AI’s fu- nation rules to ensure equal access for all actors; and
ture.” However, this vision does not fully address the restrictions on foreign ownership, control and in-
current ecosystem’s structural dependencies or pro- vestment. Such regulations are deemed necessary
pose concrete strategies for mitigating them. to maintain a competitive private industry that can
provide private contractors to governments without
undermining innovation, effectiveness or resilience.
Vanderbilt Policy Accelerator
The proposal offers a vision of the dynamics between
The white paper “The National Security Case for public options in AI and a regulated private AI mar-
public AI” presents its approach to public AI in the ket. Government development of in-house solutions
context of the threats AI systems may pose to de- is expected to drive more competitive pricing from
mocracy and national security – specifically, how private contractors. At the same time, market com-
they “may threaten the resilience of democracies petition would be further supported through regula-
around the world.”143 Public AI is framed as a du- tory measures. The proposal seeks to strike a balance
al-purpose strategy: it safeguards democratic values, between reliance on private companies and the need
privacy and other fundamental rights while also pro- to build public sector capacity to tackle societal chal-
viding secure and resilient solutions for national de- lenges using AI.
fense and homeland security.
49
The public AI framework
Table 2 | Mapping different definitions of public AI onto the Public Digital Infrastructure framework.
Public control public accountability public use public option, public utility
regulation
Source: Own table.
Drawing on these proposals and the concept of pub- tainability, as captured by the idea of “permanent
lic digital infrastructure, public AI infrastructure can public goods.”
be defined in terms of the three following character-
istics of the AI stack: Both a full-stack AI infrastructure and its individual
layers or components can exhibit the characteristics
•
Public attributes: Public AI provides univer- of public AI infrastructure. A publicly owned com-
sal and unrestricted access to components of the puting resource or an open source AI model, for ex-
stack, enabled through openness and interopera- ample, qualifies as public AI infrastructure. However,
bility. Key components are shared as digital public due to interdependencies between these layers – as
goods, and solutions are built on open stan- discussed in the previous chapter – policy efforts
dards. Systems and processes are auditable and should support full-stack approaches.
transparent. These attributes help reduce market
concentration and dependency on dominant com-
mercial actors.
•
Public functions: Public AI delivers foundational
systems and services that support broad societal
and economic functions, particularly by enabling
downstream activities and public benefits. It sup-
ports essential public capabilities such as knowl-
edge sharing and civic participation. The public AI
stack creates an enabling environment for inno-
vation while safeguarding user rights and social
values.
•
Public control: Public AI involves public control,
funding and/or production of the infrastructures
underpinning the generative AI stack. This may
take various governance forms – from direct gov-
ernment provision to public orchestration of other
actors. The goal is to place the AI stack under
democratic control, with mechanisms for collec-
tive decision-making and accountability. Public
ownership should also ensure long-term sus-
50
The public AI framework
In this report, the terms public AI stack, public achieve “ends.” An AI system is the end product
AI infrastructure and public AI systems are used or application that functions on top of the infra-
to describe the goals and preferred outcomes of structural layers beneath it. A complete AI system
public AI policies. It is therefore important to clar- includes the underlying hardware and compute
ify how these terms differ, as well as how they in- power, data, code and model architecture used
terconnect. to train the model, the trained model itself (i.e.,
its parameters) and the application layer built on
Infrastructures are facilities, systems or institu- top of the stack. In contrast, a public AI initiative
tions that serve society-wide functions and pro- might focus on providing a single type of compo-
vide foundations for downstream activities and nent, for example by releasing datasets as digital
social benefits. Frischmann defines infrastruc- public goods, or by supporting a sustainable model
tures as “shared means for many ends,” empha- development ecosystem. Each of these compo-
sizing that they should be treated as shared nents, on its own, should meet the characteristics
resources.144 of a public AI infrastructure.
Public policies policy should focus on building The public AI policy framework assumes that
public AI infrastructures for two reasons. First, deploying AI systems alone is not sufficient to
AI – as a general-purpose technology – has infra- achieve social goals. Instead, public AI infrastruc-
structural characteristics due to its open-ended ture must be built to support the operation of AI
applications and societal impacts. In this respect, systems in the public interest. At the same time,
it is similar to earlier digital technologies like the public AI policy can also include the deployment
internet. Second, Frischmann’s concept of infra- of specific AI systems – also referred to as AI solu-
structure as a shared resource implies that com- tions. This focus is especially important at a time
mons-based management strategies can enable when the purpose and impact of AI deployment
broad, open access. Emphasizing the infrastruc- remain unclear.
tural nature of AI supports the idea of managing it
as a public resource to ensure wide accessibility. The issue is further complicated by the fact that
AI infrastructure can also be understood as an AI public AI systems can serve infrastructural func-
stack of interconnected layers. tions. Generative AI models, for example, are both
concrete AI systems and can function as digital in-
At the same time, AI technologies are often de- frastructure. Foundation models, also referred to
scribed as AI systems, which suggests specific in- as general-purpose AI models, exhibit such infra-
stances of a technology that not only provide structural characteristics. When open sourced,
“means” but also operate, perform tasks and they can serve as the basis for new or derivative
models.
144 Brett M. Frischmann, ibid.
51
The public AI framework
The gradient describes varying levels of publicness The gradient framework is one of the core contribu-
based on the three characteristics – attributes, func- tions of this white paper. It introduces a model that
tions and control – and how these apply across the maps AI initiatives along a continuum – from fully
layers of the AI stack. 145
These three characteris- public to fully private – based on the dimensions of
tics, taken together, determine an infrastructure’s attributes, functions and control. This approach pro-
ability to support public interest goals and develop- vides policymakers with a diagnostic and strategic
ment trajectories independent from market actors. tool for evaluating where a given intervention cur-
As such, the gradient serves as a practical tool for rently stands and identifying which policy measures
policymakers to identify strategies that enhance the – whether investment, regulation or institution-
publicness of AI systems and strengthen the public al design – could increase its publicness. The frame-
value of specific initiatives. work is particularly relevant when assessing choices
at the compute, data and model layers of the AI stack.
145 The inspiration for this gradient comes from Irene Solaiman’s
work on a gradient of release approaches for generative AI. Positions on the gradient depend on the extent to
See: Irene Solaiman. “The Gradient of Generative AI Release: which the three characteristics of public AI infra-
Methods and Considerations.” arXiv:2302.04844, arXiv, 5 Feb.
2023. arXiv.org, https://doi.org/10.48550/arXiv.2302.04844
structure are met. Weak forms of publicness require
Accessed 3 April 2025. at least public attributes and some public functions.
52
The public AI framework
HIGH
PUBLICNESS
Fully public
Full-stack Public AI infrastructure
approaches
Public computing
infrastructure Public provision of AI components
LOW
PUBLICNESS
COMPUTE DATA MODEL
Stronger forms require securing public control as Level 1: Commercial provision of AI components
well. with public attributes. These are specific compo-
nents – typically open models or libraries – de-
These differences can be systematically mapped onto veloped and shared by commercial actors. They are
the three foundational layers of the AI stack – com- publicly accessible (and often openly shared) by or-
pute, data and models. For each level, specific ex- ganizations that combine public interest goals with
amples vary in the extent to which they meet the a commercial interest in sustaining an ecosystem
characteristics of public AI infrastructure. around their solutions. These infrastructures typi-
53
The public AI framework
cally exhibit high public attributes, low to moderate Level 4: Public provision of AI components. These
public function and low to moderate public control. are individual components – such as datasets,
benchmarks or evaluation tools – developed with
Example: PyTorch146 is a deep learning framework public funding and/or hosted on public infrastruc-
that was open sourced by Meta and is currently host- ture. In some cases, such as datasets or software,
ed by the Linux Foundation. It is an openly shared there is no dependency on compute. These infra-
and collectively governed AI component that plays a structures typically have high public attributes,
central role in AI development. Meta and other con- moderate to high public function and high public
tributors have built an ecosystem of complementary control.
research and innovation around PyTorch, with Meta
maintaining a leading role. Example: Common Voice150 is a Mozilla initiative that
provides an open platform for sharing voice data for
Level 2: Commercial AI infrastructure with public AI training. It is widely cited as a best practice in re-
attributes and functions. This refers to privately sponsible, open data sharing.
controlled infrastructure that ensures some level of
access and has a public interest orientation. It in- Level 5: Full-stack public AI infrastructure built
cludes mechanisms such as public access to com- with commercial compute. These are infrastructures
mercial compute or platforms for sharing data and that have public attributes and functions but depend
models. These infrastructures typically have high on commercial compute both during development
public attributes, moderate public function and low and deployment phases. These infrastructures typ-
public control. ically have high public attributes, moderate to high
public function, and low public control, at least with
Example: Hugging Face 147
is a commercial entity with regard to computing power.
a mission of “democratizing good machine learning.”
It operates a model and dataset sharing platform that Example: OLMo151 is an open model built by the Allen
serve as the backbone of the open source AI ecosystem. Institute for AI that sets a high bar for transparency
in model, code and training data. The institute part-
Level 3: Public computing infrastructure. These are nered with Google to train the model on its Augusta
public computing resources – such as supercomput- computing infrastructure and to deploy it on Vertex
ers and data centers – funded entirely by the public AI, Google’s cloud platform.
sector or developed through public-private partner-
ships. These infrastructures typically have unclear Level 6: Full-stack public AI infrastructure. These
public attributes (unless specific conditions for ac- infrastructures integrate data, models and compute
cess are introduced), low to moderate public function resources that all meet the public AI standard. Sys-
and moderate to high public control. tems built on such infrastructure benefit from syn-
ergies across layers and are free from commercial
Example: AI Factories148 are public supercomput- dependencies. These infrastructures typically have
ers in the EU that are financed through a mix of pub- high public attributes, moderate to high public func-
lic and private funding. Their purpose is to support tion and moderate to high public control.
startups and research institutions, with generative AI
development as one of their goals.149 Example: Alia152 is a large language model developed
by the Barcelona Supercomputing Center, a public
146 PyTorch. https://pytorch.org/ Accessed April 27 April 2025.
147 Hugging Face. https://huggingface.co/ Accessed 27 April 2025.
148 European Commission. “AI Factories.” Digital Strategy. https:// 150 Common Voice Mozilla. https://commonvoice.mozilla.org/en
digital-strategy.ec.europa.eu/en/policies/ai-factories Accessed Accessed 27 April 2025.
27 April 2025. 151 OlMo Ai2. https://allenai.org/olmo Accessed 27 April 2025.
149 AI Factories | Shaping Europe’s digital future 152 Alia. https://www.alia.gob.es/eng/ Accessed 27 April 2025.
54
The public AI framework
research institution in Spain, using its MareNostrum ply be to build more and faster, but to challenge con-
5 supercomputer. The model sets a high standard for centrated power and promote the creation of public
transparency and openness and addresses a linguistic value throughout the AI supply chain,” and that
gap by supporting Spanish and four co-official lan- “public compute investments should instead be seen
guages in Spain. as an industrial policy lever for fundamentally re-
shaping the dynamics of AI development and there-
Several important points emerge from examining the fore the direction of travel of the entire sector.”154
gradient of publicness. First, most initiatives requir-
ing computing power will remain dependent on com- Even this framing of public policy goals might be too
mercial providers, except in rare cases where public ambitious or unrealistic in the face of concentrat-
institutions maintain their own supercomputers. ed power in the AI ecosystem. Rather than aiming to
Second, the provision of certain components, espe- fundamentally impact generative AI markets, policy
cially datasets and open source software for model should prioritize reducing dependencies and building
development, involves fewer such dependencies, independent capacity to generate public value. The
making it easier to build highly public AI infrastruc- key strategic challenge lies in identifying effective
ture. Third, the ability to orchestrate a full-stack ap- public interventions in a landscape where dominant
proach – integrating computing, data and software AI firms are consolidating control over digital stacks
– can significantly enhance publicness by creating and networks – and leveraging vast private funding
synergies across these layers. Full-stack initiatives to reinforce their position in the AI stack.155
are thus more critical to public AI strategies than iso-
lated efforts focused on individual components. Con- The goals of public AI policy should be to:
versely, public AI components tend to have greater
impact when they are part of a coordinated, full- • Develop AI infrastructure with components that
stack framework. are as public as possible by ensuring strong public
attributes and functions
55
The public AI framework
• Promote research and innovation that reduce re- • Directionality and purpose: Public AI infrastruc-
liance on proprietary AI components by advancing ture should be built with clear intent and direc-
new paradigms for AI development tion, ensuring alignment with public values and
the principles outlined below. Public actors must
• Ensure adequate research talent and institution- orchestrate resources and stakeholders to en-
al capacity within the public sector to participate sure public AI initiatives generate public value and
meaningfully in AI development serve the common good.156
•
Commons-based governance: Datasets, soft-
Governance of public AI ware, models and other key components of AI
systems should be stewarded as commons. Such
Governance of AI systems is a necessary component a framework encourages open access while es-
of any public AI agenda. By governance, we refer to tablishing responsible use, democratic oversight
the processes, structures and coordinated actions by and collective stewardship. Commons-based gov-
multiple actors through which decisions related to ernance encompasses approaches that challenge
AI are made and enforced. This concept extends be- proprietary ownership and promote shared con-
yond traditional legal frameworks to include vari- trol over resources.157 It balances broad access to
ous methods for setting and upholding norms – such resources with governance mechanisms that pro-
as standards, codes of practice, voluntary licensing tect rights, ensure quality and generate public
models or community-based rules. value (including economic value).158
56
The public AI framework
•
Sustainable AI development: AI systems should
be developed and deployed in ways that promote
environmental sustainability, fair resource use
and long-term societal benefit. Public procure-
ment of AI infrastructure should include require-
ments for sustainable compute provision and the
development of responsible supply chains.161
•
Reciprocity: Public and private actors that ben-
efit from public AI resources should ensure that
downstream applications and derivative products
adhere to these governance principles. This helps
prevent the privatization of public value and pro-
tects against corporate capture.
57
5 | AI strategy and three pathways to
public AI
In this chapter, we outline key elements of a public AI In doing so, a public AI strategy should shift away from
strategy by presenting three pathways for developing the current “AI race” among major commercial labs.
public AI solutions, based on the three core layers of In other words, it should treat AI technologies not as a
AI systems: data, compute and models. potential superintelligence, but as a “normal technol-
ogy.”162 This means pursuing a pragmatic development
The metaphor of a vertically integrated AI stack sug- strategy in which investments in compute are tied to
gests that the hardware layers at the bottom form the clearly defined goals for model development and de-
infrastructural foundation of any AI system. We pro- ployment. Another pillar of this strategy should focus
pose extending this concept to also treat datasets on fostering a more sustainable path for AI develop-
and models as forms of public infrastructure – crit- ment – one centered on small models and innovations
ical building blocks upon which public AI solutions that make AI technologies more sustainable in terms of
can be developed. In other words, there are three po- their energy consumption and environmental footprint.
tential pathways to public AI, each focused on devel-
oping computing power, datasets or models as public In the following sections, we describe in more de-
infrastructural resources. tail the elements of this strategy, divided into three
pathways. Each pathway focuses on one layer of the
The aim of this chapter is to illustrate what a com- AI stack: compute, data and models. In each case, the
plete public AI strategy could look like, with inter- goal of public AI policy should be to secure public at-
ventions at the various layers of the AI stack. tributes, functions and control over AI systems and
their components.
58
AI strategy and three pathways to public AI
Orchestrating Institution
Ecosystem of small
and domain-specific models
Paradigm-shifting
innovation
AI talent & capabilities
MODELS
AI deployment pathways
Software and
tools development
Public provision of
a capstone model
59
AI strategy and three pathways to public AI
2. Data layer: At the data layer, public AI strategy Helping this ecosystem thrive requires leveraging
should promote the development of high-qual- the power of public institutions, including market-
ity datasets that are both publicly accessible and shaping tools like industrial policy and public pro-
governed through democratic, commons-based curement, to support and strengthen it. Public in-
frameworks. These governance models should en- stitutions must also be part of this ecosystem, and
sure public attributes and functions while pro- public AI solutions should be built on infrastructures
tecting data from inappropriate value extraction, developed within it.163
such as free-riding.
A key role in this ecosystem in this ecosystem should
3. Model layer: At the model layer, public AI strate- be played by a public institution capable of orches-
gy should build on open source generative AI eco- trating actions across a decentralized network of
systems that demonstrate how technologies can be actors. Orchestration involves managing how public
developed with strong public attributes. It should digital infrastructure is produced and how it aligns
also aim to strengthen the public functions of these with evolving needs and values over time. It also
technologies by setting a clear development agen- includes the ability to adapt the ecosystem as those
da focused not on technology for its own sake, but needs, values and conditions change.164 In other
on generating public value – through targeted ap- words, proper orchestration allows an institution
plication development and demand creation. to control the generative capability of various infra-
structures in the public AI ecosystem.
Taken together, this public AI strategy is relatively
complex and requires new forms of governance ca- An orchestrating institution – or, more likely, a net-
pable of coordinating the actions of diverse actors to work of institutions – thus plays a critical role in the
achieve policy goals. It calls for a strong institution- ecosystem. Several blueprints for such institutions
al framework able to lead and orchestrate the strate- have been proposed, drawing inspiration from orga-
gy effectively. nizations that have successfully delivered other pub-
lic goods and coordinated complex ecosystems.
The public AI ecosystem and its Brandon Jackson, writing for Chatham House, pro-
orchestrating institution poses a “BBC for AI” or a British AI Corporation
(BAIC), which he describes as “a new institution that
A public AI strategy should aim to create an eco- would ensure that everyone has access to powerful,
system of public generative AI infrastructures and responsibly built AI capabilities. Yet the BAIC should
systems, rather than focus solely on individual ini- be more than just a head-to-head competitor with
tiatives or standalone institutional capacity. Calls for the private AI companies. It should be set up with an
centralized public AI development often overlook the institutional design that empowers it to chart an
distributed nature of modern AI research, which is
supported by open source development norms. Even
when investing in centralized capacity – such as
public compute resources – a public AI strategy must
163 Katja Bego. “Towards Public Digital Infrastructure: A Proposed
also support a broader ecosystem centered on public
Governance Model.” Nesta, 30 March 2022. https://www.
functions and the creation of public value. nesta.org.uk/project-updates/towards-public-digital-
infrastructure-a-proposed-governance-model/; Alek
Tarkowski, et al. “Generative Interoperability.” Open Future,
The governance mechanisms proposed in chapter 4 11 March 2022. https://openfuture.eu/publication/generative-
support this ecosystem-based approach by promot- interoperability
ing the sharing of data, software and knowledge, and 164 Antonio Cordella and Andrea Paletti. “Government as a
platform, orchestration, and public value creation: the Italian
by ensuring interoperability among solutions with- case.” Government Information Quarterly, 36 (4). ISSN 0740-
in the ecosystem. 624X. https://doi.org/10.1016/j.giq.2019.101409
60
AI strategy and three pathways to public AI
independent path, building innovative digital infra- begin with an overview of bottlenecks and opportu-
structure in the public interest.”165 nities, followed by a list of proposed solutions. At the
end, we offer three additional recommendations for
Another proposal, from the Center for Future Gen- supportive measures.
erations, outlines a “CERN for AI” 166
– a more cen-
tralized model for an institution combining in-house The goal of these sections is not to provide detailed
research with a broad mandate to collaborate with blueprints for every solution. Rather, we aim to out-
academia and industry. A separate proposal by Dan- line a comprehensive strategy, coordinated across
iel Crespo and Mateo Valero suggests that European the three layers and pathways to public AI. These
public AI efforts should draw inspiration from Airbus recommendations do not take into account the eco-
and Galileo, two examples of successful coordination nomic dimensions of deploying public AI infrastruc-
among industry actors to develop innovative, com- ture, which must be shaped by the specific local or
petitive technologies.167 Most recently, the CurrentAI regional context where such policies are developed.
partnership, launched at the Paris AI Action Summit Nor do we attempt to provide a comprehensive list of
in February 2025, aims to orchestrate public interest public AI efforts; examples are included only to illus-
AI development at the global level by coordinating trate different solutions.
the efforts of governments, philanthropies and pri-
vate companies.168
Compute pathway to public AI
It should be emphasized that the aim of this publi-
cation is not to prescribe any specific institutional Computing infrastructure forms a foundation for
model. Rather, it seeks to underscore the importance public AI development, yet policy approaches must
of thoughtful institutional design as the foundation strike a balance between addressing real infrastruc-
for any effective public AI strategy. tural needs and avoiding inflated investment claims.
While compute is essential to AI progress, public
initiatives should prioritize targeted, strategic in-
Three pathways toward public AI vestments rather than attempting to replicate the
infrastructure: compute, data and massive spending patterns of dominant commercial
model players.
61
AI strategy and three pathways to public AI
end manufacturing and back-end manufacturing.169 system. The deep integration of CUDA with Nvidia
Chip design involves creating the architecture and hardware, its robust ecosystem of machine learning
layout of the semiconductor, often using specialized libraries and frameworks and its superior multi-GPU
software. Front-end manufacturing refers to the fab- scaling capabilities have entrenched the company’s
rication process in advanced foundries, where bil- position and raised substantial barriers for compet-
lions of transistors are etched onto each chip using itors.
photolithography and other precision techniques.
Back-end manufacturing involves the assembly, Data centers – the final layer – integrate chips and
packaging and testing of chips before integration software into usable compute systems. These fa-
into devices. cilities are also marked by high concentration due
to their massive cost and operational complexi-
Each stage of this supply chain is highly specialized ty. In January 2025, OpenAI, Oracle and SoftBank
and concentrated in the hands of a few firms. For in- announced the Stargate project, with a planned
stance, ASML in the Netherlands is the only company investment of up to $500 billion in data center infra-
in the world capable of producing extreme ultravio- structure. While the feasibility of this investment re-
let (EUV) lithography machines – critical equipment mains uncertain, the scale illustrates the intensifying
for manufacturing advanced chips. These machines global race for compute leadership.172 In 2024, com-
consist of over 100,000 components and cost roughly panies such as Microsoft, Meta, Amazon and Apple
€350 million each. spent approximately $218 billion on physical infra-
structure. As a result, state-of-the-art data centers
Front-end manufacturing is particularly capital- remain the domain of large, well-capitalized corpo-
and technology-intensive, leading to extreme mar- rations or publicly backed initiatives.173
ket concentration. Only a few firms – most notably
Samsung and Taiwan Semiconductor Manufactur- In short, these three components – chips, software
ing Company (TSMC) – can operate in this space. In and data centers – highlight deep dependencies on a
2024, TSMC held a 90% market share in advanced small number of dominant players and reflect the ex-
logic chips, which are essential for AI training and traordinary capital intensity of the compute market.
deployment. 170 That same year, TSMC invested more
than $30 billion in capital expenditures and manu- Public initiatives aimed at securing compute capac-
factured most of Nvidia’s chips, along with chips for ity often focus narrowly on sovereignty rather than
numerous Chinese companies, despite geopolitical public value. These efforts treat compute as a na-
tensions. 171
tional resource and view expanded capacity as an
end in itself. Investments, typically undertaken with
Specialized software frameworks also create signif- commercial partners, are rarely tied to public value
icant lock-in. Nvidia’s proprietary CUDA platform, conditions and often serve to bolster national com-
designed to run exclusively on its GPUs, dominates mercial actors. The notion of sovereign AI, when built
the market and has created a ‚walled garden‘ eco- on commercial compute infrastructure, risks rein-
forcing the market dominance of existing players.
169 Jan-Peter Kleinhans and Julia Christina Hess. “Governments’
role in the global semiconductor value chain #2.” Stiftung Neue
Verantwortung. 6 July 2022. https://www.interface-eu.org/
publications/eca-mapping
170 “TSMC’s Advanced Processes Remain Resilient Amid 172 Steve Holland. “Trump announces private-sector $500 billion
Challenges.” Trendforce. 8 April 2024. https://www.trendforce. investment in AI infrastructure.” Reuters. 22 January 2025.
com/news/2024/04/08/news-tsmcs-advanced-processes- https://www.reuters.com/technology/artificial-intelligence/
remain-resilient-amid-challenges/ trump-announce-private-sector-ai-infrastructure-
171 Chris Miller. “Chip War. The Fight for the World‘s Most investment-cbs-reports-2025-01-21/
Critical Technology.” Simon & Schuster. 4 October 2022. 173 Michael Flaherty. “Tech dollars flood into AI data centers.”
https://www.simonandschuster.com/books/Chip-War/Chris- Axios. 26 December 2024. https://www.axios.com/2024/12/20/
Miller/9781982172008 big-tech-capex-ai
62
AI strategy and three pathways to public AI
While there is growing consensus among policy- such partnerships, resulting in infrastructures with a
makers that public compute provision is essential, high degree of publicness.
the high costs and fast pace of technological change
make such interventions challenging. Unlike sover- In addition, public compute initiatives face two major
eign AI strategies, the goal should not merely be to shortcomings. First, they often lack effective allo-
expand national capacity, but to ensure that these cation mechanisms and operate without a clear vi-
investments generate public benefit. sion for which AI components and infrastructures
should be prioritized. Not all projects require large-
scale compute, and a better understanding of actu-
Compute: opportunities al computing needs is necessary to inform a sound
public compute strategy. Without clear criteria and
The high market concentration and capital intensity governance, resources risk being misallocated or un-
of computing infrastructure create significant chal- derused. Any large-scale initiative – such as a poten-
lenges for developing public compute initiatives. As tial “CERN for AI” – should begin with a systematic
noted in the Computing Commons report by the Ada assessment of demand, identifying which institu-
Lovelace Institute, “policymakers need to be realistic tions, researchers and projects require which lev-
about what can be achieved through public compute els of compute. This demand-driven approach should
projects alone,” as dependencies in semiconductor guide allocation policies, access rules and future
supply chains mean that “for most jurisdictions the scaling. A mission-driven strategy should also link
goal of ‘onshoring’ production will likely be a near compute use to clear public goals.
impossibility in the short to medium term.”
Second, compute initiatives are often framed pri-
The report also highlights “a lack of genuinely inde- marily as sovereign assets, with a focus on national
pendent alternatives to AI infrastructures operated control and expanding capacity as ends in them-
by the largest technology firms, which are over- selves. This perspective risks prioritizing geopolitical
whelmingly headquartered in the USA and China.”174 narratives about AI sovereignty over more mean-
At the same time, compute costs continue to rise ingful questions about who has access to computing
sharply, with spending on AI training runs increasing power and for what purpose.
by a factor of 2.5 annually in recent years.175 Even as
efficiency and optimization improve, total expendi- Given these structural dependencies, most com-
tures remain massive, with some leading AI compa- pute-based pathways toward public AI will likely fall
nies announcing plans to invest hundreds of billions in the middle of the publicness gradient.
of dollars in the coming years.
Policy recommendations for the compute pathway
Due to the high capital requirements and rapid pace include:
of technological change, it is more realistic to pur-
sue public-private partnerships, where governments Public compute for open source AI development
serve as one partner rather than the sole provider.
Aurora GPT – an initiative to build a science-focused The first recommendation is to ensure that fully open
foundation model in the United States – and the Eu- source AI projects – with open, accessible training
ropean AI Factories initiative are both examples of data – have access to sufficient compute resourc-
es. A core interest of any public AI agenda should
63
AI strategy and three pathways to public AI
and publicly supported data center initiatives play an and public research organizations. Many universi-
essential role in making this possible. ties face serious constraints due to the high cost and
limited availability of GPUs, often relying on expen-
A notable example is the collaboration between the sive commercial cloud services. Investing in public
Allen Institute for AI and the LUMI supercomput- supercomputing and data center initiatives is essen-
er in – part of the EU’s EuroHPC AI Factories initia- tial to ensure that cutting-edge AI research does not
tive. In 2024, the Allen Institute released OLMo, a remain confined to closed, private labs but can also
fully open source language model, including its pre- take place in universities and research institutes.
training dataset, trained using LUMI’s computing
power.176 The EuroHPC initiative represents a form of Providing compute access to the academic sector is
semi-public infrastructure, funded through a com- also a way to attract and retain top AI talent with-
bination of EU funds, national budgets and private in public research institutions. Without adequate re-
contributions.177 sources, researchers and students may be drawn to
well-funded corporate labs, limiting the develop-
Another example is the Aurora GPT, a U.S. project led ment of fully open source models and public interest
by Argonne National Lab in partnership with Intel, applications.
which relies on a public-private collaboration using
the Aurora supercomputer and Intel-provided GPUs Improved coordination between public
to develop a foundation model for science. 178
In Eu- compute initiatives
rope, the Barcelona Supercomputing Center recently
released Alia, a Spanish-language model described as Many existing public compute initiatives tend to
open, transparent and public.179 frame infrastructure primarily as a sovereign asset,
emphasizing national control and domestic capac-
These examples demonstrate the potential of pub- ity. However, this sovereignty-first approach risks
lic compute to support open source AI efforts. Howev- fragmenting the broader mission of public AI. A truly
er, sustaining progress at the frontier – and enabling public AI agenda should not reinforce narrow nation-
wide deployment – will require continuous expansion al strategies but instead promote cross-border coop-
and upgrades to public computing capacity. As pro- eration – particularly among democratic nations – to
prietary models continue to advance rapidly, there is a maximize the impact of public investment. Without
growing risk that open models will fall too far behind. coordinated strategies, governments risk duplicat-
To avoid this widening gap, expanded access to public ing efforts, underutilizing resources and falling short
compute for open source AI development is essential. of the scale needed to support open source AI ecosys-
tems and create viable public alternatives to propri-
Public compute for research institutions etary models.
A second strategic pillar of a public AI strategy is ex- One area where improved coordination is urgent is
panding access to compute for academic institutions the AI Factories initiative in Europe, which estab-
lished data centers in strategic locations, such as
large supercomputing hubs. This initiative would
176 Allen Institute for AI, “Hello OLMo: A Truly Open LLM.” Allen
Institute for AI Blog. January 9 January 2024. https://allenai. benefit from deeper collaboration across nation-
org/blog/hello-olmo-a-truly-open-llm-43f7e7359222 al borders to reach its full potential. One propos-
177 European High-Performance Computing Joint Undertaking
al that embodies this approach is the “Airbus for
(EuroHPC JU). “Discover EuroHPC JU.” https://eurohpc-ju.
europa.eu/about/discover-eurohpc-ju_en Accessed 27 April AI”180 model, which advocates pooling resources and
2025. building shared capacity to produce high-perform-
178 Rusty Flint. “AuroraGPT: Argonne Lab and Intel.” Quantum
ing, open AI models. Supporting such collaborative
Zeitgeist. 14 March 2024. https://quantumzeitgeist.com/
auroragpt-argonne-lab-and-intel/
179 Alia. https://www.alia.gob.es/eng/ Accessed 27 April 2025. 180 Daniel Crespo and Mateo Valero. ibid.
64
AI strategy and three pathways to public AI
frameworks is essential to transform scattered infra- Governance and ethical concerns related to data use
structure into a cohesive, strategic and effective pub- for AI training represent another major bottleneck.
lic AI ecosystem. The training of commercial models on the entire-
ty of the public internet – often under unclear legal
conditions and with minimal transparency – re-
Data pathway to public AI flects a lack of proper oversight and results in the ex-
traction of value from global knowledge and cultural
Current approaches to data sources for AI develop- commons.182 Evidence gathered by the Data Prove-
ment oscillate between proprietary control and un- nance Initiative shows that consent for web crawl-
restrained extraction from public sources. Unlike ing is steadily decreasing, especially among domains
the compute pathway, establishing a data commons whose content is used in AI training.183 And recent
is not primarily about investments in technology or data from the Wikimedia Foundation shows that rap-
hardware – it requires better governance of various idly growing automated traffic from web crawlers of
types of data. The data pathway offers opportunities AI companies is becoming a financial burden.184
to create genuine digital public goods with gover-
nance mechanisms that protect against value ex- Data quality presents a further challenge. Unlike
traction and ensure equitable access. This requires compute-related dependencies, poor data quali-
overcoming bottlenecks related to proprietary con- ty may not halt AI development but does undermine
trol, declining consent for data use and insufficient the usefulness and integrity of generative AI solu-
attention to data quality. tions. Training models on publicly available con-
tent often reflects not only poor governance but also
a lack of attention to data quality. Although this issue
Data: bottlenecks has been raised by some experts,185 recurring exam-
ples demonstrate that governance practices remain
While much of AI development is fueled by a culture insufficient.
of open sharing, data practices are largely shaped by
either proprietary control or unregulated use of pub- As a result, the widespread use of large datasets built
licly available sources. Few AI development teams from scraped web content risks reinforcing biases
make meaningful efforts to share high-quality, use- – such as the overrepresentation of well-resourced
ful datasets. At the same time, they seek competi- written languages and dominant cultural narra-
tive advantages through proprietary sources – such tives – exacerbating existing inequalities. In some
as user-generated and personal data from social cases, it has also led to the scaling of harmful con-
networks and online platforms, or data obtained tent, including explicit imagery.186 On a global scale,
through exclusive agreements. Public web con-
tent continues to be crawled and scraped, with at- 182 Paul Keller. “AI, the Commons and the Limits of Copyright.”
Open Future. 7 March 2024. https://openfuture.eu/blog/ai-the-
tempts to filter and improve quality. Yet there are
commons-and-the-limits-of-copyright/
signs that the social contract underpinning the open 183 Shayne Longpre et al. “Compulsory Licensing for Artificial
web is eroding, as content owners increasingly with- Intelligence Models,” arXiv preprint. 24 July 2024. https://
arxiv.org/abs/2407.14933
draw consent for their domains to be crawled. These
184 Birgit Mueller et al. “How Crawlers Impact the Operations of
trends contribute to a negative feedback loop, lead- the Wikimedia Projects,” Diff – Wikimedia Foundation Blog,
ing to what Stefaan Verhulst has described as a “data 1 April 1 2025. https://diff.wikimedia.org/2025/04/01/how-
crawlers-impact-the-operations-of-the-wikimedia-projects/
winter” – a decline in the willingness to see data as a
185 Will Orr and Kate Crawford. “Is AI Computation a Public
resource that can serve the common good.181 Good?” SocArXiv preprint. 2024. https://osf.io/preprints/
socarxiv/8c9uh_v1
186 Abeba Birhane et al. “Multimodal datasets: misogyny,
181 Stefaan Verhulst. “Are We Entering a Data Winter?” Policy pornography, and malignant stereotypes.” arXiv preprint. 5
Labs. 21 March 2024. https://policylabs.frontiersin.org/content/ October 2021. https://arxiv.org/pdf/2110.01963 Accessed 3 April
commentary-are-we-entering-a-data-winter 2025.
65
AI strategy and three pathways to public AI
inadequate data governance is increasingly seen a foundational data source190 – even when their legal
as enabling new forms of data colonialism and ex- status is ambiguous, as seen with the Books3 data-
tractive practices. 187
set. Simply increasing the volume of training data
is neither the only strategy nor the most effective
one. Developers are already focused on improving
Data: opportunities the quality of web-scraped data, as shown by proj-
ects like the FineWeb datasets, which filter and clean
The data pathway to public AI development aims to Common Crawl data.
create a pool of datasets and content collections that
function as digital public goods. While data is not High-quality data sources can support genera-
typically seen as public infrastructure for AI devel- tive AI development at various stages: pretraining,
opment, it can in fact possess public attributes, serve post-training or adaptation, inference and the cre-
public functions and be subject to public control.188 ation of synthetic data.191 Newer approaches to data-
For this to occur, there must be a dual obligation: set development rely less on pretraining and focus
to expand access and to better protect various data more on the post-training phase, which requires data
sources. In the case of data, the key challenges are that cannot be easily “found in the wild” or scraped
less about upstream dependencies and more about from public sources. This includes domain-specif-
downstream risks – specifically, the risk that pub- ic data for fine-tuning specialized models, as well
lic data is extracted for private gain, reinforcing in- as dialogues and task-specific examples used in in-
equality.189 This happens when private actors capture struction tuning. In model distillation, for exam-
the economic value generated by data without giv- ple, developers rely not on new human-generated
ing back to the people and institutions that created or data but on synthetic data generated by a “teacher”
maintained it as a public good. model. Public interventions must therefore address
both the governance of publicly available data and
This means data-sharing efforts must shift focus – the development of specialized datasets and tools
from simply increasing the volume of available data – covered further in the section on additional mea-
to improving its quality and implementing gover- sures, as part of the software and tool ecosystem for
nance mechanisms that ensure equitable, sustainable public AI.
sharing protected from value extraction. As a result,
both data transparency and novel gated access mod- Data is not a homogenous concept, and its use is gov-
els (to protect, for example, personal data rights) are erned by multiple legal frameworks that protect data
becoming central governance issues. In addition, en- rights, including copyright and personal data regu-
suring access to commercial datasets – at least for lations. As such, data sharing exists on a spectrum
research purposes – is a vital reciprocal measure to – from fully open to gated models – each of which
secure private data for public interest uses. can be understood as a form of commons-based gov-
ernance. These approaches are underpinned by key
Model developers have always relied on high-qual- principles: sharing as much data as possible while
ity open datasets. Wikipedia is a prime example of a maintaining necessary restrictions; ensuring trans-
structured, high-quality dataset, and books remain parency about data sources; respecting data subjects’
choices; protecting shared resources; maintaining
187 James Muldoon and Boxi A. Wu. “Artificial Intelligence in the
Colonial Matrix of Power.” Philosophy & Technology 36, no. 80 190 Alek Tarkowski et al. “Towards a Books Data Commons for
(2023). https://doi.org/10.1007/s13347-023-00687-8 Accessed AI Training.”Open Future. 8 April 2024. https://openfuture.
3 April 2025. eu/publication/towards-a-books-data-commons-for-ai-
188 Digital Public Goods Alliance, et al. “Exploring Data as and in training/
Service of the Public Good.” https://www.digitalpublicgoods. 191 Hannah Chafetz, Sampriti Saxena and Stefaan G. Verhulst.
net/PublicGoodDataReport.pdf Accessed 23 April 2025. “A Fourth Wave of Open Data? Exploring the Spectrum of
189 Paul Keller and Alek Tarkowski. “The Paradox of Open.” Open Scenarios for Open Data and Generative AI.” The GovLab. May
Future. https://paradox.openfuture.eu/ Accessed 23 April 2025. 2024. https://www.genai.opendatapolicylab.org/
66
AI strategy and three pathways to public AI
dataset quality; and establishing trusted institutions open access. Notable examples include the Human
to steward them. Genome Project, CERN’s Open Data portal and NA-
SA’s Earthdata platform.
Policy recommendations for the data pathway in-
clude: Public data commons
Datasets as digital public goods A public data commons is a data governance frame-
work that aims to secure public interest goals
Open data – such as Wikimedia content – has been through commons-based management of data.195
a foundational resource in the development of gen- These commons complement open data approach-
erative AI models. Using openly licensed data and es and are particularly well suited for cases involv-
content for AI training offers the advantage of legal ing sensitive data, where rights must be protected, or
certainty when developing models. Research by Gov- where economic factors tied to dataset creation and
Lab suggests that “the intersection of open data– maintenance must be considered.
specifically open data from government or research
institutions –and generative AI can not only improve Public data commons should be governed by three
the quality of the generative AI output but also help core principles:
expand generative AI use cases and democratize open
data access.”192 Open data therefore has the poten- • Stewarding access through clear sharing frame-
tial to support public functions of generative AI in works and permission interfaces
solutions that address global challenges like climate
change or healthcare.193 • Ensuring collective governance through defined
communities, trusted institutions and democratic
Using open datasets for AI training also offers the control
possibility of moving beyond current norms of model
openness, in which model weights are shared but • Generating public value through mission-orient-
training data remains closed and nontransparent. ed goals and public interest-oriented licensing
Open data enables the development of fully open AI models.
models.194 Ongoing efforts to build such models are
undertaken by organizations like EleutherAI, Spawn- To establish data commons for AI training, edicat-
ing and the Allen Institute for AI. ed public institutions are needed to act as trusted in-
termediaries. These institutions must also possess the
Public AI policies can build on more than a decade technical capabilities to build hosting platforms for
of experience developing open data infrastructure. modern training datasets. Public data commons serve
However, the approach must shift from simply re- a gatekeeping role, supporting various data types and
leasing as much data as possible to intentionally implementing flexible governance – from open access
creating high-quality, purpose-built datasets for AI to gated models that preserve individual and collec-
training. Beyond just training foundation models, tive data rights. Work on data commons is often mo-
many initiatives already provide open data for com- tivated by the need to protect community-owned data
putational research and demonstrate the value of from exploitation. Notable examples include the Māori
language datasets and AI tools developed by Te Hiku
Media, as well as African language datasets curated by
192 Digital Public Goods Alliance, ibid.
193 Hannah Chafetz, Sampriti Saxena and Stefaan G. Verhulst. ibid.
Common Voice and the African Languages Project.
194 Stefan Baack, et al. “Towards Best Practices for Open Datasets
for LLM Training Proceedings from the Dataset Convening.” 195 Alek Tarkowski and Zuzanna Warso. “Commons-based data
Mozilla. 13 January 2025. https://foundation.mozilla.org/en/ set governance for AI.” Open Future. 21 March 2024. https://
research/library/towards-best-practices-for-open-datasets- openfuture.eu/publication/commons-based-data-set-
for-llm-training/ governance-for-ai/ Accessed 3 April 2025.
67
AI strategy and three pathways to public AI
Public AI policy goals at the model layer should cen- Over the last two years, multiple open small mod-
ter on ensuring the active development of open AI els have been released, often developed to address
models that can be deployed for public interest uses. linguistic or regional gaps left by major AI labs. Ex-
Under the dominant transformer paradigm, model amples include the SEA-LION models created by AI
development faces serious constraints related to Singapore; Aya, a family of multilingual models from
compute and to the lack of sufficient, high-quality Cohere; and Bielik, a Polish LLM built by the grass-
training data. roots Spichlerz initiative. However, these developers
typically lack the resources needed to sustain long-
Building foundation models from scratch is not only term development or deployment of their models,
extremely expensive, but also subject to rapid ob- limiting the impact of these alternatives.
solescence as state-of-the-art capabilities are ad-
vancing quickly. This makes it difficult to justify While models themselves are a key component facil-
large-scale public investments aimed at directly itating further AI development, open access to ad-
competing with commercial AI labs. Even when pub-
lic compute resources are made available – as in the 196 Irene Solaiman. ibid.
68
AI strategy and three pathways to public AI
ditional AI model components is just as important. stone model” and the creation of derivative small
As noted in the Model Openness Framework197 and models that are more sustainable and suited to spe-
the Framework for Openness in Foundation Models cific needs.
by the Columbia Convening on Openness and AI,198
openness in AI goes beyond the architecture and pa- Policy recommendations for the model pathway in-
rameters of individual models. It also includes the clude:
code and datasets used to train, finetune or evaluate
models.199, 200 Their development faces similar con- Provision of a capstone model
straints, typical for many open source projects, also
beyond AI development. Most so-called open models today fall short of genu-
ine open source standards – often omitting training
data, critical documentation or transparency around
Models: opportunities the training process. This undermines scientific re-
producibility and makes it difficult to audit or assess
These constraints suggest that public AI initiatives model bias. As a result, critical infrastructure is in-
need a different approach than competing directly creasingly shaped by private actors without demo-
with commercial AI development. Public AI strate- cratic oversight. While this is particularly true at the
gy should not aim to engage in an expensive race to frontier, not all cutting-edge models are developed
build the largest and most capable models – a strat- by private firms – for example, the Allen Institute for
egy that is neither realistic nor sustainable without AI’s OLMo 2 demonstrates that open and transparent
major increases in public compute capacity. In- alternatives are possible.
stead, the focus should be on fostering an ecosystem
of competitive open AI models and the components A robust public AI strategy should foster the devel-
needed to build them. opment and long-term sustainability of high-per-
formance, openly available models, ensuring a rich
The rapid pace of technological development means ecosystem of public alternatives to proprietary sys-
that the resources required to develop or deploy tems. Among these, governments should prioritize
models can shift quickly. With new AI paradigms, the the creation of at least one “capstone model” – a
demands for successful model development or de- permanently open, democratically governed model
ployment may change at any moment. For instance, that aspires to remain at or near the frontier of AI
recent advances in model distillation have made it capabilities. This model would serve not only as a
possible to create small, efficient models at relatively flagship public asset but also as a foundation for
low cost – models that can effectively compete with broad-based research, innovation and deployment.
earlier generations of large models. While multiple such models may emerge, the cap-
stone model would serve as a strategic anchor. Be-
Model-based pathways to public AI must be ground- cause open source is a global endeavor, adherence to
ed in a clear understanding of the technological open source standards should take precedence over
advances that enable more affordable yet capable the model’s country of origin.
models, along with targeted investments that sup-
port the entire open source AI ecosystem and public However, due to the speed of innovation in AI, build-
interest innovation. These pathways should support ing a state-of-the-art model carries the risk of rapid
both the development of a state-of-the-art “cap- obsolescence, as new breakthroughs by leading com-
mercial labs may quickly outpace public efforts. As
69
AI strategy and three pathways to public AI
There is also a need to conduct research into alter- Finally, open benchmarks are essential for measur-
native development approaches beyond the trans- ing model capabilities and societal impact. While
former architecture and its scaling laws. Innovation current benchmarks often focus on technical perfor-
should focus on creating less resource-intensive AI mance, new benchmarks should evaluate how well AI
technologies. Investing in alternative model archi- systems serve public goals, particularly within regu-
tectures is not merely a technical curiosity – it is a lated industries where the stakes for public safety or
strategic necessity for ensuring the sustainability general interest are high.
of AI development. By reducing the computational
burden, these alternative models could lower energy
demands and operational costs, making advanced AI
capabilities more accessible to public institutions and
research organizations.
70
AI strategy and three pathways to public AI
71
AI strategy and three pathways to public AI
Fund, which has been investing in Open Digital In- Build AI deployment pathways
frastructure for AI;203 the provision of targeted fund-
ing through existing public funding bodies, such as In principle, public infrastructures generate spill-
the aforementioned Fund or the UK Research and In- over effects – also known as positive externalities
novation (UKRI); 204 or direct funding for open source – through their use by a wide range of actors. At the
software through AI strategies, such as France’s same time, applications and solutions built on top
funding for scikit-learn and the broader data science of these infrastructures represent a more direct re-
commons in its national AI strategy. 205
alization of public AI’s value, as they can address
real-world problems and deliver tangible social ben-
Unlike hardware-focused computing investments, efits. These applications also provide essential feed-
software development can deliver outsized impacts back that can inform and improve the underlying
with relatively modest funding. Such software de- infrastructure.
velopment supports the open source AI ecosystem,
ensures collaboration and supports capacity devel- Because AI is a general-purpose technology, public
opment. AI infrastructure is abstract and broadly applicable
across many, if not all, spheres of life. To fulfill the
This recommendation is not limited to software but potential of investments in this infrastructure, spe-
also includes other tools such as evaluation frame- cific solutions need to be developed. Without them,
works, benchmarks and development environments. there is a risk that the capacities and value embedded
It can also entail the development of novel tools, such in public AI – such as datasets – will be captured by
as frameworks for democratic inputs to post-train- commercial actors and repurposed for private gain.
ing.206 In each case, public funding of open source Applications built on top of the public AI stack can
software development would lower barriers to entry address problems that are underserved by commer-
and ensure that no new bottlenecks are created due cial AI development. The focus should therefore be
to proprietary control over key building blocks (as in on advancing public interest goals – areas where pri-
the case of CUDA). Special attention should be given vate industry lacks incentives to invest, or where
to supporting underserved components of the stack, there is a risk of value capture by commercial actors
including data processing pipelines, model evalua- benefiting from first-mover advantage and network
tion suites and tools supporting public accountability effects.207
in deployment. Support should extend beyond initial
development to include the maintenance of key soft- Recent efforts to build public interest applications
ware and tools, treated as digital public goods. include GovLab’s New Commons Challenge, which
promotes responsible reuse of data for AI-driven
local decision-making and humanitarian response;
Gooey.ai’s Workflow Accelerator, which helps orga-
nizations develop AI assistants for farmers, nurses,
203 Adriana Groh. “AI Sovereignty Starts with Open Infrastructure.”
Sovereign Tech Agency. 27 February 2025. https://www. technicians and other frontline workers; and
sovereign.tech/news/ai-sovereignty-open-infrastructure/ EarthRanger, an AI-powered wildlife conservation
204 Tom Milton, Cailean Osborne, Matt Pickering. “A UK platform stewarded by the Allen Institute for AI.
Open-Source Fund to Support Software Innovation and
Maintenance.” Centre for British Progress. 17 April 2024.
https://britishprogress.org/uk-day-one/a-uk-open-source-
fund-to-support-software-innovati
205 “Stratégie nationale pour l‘intelligence artificiele – 2e phase.”
8 November 2021. https://www.enseignementsup-recherche.
gouv.fr/sites/default/files/2021-11/dossier-de-presse---
strat-gie-nationale-pour-l-intelligence-artificielle-2e-
phase-14920.pdf Accessed 23 April 2025.
206 “A Roadmap to Democratic AI.” The Collective Intelligence
Project. March 2024. https://www.cip.org/research/ai-roadmap 207 Nik Marda, Jasmine Sun and Mark Surman, ibid.
72
AI strategy and three pathways to public AI
Coda: mission-driven public AI dation model. While avoiding the hype driving much
policy of commercial AI development, public AI policy
should be grounded in a careful analysis of both the
The characteristics of public AI outlined in chapter demand for and supply of AI capabilities. In this case,
3 remain constant regardless of an AI system’s ar- provisioning a robust, state-of-the-art model as an
chitecture or capabilities. Public AI should be open open source technology would fill a critical market
and accessible, create public value and remain under gap, as commercial models typically offer, at best,
public control. In this sense, the vision of public AI is open weights without transparent documentation on
intentionally technologically agnostic. training data or development processes.
However, public AI policy must also offer clarity on As outlined in the model pathway above, we recom-
the kinds of technologies needed to serve the public mend supporting both the creation of a public foun-
interest. This requires the policy debate to engage – dation model and the development of various small
at least to some degree – with fundamental ques- models. This two-pronged approach fosters an open
tions about the types of AI systems being developed source AI ecosystem that benefits from a central, ca-
and deployed. pable model while also investing in more specialized,
sustainable solutions. The exact balance between
Specifically, public AI policy needs to reckon with the these two directions should be guided by a more de-
idea of Artificial General Intelligence (AGI), a vision tailed analysis of both the economics of AI develop-
promoted by many leading commercial AI labs. The ment and the needs of the public.
fuzzy and controversial term is typically used to de-
scribe AI systems that equal or surpass human intel- Currently, AI strategies often lack specificity on ei-
ligence. OpenAI, for instance, defines AGI as “highly ther the societal needs AI should address or the tech-
autonomous systems that outperform humans at nical capabilities required to meet them. Instead,
most economically valuable work.” 208 Policymakers public investments in AI are frequently motivated by
should approach the concept of AGI with caution, as a generalized belief that industrial policy must sup-
it often feeds into hype cycles and fosters a sense of port disruptive innovation as a remedy for econom-
technological determinism. 209
ic stagnation. Too often, these investments follow
the demands of industry, rather than focusing on the
An alternative is to treat AI as cultural technologies 210
real, everyday needs of citizens.212
– or “normal AI:” 211 technologies that that may fun-
damentally shape our societies without necessarily Proposals for public investment in computing power
exhibiting superhuman capabilities. illustrate this problem. These often lack a clear anal-
ysis or justification for the types of AI systems to be
The public AI strategy proposed in this report in- developed or the computing resources required. As
cludes the development of a state-of-the-art foun- a result, public AI policy risks replicating the same
unsustainable “more is better” investment logic that
208 Lauren Leffer. “In the Race to Artificial General Intelligence, drives much of the commercial AI sector. A shift
Where’s the Finish Line?.” Scientific American. 25 June 2024.
toward building AI as public digital infrastructure
https://www.scientificamerican.com/article/what-does-
artificial-general-intelligence-actually-mean/ – guided by the principles outlined in this report –
209 Zuzanna Warso. “The Digital Innovation We Need. Three offers a way to avoid these pitfalls and align AI de-
lessons on EU Research and Innovation funding.” Open Future.
velopment with the public good.
12 November 2024. https://openfuture.eu/publication/the-
digital-transformation-we-need/
212 Zuzanna Warso and Meret Baumgartner. “Putting money
210 Henry Farrell, et al. “Large AI models are cultural and social
where your mouth is? Insights into EU R&I funding for digital
technologies.” Science. 13 March 2025, Vol 387, Issue 6739, pp.
technologies.” Critical Infrastructure Lab. 2025. https://
1153-1156. https://www.science.org/stoken/author-tokens/ST-
openfuture.eu/publication/putting-money-where-your-
2495/full
mouth-is/
211 Arvind Narayanan and Sayash Kapoor. ibid.
73
Address | Contact
Bertelsmann Stiftung
Carl-Bertelsmann-Straße 256
33311 Gütersloh
Phone +49 5241 81-0
www.bertelsmann-stiftung.de
Supported by Commissioned by