CloudFest speaker Brewster Kahle on Public AI and Creating The Internet Archive and Wayback Machine

Brewster Kahle

Brewster Kahle is the founder and Digital Librarian of the Internet Archive, one of the largest digital libraries in the world. He has spent three decades trying to build “a library of everything” for the digital age, bringing us the ever-popular Wayback Machine; and now he wants the Cloud world to help make AI work for the public good.

As CloudFest keynote speaker, Brewster will share his story and look to the future of the Cloud. He’ll discuss how Cloud providers sit right at the fault line between information, infrastructure, and policy.

But we couldn’t wait for that, so we called him to get some of his latest thoughts on everything from Finnish-speaking AI to the need for a European YouTube.

CloudFest attendees will know you as “the Wayback Machine guy,” how did your career lead to that point?

I was trained at the MIT AI Lab around 1980, and the idea back then was to build a “library of everything”—if you wanted a global brain, it should be able to read good books.

We didn’t yet have the pieces, so I helped build a supercomputer that became the first search engine on the internet, worked on a publishing system that predated the web, helped get publishers onto the web, and pushed free and open source software and open protocols.

By 1996, the infrastructure was finally there, and I could focus on building the library itself by founding the Internet Archive.

What does the Internet Archive look like today as a “library of everything”?

We started by collecting the web, and that grew very fast. Then we began systematically archiving television—from Russia, China, Japan, Iraq, Al Jazeera, Fox, ABC, NBC, CBS—and later expanded into music, books and software.

A big current focus is government websites and datasets, under a project we call Democracy’s Library, and we digitize on the order of a million books a year while continuing to collect foreign digital materials, mostly from the web.

You’ve used the phrase “universal access to all knowledge.” Is that your guiding principle?

Yes. If there’s a byline for the whole project, it’s “universal access to all knowledge.” But in the age of AI, that goal now implies “public AI” too—AI that serves research, education and the public good, rather than just commercial interests.

How is the explosion of AI‑generated content changing the job of archiving the web?

We’re archiving AI‑generated materials as well, but they bring new challenges. Every website can now be custom, just like we already saw with JavaScript‑heavy sites in the last decade, and we’re heading into a world where every visit is personalized to what the system thinks you want to hear.

That means we may have to archive many different “shots” of the same site to capture what different people see; otherwise you can’t even be sure you’re all looking at the same storefront.

Why has trust become a strategic issue for Cloud and AI providers, not just a technical one?

Right now, the web feels like it’s failing us—we feel surveilled, and AI risks making that worse if it’s controlled by only a few big companies.

What we want instead is a game with many winners, where small and medium‑sized players can still deliver competitive user experiences and services. It shouldn’t be a stark choice between American and Chinese AI giants; we should see European, local, and contextual systems too, and build a smarter infrastructure rather than more powerful monopolies.

Why has trust become a strategic issue for Cloud and AI providers, not just a technical one?

European policy on AI and data is a mixed bag. On paper, Europe’s Copyright Act allows cultural heritage and research organizations to do text and data mining—which is basically AI research—over large datasets. But a survey by the University of Amsterdam’s law school couldn’t find many cultural heritage institutions actually using that right, largely because of a perceived lack of legal certainty.

Other jurisdictions are more welcoming: Switzerland lets anyone do text and data mining for scientific research, and you see about 3,000 Google researchers near Zurich, plus 1,000 from Meta and lots of startups. Japan, Singapore and China also have policies that are more favorable to innovation, while in Europe, powerful rightholder lobbies are holding things back despite some good laws.

Why has trust become a strategic issue for Cloud and AI providers, not just a technical one?

Public AI is using AI tools and services for the public good, especially in research and education. One example: Internet Archive Europe is working with a research organization to build “Climate GPT,” models trained on the best climate information we have, to offer free services to people dealing with climate impacts—scientists, policymakers, city planners, and more.

Another project with the Digital Thinking Network focuses on large language models for smaller languages, not by just translating in and out of English‑centric models, but by training foundational models purely on, say, Afrikaans or Finnish so they truly reflect that language’s point of view.

I think this is an exciting direction, especially as LLMs start to shape how the next generation learns and even how we all speak.

You’ve warned about Europe missing waves of innovation before. What’s the risk this time?

Look at the track record: Europe never really had its own major search engine;  ill-adapted laws and strong lobbies blocked that. It doesn’t have a homegrown YouTube‑style user‑contributed content platform, and it doesn’t have its own dominant social media stack, so it relies on the American stack—and people are rightly nervous about that.

Unless policies and practices change to make it easy for startups, universities and new services to flourish, the AI opportunity will simply move to more permissive jurisdictions. Friction is not your friend.

What’s your message to the Cloud ecosystem?

We call it “information technology,” but for the last 130 years most of the focus has been on the technology part; now we need to focus much more on the information part. If we can’t bring information and technology together—both in policy and in practice—we’ll keep seeing real damage to innovation, education, and new services.

See Brewster Kahle Live at CloudFest

In his keynote, Brewster Kahle will connect the dots between AI, digital heritage, decentralization, and the business realities facing Cloud and infrastructure providers.

If you care about how policy and architecture choices today will shape who innovates on your platforms tomorrow—from climate‑focused models to language‑local LLMs—his session belongs on your must‑see list.

Join us in Rust, Germany, to be part of the conversation on “public AI”, information policy and building a Cloud ecosystem with many winners—and hear directly from the Internet Archive founder helping to write that future.

Miles Kendall Avatar

This might also interest you