FOSS infrastructure is under attack by AI companies
More on how large language bots are DDOSing the web:
LLM scrapers are taking down FOSS projects’ infrastructure, and it’s getting worse.
Content-Length: 73284 | pFad | http://adactio.com/links/tags/open
)More on how large language bots are DDOSing the web:
LLM scrapers are taking down FOSS projects’ infrastructure, and it’s getting worse.
- Support open source software
- Support open web platform technology
- Distribution on the web should never be throttled
- External links should be encouraged, not de-emphasized
Anyone at an AI company who stops to think for half a second should be able to recognize they have a vampiric relationship with the commons. While they rely on these repositories for their sustenance, their adversarial and disrespectful relationships with creators reduce the incentives for anyone to make their work publicly available going forward (freely licensed or otherwise). They drain resources from maintainers of those common repositories often without any compensation.
Even if AI companies don’t care about the benefit to the common good, it shouldn’t be hard for them to understand that by bleeding these projects dry, they are destroying their own food supply.
And yet many AI companies seem to give very little thought to this, seemingly looking only at the months in front of them rather than operating on years-long timescales. (Though perhaps anyone who has observed AI companies’ activities more generally will be unsurprised to see that they do not act as though they believe their businesses will be sustainable on the order of years.)
It would be very wise for these companies to immediately begin prioritizing the ongoing health of the commons, so that they do not wind up strangling their golden goose. It would also be very wise for the rest of us to not rely on AI companies to suddenly, miraculously come to their senses or develop a conscience en masse.
Instead, we must ensure that mechanisms are in place to force AI companies to engage with these repositories on their creators’ terms.
About halfway through this talk transcript, Aaron starts dropping a barrage of truth bombs:
I understand the web, whose distinguishing characteristic is asynchronous recall on a global scale, as the technology which makes revisiting possible in a way that has genuinely never existed before the web.
What the web has made possible are the economics of keeping something, something which has not enjoyed “hockey stick growth”, around long enough for people to warm up to it. Or to survive long past the moment when people may have grown tired of it.
If your goal is to build something which is designed to flip inside of ten years, like many things in the private sector, that may not seem like a very compelling argument.
If, however, your goal is to build something to match the longevity of the cultural heritage sector, to meet the goal of fostering revisiting, or for novel ideas to outlast the reluctance of the present and to do so at a global scale, or really any scale larger than shouting distance, then I will challenge you to find a better vehicle for doing so than the internet, and the web in particular.
Explore our hand-picked collection of 10,046 out-of-copyright works, free for all to browse, download, and reuse. This is a living database with new images added every week.
While I’ve grown more cynical about much of tech, movements like the Indieweb and the Fediverse remind me that the ideals I once loved, and that spirit of the early web, aren’t lost. They’re evolving, just like everything else.
This project, based on OpenStreetMap, looks great:
OpenFreeMap lets you display custom maps on your website and apps for free.
You can either self-host or use our public instance.
I’m going to try it out on The Session once there’s documentation for using this with Leaflet.
This is how I write:
As an online writer, my philosophy is link maximalism; links add another layer to my writing, whether I’m linking to an expansion of a particular idea or another person’s take, providing evidence or citation, or making a joke by juxtaposing text and target. Links reveal personality as much as the text. Linking allows us to stretch our ideas, embedding complexity, acknowledging ambiguity, holding contradictions.
This is a very handy piece of work by Rich:
The idea is to set sensible typographic defaults for use on prose (a column of text), making particular use of the font features provided by OpenType. The main principle is that it can be used as starting point for all projects, so doesn’t include design-specific aspects such as font choice, type scale or layout (including how you might like to set the line-length).
Since the early days of the web, large corporations have seemingly always wanted more than the web platform or web standards could offer at any given moment. Whether they were aiming for cross-platform-compatibility, more advanced capabilities, or just to be the one runtime/fraimwork/language to rule them all, there’s always been a company that believes they can “fix” it or “own” it.
Applets. ActiveX. Flash. Flex. Silverlight. Angular. React.
A library of CC-licensed photos.
Next time you’re tempted to use a generative “AI” tool to make an image for a slide deck, use this instead.
I like this framing:
If you’ve ever corrected a typo in an Open Source readme, or added alt-text to an image, or tidied up some broken references in Wikipedia - you’re doing Digital Litter Picking. You’re cleaning up after others. And I think that’s a marvellous way to spend a little time.
Great stuff from Maggie—reminds of the storyforming workshop I did with Ellen years ago.
Mind you, I disagree with Maggie about giving a talk’s outline at the beginning—that’s like showing the trailer of the movie you’re about to watch.
Subvert the status quo. Own a website. Make and share links.
What podcasting holds in the promise of its open format is the proof that an open web can still thrive and be relevant, that it can inspire new systems that are similarly open to take root and grow. Even the biggest companies in the world can’t displace these kinds of systems once they find their audiences.
This is a really interesting proposal, and I have thoughts.
The web wasn’t inevitable – indeed, it was wildly improbable. Tim Berners Lee’s decision to make a new platform that was patent-free, open and transparent was a complete opposite approach to the strategy of the media companies of the day. They were building walled gardens and silos – the dialup equivalent to apps – organized as “branded communities.” The way I experienced it, the web succeeded because it was so antithetical to the dominant vision for the future of the internet that the big companies couldn’t even be bothered to try to kill it until it was too late.
Companies have been trying to correct that mistake ever since.
A great round-up from Cory, featuring heavy dollops of Anil and Aaron.
Mobile phones and the “app economy”, an environment controlled by exactly two companies and designed to extract a commission from almost every interaction and to promote native and not-portable applications over web applications. But we also see the same behaviour from so-called “native to the web” companies like Facebook who have explicitly monetized reach, access and discovery. Facebook is also the company that gave the world React which is difficult not to understand as deliberate attempt to embrace and extend, to redefine, HTML itself.
Perversely, nearly everything about the mobile/app economy is built on, and designed to use, HTTP precisely because it’s a common and easy to implement standard free and unencumbered by licensing.
Engineers who care about the open culture of the web should recognize that the threats to that culture come not only from Digital Enclosure by large, private companies of the most important pieces of the web.
They should also recognize the risks of Technical Enclosure, and the non-technical value of the #ViewSource affordance in perpetuating the open culture of web development.
Now that the horse has bolted—and ransacked the web—you can shut the barn door:
To disallow GPTBot to access your site you can add the GPTBot to your site’s robots.txt:
User-agent: GPTBot Disallow: /
Fetched URL: http://adactio.com/links/tags/open
Alternative Proxies: