Is Generative AI Destroying the Open Web

Subscription walls prevent AI scraping, but at what cost? I’m rethinking my whole publishing strategy.

Is Generative AI Destroying the Open Web
Photo by No Revisions / Unsplash

It’s been six months since I moved my writing behind a subscription wall. At the time, it felt like the only way to prevent AI companies from vacuuming my words into their large language models. Today, however, I’m having second thoughts about this strategy and its repercussions on the open web.


While it seems impossible, I’ve been writing online for over two decades. I launched my website in the early days of blogging. Back then, there was no paywall or fear of AI stealing content. The web was open, independent, and available for everyone.

I want my writing to be available for everyone, which is why everything I publish behind Medium’s paywall is also available for free on my website. However, with the rise of generative AI, everything ever written has been stolen to train large language models. To try to stop the bots from scraping my writing, I added an email subscription requirement back in February.

In addition to requiring email addresses, I used Dark Visitors’ recommendations to try and block AI bots. I emphasize try because there is no guarantee that AI companies follow the rules. Anthropic, for example, is blatantly ignoring anti-bot policies and causing excessive server bills.

CloudFlare, which manages the DNS for my site, recently added an AI bot and web crawler block button to their service. I’ve got this engaged, for whatever it’s worth.

I’m not alone in trying to block these crawlers. 404 Media reports that “8 percent of the ‘most actively maintained, critical sources,’ meaning websites that are regularly updated and are not dormant, have restricted AI scraping in the last year.” This is exceptional and presents a new issue.

The same report from 404 Media explains that researchers and academics are getting caught up in bot blocking. One of the academics interviewed feared that “search engines and web archives” could end up blocked as a result. This loss of access would be an absolute shame for the open web.

These AI companies have backed creators like myself into a corner. We either keep our creations available for their bots or restrict access to everyone. I can’t find a middle ground. AI companies have forced the open web into mutually assured destruction. It’s not right.

Needless to say, I’m conflicted. While balance with AI is the only way forward, I can’t find balance with keeping the web open. I want to remove the subscription requirement for my website, but I don’t trust the anti-AI protections in place if I do so.

This isn’t one of those posts where I have good advice for you to implement. In fact, it’s the exact opposite. I have no solutions and am hoping that maybe you do.

How do we keep the web open while keeping our work out of large language models? Should we even worry about it? Hit reply and let me know what you think.