Cloudflare, along with a majority of the world’s leading publishers and AI companies, is changing the default to block AI crawlers unless they pay creators for content.
Could this conflict with the fair use of AI training rules?
Submitted 8 months ago by wosat@lemmy.world to technology@lemmy.world
https://blog.cloudflare.com/content-independence-day-no-ai-crawl-without-compensation/
Cloudflare, along with a majority of the world’s leading publishers and AI companies, is changing the default to block AI crawlers unless they pay creators for content.
Could this conflict with the fair use of AI training rules?
No, the fact it’s technically legal doesn’t mean you have to make it easy for them.
Yup. “Legal” just means the government won’t punish you.
This has been tried before many times. The problem is that this exchange can never satisfy all of the parties.
Would site host take 0.01$ for a page? If its Walmart.com then they’d happily lose even 0.10$ or more if competitors can’t analyze their products and other perceived IP damages.
For example, let’s assume they do the business math and come out that if it is anything below 5$ is a no deal - what scraper would pay 5$ for a single product page scrape? Maybe openAI can pay that but is this what we want where public scraping is only accessible to billionaires? What if you’re just a user that wants to track Walmart price to build your own budgeting script? Are you paying 5$ on every request?
Cloudflare aren’t perfect, but I still use them because for a free account the benefits outweigh the negatives like this… However to say the worlds leading publishers and AI are on board is simply not true…
Awesome, clear, you know how to do it, you can do it!
Kind of reminds me of that time CF bait and switched that gambling website. Everyone was wondering who they find more predatory and distasteful.
iirc it came out that the gambling site was intentionally cycling through cloudflare’s IP range to avoid IP blocking, causing legitimate websites to be flagged as unsafe. They said they’d still host them, they just need their own IP and not one of cloudflare’s. Horribly communicated to the customer though
That makes more sense now.
A nice explanation of what’s wrong with web-based AI. I hope other content providers follow suit.
ter_maxima@jlai.lu 8 months ago
If they could stop blocking real users at the same time they block AI crawlers, that would be nice.
AceFuzzLord@lemmy.zip 8 months ago
I absolutely hate cloudflare because they always block me whenever I visit a site while connected to any VPN server on Proton, regardless of country. Even when I’m not connected to a VPN I sometimes have trouble with them deciding my connection is suspicious despite not having anything that should trigger it.
fmstrat@lemmy.nowsci.com 8 months ago
Make a dummy Google Account, and log into it when on the VPN. Having an ad history avoids the blocks usually.
Also, if it’s image captchas that never end, switch to the accessibility option for the captcha.
Imgonnatrythis@sh.itjust.works 8 months ago
You just need to pay!