Often it is, but the problem is platforms conflate things with the questionable AI scraping crawlers to blackmail websites into participating in feeding AI.
For example, Googlebot if enabled won’t just list you for search, but will also scrape your contents for Google’s AI. I imagine LinkedinBot, given it’s microsoft, will feed some other AI of theirs as well on top of the previews.
Until regulation steps in to require AI bots to separately ask for crawling permission, or to actually get a proper license for reuse of the contents, this situation isn’t going to improve.
TeddE@lemmy.world 8 months ago
Kinda, but also not really. Any major tech player that has billions to lose will make a show of respecting robots.txt when presenting that information to third parties, lest they be exposed by basic journalism.
However, they also have separate networks in R&D that sweep the net all the time and do not care about such restrictions. It’s theatre.
And they’re still happy to punish people that have the gall to publicly decline their crawlers. Basically they can eat their cake and have it too.