Comment

Comment on Pinterest changes user terms so it can train AI on user data and photos, regardless of when they were posted

My worry is that these social media alternatives might get scraped by these AI companies as well.

Sure, a company handing it over is much easier (i.e. Reddit). But with the decentralized nature, everyone needs to protect their instances themselves, which I’m not sure how well everyone will be capable of doing that.

Definitely much more difficult, so it’s a step in the right direction.

source

Sort:hotnew top

Womble@lemmy.world ⁨1⁩ ⁨year⁩ ago
Everything on the Fediverse is almost certainly scraped, and will be repeatedly. You cant “protect” content that is freely available on a public website.

source
- ayyy@sh.itjust.works ⁨1⁩ ⁨year⁩ ago
  Bug uh, I wrote an entire license in every one of my comments so it would be impossible for them to scrape! /s
  
  source
- kane@femboys.biz ⁨1⁩ ⁨year⁩ ago
  I do not entirely agree.
  
  While what you said might be true for content that we post, things like view history and tracking in itself is much more difficult. That meta data does help with tagging content.
  
  source
  - Womble@lemmy.world ⁨1⁩ ⁨year⁩ ago
    Yeah, fair enough, I was refering to posts and comments not other metadata because that isnt publicly available just as a get request (as far as I’m aware)
    
    source
Emperor@feddit.uk ⁨1⁩ ⁨year⁩ ago
There are lists of bots that instance Admins can block for a range of reasons.

Anything online can be scraped but big firms might run into regulatory trouble if they are caught randomly scraping sites without consent. At the moment, the big social media apps have a tonne of content to train on in tightly controlled conditions, so they don’t really need to go into the wild, yet. However, we need to be vigilant, block them and make a fuss if we catch them at it.

source
- CosmicTurtle0@lemmy.dbzer0.com ⁨1⁩ ⁨year⁩ ago
  What’s to stop a company from standing up their own instance?
  
  If they only create an admin account and then federate to every instance, now they have everyone’s content.
  
  I’m suddenly realizing the anti-AI blurbs people add to their comments now make sense.
  
  source
  - JohnEdwa@sopuli.xyz ⁨1⁩ ⁨year⁩ ago
    IANAL, but the way the federation by necessity copies your posts and information to every instance there is and to be able to do that it all needs to be under a licence that allows it to happen, those blurbs almost certainly are legally entirely meaningless. The only thing I can think of is claiming a non-commercial use violations, but that could put every instance that runs on donations under fire as well.
    
    source
- kane@femboys.biz ⁨1⁩ ⁨year⁩ ago
  That’s a very good shout, I wasn’t aware there are pre existing lists. That’s a great step, and definitely one I will look to add to my own instance.
  
  source
  - Emperor@feddit.uk ⁨1⁩ ⁨year⁩ ago
    We just added it as the old frontend was getting hammered by bots - it helped a lot.
    
    source