'Meta Torrented over 81 TB of Data Through Anna's Archive, Despite Few Seeders' * TorrentFreak

⁨0⁩ ⁨likes⁩

Submitted ⁨⁨1⁩ ⁨year⁩ ago⁩ by ⁨misk@sopuli.xyz⁩ to ⁨technology@lemmy.world⁩

https://torrentfreak.com/meta-torrented-over-81-tb-of-data-through-annas-archive-despite-few-seeders-250206/

source

Comments

Sort:hotnew top

meowmeowbeanz@sh.itjust.works ⁨1⁩ ⁨year⁩ ago
Oh look, another tech giant treating open knowledge initiatives like their personal data buffet. Let me translate this corporate nonsense for you:

Meta: “We need training data for our AI!” Also Meta: Let’s leech 81.7TB from a community project without contributing anything back.

The absolute audacity of downloading terabytes through torrents while their employees were internally admitting it was “legally problematic”. And the best part? They couldn’t even be bothered to seed properly - just grab and go, classic corporate behavior.

Remember when companies actually contributed to open source instead of just parasitically consuming it? But no, they’d rather burden volunteer-run projects with massive bandwidth costs while their lawyers probably bill more per hour than these projects’ entire monthly budget.

Pro tip Meta: If you’re going to pilfer knowledge from the commons, at least seed back properly. Your “move fast and break things” motto isn’t supposed to apply to community archives.

source
- Anti_Face_Weapon@lemmy.world ⁨1⁩ ⁨year⁩ ago
  Not seeding is crazy …
  
  source
- interdimensionalmeme@lemmy.ml ⁨1⁩ ⁨year⁩ ago
  My seedbox is locked and load, please point me to the. Torrent in need. Archive team assemble!
  
  source
  - underwire212@lemm.ee ⁨1⁩ ⁨year⁩ ago
    Yes please support annas-archive!! It is a wonderful project. I can essentially get an epub file for any book (including banned books) I want. They have so much more than that too.
    
    source
  - ILikeBoobies@lemmy.ca ⁨1⁩ ⁨year⁩ ago
    annas-archive.org
    
    This is the website listed in the article
    
    source
    -> View More Comments
- General_Effort@lemmy.world ⁨1⁩ ⁨year⁩ ago
  When you’re shilling for copyright, at least pick a lane. Are they bad for “pirating” or bad for not supporting “piracy”?
  
  I guess it doesn’t matter as long as the owners collect their rent.
  
  source
  - ulterno@programming.dev ⁨1⁩ ⁨year⁩ ago
    They are pirating, while also DOSing the providers.
    
    source
ad_on_is@lemm.ee ⁨1⁩ ⁨year⁩ ago
If buying ain’t owning, than downloading…

oh wait, that’s our slogan

source
bungalowtill@lemmy.dbzer0.com ⁨1⁩ ⁨year⁩ ago
The Pirates of the Crown

source
drascus@sh.itjust.works ⁨1⁩ ⁨year⁩ ago
Just gotta love these big tech companies and their bullshit double standards.

source
njordomir@lemmy.world ⁨1⁩ ⁨year⁩ ago
If someone was to acquire a few hundred gigs of books and feed them to something like paperless-ngx, would it work as a sort of google of books? Are there any software projects better suited for doing thisand understand synonyms and perhaps some context? I guess AI search but guided for the intermediate user.

Google is so bad lately. Basically every result is official sponsored corporate biased BS. It would be nice to be able to instantly query a bunch of ebooks.

source
- rumba@lemmy.zip ⁨1⁩ ⁨year⁩ ago
  GPT, Meta, Deepseek and Google have probably all been trained on the data.
  
  The problem is, training on the data, and actually training for knowledge of the data are VERY different things.
  
  www.youtube.com/watch?v=_GkHZQYFOGM
  
  source
- werefreeatlast@lemmy.world ⁨1⁩ ⁨year⁩ ago
  Yes. This exactly.
  
  source
SpikesOtherDog@ani.social ⁨1⁩ ⁨year⁩ ago
phys.org/…/2010-11-million-dollar-verdict-music-p…

In all fairness, meta should be assessed a fee of 250k per EACH pirated work.

This would amount to forfeiting all assets to doge.

source
- ulterno@programming.dev ⁨1⁩ ⁨year⁩ ago
  And I’d guess all that money would then go to military funding, with Anna’s Archive, again getting nothing out of it?
  
  source
  - SpikesOtherDog@ani.social ⁨1⁩ ⁨year⁩ ago
    It would go to… Uh…
    
    HEY SOMEONE PUT A DEAD CAT ON THE TABLE!
    
    source
- Grunt4019@lemm.ee ⁨1⁩ ⁨year⁩ ago
  Assuming 2.6 MB per book.
  
  81 TB would be 32,667,175 books.
  
  At $250k per book that would come out to:
  
  $8.17 trillion.
  
  source
  - SpikesOtherDog@ani.social ⁨1⁩ ⁨year⁩ ago
    Image
    
    source
- nyan@lemmy.cafe ⁨1⁩ ⁨year⁩ ago
  They might end up having to pay more money than exists on the planet at that rate.
  
  source
  - ryan_@lemmy.world ⁨1⁩ ⁨year⁩ ago
    I’m a reasonable man so I’ll allow it.
    
    source
  - SpikesOtherDog@ani.social ⁨1⁩ ⁨year⁩ ago
    Good
    
    source
    -> View More Comments
jaybone@lemmy.world ⁨1⁩ ⁨year⁩ ago
What is Anna’s Archive?

source
- misk@sopuli.xyz ⁨1⁩ ⁨year⁩ ago
  It’s a popular search engine that works with shadow libraries like Sci-Hub or Library Genesis. Shadow libraries are hosts to copies of works of literature and science. Their legal status is murky at best but it’s incredibly impractical to persecute those accessing them.
  
  source
  - MonkderVierte@lemmy.ml ⁨1⁩ ⁨year⁩ ago
    
    it’s incredibly impractical to persecute those accessing them.
    
    Always was. If you’re serious, persecute those hosting it.
    
    source
  - jaybone@lemmy.world ⁨1⁩ ⁨year⁩ ago
    So it’s like thepiratebay or 1337x.to but for books?
    
    Also I think you mean prosecuting, not persecuting.
    
    source
    -> View More Comments
Grimy@lemmy.world ⁨1⁩ ⁨year⁩ ago
Meta has open sourced every single one of their llms. They essentially gave birth to the whole open llm scene.

If they start losing all these lawsuits, the whole scene dies and all those nifty models and their fine-tunes get removed from huggingface, to be repackaged and sold to us with a subscription fee. All the other domestic open source players will close down.

The copyright crew aren’t the good guys here, even if it’s spearheaded by Sarah Silverman and Meta has traditionally played the part of the villain.

source
- LodeMike@lemmy.today ⁨1⁩ ⁨year⁩ ago
  Where is the source content then
  
  source
  - Knock_Knock_Lemmy_In@lemmy.world ⁨1⁩ ⁨year⁩ ago
    Annas archive. Keep up. Pffff.
    
    source
- antonim@lemmy.dbzer0.com ⁨1⁩ ⁨year⁩ ago
  If the existence of open source LLMs hinges on the benevolence of one of the few most cancerous tech companies in the world, maybe they’re not really worth it?
  
  This isn’t about “heroes” and “villains”. Facebook has been and has stayed the “villain”, they’ve done something colossally illegal that any mere mortal would be sued to death for (by an another “villainous” instance, the media system that has made piracy a necessity in the first place), and they’re hoping to get away with it simply on technicalities and by having more money for better lawyers. Rules are rules, if you don’t like them maybe Facebook should try to change them (and not just for themselves, but for the rest of us too)?
  
  source
  - Grimy@lemmy.world ⁨1⁩ ⁨year⁩ ago
    The existence hinges on the rewriting and strengthening of copyright laws by data brokers and other cancerous tech companies. It’s not Meta vs us, but Meta and us vs Google and Openai.
    
    They are being sued for copyright infringement when it’s clearly highly transformative. The rules are fine as is, Meta isn’t the one trying to change them. You seem to imply I should go against my own interests and support frivolous lawsuits that will negatively impact me just because Meta is a boogeyman.
    
    source
    -> View More Comments
- Telodzrum@lemmy.world ⁨1⁩ ⁨year⁩ ago
  Nope. Get fucked
  
  source
- misk@sopuli.xyz ⁨1⁩ ⁨year⁩ ago
  Meta stole from everyone, including those that struggle to make ends meet, so it doesn’t matter that they gave you back some of it. Any moral qualms should evaporate when you consider that they did it to create shareholder value and the rest is philanthropy (aka pretend tax). As a socialist I believe that man is owed for his work and you can’t take from him even though technology makes it so easy.
  
  source
  - General_Effort@lemmy.world ⁨1⁩ ⁨year⁩ ago
    Calling property labor, doesn’t make you a socialist.
    
    source
    -> View More Comments
  - LainTrain@lemmy.dbzer0.com ⁨1⁩ ⁨year⁩ ago
    As a socialist I believe intellectual property is a falsehood and technological advancement should be for the public good. Open source LLMs are for the public good.
    
    Given the options between having open source LLMs and the US Govt banning non-corpo non-proprietary LLMs and giving a free pass to people like Musk and Altman and Zucc to monopolize, I happily pick the former.
    
    source
    -> View More Comments
  - Grimy@lemmy.world ⁨1⁩ ⁨year⁩ ago
    Don’t give me that slop. No one except the biggest names are getting a dime out it once OpenAI buys up all the data and kills off their competition. It’s also highly transformative, which is perfectly legal.
    
    Copyright laws have been turned into a joke, only protecting big money and their interests.
    
    source
daggermoon@lemmy.world ⁨1⁩ ⁨year⁩ ago
Damn leeches

source
LEVI@feddit.org ⁨1⁩ ⁨year⁩ ago
[deleted]
source
- mox@lemmy.sdf.org ⁨1⁩ ⁨year⁩ ago
  Facebook: I’ll just ~~torrent what I need~~ burden your underfunded project and volunteers with over 81 TB of bandwidth costs without contributing anything in return, see yaa
  
  FTFY
  
  source
  - C126@sh.itjust.works ⁨1⁩ ⁨year⁩ ago
    Yeah the least they could do is seed forever.
    
    source
    -> View More Comments
shittydwarf@lemmy.dbzer0.com ⁨1⁩ ⁨year⁩ ago
Image

source
- personalthought381@lemm.ee ⁨1⁩ ⁨year⁩ ago
  Rules for thee, not for me
  
  source
akilou@sh.itjust.works ⁨1⁩ ⁨year⁩ ago
But did they keep a good ratio though?

source
- rottingleaf@lemmy.world ⁨1⁩ ⁨year⁩ ago
  In copyright protection terms the ratio shouldn’t matter. They should pay for all the lost profits from pirating everything they’ve downloaded. Every time someone pirated it should be counted. And every time someone uses the AI trained on the data.
  
  They can become the corporate Jesus of the interwebs, having paid for our sins.
  
  source
  - grue@lemmy.world ⁨1⁩ ⁨year⁩ ago
    Technically, copyright infringement is committed by the entity making and sending the copy, not the entity receiving it. Leeching could indeed remove liability.
    
    I’m not sure if the courts have cared about that nuance when persecuting the ‘small fish,’ but I bet they would in this ‘big fish’ case.
    
    source
    -> View More Comments
- empireOfLove2@lemmy.dbzer0.com ⁨1⁩ ⁨year⁩ ago
  1000% guarantee those mf’s had their upload choked to 20kbps
  
  source
  - Tregetour@lemdro.id ⁨1⁩ ⁨year⁩ ago
    20 was the lead engineer ‘mishearing’ Zuck after he said 2.
    
    source
  - guaraguaito@lemmy.blahaj.zone ⁨1⁩ ⁨year⁩ ago
    Nah they used a leeching client
    
    source
- SnotFlickerman@lemmy.blahaj.zone ⁨1⁩ ⁨year⁩ ago
  Asking the real questions.
  
  source
SnotFlickerman@lemmy.blahaj.zone ⁨1⁩ ⁨year⁩ ago

“Meta downloaded millions of pirated books from LibGen through the bit torrent protocol using a platform called LibTorrent. Internally, Meta acknowledged that using this protocol was legally problematic,” the third amended complaint noted.

Just want to make clear that Libtorrent is just the torrent application they were using, while the Libgen torrents are easily accessible on the libgen site, not through a separate “platform” called Libtorrent.

I wish people like us could help with these complaints, because then they might actually get the details more accurate to reality.

libgen.is/repository_torrent/

www.libtorrent.org

The amended complaint makes it sound like Libtorrent is a private tracker website when its just the application they were using on the publicly available torrents.

source
- corsicanguppy@lemmy.ca ⁨1⁩ ⁨year⁩ ago
  People are putting an S on the end of words like ‘traffic’ and ‘email’. They will never understand the semantics of that correction.
  
  source
  - paraphrand@lemmy.world ⁨1⁩ ⁨year⁩ ago
    Meta Horizons
    
    source
- db2@lemmy.world ⁨1⁩ ⁨year⁩ ago
  Totes yeet, yo.
  
  source
Telorand@reddthat.com ⁨1⁩ ⁨year⁩ ago
Do it, Judge. Protect the wealthy and say it’s not piracy. Do it.

source
- roofuskit@lemmy.world ⁨1⁩ ⁨year⁩ ago
  He already referred them to the justice department, this is a civil case, he cannot sentence them criminally.
  
  source
- Damage@slrpnk.net ⁨1⁩ ⁨year⁩ ago
  They’ll be fined 100k
  
  source
  - Telorand@reddthat.com ⁨1⁩ ⁨year⁩ ago
    And they’ll ham up how punished and sorry they are, and how thankful they are for the judge handing down “fair and impartial” justice.
    
    source
- Lexam@lemmy.world ⁨1⁩ ⁨year⁩ ago
  It’s not piracy. For corporations. For you and me believe it or not, straight to jail!
  
  source
  - curbstickle@lemmy.dbzer0.com ⁨1⁩ ⁨year⁩ ago
    Just make an llc, now its legal again.
    
    source
    -> View More Comments
- abobla@lemm.ee ⁨1⁩ ⁨year⁩ ago
  Please! Think of the shareholders, we must protect them!
  
  source