Open Menu
AllLocalCommunitiesAbout
lotide
AllLocalCommunitiesAbout
Login

Judge backs AI firm over use of copyrighted books

⁨121⁩ ⁨likes⁩

Submitted ⁨⁨7⁩ ⁨hours⁩ ago⁩ by ⁨Davriellelouna@lemmy.world⁩ to ⁨technology@lemmy.world⁩

https://www.bbc.com/news/articles/c77vr00enzyo

source

Comments

Sort:hotnewtop
  • BlameTheAntifa@lemmy.world ⁨21⁩ ⁨minutes⁩ ago

    Image

    Anakin: “Judge backs AI firm over use of copyrighted books” Padme: “But they’ll be held accountable when they reproduce parts of those works or compete with the work they were trained on, right?” Anakin: “…” Padme: “Right?”

    source
  • gedaliyah@lemmy.world ⁨7⁩ ⁨hours⁩ ago

    I’m not pirateing. I’m building my model.

    source
    • QuadratureSurfer@lemmy.world ⁨5⁩ ⁨hours⁩ ago

      To anyone who is reading this comment without reading through the article. This ruling doesn’t mean that it’s okay to pirate for building a model. Anthropic will still need to go through trial for that:

      But he rejected Anthropic’s request to dismiss the case, ruling the firm would have to stand trial over its use of pirated copies to build its library of material.

      source
      • Artisian@lemmy.world ⁨2⁩ ⁨hours⁩ ago

        I also read through the judgement, and I think it’s better for anthropic than you describe. He distinguishes three issues: A) Use any written material they get their hands on to train the model (and the resulting model doesn’t just reproduce the works).

        B) Buy a single copy of a print book, scan it, and retain the digital copy for a company library (for all sorts of future purposes).

        C) Pirate a book and retain that copy for a company library (for all sorts of future purposes).

        A and B were fair use by summary judgement. Manning this judge thinks it’s clear cut in anthropics favor. C will go to trial.

        source
        • -> View More Comments
  • the_q@lemmy.zip ⁨6⁩ ⁨hours⁩ ago

    An 80 year old judge on their best day couldn’t be trusted to make an informed decision. This guy was either bought or confused into his decision. Old people gotta go.

    source
  • Grimy@lemmy.world ⁨7⁩ ⁨hours⁩ ago

    80% of the book market is owned by 5 publishing houses.

    They want to create a monopoly around AI and kill open source. The copyright industry is not our friend. This is a win, not a loss.

    source
    • OmegaMouse@pawb.social ⁨7⁩ ⁨hours⁩ ago

      What, how is this a win? Three authors lost a lawsuit to an AI firm using their works.

      source
      • ShittyBeatlesFCPres@lemmy.world ⁨46⁩ ⁨minutes⁩ ago

        I would harm the A.I. industry if Anthropic loses the next part of the trial on whether they pirated books — from what I’ve read, Anthropic and Meta are suspected of getting a lot off torrent sites and the like.

        It’s possible they all did some piracy in their mad dash to find training material but Amazon and Google have bookstores and Google even has a book text search engine, Google Scholar, and probably everything else already in its data centers. So, not sure why they’d have to resort to piracy.

        source
      • Grimy@lemmy.world ⁨3⁩ ⁨hours⁩ ago

        The lawsuit would not have benefitted their fellow authors but their publishing houses and the big ai companies.

        source
    • sentient_loom@sh.itjust.works ⁨7⁩ ⁨hours⁩ ago

      How exactly does this benefit “us” ?

      source
      • gaylord_fartmaster@lemmy.world ⁨5⁩ ⁨hours⁩ ago

        Because books are used to train both commercial and open source language models?

        source
        • -> View More Comments
    • hendrik@palaver.p3x.de ⁨7⁩ ⁨hours⁩ ago

      Keep in mind this isn't about open-weight vs other AI models at all. This is about how training data can be collected and used.

      source
      • bob_omb_battlefield@sh.itjust.works ⁨6⁩ ⁨hours⁩ ago

        If you aren’t allowed to freely use data for training without a license, then the fear is that only large companies will own enough works or be able to afford licenses to train models.

        source
        • -> View More Comments
      • Grimy@lemmy.world ⁨3⁩ ⁨hours⁩ ago

        Because of the vast amount of data needed, there will be no competitive viable open source solution if half the data is kept in a walled garden.

        This is about open weights vs closed weights.

        source
        • -> View More Comments
    • SonOfAntenora@lemmy.world ⁨6⁩ ⁨hours⁩ ago

      Cool than, try to do some torrenting out there and don’t hide that. Tell us how it goes

      source
  • MyOpinion@lemmy.today ⁨3⁩ ⁨hours⁩ ago

    I hate AI with a fire that keeps we warm at night. That is all.

    source
  • AbouBenAdhem@lemmy.world ⁨6⁩ ⁨hours⁩ ago

    IMO the focus should have always been on the potential for AI to produce copyright-violating output, not on the method of training.

    source
    • Artisian@lemmy.world ⁨2⁩ ⁨hours⁩ ago

      Plantifs made that argument and the judge shoots it down pretty hard. That competition isn’t what copyright protects from. Would love to hear your thoughts on the ruling (it’s linked by reuters).

      source
    • SculptusPoe@lemmy.world ⁨6⁩ ⁨hours⁩ ago

      If you try to sell “the new adventures of Doctor Strange, Steven Strange and Magic Man.” existing copyright laws are sufficient and will stop it. Really, training should be regulated by the same laws as reading. If they can get the material through legitimate means it should be fine.

      source
      • devfuuu@lemmy.world [bot] ⁨6⁩ ⁨hours⁩ ago

        That “freely” there really does a lot of hard work.

        source
        • -> View More Comments
  • hendrik@palaver.p3x.de ⁨7⁩ ⁨hours⁩ ago

    Previous discussion: https://programming.dev/post/32802895

    source