Comment

Comment on AI was a common theme at Gamescom 2025, and while some indie teams say it's invaluable, it remains an ethical nightmare

Skullgrid@lemmy.world ⁨2⁩ ⁨months⁩ ago

doesn’t have to be an ethical nightmare. Public domain datasets on local hardware using renewable eletricity, who’s mad now, the artist you already can’t afford to pay because you have no fucking money anyway?

source

Sort:hotnew top

very_well_lost@lemmy.world ⁨2⁩ ⁨months⁩ ago

AI would be fine if we just changed everything about it

lol

source
- onslaught545@lemmy.zip ⁨2⁩ ⁨months⁩ ago
  Not all LLMs are the same. You can absolutely take a neural network model and train it yourself on your own dataset that doesn’t violate copyright.
  
  source
  - Mika@sopuli.xyz ⁨2⁩ ⁨months⁩ ago
    I can almost guarantee that hundred billion params LLMs are not trained on that, and are trained on the whole web scraped to the furthest extent.
    
    The only sane and ethical solution going forward is to force to opensource all LLMs. Use the datasets generated by humanity - give back to humanity.
    
    source
    Skullgrid@lemmy.world ⁨2⁩ ⁨months⁩ ago
    
    The only sane and ethical solution going forward is to force to opensource all LLMs.
    
    Jesus fucking christ. There are SO GODDAMN MANY open source LLMs, even from fucking scumbags like facebook. I get that there’s subtleties to the argument on the ProAI vs AntiAI side, but you guys just screech and scream.
    
    github.com/eugeneyan/open-llms
    
    source
    -> View More Comments
    Mika@sopuli.xyz ⁨2⁩ ⁨months⁩ ago
    Besides, the article is about image gen AI, not LLMs.
    
    source
    -> View More Comments
  - riskable@programming.dev ⁨2⁩ ⁨months⁩ ago
    Training an AI is orthogonal to copyright since the process of training doesn’t involve distribution.
    
    You can train an AI with whatever TF you want without anyone’s consent. That’s perfectly legal fair use. It’s no different than if you copy a song from your PC to your phone.
    
    Copyright really only comes into play when someone uses an AI to distribute a derivative of someone’s copyrighted work. Even then, it’s really the end user that is even capable of doing such a thing by uploading the output of the AI somewhere.
    
    source
HarkMahlberg@kbin.earth ⁨2⁩ ⁨months⁩ ago
Beyond the copyright issues and energy issues, AI does some serious damage to your ability to do actual hard research. And I'm not just talking about "AI brain."

Let's say you're looking to solve a programming problem. If you use a search engine and look up the question or a string of keywords, what do you usually do? You look through each link that comes up and judge books by their covers (to an extent). "Do these look like reputable sites? Have I heard of any of them before?" You scroll click a bunch of them and read through them. Now you evaluate their contents. "Have I already tried this info? Oh this answer is from 15 years ago, it might be outdated." Then you pare down your links to a smaller number and try the solution each one provides, one at a time.

Now let's say you use an AI to do the same thing. You pray to the Oracle, and the Oracle responds with a single answer. It's a total soup of its training data. You can't tell where specifically it got any of this info. You just have to trust it on faith. You try it, maybe it works, maybe it doesn't. If it doesn't, you have to write a new prayer try again.

Even running a local model means you can't discern the source material from the output. This isn't Garbage In Garbage Out, but Stew In Soup Out. You can feed an AI a corpus of perfectly useful information, but it will churn everthing into a single liquidy mass at the end. And because the process is destructive, you can't un-soup the output. You've robbed yourself of the ability to learn from the input, and put all your faith into the Oracle.

source
- Skullgrid@lemmy.world ⁨2⁩ ⁨months⁩ ago
  The topic is : using AIs for game dev.
  
  I’m pretty sure that generating placeholder art isn’t going to ruin my ability to research
  
  AIs need to be used TAKING THEIR FLAWS INTO ACCOUNT and for very specific things.
  
  I’m just going to be upfront: AI haters don’t know the actual way this shit works except that by existing, LLMS drain oceans and create more global warming than the entire petrol industry, and AI bros are filling their codebases with junk code that’s going to explode in their faces from anywhere between 6 months to 3 years.
  
  source
  - lime@feddit.nu ⁨2⁩ ⁨months⁩ ago
    as someone who has studied ml since around 2015, i’m still not convinced. i run local models, i train on CC data, i triple-check everything, and it’s just not that useful. it’s fun, but not productive.
    
    source
  - HarkMahlberg@kbin.earth ⁨2⁩ ⁨months⁩ ago
    Wild to see you call for a "sane take" when you strawman the actual water problem into "draining the oceans."
    
    Local residents with nearby data centers aren't being told to take fewer showers with salt water from the ocean.
    
    source
    Skullgrid@lemmy.world ⁨2⁩ ⁨months⁩ ago
    Is that a problem with the existence of llms as a technology, or shitty corporations working with corrupt governments in starving local people of resources to turn a quick buck?
    
    If you are allowing a data center to be built, you need to make sure you have power etc to build it without negativitely impacting the local people. It’s not the fault of an LLM that they fucked this shit up.
    
    source
    -> View More Comments
- Mika@sopuli.xyz ⁨2⁩ ⁨months⁩ ago
  
  you can’t be critical about the answer
  
  You actually can, and you should be. And the process is not destructive since you can always undo in tools like cursor, or discard in git.
  
  Besides, you can steer a good coding LLM in a right direction. The better you understand what are you doing - the better.
  
  source
  - HarkMahlberg@kbin.earth ⁨2⁩ ⁨months⁩ ago
    You misunderstood, I wasn't saying you can't Ctrl Z after using the output, but that the process of training an AI on a corpus yields a black box. This process can't be reverse engineered to see how it came up with it's answers.
    
    It can't tell you how much of one source it used over another. It can't tell you what it's priorities are in evaluating data... not without the risk of hallucinating on you when you ask it.
    
    source
eldebryn@lemmy.world ⁨2⁩ ⁨months⁩ ago
Out of legit curiosity, how many models do you know trained exclusively on public domain data, which are actually useful?

source
- lime@feddit.nu ⁨2⁩ ⁨months⁩ ago
  anything trained on common corpus. which, oddly, is harder to find than the actual training data.
  
  source
  - eldebryn@lemmy.world ⁨2⁩ ⁨months⁩ ago
    I mean this respectfully, but that wasn’t an actual answer.
    
    source
    lime@feddit.nu ⁨2⁩ ⁨months⁩ ago
    no, it sort of reinforced your point.
    
    source
    -> View More Comments