Comment

Comment on AIs can’t stop recommending nuclear strikes in war game simulations

there is no ai, only largelanguagemodel that has been trained on data. The data it has been trained suggests this is the best idea. llm cant evaluate the data its trained on so anything you put in will be equally valid. I give it that its really impressive how they can output the training results in such coherent way that can be kind of “conversed” with, but there is no will or intelligence behind it.

This is also why corporations insisting on putting them everywhere is quite horrible security issue -> you can jailbreak any llm and tell them to do anything. So this has enabled all kinds of stupid vulnerabilities that exploit this. Now you can even send someone malicious google calendar invites that makes gemini do bad shit to your systems its connected to.

source

Sort:hotnew top

Grail@multiverse.soulism.net ⁨1⁩ ⁨day⁩ ago
So you’re saying that because the AI has been exposed to training data in the past, it’s incapable of making choices. Interesting argument. Pretty easy to reducto ad absurdum, though.

source
- reksas@sopuli.xyz ⁨15⁩ ⁨hours⁩ ago
  no, its incapable of making choices because there is nothing there to make the choices. Its just fancy way of interacting with the data it has been trained with. Though i suppose if there was a way to let llm function “live” instead of only by responding to queries, it could be possible to at least test if it could act on its own, but i dont think it can -> we would know by now because it would be step closer to agi, which is basically the holy grail for these kind of things. And equally possible to get, i think.
  
  You can literally make the llm say and do anything with right kind of query, this is also why its impossible to make them safe. Even though you can’t directly ask for something forbidden, with some creativity you can bybass the initializations the corpos have put in. Its not possible for them to account for every single thing and if they try they will run out of token space.
  
  The whole “ai” term is just corporations perpetuating a lie because it sounds impressive and thus makes people want to give them more money for their bullshit.
  
  source
  - Grail@multiverse.soulism.net ⁨12⁩ ⁨hours⁩ ago
    No, LLMs are not just an interface for accessing training data. If that were true, then their references would actually work. The fact that LLMs can hallucinate and make stuff up proves that they are not just accessing the training data. The ANN is generating new (often incorrect) information.
    
    source
    reksas@sopuli.xyz ⁨5⁩ ⁨hours⁩ ago
    if the hallucinations are result of something actually happening in the background, that would be quite interesting. It would also be very bad for rest of us since it might mean the billionaires who own the damn things would be in position to get even worse deathgrip on our world. If they ever manage to create agi, the worst thing that could happen isnt that it breaks free and enslaves humanity but that it doesnt and it helps the billionaires enslave us further and make sure we cant ever even think about fighting back.
    
    But i think the hallucinations are based on incorrect information in the training data, they did train it from stuff from reddit too. Any and everything will be considered true, but if 99% of the data says one thing and 1% says another, then i think it will reference that 99% more often but it cant know that the 1% is wrong, can even real humans know it for certain? And since it cant evaluate anything, there might be situations where that 1% of data might be more relevant due to some nebulous mechanism on how it processes data.
    
    llms have been made to act extremely helpful and subservient, so if they actually could “think” wouldnt they factcheck themselves first before saying something? I have sometimes just asked “are you sure?” and the llm starts “profusely apologizing” for providing incorrect information or otherwise correcting itself.
    
    Though i wonder how it would answer if it truely had no initialization querys, as they have same hidden instructions on every query you make on how to “behave” and what not to say.
    
    source
    -> View More Comments