Comment

Comment on Rabbit R1 AI box revealed to just be an Android app

The best convincing answer is the correct one. The correlation of AI answers with correct answers is fairly high. Numerous test show that. The models also significantly improved (especially paid versions) since introduction just 2 years ago.
Of course it does not mean that it could be trusted as much as Wikipedia, but it is probably better source than Facebook.

source

Sort:hotnew top

De_Narm@lemmy.world ⁨8⁩ ⁨months⁩ ago
“Fairly high” is still useless (and doesn’t actually quantify anything, depending on context both 1% and 99% could be ‘fairly high’). As long as these models just hallucinate things, I need to double-check. Which is what I would have done without one of these things anyway.

source
- AIhasUse@lemmy.world ⁨8⁩ ⁨months⁩ ago
  Hallucinations are largely dealt with if you use agents. It won’t be long until it gets packaged well enough that anyone can just use it. For now, it takes a little bit of effort to get a decent setup.
  
  source
- TrickDacy@lemmy.world ⁨8⁩ ⁨months⁩ ago
  1% correct is never “fairly high” wtf
  
  Also if you want a computer that you don’t have to double check, you literally are expecting software to embody the concept of God. This is fucking stupid.
  
  source
  - De_Narm@lemmy.world ⁨8⁩ ⁨months⁩ ago
    
    1% correct is never “fairly high” wtf It’s all about context. Asking a bunch of 4 year olds about questions about trigonometry, 1% of answers being correct would be fairly high. ‘Fairly high’ basically only means ‘as high as expected’ or ‘higher than expected’.
    
    Also if you want a computer that you don’t have to double check, you literally are expecting software to embody the concept of God. This is fucking stupid. Hence, it is useless. If I cannot expect it to be more or less always correct, I can skip using it and just look stuff up myself.
    
    source
    TrickDacy@lemmy.world ⁨8⁩ ⁨months⁩ ago
    Obviously the only contexts that would apply here are ones where you expect a correct answer. Why would we be evaluating a software that claims to be helpful against 4 year old asked to do calculus? I have to question your ability to reason for insinuating this.
    
    So confirmed. God or nothing. Why don’t you go back to quills? Computers cannot read your mind and write this message automatically, hence they are useless
    
    source
    -> View More Comments
SpaceNoodle@lemmy.world ⁨8⁩ ⁨months⁩ ago
An LLM has never generated a correct answer to any of my queries.

source
- iAmTheTot@kbin.social ⁨8⁩ ⁨months⁩ ago
  That seems unlikely, unless "any" means two.
  
  source
  - SpaceNoodle@lemmy.world ⁨8⁩ ⁨months⁩ ago
    Perhaps the problem is that I never bothered to ask anything trivial enough, but you’d think that two rhyming words starting with 'L" would be simple.
    
    source
    TimeSquirrel@kbin.social ⁨8⁩ ⁨months⁩ ago
    That might be because of how it works under the hood and how it tokenized words, characters, and sentences. It may not have anything telling it that a specific word starts with a specific letter. It might only have the whole word. That's my guess.
    
    source
    CaptDust@sh.itjust.works ⁨8⁩ ⁨months⁩ ago
    “AI” is a really dumb term for what we’re all using currently. General LLMs are not intelligent, it’s assigning priorities to tokens (words) in a database, based on what tokens were provided before it, to compare and guess the next most logical word and phrase, really really fast. Informed guesses, sure, but there’s not enough parameters to consider all the factors required to identify a rhyme.
    
    That said, honestly I’m struggling to come up with 2 rhyming L words? Lol even rhymebrain is failing me. I’m curious what you went with.
    
    source
    MxM111@kbin.social ⁨8⁩ ⁨months⁩ ago
    Ok, by asking you mean that you find somewhere questions that someone identified as being answered wrongly by LLM, and asking yourself.
    
    source
    -> View More Comments
- magic_lobster_party@kbin.run ⁨8⁩ ⁨months⁩ ago
  I’ve asked GPT4 to write specific Python programs, and more often than not it does a good job. And if the program is incorrect I can tell it about the error and it will often manage to fix it for me.
  
  source
- TrickDacy@lemmy.world ⁨8⁩ ⁨months⁩ ago
  I don’t believe you
  
  source
  - FlorianSimon@sh.itjust.works ⁨8⁩ ⁨months⁩ ago
    You have every right not to, but the “useless” word comes out a lot when talking about LLMs and code, and we’re not all arguing in bad faith. The reliability problem is still a strong factor in why people don’t use this more, and, even if you buy into the hype, it’s probably a good idea to temper your expectations and try to walk a mile in the other person’s shoes. You might get to use LLMs and learn a thing or two.
    
    source
    TrickDacy@lemmy.world ⁨8⁩ ⁨months⁩ ago
    I only “believe the hype” because a good developer friend of mine suggested I try copilot so I did and was impressed. It’s an amazing technical achievement that helps me get my job done. It’s useful every single day I use it. Does it do my job for me? No of fucking course not, I’m not a moron who expected that to begin with. It speeds up small portions of tasks and if I don’t understand or agree with its solution, it’s insanely easy not to use it.
    
    People online mad about something new is all this is. There are valid concerns about this kind of tech, but I rarely see that. Ignorance on the topic prevails. Anyone calling ai “useless” in a blanket statement is necessarily ignorant and doesn’t really deserve my time except to catch a quick insult for being the ignorant fool they have revealed themselves to be.
    
    source
    -> View More Comments
  - SpaceNoodle@lemmy.world ⁨8⁩ ⁨months⁩ ago
    OK
    
    source
k_rol@lemmy.ca ⁨8⁩ ⁨months⁩ ago
I think Meta hates your answer

source