1% correct is never “fairly high” wtf
Also if you want a computer that you don’t have to double check, you literally are expecting software to embody the concept of God. This is fucking stupid.
Comment on Rabbit R1 AI box revealed to just be an Android app
De_Narm@lemmy.world 6 months ago“Fairly high” is still useless (and doesn’t actually quantify anything, depending on context both 1% and 99% could be ‘fairly high’). As long as these models just hallucinate things, I need to double-check. Which is what I would have done without one of these things anyway.
1% correct is never “fairly high” wtf
Also if you want a computer that you don’t have to double check, you literally are expecting software to embody the concept of God. This is fucking stupid.
1% correct is never “fairly high” wtf It’s all about context. Asking a bunch of 4 year olds about questions about trigonometry, 1% of answers being correct would be fairly high. ‘Fairly high’ basically only means ‘as high as expected’ or ‘higher than expected’.
Also if you want a computer that you don’t have to double check, you literally are expecting software to embody the concept of God. This is fucking stupid. Hence, it is useless. If I cannot expect it to be more or less always correct, I can skip using it and just look stuff up myself.
Obviously the only contexts that would apply here are ones where you expect a correct answer. Why would we be evaluating a software that claims to be helpful against 4 year old asked to do calculus? I have to question your ability to reason for insinuating this.
So confirmed. God or nothing. Why don’t you go back to quills? Computers cannot read your mind and write this message automatically, hence they are useless
Obviously the only contexts that would apply here are ones where you expect a correct answer.
That’s the whole point, I don’t expect correct answers. Neither from a 4 year old nor from a probabilistic language model.
AIhasUse@lemmy.world 6 months ago
Hallucinations are largely dealt with if you use agents. It won’t be long until it gets packaged well enough that anyone can just use it. For now, it takes a little bit of effort to get a decent setup.