In order to get perfect training data, they cannot use any human output.
I’m afraid it is not going to happen anytime soon :)
Comment on OopsGPT - OpenAI just announced a new search tool. Its demo already got something wrong.
Bell@lemmy.world 5 months ago
The hallucinations will continue until the training data is absolutely perfect
In order to get perfect training data, they cannot use any human output.
I’m afraid it is not going to happen anytime soon :)
I’ve started editing my answers/questions on StackExchange. Few characters at a time. I’m doing my part.
Won’t that also make things worse for people looking for answers?
marked as duplicate, closed
Are you improving it, or do you create new errors? ;-)
Errors. I rewrote all my stuff up there with Fuck you OpenAI or something pike that in spring and got banned fo 3 months. So after I got a reminder in my calendar that 3 months are up I got to work a little bit more sophisticatedly.
yes
What other output do you propose?
What other output do you propose?
I do not propose, and it is not neccessarily any output.
Their first question is, what do they want the AI to do. And if they want it to be perfect, then they need to use perfect training data, not human output.
This is exactly why apple uses API for apps to give well structured data as context instead of random screenshot data
Hallucinations are an unavoidable part of LLMs, and are just as present in the human mind. Training data isn’t the issue. The issue is that the design of the systems that leverage LLMs uses them to do more than they should be doing.
I don’t think that anything short of being able to validate an LLM’s output without running it through another LLM will be able to fully prevent hallucinations.
hendrik@palaver.p3x.de 5 months ago
That's not correct btw. AI is supposed to be creative and come up with new text/images/ideas. Even with perfect training data. That creativity means creativity. We want it to come up with new text out of thin air. And perfect training data is not going to change anything about it. We'd need to remove the ability to generate fictional stories and lots of other answers, too. Or come up with an entirely different approach.
bjorney@lemmy.ca 5 months ago
AI isn’t supposed to be creative, it’s isn’t even capable of that. It’s meant to min/max it’s evaluation criterion against a test dataset
It does this by regurgitating the training data associated with a given input as closely as possible
hendrik@palaver.p3x.de 5 months ago
I've heard people saying that before. But it's not true. You can ask an AI to draw you an astronaut on a horse and it'll do it despite never having seen such picture. (Now it has.) Same applies to LLMs. They come up with an answer to your exact question. Not a similar one it saw on Reddit before. That answer might be wrong (which is my point) but if you try it, you'll regularly find it tries answering your questions and not different ones.
I've also tried some scifi storywriting with AI and there it becomes quite obvious that it's able to apply things it knows from different contexts and apply that to my setting. Like ethics questions, basic physics and what character can and cannot do. Rough knowledge about how stories are written. You can tell it to do a plot twist an an arbitrary point and it'll do. All of that is knowledge about (abstract) knowledge and the ability to apply it to different contexts. Which is an important part of creativity.
And I've read papers where the scientists try to look inside of AI and they are able to spot abstract concepts like what a cat is in the weights. It's fascinating how it works. And it turns out it's not just regurgitating it's training data. Which isn't surprising because a lot of effort has been put into the computer science behind it to make AI more than that. And it's also why they're useful in the first place.
LunarLoony@lemmy.sdf.org 5 months ago
It’s able to apply those things because it’s read millions of sci-fi stories, and can make an educated guess. It’s also able to produce an image od an astronaut on a horse because it’s seen lots of images of astronauts and horses, and people sitting on horses, so it can once again make an educated guess. I don’t think it’s right to call that creativity.