We have to stop ignoring AI’s hallucination problem

⁨519⁩ ⁨likes⁩

Submitted ⁨⁨9⁩ ⁨months⁩ ago⁩ by ⁨misk@sopuli.xyz⁩ to ⁨technology@lemmy.world⁩

https://www.theverge.com/2024/5/15/24154808/ai-chatgpt-google-gemini-microsoft-copilot-hallucination-wrong

source

Comments

Sort:hotnew top

Voroxpete@sh.itjust.works ⁨9⁩ ⁨months⁩ ago
We not only have to stop ignoring the problem, we need to be absolutely clear about what the problem is.

LLMs don’t hallucinate wrong answers. They hallucinate all answers. Some of those answers will happen to be right.

If this sounds like nitpicking or quibbling over verbiage, it’s not. This is really, really important to understand. LLMs exist within a hallucinatory false reality. They do not have any comprehension of the truth or untruth of what they are saying, and this means that when they say things that are true, they do not understand why those things are true.

That is the part that’s crucial to understand. A really simple test of this problem is to ask ChatGPT to back up an answer with sources. It fundamentally cannot do it, because it has no ability to actually comprehend and correlate factual information in that way. This means, for example, that AI is incapable of assessing the potential veracity of the information it gives you. A human can say “That’s a little outside of my area of expertise,” but an LLM cannot. It can only be coded with hard blocks in response to certain keywords to cut it from answering and insert a stock response.

This distinction, that AI is always hallucinating, is important because of stuff like this:

But notice how Reid said there was a balance? That’s because a lot of AI researchers don’t actually think hallucinations can be solved. A study out of the National University of Singapore suggested that hallucinations are an inevitable outcome of all large language models. **Just as no person is 100 percent right all the time, neither are these computers. **

That is some fucking toxic shit right there. Treating the fallibility of LLMs as analogous to the fallibility of humans is a huge, huge false equivalence. Humans can be wrong, but we’re wrong in ways that allow us the capacity to grow and learn. Even when we are wrong about things, we can often learn from how we are wrong. There’s a structure to how humans learn and process information that allows us to interrogate our failures and adjust for them.

When an LLM is wrong, we just have to force it to keep rolling the dice until it’s right. It cannot explain its reasoning. It cannot provide proof of work. I work in a field where I often have to direct the efforts of people who know more about specific subjects than I do, and part of how you do that is you get people to explain their reasoning, and you go back and forth testing propositions and arguments with them. You say “I want this, what are the specific challenges involved in doing it?” They tell you it’s really hard, you ask them why. They break things down for you, and together you find solutions. With an LLM, if you ask it why something works the way it does, it will commit to the bit and proceed to hallucinate false facts and false premises to support its false answer, because it’s not operating in the same reality you are, nor does it have any conception of reality in the first place.

source
- dustyData@lemmy.world ⁨9⁩ ⁨months⁩ ago
  This right here is also the reason why AI fanboys get angry when they are told that LLMs are not intelligent or even thinking at all. They don’t understand that in order for rational intelligence to exist, the LLMs should be able to have an internal, referential inner world of symbols, to contrast external input (training data) against and that is also capable of changing and molding to reality and truth criteria. No, tokens are not what I’m talking about. I’m talking about an internally consistent and persistent representation of the world. An identity, which is currently antithetical with the information model used to train LLMs. Let me try to illustrate.
  
  I don’t remember the details or technical terms but essentially, animal intelligence needs to experience a lot of things first hand in order to create an individualized model of the world which is used to direct behavior (language is just one form of behavior after all). This is very slow and labor intensive, but it means that animals are extremely good, when they get good, at adapting said skills to a messy reality. LLMs are transactional, they rely entirely on the correlation of patterns of input to itself. As a result they don’t need years of experience, like humans for example, to develop skilled intelligent responses. They can do it in hours of sensing training input instead. But at the same time, they can never be certain of their results, and when faced with reality, they crumble because it’s harder for it to adapt intelligently and effectively to the mess of reality.
  
  LLMs are a solipsism experiment. A child is locked in a dark cave with nothing but a dim light and millions of pages of text, assume immortality and no need for food or water. As there is nothing else to do but look at the text they eventually develop the ability to understand how the symbols marked on the text relate to each other, how they are usually and typically assembled one next to the other. One day, a slit on a wall opens and the person receives a piece of paper with a prompt, a pencil and a blank page. Out of boredom, the person looks at the prompt, it recognizes the symbols and the pattern, and starts assembling the symbols on the blank page with the pencil. They are just trying to continue from the prompt what they think would typically follow or should follow afterwards. The slit in the wall opens again, and the person intuitively pushes the paper it just wrote into the slit.
  
  For the people outside the cave, leaving prompts and receiving the novel piece of paper, it would look like an intelligent linguistic construction, it is grammatically correct, the sentences are correctly punctuated and structured. The words even make sense and it says intelligent things in accordance to the training text left inside and the prompt given. But once in a while it seems to hallucinate weird passages. They miss the point that, it is not hallucinating, it just has no sense of reality. Their reality is just the text. When the cave is opened and the person trapped inside is left into the light of the world, it would still be profoundly ignorant about it. When given the word sun, written on a piece of paper, they would have no idea that the word refers to the bright burning ball of gas above them. It would know the word, it would know how it is usually used to assemble text next to other words. But it won’t know what it is.
  
  LLMs are just like that, they just aren’t actually intelligent as the person in this mental experiment. Because there’s no way, currently, for these LLMs to actually sense and correlate the real world, or several sources of sensors into a mentalese internal model. This is currently the crux and the biggest problem on the field of AI as I understand it.
  
  source
  - Aceticon@lemmy.world ⁨9⁩ ⁨months⁩ ago
    That’s an excellent methaphor for LLMs.
    
    source
    -> View More Comments
  - Cyberflunk@lemmy.world ⁨9⁩ ⁨months⁩ ago
    Wtf are you even talking about.
    
    source
    -> View More Comments
  - UnpluggedFridge@lemmy.world ⁨9⁩ ⁨months⁩ ago
    How do hallucinations preclude an internal representation? Couldn’t hallucinations arise from a consistent internal representation that is not fully aligned with reality?
    
    I think you are misunderstanding the role of tokens in LLMs and conflating them with internal representation. Tokens are used to generate a state, similar to external stimuli. The internal representation, assuming there is one, is the manner in which the tokens are processed. You could say the same thing about human minds, that the representation is not located anywhere like a piece of data; it is the manner in which we process stimuli.
    
    source
    -> View More Comments
- snek@lemmy.world ⁨9⁩ ⁨months⁩ ago
  I fucking hate how OpenAi and other such companies claim their models “understand” language or are “fluent” in French. These are human attributes. Unless they made a synthetic brain, they can take these claims and shove them up their square tight corporate behinds.
  
  source
  - mamotromico@lemmy.ml ⁨9⁩ ⁨months⁩ ago
    I though I would have an aneurism reading their presentation page on Sora.
    
    They are saying Sora can understand and simulate complex physics in 3D space to render a video.
    
    How can such bullshit go unchallenged. It drives me crazy.
    
    source
  - EatATaco@lemm.ee ⁨9⁩ ⁨months⁩ ago
    This is circular logic: only humans can be fluent, so the models can’t be fluent because they aren’t human.
    
    And it’s universally upvoted…in response to an ais getting things wrong so they can’t be doing anything but hallucinating.
    
    And will you learn from this? Nope. I’ll just be down voted and shouted at.
    
    source
    -> View More Comments
- el_bhm@lemm.ee ⁨9⁩ ⁨months⁩ ago
  
  They do not have any comprehension of the truth or untruth of what they are saying, and this means that when they say things that are true, they do not understand why those things are true.
  
  Which can be beautifully exploited with sponsored content.
  
  See Google I/O '24.
  
  source
  - SnipingNinja@slrpnk.net ⁨9⁩ ⁨months⁩ ago
    What specifically in Google I/O?
    
    source
    -> View More Comments
- nucleative@lemmy.world ⁨9⁩ ⁨months⁩ ago
  Well stated and explained. I’m not an AI researcher but I develop with LLMs quite a lot right now.
  
  Hallucination is a huge problem we face when we’re trying to use LLMs for non-fiction. It’s a little bit like having a friend who can lie straight-faced and convincingly. You cannot distinguish whether they are telling you the truth or they’re lying until you rely on the output.
  
  I think one of the nearest solutions to this may be the addition of extra layers or observer engines that are very deterministic and trained on only extremely reputable sources, perhaps only peer reviewed trade journals, for example, or sources we deem trustworthy. Unfortunately this could only serve to improve our confidence in the facts, not remove hallucination entirely.
  
  It’s even feasible that we could have multiple observers with different domains of expertise (i.e. training sources) and voting capability to fact check and subjectively rate the LLMs output trustworthiness.
  
  But all this will accomplish short term is to perhaps roll the dice in our favor a bit more often.
  
  The perceived results from the end users however may significantly improve. Consider some human examples: sometimes people disagree with their doctor so they go see another doctor and another until they get the answer they want. Sometimes two very experienced lawyers both look at the facts and disagree.
  
  The system that prevents me from knowingly stating something as true, despite not knowing, without some ability to back up my claims is my reputation and my personal values and ethics. LLMs can only pretend to have those traits when we tell them to.
  
  source
  - Voroxpete@sh.itjust.works ⁨9⁩ ⁨months⁩ ago
    
    Consider some human examples: sometimes people disagree with their doctor so they go see another doctor and another until they get the answer they want. Sometimes two very experienced lawyers both look at the facts and disagree.
    
    This actually illustrates my point really well. Because the reason those people disagree might be
    
    Different awareness of the facts (lawyer A knows an important piece of information lawyer B doesn’t)
    
    Different understanding of the facts (lawyer might have context lawyer B doesn’t)
    
    Different interpretation of the facts (this is the hardest to quantify, as its a complex outcome of everything that makes us human, including personality traits such as our biases).
    
    Whereas you can ask the same question to the same LLM equipped with the same data set and get two different answers because it’s just rolling dice at the end of the day.
    
    If I sit those two lawyers down at a bar, with no case on the line, no motivation other than just friendly discussion, they could debate the subject and likely eventually come to a consensus, because they are sentient beings capable of reason. That’s what LLMs can only fake through smoke and mirrors.
    
    source
- UnpluggedFridge@lemmy.world ⁨9⁩ ⁨months⁩ ago
  I think where you are going wrong here is assuming that our internal perception is not also a hallucination by your definition. It absolutely is. But our minds are embodied, thus we are able check these hallucinations against some outside stimulus. Your gripe that current LLMs are unable to do that is really a criticism of the current implementations of AI, which are trained on some data, frozen, then restricted from further learning by design. Imagine if your mind was removed from all stimulus and then tested. That is what current LLMs are, and I doubt we could expect a human mind to behave much better in such a scenario. Just look at what happens to people cut off from social stimulus; their mental capacities degrade rapidly and that is just one type of stimulus.
  
  Another problem with your analysis is that you expect the AI to do something that humans cannot do: cite sources without an external reference. Go ahead right now and from memory cite some source for something you know. Do not Google search, just remember where you got that knowledge. Now who is the one that cannot cite sources? The way we cite sources generally requires access to the source at that moment. Current LLMs do not have that by design. Once again, this is a gripe with implementation of a very new technology.
  
  The main problem I have with so many of these “AI isn’t really able to…” arguments is that no one is offering a rigorous definition of knowledge, understanding, introspection, etc in a way that can be measured and tested. Further, we just assume that humans are able to do all these things without any tests to see if we can. Don’t even get me started on the free will vs illusory free will debate that remains unsettled after centuries. But the crux of many of these arguments is the assumption that humans can do it and are somehow uniquely able to do it. We had these same debates about levels of intelligence in animals long ago, and we found that there really isn’t any intelligent capability that is uniquely human.
  
  source
  - mindlesscrollyparrot@discuss.tchncs.de ⁨9⁩ ⁨months⁩ ago
    This seems to be a really long way of saying that you agree that current LLMs hallucinate all the time.
    
    I’m not sure that the ability to change in response to new data would necessarily be enough. They cannot form hypotheses and, even if they could, they have no way to test them.
    
    source
    -> View More Comments
- EatATaco@lemm.ee ⁨9⁩ ⁨months⁩ ago
  
  they do not understand why those things are true.
  
  Some researchers compared the results of questions between chat gpt 3 and 4. One of the questions was about stacking items in a stable way. Chat gpt 3 just, in line with what you are saying about “without understanding”, listed the items saying to place them one on top of each other. No way it would have worked.
  
  Chat gpt 4, however, said that you should put the book down first, put the eggs in a 3 x 3 grid on top of the book, trap them in a way with a laptop so they don’t roll around, and then put the bottle on top of the laptop standing up, and then balance the nail on the top of it…even noting you have to put the flat end of the nail down. This sounds a lot like understanding to me and not just rolling the dice hoping to be correct.
  
  Yes, AI confidently gets stuff wrong. But let’s all note that there is a whole subreddit dedicated to people being confidently wrong. One doesn’t need to go any further than Lemmy to see people confidently claiming to know the truth about shit they should know is outside of their actual knowledge. We’re all guilty of this. Including refusing to learn when we are wrong. Additionally, the argument that they can’t learn doesn’t make sense because models have definitely become better.
  
  Now I’m not saying ai is conscious, I really don’t know, but all of your shortcomings you’ve listed humans are guilty of too. So to use it as examples as to why it’s always just a hallucination, or that our thoughts are not, doesn’t seem to hold much water to me.
  
  source
  - insaan@leftopia.org ⁨9⁩ ⁨months⁩ ago
    
    the argument that they can’t learn doesn’t make sense because models have definitely become better.
    
    They have to be either trained with new data or their internal structure has to be improved. It’s an offline process, meaning they don’t learn through chat sessions we have with them (if you open a new session it will have forgotten what you told it in a previous session), and they can’t learn through any kind of self-directed research process like a human can.
    
    all of your shortcomings you’ve listed humans are guilty of too.
    
    LLMs are sophisticated word generators. They don’t think or understand in any way, full stop. This is really important to understand about them.
    
    source
    -> View More Comments
  - AstralPath@lemmy.ca ⁨9⁩ ⁨months⁩ ago
    A source link to what you’re referring to would be nice.
    
    source
    -> View More Comments
- 5gruel@lemmy.world ⁨9⁩ ⁨months⁩ ago
  I’m not convinced about the “a human can say ‘that’s a little outside my area of expertise’, but an LLM cannot.” I’m sure there are a lot of examples in the training data set that contains qualification of answers and expression of uncertainty, so why would the model not be able to generate that output? I don’t see why it would require an “understanding” for that specifically. I would suspect that better human reinforcement would make such answers possible.
  
  source
  - dustyData@lemmy.world ⁨9⁩ ⁨months⁩ ago
    Because humans can do introspection and think and reflect about our own knowledge against the perceived expertise and knowledge of other humans. There’s nothing in LLMs models capable of doing this. An LLM cannot asses it own state, and even if it could, it has nothing to contrast it to. You cannot develop the concept of ignorance without an other to interact and compare with.
    
    source
- HelloHotel@lemmy.world ⁨9⁩ ⁨months⁩ ago
  usually, what I see is that the REPL they are using is never introspective enough. The ai cant on its own revert to a prevous state or give notes to itself because the response being fast and in linear time matters for a chatbot. ChatGPT can make really cool stuff when you ask it to break it’s thoght process into steps. Ones it usually fails spectacularly at. It was like pulling teeth to get it to actually do the steps and not just give the bad answer anyway.
  
  source
astreus@lemmy.ml ⁨9⁩ ⁨months⁩ ago
“We invented a new kind of calculator. It usually returns the correct value for the mathematics you asked it to evaluate! But sometimes it makes up wrong answers for reasons we don’t understand. So if it’s important to you that you know the actual answer, you should always use a second, better calculator to check our work.”

Then what is the point of this new calculator?

Fantastic comment, from the article.

source
- CaptainSpaceman@lemmy.world ⁨9⁩ ⁨months⁩ ago
  Its not just a calculator though.
  
  Image generation requires no fact checking whatsoever, and some of the tools can do it well.
  
  That said, LLMs will always have limitations and true AI is still a ways away.
  
  source
  - pixel_prophet@lemm.ee ⁨9⁩ ⁨months⁩ ago
    The biggest disappointment in the image generation capabilities was the realisation that there is no object permanence there in terms of components making up an image so for any specificity you’re just playing whackamole with iterations that introduce other undesirable shit no matter how specific you make your prompts.
    
    They are also now heavily nerfing the models to avoid lawsuits by just ignoring anything relating to specific styles that may be considered trademarks, problem is those are often industry jargon so now you’re having to craft more convoluted prompts and get more mid results.
    
    source
  - sudneo@lemm.ee ⁨9⁩ ⁨months⁩ ago
    It does require fact-checking. You might ask a human and get someone with 10 fingers on one hand, you might ask people in the background and get blobs merged on each other. The fact check in images is absolutely necessary and consists of verifying that the generate image adheres to your prompt and that the objects in it match their intended real counterparts.
    
    I do agree that it’s a different type of fact checking, but that’s because an image is not inherently correct or wrong, it only is if compared to your prompt and (where applicable) to reality.
    
    source
  - catloaf@lemm.ee ⁨9⁩ ⁨months⁩ ago
    It doesn’t? Have you not seen any of the articles about AI-generated images being used for misinformation?
    
    source
  - elephantium@lemmy.world ⁨9⁩ ⁨months⁩ ago
    
    Image generation requires no fact checking whatsoever
    
    Sure it does. Let’s say IKEA wants to use midjourney to generate images for its furniture assembly instructions. The instructions are already written, so the prompt is something like “step 3 of assembling the BorkBork kitchen table”.
    
    Would you just auto-insert whatever it generated and send it straight to the printer for 20000 copies?
    
    Or would you look at the image and make sure that it didn’t show a couch instead?
    
    If you choose the latter, that’s fact checking.
    
    That said, LLMs will always have limitations and true AI is still a ways away.
    
    I can’t agree more strongly with this point!
    
    source
- lateraltwo@lemmy.world ⁨9⁩ ⁨months⁩ ago
  It’s a nascent stage technology that reflects the world’s words back at you in statistical order by way parsing user generated prompts. It’s a reactive system with no autonomy to deviate from a template upon reset. It’s no Rokos Basilisk inherently, just because
  
  source
  - tourist@lemmy.world ⁨9⁩ ⁨months⁩ ago
    am I understanding correctly that it’s just a fancy random word generator
    
    source
    -> View More Comments
- elephantium@lemmy.world ⁨9⁩ ⁨months⁩ ago
  Some problems lend themselves to “guess-and-check” approaches. This calculator is great at guessing, and it’s usually “close enough”.
  
  The other calculator can check efficiently, but it can’t solve the original problem.
  
  Essentially this is the entire motivation for numerical methods.
  
  source
  - Aceticon@lemmy.world ⁨9⁩ ⁨months⁩ ago
    In my personal experience given that’s how I general manage to shortcut a lot of labour intensive intellectual tasks, using intuition to guess possible answers/results and then working backwards from them to determine which one is right and even prove it, is generally faster (I guess how often it’s so depends on how good one’s intuition is in a given field, which in turn correlates with experience in it).
    
    That said, it’s far from guaranteed faster.
    
    Further, merelly just the intuition step does not yield a result that can be trusted without validation.
    
    Maybe by being used as intuition is in this process, LLMs can help accelerate the search for results in subjects one has not enough experience in to have good intuition on but has enough validation (or there are ways or tools to do it inherent to that domain) to do the “validation of possible results” part.
    
    source
- kuberoot@discuss.tchncs.de ⁨9⁩ ⁨months⁩ ago
  That’s not really right, because verifying solutions is usually much easier than finding them. A calculator that can take in arbitrary sets of formulas and produce answers for variables, but is sometimes wrong, is an entirely different beast than a calculator that can plug values into variables and evaluate expressions to check if they’re correct.
  
  As a matter of fact, I’m pretty sure that argument would also make quantum computing pointless - because quantum computers are probability based and can provide answers for difficult problems, but not consistently, so you want to use a regular computer to verify those answers.
  
  Perhaps a better comparison would be a dictionary that can explain entire sentences, but requires you to then check each word in a regular dictionary and make sure it didn’t mix them up completely? Though I guess that’s actually exactly how LLMs operate…
  
  source
  - assassin_aragorn@lemmy.world ⁨9⁩ ⁨months⁩ ago
    It’s only easier to verify a solution than come up with a solution when you can trust and understand the algorithms that are developing the solution. Simulation software for thermodynamics is magnitudes faster than hand calculations, but you know what the software is doing. The creators of the software aren’t saying “we don’t actually know how it works”.
    
    In the case of an LLM, I have to verify everything with no trust whatsoever. And that takes longer than just doing it myself. Especially because an LLM is writing something for me, it isn’t doing complex math.
    
    source
    -> View More Comments
- Zerfallen@lemmy.world ⁨9⁩ ⁨months⁩ ago
  It would be a great comment if it represented reality, but as an analogy it’s completely off.
  
  LLM-based AI represents functionality that nothing other than the human mind and painstaking research or singular expertise can replicate. There is no already existing ‘second, better calculator’ that has the same breadth of capabilities, particularly in areas involving language.
  
  If you’re only using it as a calculator (which was never the strength of an LLM in the first place), for problems you could already solve with a calculator because you understand what is required, then uh… yeah i mean use a calculator, that is the appropriate tool.
  
  source
  - ramirezmike@programming.dev ⁨9⁩ ⁨months⁩ ago
    do you know what an analogy is??
    
    source
    -> View More Comments
- RecluseRamble@lemmy.dbzer0.com ⁨9⁩ ⁨months⁩ ago
  The problem is people thinking the tool is a “calculator” (or fact-checker or search engine) while it’s just a text generator. It’s great for generating text.
  
  But even then it can’t keep a paragraph stable during the conversation. For me personally, the best antidote against the hype was to use the tool.
  
  source
MentalEdge@sopuli.xyz ⁨9⁩ ⁨months⁩ ago
Altman going “yeah we could make it get things right 100% of the time, but that would be boring” has such “my girlfriend goes to another school” energy it’s not even funny.

source
FalseMyrmidon@kbin.run ⁨9⁩ ⁨months⁩ ago
Who's ignoring hallucinations? It gets brought up in basically every conversation about LLMs.

source
- 14th_cylon@lemm.ee ⁨9⁩ ⁨months⁩ ago
  People who suggest, let’s say, firing employees of crisis intervention hotline and replacing them with llms…
  
  source
  - SkyezOpen@lemmy.world ⁨9⁩ ⁨months⁩ ago
    “Have you considered doing a flip as you leap off the building? That way your death is super memorable and cool, even if your life wasn’t.”
    
    -Crisis hotline LLM, probably.
    
    source
  - Voroxpete@sh.itjust.works ⁨9⁩ ⁨months⁩ ago
    Less horrifying conceptually, but in Canada a major airline tried to replace their support services with a chatbot. The chatbot then invented discounts that didn’t actually exist, and the courts ruled that the airline had to honour them. The chatbot was, for all intents and purposes, no more or less official a source of data than any other information they put out, such as their website and other documentation.
    
    source
  - L_Acacia@lemmy.one ⁨9⁩ ⁨months⁩ ago
    They know the tech is not good enough, they just dont care and want to maximise profit.
    
    source
- Neato@ttrpg.network ⁨9⁩ ⁨months⁩ ago
  It really needs to be a disqualifying factor for generative AI. Even using it for my hobbies is useless when I can’t trust it knows dick about fuck. Every time I test the new version out it gets things so blatantly wrong and contradictory that I give up; it’s not worth the effort. It’s no surprise everywhere I’ve worked has outright banned its use for official work.
  
  source
  - DdCno1@kbin.social ⁨9⁩ ⁨months⁩ ago
    I agree. The only application that is fine for this in my opinion is using it solely for entertainment, as a toy.
    
    The problem is of course that everyone and their mothers are pouring billions into what clearly should only be used as a toy, expecting it to perform miracles it currently can not and might never be able to pull off.
    
    source
Lmaydev@programming.dev ⁨9⁩ ⁨months⁩ ago
Honestly I feel people are using them completely wrong.

Their real power is their ability to understand language and context.

Turning natural language input into commands that can be executed by a traditional software system is a huge deal.

Microsoft released an AI powered auto complete text box and it’s genius.

Currently you have to type an exact text match in an auto complete box. So if you type cats but the item is called pets you’ll get no results. Now the ai can find context based matches in the auto complete list.

This is their real power.

Also they’re amazing at generating non factual based things. Stories, poems etc.

source
- noodlejetski@lemm.ee ⁨9⁩ ⁨months⁩ ago
  
  Their real power is their ability to understand language and context.
  
  …they do exactly none of that.
  
  source
  - breakingcups@lemmy.world ⁨9⁩ ⁨months⁩ ago
    No, but they approximate it. Which is fine for most use cases the person you’re responding to described.
    
    source
    -> View More Comments
  - Lmaydev@programming.dev ⁨9⁩ ⁨months⁩ ago
    They do it much better than anything you can hard code currently.
    
    source
- Blue_Morpho@lemmy.world ⁨9⁩ ⁨months⁩ ago
  
  So if you type cats but the item is called pets get no results. Now the ai can find context based matches in the auto complete list.
  
  Google added context search to Gmail and it’s infuriating. I’m looking for an exact phrase that I even put in quotes but Gmail returns a long list of emails that are vaguely related to the search word.
  
  source
  - Lmaydev@programming.dev ⁨9⁩ ⁨months⁩ ago
    That is indeed a poor use. Searching traditionally first and falling back to it would make way more sense.
    
    source
    -> View More Comments
- Voroxpete@sh.itjust.works ⁨9⁩ ⁨months⁩ ago
  That’s called “fuzzy” matching, it’s existed for a long, long time. We didn’t need “AI” to do that.
  
  source
  - Lmaydev@programming.dev ⁨9⁩ ⁨months⁩ ago
    No it’s not.
    
    Fuzzy matching is a search technique that uses a set of fuzzy rules to compare two strings. The fuzzy rules allow for some degree of similarity, which makes the search process more efficient.
    
    That allows for mis typing etc. it doesn’t allow context based searching at all.
    
    Also it is an AI technique itself.
    
    source
    -> View More Comments
- hedgehogging_the_bed@lemmy.world ⁨9⁩ ⁨months⁩ ago
  Searching with synonym matching is almost.decades old at this point. I worked on it as an undergrad in the early 2000s.and it wasn’t new then, just complicated. Google’s version improved over other search algorithms for a long time.and then trashed it by letting AI take over.
  
  source
  - Lmaydev@programming.dev ⁨9⁩ ⁨months⁩ ago
    Google’s algorithm has pretty much always used AI techniques.
    
    It doesn’t have to be a synonym. That’s just an example.
    
    Typing diabetes and getting medical services as a result wouldn’t be possible with that technique unless you had a database of every disease to search.
    
    source
    -> View More Comments
- Th4tGuyII@kbin.social ⁨9⁩ ⁨months⁩ ago
  Exactly. The big problem with LLMs is that they're so good at mimicking understanding that people forget that they don't actually have understanding of anything beyond language itself.
  
  The thing they excel at, and should be used for, is exactly what you say - a natural language interface between humans and software.
  
  Like in your example, an LLM doesn't know what a cat is, but it knows what words describe a cat based on training data - and for a search engine, that's all you need.
  
  source
- not_amm@lemmy.ml ⁨9⁩ ⁨months⁩ ago
  That’s why I only use Perplexity. ChatGPT can’t give me sources unless I pay, so I can’t trust information it gives me and it also hallucinated a lot when coding, it was faster to search in the official documentation rather than correcting and debugging code “generated” by ChatGPT.
  
  I use Perplexity + SearXNG, so I can search a lot faster, cite sources and it also makes summaries of your search, so it saves me time while writing introductions and so.
  
  It sometimes hallucinates too and cites weird sources, but it’s faster for me to correct and search for better sources given the context and more ideas. In summary, when/if you’re correcting the prompts and searching apart from Perplexity, you already have something useful.
  
  BTW, I try not to use it a lot, but it’s way better for my workflow.
  
  source
Wirlocke@lemmy.blahaj.zone ⁨9⁩ ⁨months⁩ ago
I’m a bit annoyed at all the people being pedantic about the term hallucinate.

Programmers use preexisting concepts as allegory for computer concepts all the time.

Your file isn’t really a file, your desktop isn’t a desk, your recycling bin isn’t a recycling bin.

[Insert the entirety of Object Oriented Programming here]

Neural networks aren’t really neurons, genetic algorithms isn’t really genetics, and the LLM isn’t really hallucinating.

But it easily conveys what the bug is. It only personifies the LLM because the English language almost always personifies the subject. The moment you apply a verb on an object you imply it performed an action, unless you limit yourself to esoteric words/acronyms or you use several words to overexplain everytime.

source
- calcopiritus@lemmy.world ⁨9⁩ ⁨months⁩ ago
  It’s easily the worst problem if Lemmy. Sometimes one guy has an issue with something and suddenly the whole thread is about that thing, as if everyone thought about it. No, you didn’t think about it, you just read another person’s comment and made another one instead of replying to it.
  
  I never heard anyone complain about the term “hallucination” for AIs, but suddenly in this one thread there are 100 clonic comments instead of a single upvoted ones.
  
  I get it, you don’t like “hallucinate”, just upvote the existing comment about it and move on. If you have anything to add, reply to that comment.
  
  I don’t know why this specific thing is so common on Lemmy though, I don’t think it happened in reddit.
  
  source
  - emptiestplace@lemmy.ml ⁨9⁩ ⁨months⁩ ago
    
    I don’t know why this specific thing is so common on Lemmy though, I don’t think it happened in reddit.
    
    When you’re used to knowing a lot relative to the people around you, learning to listen sometimes becomes optional.
    
    source
  - ZILtoid1991@lemmy.world ⁨9⁩ ⁨months⁩ ago
    “Hallucination” pretty well describes my opinion on AI generated “content”. I think all of their generation is a hallucination at best.
    
    Garbage in, garbage out.
    
    source
- abrinael@lemmy.world ⁨9⁩ ⁨months⁩ ago
  What I don’t like about it is that it makes it sound more benign than it is. Which also points to who decided to use that term - AI promoters/proponents.
  
  source
  - zalgotext@sh.itjust.works ⁨9⁩ ⁨months⁩ ago
    The term “hallucination” has been used for years in AI/ML academia. I reading about AI hallucinations ten years ago when I was in college. The term was originally coined by researchers and mathematicians, not the snake oil salesman pushing AI today.
    
    source
    -> View More Comments
- ZILtoid1991@lemmy.world ⁨9⁩ ⁨months⁩ ago
  They’re nowadays using it to humanize neural networks, and thus oversell its capabilities.
  
  source
lectricleopard@lemmy.world ⁨9⁩ ⁨months⁩ ago
The Chinese Room thought experiment is a good place to start the conversation. AI isn’t intelligent, and it doesn’t hallucinate. Its not sentient. It’s just a computer program.

People need to stop using personifying language for this stuff.

source
- TubularTittyFrog@lemmy.world ⁨9⁩ ⁨months⁩ ago
  that’s not fun and dramatic and clickbaity though
  
  source
- TheDarksteel94@sopuli.xyz ⁨9⁩ ⁨months⁩ ago
  Technically, humans are just bio machines, running very complicated software. AI just isn’t there yet.
  
  source
ALostInquirer@lemm.ee ⁨9⁩ ⁨months⁩ ago
Why do tech journalists keep using the businesses’ language about AI, such as “hallucination”, instead of glitching/bugging/breaking?

source
- Danksy@lemmy.world ⁨9⁩ ⁨months⁩ ago
  It’s not a bug, it’s a natural consequence of the methodology. A language model won’t always be correct when it doesn’t know what it is saying.
  
  source
- machinin@lemmy.world ⁨9⁩ ⁨months⁩ ago
  …wikipedia.org/…/Hallucination_(artificial_intell…
  
  The term “hallucinations” originally came from computer researchers working with image producing AI systems. I think you might be hallucinating yourself 😉
  
  source
KillingTimeItself@lemmy.dbzer0.com ⁨9⁩ ⁨months⁩ ago
it’s only going to get worse, especially as datasets deteriorate.

With things like reddit being overrun by AI, and also selling AI training data, i can only imagine what mess that’s going to cause.

source
SolNine@lemmy.ml ⁨9⁩ ⁨months⁩ ago
The simple solution is not to rely upon AI. It’s like a misinformed relative after a jar of moonshine, they might be right some of the time, or they might be totally full of shit.

I honestly don’t know why people are obsessed with relying on AI, is it that difficult to look up the answer from a reliable source?

source
Hugin@lemmy.world ⁨9⁩ ⁨months⁩ ago
Prisencolinensinainciusol an Italian song that is complete gibberish but made to sound like an English language song. That’s what AI is right now.

www.youtube.com/watch?v=RObuKTeHoxo

source
xia@lemmy.sdf.org ⁨9⁩ ⁨months⁩ ago
Yeah! Just like water’s “wetness” problem. It’s kinda fundamental to how the tech operates.

source
SulaymanF@lemmy.world ⁨9⁩ ⁨months⁩ ago
We also have to stop calling it hallucinations. The proper term in psychology for making stuff up like this is “Confabulations.”

source
Zier@fedia.io ⁨9⁩ ⁨months⁩ ago
AI making things up?
So someone finally invented an electronic replacement for politicians.

source
JackGreenEarth@lemm.ee ⁨9⁩ ⁨months⁩ ago
Don’t let perfect be the enemy of good.

source
CrayonRosary@lemmy.world ⁨9⁩ ⁨months⁩ ago
More importantly, we need to stop ignoring criminal case eye witness’ hallucinatory testimony.

source
possiblylinux127@lemmy.zip ⁨9⁩ ⁨months⁩ ago
What do you think we are working on?

source
autotldr@lemmings.world [bot] ⁨9⁩ ⁨months⁩ ago
This is the best summary I could come up with:

All of Silicon Valley — of Big Tech — is focused on taking large language models and other forms of artificial intelligence and moving them from the laptops of researchers into the phones and computers of average people.

But if I type “show me a picture of Alex Cranz” into the prompt window, Meta AI inevitably returns images of very pretty dark-haired men with beards.

Earlier this year, ChatGPT had a spell and started spouting absolute nonsense, but it also regularly makes up case law, leading to multiple lawyers getting into hot water with the courts.

In a commercial for Google’s new AI-ified search engine, someone asked how to fix a jammed film camera, and it suggested they “open the back door and gently remove the film.” That is the easiest way to destroy any photos you’ve already taken.

An AI’s difficult relationship with the truth is called “hallucinating.” In extremely simple terms: these machines are great at discovering patterns of information, but in their attempt to extrapolate and create, they occasionally get it wrong.

This idea that there’s a kind of unquantifiable magic sauce in AI that will allow us to forgive its tenuous relationship with reality is brought up a lot by the people eager to hand-wave away accuracy concerns.

The original article contains 1,211 words, the summary contains 212 words. Saved 82%. I’m a bot and I’m open source!

source
Cyberflunk@lemmy.world ⁨9⁩ ⁨months⁩ ago
Holy shit. Dunning Kruger is fully engaged in these post comments

source
oldfemboy@lemmy.ml ⁨9⁩ ⁨months⁩ ago
I remember getting gaslit about AIs lying. So glad it’s getting attention.

source
OozingPositron@feddit.cl ⁨9⁩ ⁨months⁩ ago
>The verge

Don’t take away the hallucinations, how am I supposed to do ERP with the models then?

source
muntedcrocodile@lemm.ee ⁨9⁩ ⁨months⁩ ago
Ais real power its ability to use tools and understand context form existing tools. For a Foss tool that uses an llm to do web searches and generate accurate(not guaranteed) results try my tool github.com/muntedcrocodile/Sydney

source
kromem@lemmy.world ⁨9⁩ ⁨months⁩ ago
It’s not hallucination, it’s confabulation. Very similar in its nuances to stroke patients.

Just like the pretrained model trying to nuke people in wargames wasn’t malicious so much as like how anyone sitting in front of a big red button labeled ‘Nuke’ might be without a functioning prefrontal cortex to inhibit that exploratory thought.

Human brains are a delicate balance between fairly specialized subsystems.

Right now, ‘AI’ companies are mostly trying to do it all in one at once. Yes, the current models are typically a “mixture of experts,” but it’s still all in one functional layer.

Hallucinations/confabulations are currently fairly solvable for LLMs. You just run the same query a bunch of times and see how consistent the answer is. If it’s making it up because it doesn’t know, they’ll be stochastic. If it knows the correct answer, it will be consistent. If it only partly knows, it will be somewhere in between (but in a way that can be fine tuned to be detected by a classifier).

This adds a second layer across each of those variations. If you want to check whether something is safe, you’d also need to verify that answer isn’t a confabulation, so that’s more passes.

It gets to be a lot quite quickly.

As the tech scales (what’s being done with servers today will happen around 80% as well on smartphones in about two years), those extra passes aren’t going to need to be as massive.

This is a problem that will eventually go away, just not for a single pass at a single layer, which is 99% of the instances where people are complaining this is an issue.

source
Alllo@lemmy.world ⁨9⁩ ⁨months⁩ ago
without reading the article, this is the best summary I could come up with:

Mainstream government tied media keeps hallucinatin up facts. Republican, democrat, doesn’t matter; they hallucinate up facts. Time to stop ignoring human’s hallucination problem. At least with ai, they don’t have some subversive agenda beneath the surface when they do it. Time to help ai take over the world bbl

source
HawlSera@lemm.ee ⁨9⁩ ⁨months⁩ ago
The AI isn’t alive, it’s not hallucinating… We will likely never have true AI until we figure out the Hard Problem.

source