Comment

Comment on Does vibe coding sort of work at all?

xavier666@lemmy.umucat.day ⁨6⁩ ⁨months⁩ ago

Think of LLMs as the person who gets good marks in exams because they memorized the entire textbook.

For small, quick problems you can rely on them (“Hey, what’s the syntax for using rsync between two remote servers?”) but the moment the problem is slightly complicated, they will fail because they don’t actually understand what they have learnt; if the answer is not in the original textbook, they fail.

Now, if you are aware of the source material or if you are decently proficient in coding, you can check their incorrect response, correct it, and make it your own. Instead of creating the solution from scratch, LLMs can give you a push in the right direction. However, DON’T consider their output as the gospel truth. LLMs can augment good coders, but it can lead poor coders astray.

This is not something specific to LLMs; if you don’t know how to use Stackoverflow, you can use the wrong solution from the list of given solutions. You need to be technically proficient to even understand which one of the solutions is correct for your usecase. Having a strong base will help you in the long run.

source

Sort:hotnew top

lepinkainen@lemmy.world ⁨6⁩ ⁨months⁩ ago
The main problem with LLMs is that they’re the person who memorised the textbook AND never admit they don’t know something.

No matter what you ask, an LLM will give you an answer. They will never say “I don’t know”, but will rather spout 100% confident bullshit.

The “thinking” models are a bit better, but still have the same issue.

source
- xavier666@lemmy.umucat.day ⁨6⁩ ⁨months⁩ ago
  
  No matter what you ask, an LLM will give you an answer. They will never say “I don’t know”
  
  There is a reason for this. LLMs are “rewarded” (just an internal scoring mechanism) for generating an answer. No matter what you say, it will try to maximize the reward value by generating an answer with high hallucination. There is no reward mechanism for saying “I don’t know” to a difficult question.
  
  I am not into research on LLMs, but i think this is being worked upon.
  
  source
  - TranquilTurbulence@lemmy.zip ⁨6⁩ ⁨months⁩ ago
    Something very similar is also true with humans. People just love to have answers even if they aren’t entirely reliable or even true. Having just some answer seems to be more appealing than not having any answers at all. Why do you think people had weird beliefs about stars, rainbows, thunder etc.
    
    The way LLMs hallucinate is also a little weird. If you ask about quantum physics things, they actually can tell you that modern science doesn’t have a conclusive answer to your question. I guess that’s because other people have written articles about the very same question, and have pointed out that it’s still a topic of ongoing debate.
    
    If you ask about robot waitresses used in a particular restaurant, it will happily give you the wrong answer. Obviously, there’s not much data about that restaurant, let alone any academic debate, so I guess that’s also reflected in the answer.
    
    source
josefo@leminal.space ⁨6⁩ ⁨months⁩ ago
Great summary. I would add not using LLMs to learn something new. As OP mentioned, when you know your stuff, you are aware of how much it bullshits. What happens when you don’t know? You eat all the bullshit because it sounds good. Or you will end up with a vibed codebase you can’t fully understand because you didn’t reason to produce it. It’s like driving a car and having a shitty copilot that sometimes hallucinates roads, and if you don’t know where you are supposed to be, wherever that copilot takes you would look good. You lack the context to judge the results or advice.

I basically use it now days as a semantic search engine of documentation. Talking with documentation is the coolest. If the response doesn’t come with a doc link, it’s probably not worth it. Make it point to the human input, make it help you find things you don’t know the name of, but never trust the output without judging. In my experience, making it generate code that you end up correcting it’s more cognitive heavy load than to write it yourself from scratch.

source