Comment

Comment on How to turn off Gemini in Gmail — and why you should | Proton

hperrin@lemmy.ca ⁨1⁩ ⁨month⁩ ago

I played around with it a lot yesterday, giving it documentation and asking it to write some code based on the API documentation. Just like every single other LLM I’ve ever tried, it just bungled the entire thing. It made up a bunch of functions and syntax that just doesn’t exist. After I told it the code was wrong and gave it the right way to do it, it told me that I got it wrong and converted it back to the incorrect syntax. LLMs are interesting toys, but shouldn’t be used for real work.

source

Sort:hotnew top

SuspciousCarrot78@lemmy.world ⁨1⁩ ⁨month⁩ ago
Yeah. I had ChatGPT (more than once) take the code given, cut it in half, scramble it and then claim “see? I did it! Code works now”.

When you point out what it did, by pasting its own code back in, it will say “oh, why did you do that? There’s a mistake in your code at XYZ”. No…there’s a mistake in your code, buddy.

When you paste in what you want it to add, it “fixes” XYZ … and …surprise surprise… It’s either your OG code or more breaks.

The only one ive seen that doesn’t do this is (or does it a lot less) is Claude.

I think Lumo for the most part is really just Mistral, Nemotron and Openhands in a trench coat. ICBW.

I think Lumo’s value proposition is around data retention and privacy, not SOTA llm tech.

source
IEatDaFeesh@lemmy.world ⁨1⁩ ⁨month⁩ ago
Sounds like a skill issue. I guess you don’t know how to prompt correctly. 🤷

source
- hperrin@lemmy.ca ⁨1⁩ ⁨month⁩ ago
  Feel free to try. Here’s the library I use: nymph.io
  
  It’s open source, and all the docs and code are available at that link and on GitHub. I always ask it to make a note entity, which is just incredibly simple. Basically the same thing as the ToDo example.
  
  The reason I use this library (other than that I wrote it, so I know it really well) is that it isn’t widely known and there aren’t many example projects of it on GitHub, so the LLM has to be able to actually read and understand the docs and code in order to properly use it. For something like React, there are a million examples online, so for basic things, the LLM isn’t really understanding anything, it’s just making something similar to its training data. That’s not how actual high level programming works, so making it follow an API it isn’t already trained on is a good way to test if it is near the same abilities as an actual entry level SWE.
  
  source