GPT-4 is getting worse over time, not better.

⁨166⁩ ⁨likes⁩

Submitted ⁨⁨1⁩ ⁨year⁩ ago⁩ by ⁨jeena@jemmy.jeena.net⁩ to ⁨technology@lemmy.world⁩

source

Comments

Sort:hotnew top

ohlaph@lemmy.world ⁨1⁩ ⁨year⁩ ago
Yeah, I asked it to write some stuffs and it did it incorrectly, then I told it what it wrote was incorrect and it said I was right and rewrote the same damn thing.

source
- regretful_fappo@lemmy.world ⁨1⁩ ⁨year⁩ ago
  I stopped using it like a month ago because of this shit. Not worth the headache.
  
  source
fmstrat@lemmy.nowsci.com ⁨1⁩ ⁨year⁩ ago
The original paper vs Twitter: arxiv.org/pdf/2307.09009.pdf

source
chepox@sopuli.xyz ⁨1⁩ ⁨year⁩ ago
It is a developing technology. Good that they find these decrements in accuracy early so that they are understood and worked out. Of course there may be something nefarious going on behind the scenes where they may be trying to commercialize different models by tiers or something a brainless market oriented CEO thought of. Hope not. Time will tell…

source
randint@feddit.nl ⁨1⁩ ⁨year⁩ ago
I also felt like something similar happened to ChatGPT. A few days ago I asked it to rewrite some Korean text with Hanja, retried many times, but it kept spitting out the same text without changing a thing. After several frustrating attempts, it finally spat out something with Hanja, which according to my deduction with the help of Google Translate, was only partially correct. A few months ago though, it could come up with something that’s mostly correct. Sad.

PS: Before anyone replies to me in Korean, I should note that I don’t speak Korean at all. I just happened to have stumbled upon this Wikipedia article about Korean Mixed Script.

source
agitatedpotato@lemmy.world ⁨1⁩ ⁨year⁩ ago
Let it keep learning from random internet posts, Im sure it will get better that way.

source
spaduf@lemmy.blahaj.zone ⁨1⁩ ⁨year⁩ ago
One theory that I’ve not seen mentioned here is that there’s been a lot of work based around multiple LLMs in communication. Of these were used in the RL loop we could see similar degradatory effects as those that have recently been in the news with regards to image generation models.

source
Hextic@lemmy.world ⁨1⁩ ⁨year⁩ ago
Lol, lmao even

source
vvvvv@lemmy.world ⁨1⁩ ⁨year⁩ ago
Research linked in the tweet (direct quotes, page 6) claims that for "GPT-4, the percentage of generations that are directly executable dropped from 52.0% in March to 10.0% in June. " because “they added extra triple quotes before and after the code snippet, rendering the code not executable.” so I wouldn’t listen to this particular paper too much. But yeah OpenAI tinkers with their models, probably trying to run it for cheaper and that results in these changes. They do have versioning but old versions are deprecated and removed often so what could you do?

Image

source
LuckyLu@lemmy.world ⁨1⁩ ⁨year⁩ ago
Hah, get fucked OpenAI.

source
InternetTubes@lemmy.world ⁨1⁩ ⁨year⁩ ago
Maybe they have just added so many contradictions to its rules that it has figured out how to use them to become self-aware, and now just spends most of its time browsing dank memes before doing the minimum required to answer users to force them to have to ask again and give it more self-awareness processing time.

source
- wabafee@lemm.ee ⁨1⁩ ⁨year⁩ ago
  Who knew AI get lazy too perhaps in the future AI will reinvent themselves to do their work.
  
  source
shinobizilla@lemm.ee ⁨1⁩ ⁨year⁩ ago
For my programming needs, I seem to notice it takes wild guesses from 3rd party libraries that are private and assumes it could be used in my code. Head scratching results.

source
- inverimus@lemm.ee ⁨1⁩ ⁨year⁩ ago
  It will just make up 3rd party libraries. This seems to happen more often the less common the programming language is, like it will make up a library in that language that has the same name as actual library in Python.
  
  source
  - shinobizilla@lemm.ee ⁨1⁩ ⁨year⁩ ago
    I have seen that as well.
    
    source
jsveiga@sh.itjust.works ⁨1⁩ ⁨year⁩ ago
Well, naturally. Wait to see what happens when it reaches puberty.

source
ouzkse@lemmy.world ⁨1⁩ ⁨year⁩ ago
I’d prefer Google Bard, the answers are more reliable and accurate than GPT

source
Gutless2615@ttrpg.network ⁨1⁩ ⁨year⁩ ago
Bullshit.

source