I do think part of it is expectation creep but also that it’s got better at some harder elements which aren’t as noticeable - it used to invent functions which should exist but don’t, I haven’t seen it do that in a while but it does seem to have limited the scope it can work with. I think it’s probably like how with images you can have it make great images OR strictly obey the prompt but the more you want it to do one the less it can do the other.
I’ve been using 3.5 to help code and it’s incredibly useful for things it’s good at like reminding me what a certain function call does and what my options are with it, it’s got much better at that and tiny scripts like ‘a python script that reads all the files in a folder and sorts the big images into a separate folder’ or something like that. Getting it to handle anything with more complexity it’s got worse at, it was never great at it tbh so I think maybe it’s getting to s block where now it knows it can’t do it so rejects the answers with critical failures (like when it makes up function of a standard library because it’d be useful) and settles on a weaker but less wrong one - a lot of the making up functions errors were easy to fix because you could just say ‘pil doesn’t have a function to do that can you write one’
So yeah I don’t think it’s really getting worse but there are tradeoffs - if only openAI lived by any of the principles they claimed when setting up and naming themselves then we’d be able to experiment and explore different usage methods for different tasks just like people do with stable diffusion. But capitalists are going to lie, cheat, and try to monopolize so we’re stuck guessing.
Linkerbaan@lemmy.world 11 months ago
Maybe they’re crippling it so when GPT5 releases it looks better. Like Apple did with cpu throttling of older iphones
tagliatelle@lemmy.world 11 months ago
They probably have to scale down the resources used for each query as they can’t acale up their infrastructure to handle the load.
backgroundcow@lemmy.world 11 months ago
This is my guess as well. They have been limiting new signups for the paid service for a long time, which must mean they are overloaded; and then it makes a lot of sense to just degrade the quality of GPT-4 so they can serve all paying users. I just wish there was a way to know the “quality level” the service is operating at.
monkeyslikebananas2@lemmy.world 11 months ago
This is most likely the answer. Management saw the revenue and cost and said, “whoa! Turn all that unnecessary stuff off!”