You can finetune LLMs using smaller datasets, or with RLHF (reinforcement learning from human feedback) wherein people can give ratings to responses and the model can be either “rewarded” or “penalized” based off of the ratings for a given output. This retrains the LLM to produce outputs that people prefer.
Comment on Bill Gates feels Generative AI has plateaued, says GPT-5 will not be any better
raptir@lemdro.id 11 months agoChatGPT doesn’t learn like that though, does it? I thought it was “static” with its training data.
HiggsBroson@lemmy.world 11 months ago
niisyth@lemmy.ca 11 months ago
Active Learning Models. Though public exposure can eaily fuck it up, without adult supervision. With proper supervision though, there’s promise.
BearOfaTime@lemm.ee 11 months ago
So it will always have the biases of the supervisors
niisyth@lemmy.ca 11 months ago
Bias is inevitable. Whether it is AI or any other knowledge based system. We just have to be cognizant of it and try to remedy it.
grabyourmotherskeys@lemmy.world 11 months ago
I was speculating about how you can overcome hallucinations, etc., by supplying additional training data. Not specific to ChatGPT or even LLMs…