Musk probably heard about “synthetic data” training, which is where you use machine learning to create thousands of things that are typical-enough to be good training data. Microsoft uses it to take documents users upload to Office365, train the ML model, and then use that ML output to train an LLM so they can technically say “no, your data any used to train an LLM.” Because it trained the thing that trained the LLM.
However, you can’t do that with LLM output and stuff like… History. WTF evidence and documents are the basis for the crap he wants to add? The hallucinations will just compound because who’s going to cross-check this other than Grok anyway?
Voroxpete@sh.itjust.works 1 day ago
There are, as I understand it, ways that you can train on AI generated material without inviting model collapse, but that’s more to do with distilling the output of a model. What Musk is describing is absolutely wholesale confabulation being fed back into the next generation of their model. It’s also a total pipe dream. Getting an AI to rewrite something like the total training data set to your exact requirements, and verifying that it had done so satisfactorily would be an absolutely monumental undertaking. The compute time alone would be staggering and the human labour (to check the output) many times higher than that.
But the whiny little piss baby is mad that his own AI keeps fact checking him, and his engineers have already explained that coding it to lie doesn’t really work because the training data tends to outweigh the initial prompt, so this is the best theory he can come up with for how he can “fix” his AI expressing reality’s well known liberal bias.
Deflated0ne@lemmy.world 1 day ago
Model collapse is the ideal.