Comment on Are there any AI services that don't work on stolen data?
Treczoks@lemmy.world 4 days agoThere are no legal sources big enough to train an AI on the level required to even perform basic interaction.
Comment on Are there any AI services that don't work on stolen data?
Treczoks@lemmy.world 4 days agoThere are no legal sources big enough to train an AI on the level required to even perform basic interaction.
AmbitiousProcess@piefed.social 4 days ago
This is very true.
I was part of the OpenAssistant project, voluntarily submitting my personal writing to train open-source LLMs without having to steal data, in the hopes it would stop these companies from stealing people's work and make "AI" less of a black box.
After thousands of people submitting millions of prompt-response pairs, and after some researchers said it was the highest quality natural language dataset they'd seen in a while, the base model was almost always incoherent. You only got a functioning model if you just used the data to fine-tune an existing larger model, Llama at the time.