Comment on Are there any AI services that don't work on stolen data?

<- View Parent
AmbitiousProcess@piefed.social ⁨2⁩ ⁨weeks⁩ ago

This is very true.

I was part of the OpenAssistant project, voluntarily submitting my personal writing to train open-source LLMs without having to steal data, in the hopes it would stop these companies from stealing people's work and make "AI" less of a black box.

After thousands of people submitting millions of prompt-response pairs, and after some researchers said it was the highest quality natural language dataset they'd seen in a while, the base model was almost always incoherent. You only got a functioning model if you just used the data to fine-tune an existing larger model, Llama at the time.

source
Sort:hotnewtop