BetaDoggo_@lemmy.world 2 months ago
How many times is this same article going to be written? Model collapse from synthetic data is not a concern at any scale when human data is in the mix. We have entire series of models now trained with mostly synthetic data: huggingface.co/docs/transformers/main/…/phi3. When using entirely unassisted outputs error accumulates with each generation but this isn’t a concern in any real scenarios.
SomethingBurger@jlai.lu 2 months ago
As the number of articles about this exact subject increases, so does the likelihood of AI only being able to write about this very subject.