The STT (speech to text) model that they created is open source (Whisper) as well as a few others:
Comment on The Irony of 'You Wouldn't Download a Car' Making a Comeback in AI Debates
fancyl@lemmy.world 4 months ago
Are the models that OpenAI creates open source? I don’t know enough about LLMs but if ChatGPT wants exemptions from the law, it result in a public good (emphasis on public).
QuadratureSurfer@lemmy.world 4 months ago
WalnutLum@lemmy.ml 4 months ago
Those aren’t open source, neither by the OSI’s Open Source Definition nor by the OSI’s Open Source AI Definition.
The important part for the latter being a published listing of all the training data. (Trainers don’t have to provide the data, but they must provide at least a way to recreate the model given the same inputs).
They are model-available if anything.
QuadratureSurfer@lemmy.world 4 months ago
I did a quick check on the license for Whisper:
Whisper’s code and model weights are released under the MIT License. See LICENSE for further details.
So that definitely meets the Open Source Definition on your first link.
And it looks like it also meets the definition of open source as per your second link.
Additional WER/CER metrics corresponding to the other models and datasets can be found in Appendix D.1, D.2, and D.4 of the paper, as well as the BLEU (Bilingual Evaluation Understudy) scores for translation in Appendix D.3.
WalnutLum@lemmy.ml 4 months ago
Whisper’s code and model weights are released under the MIT License. See LICENSE for further details. So that definitely meets the Open Source Definition on your first link.
Model weights by themselves do not qualify as “open source”, as the OSAID qualifies. Weights are not source.
Additional WER/CER metrics corresponding to the other models and datasets can be found in Appendix D.1, D.2, and D.4 of the paper, as well as the BLEU (Bilingual Evaluation Understudy) scores for translation in Appendix D.3.
This is not training data. These are testing metrics.
masterspace@lemmy.ca 4 months ago
OpenAI does not publish their models openly. Other companies like Microsoft and Meta do.
graycube@lemmy.world 4 months ago
Nothing about OpenAI is open-source. The name is a misdirection.
If you use my IP without my permission and profit it from it, then that is IP theft, whether or not you republish a plagiarized version.
dariusj18@lemmy.world 4 months ago
So I guess every reaction and review on the internet that is ad supported or behind a payroll is theft too?
RicoBerto@lemmy.blahaj.zone 4 months ago
No, we have rules on fair use and derivative works. Sometimes they fall on one side, sometimes another.
InvertedParallax@lemm.ee 4 months ago
Fair use by humans.
There is no fair use by computers, otherwise we couldn’t have piracy laws.