Comment on Nvidia loses $500 bn in value as Chinese AI firm jolts tech shares
Womble@lemmy.world 3 days agowww.analyticsvidhya.com/blog/2024/…/deepseek-v3/
Huh I guess 6 million USD is not millions eh? The innovation is it’s comparatively cheap to train, compared to the billions OpenAI et al are spending. (and that is with acquiring thousands of H800s not included in the cost)
UnderpantsWeevil@lemmy.world 3 days ago
Smaller builds with less comprehensive datasets take less time and money. Again, this doesn’t have to be encyclopedic. You can train your model entirely on a small sample of historical events in and around Beijing in 1989 if you are exclusively fixated on getting results back about Tienanmen Square.
Womble@lemmy.world 3 days ago
Oh, by the way, as to your theory of “maybe it just doesnt know about Tienanmen, its not an encyclopedia”…
Image
Dhs92@programming.dev 3 days ago
I don’t think I’ve seen that internal dialog before with LLMs. Do you get that with most models when running using ollama?
Womble@lemmy.world 3 days ago
Thats the innovation of the “chain of thought” models like OpenAI’s o1 and now this deepseek model, it narrates an internal dialogue first in order to try and create more consistent answers. It isnt perfect but it helps it do things like logical reasoning at the cost of taking a lot longer to get to the answer.
Womble@lemmy.world 3 days ago
Ok sure, as I said before I am grateful that they have done this and open sourced it. But it is still deliberately politically censored, and no “Just train your own bro” is not a reasonable reply to that.
Rai@lemmy.dbzer0.com 3 days ago
They know less than I do about LLMs of that’s something they think you can just DO… and that’s saying a lot.