Comment

Comment on Nvidia loses $500 bn in value as Chinese AI firm jolts tech shares

UnderpantsWeevil@lemmy.world ⁨8⁩ ⁨months⁩ ago

The innovation is it’s comparatively cheap to train, compared to the billions

Smaller builds with less comprehensive datasets take less time and money. Again, this doesn’t have to be encyclopedic. You can train your model entirely on a small sample of historical events in and around Beijing in 1989 if you are exclusively fixated on getting results back about Tienanmen Square.

source

Sort:hotnew top

Womble@lemmy.world ⁨8⁩ ⁨months⁩ ago
Oh, by the way, as to your theory of “maybe it just doesnt know about Tienanmen, its not an encyclopedia”…

Image

source
- Dhs92@programming.dev ⁨8⁩ ⁨months⁩ ago
  I don’t think I’ve seen that internal dialog before with LLMs. Do you get that with most models when running using ollama?
  
  source
  - Womble@lemmy.world ⁨8⁩ ⁨months⁩ ago
    Thats the innovation of the “chain of thought” models like OpenAI’s o1 and now this deepseek model, it narrates an internal dialogue first in order to try and create more consistent answers. It isnt perfect but it helps it do things like logical reasoning at the cost of taking a lot longer to get to the answer.
    
    source
Womble@lemmy.world ⁨8⁩ ⁨months⁩ ago
Ok sure, as I said before I am grateful that they have done this and open sourced it. But it is still deliberately politically censored, and no “Just train your own bro” is not a reasonable reply to that.

source
- Rai@lemmy.dbzer0.com ⁨8⁩ ⁨months⁩ ago
  They know less than I do about LLMs of that’s something they think you can just DO… and that’s saying a lot.
  
  source