Comment

Comment on Nvidia loses $500 bn in value as Chinese AI firm jolts tech shares

www.analyticsvidhya.com/blog/2024/…/deepseek-v3/

Huh I guess 6 million USD is not millions eh? The innovation is it’s comparatively cheap to train, compared to the billions OpenAI et al are spending. (and that is with acquiring thousands of H800s not included in the cost)

source

Sort:hotnew top

UnderpantsWeevil@lemmy.world ⁨10⁩ ⁨months⁩ ago

The innovation is it’s comparatively cheap to train, compared to the billions

Smaller builds with less comprehensive datasets take less time and money. Again, this doesn’t have to be encyclopedic. You can train your model entirely on a small sample of historical events in and around Beijing in 1989 if you are exclusively fixated on getting results back about Tienanmen Square.

source
- Womble@lemmy.world ⁨10⁩ ⁨months⁩ ago
  Oh, by the way, as to your theory of “maybe it just doesnt know about Tienanmen, its not an encyclopedia”…
  
  Image
  
  source
  - Dhs92@programming.dev ⁨10⁩ ⁨months⁩ ago
    I don’t think I’ve seen that internal dialog before with LLMs. Do you get that with most models when running using ollama?
    
    source
    Womble@lemmy.world ⁨10⁩ ⁨months⁩ ago
    Thats the innovation of the “chain of thought” models like OpenAI’s o1 and now this deepseek model, it narrates an internal dialogue first in order to try and create more consistent answers. It isnt perfect but it helps it do things like logical reasoning at the cost of taking a lot longer to get to the answer.
    
    source
- Womble@lemmy.world ⁨10⁩ ⁨months⁩ ago
  Ok sure, as I said before I am grateful that they have done this and open sourced it. But it is still deliberately politically censored, and no “Just train your own bro” is not a reasonable reply to that.
  
  source
  - Rai@lemmy.dbzer0.com ⁨10⁩ ⁨months⁩ ago
    They know less than I do about LLMs of that’s something they think you can just DO… and that’s saying a lot.
    
    source