deepseek bad because it doesn’t parrot my US State Dept narrative 😞
Comment on Nvidia loses $500 bn in value as Chinese AI firm jolts tech shares
UnderpantsWeevil@lemmy.world 2 months ago
The number of people repeating “I bet it won’t tell you about Tianamen Square” jokes around this news has - imho - neatly explained why the US tech sector is absolutely fucked going into the next generation.
bennieandthez@lemmygrad.ml 2 months ago
Womble@lemmy.world 2 months ago
It’s not a joke, it wont:
Image
Not_mikey@slrpnk.net 2 months ago
It’s even worse / funnier in the app, it will generate the response, then once it realizes its about Taiwan it will delete the whole response and say sorry I can’t do that.
If you ask it “what is the republic of china” it will generate a couple paragraphs of the history of China, then it’ll get a couple sentences in about the retreat to Taiwan and then stop and delete the response.
Womble@lemmy.world 2 months ago
In fairness that is also exactly what gpt clauude and the rest do for their online versions too when you hit their limits (usually around sex). IIRC they work by having a second LLM monitor the output and send a cancel signal if they think its gone over the line.
JasSmith@sh.itjust.works 2 months ago
Okay but one is about puritanical Western cultural standards about sex, and one is about government censorship to maintain totalitarian power. One of these things is not like the other.
Bronzebeard@lemm.ee 2 months ago
You missed the entire point of their comment
Womble@lemmy.world 2 months ago
Maybe they should have been clearer than saying people were joking about it doing something that it actually does if they wanted to make a point.
Bronzebeard@lemm.ee 2 months ago
People caring more about “China bad” instead of looking at what the tech they made can actually do is the issue.
You needing this explicitly spelled out for you does not help the case.
Eyekaytee@aussie.zone 2 months ago
I’m slow, what’s the point? how does people joking about the fact China is censoring output explain
bennieandthez@lemmygrad.ml 2 months ago
Because they care more about the model not parroting US state dept narratives than the engineering behind it.
Smokeydope@lemmy.world 2 months ago
Try an albliterated version of the qwen 14 or 32b distills. it will give you a real overview.
Womble@lemmy.world 2 months ago
Oh I hadnt realised uncensored version had started coming out yet, I definitely wil look into it once quantised versions drop.
Smokeydope@lemmy.world 2 months ago
huggingface.co/…/DeepSeek-R1-Distill-Qwen-14B-abl…
Scolding7300@lemmy.world 2 months ago
That’s just dumb. It at least doesn’t suppress that when provided with search results/refuses to search (at least when integrated in Kagi)
UnderpantsWeevil@lemmy.world 2 months ago
What training data did you use?
Womble@lemmy.world 2 months ago
??? you dont use training data when running models, that’s what is used in training them.
UnderpantsWeevil@lemmy.world 2 months ago
DeepSeek open-sourced their model. Go ahead and train it on different data and try again.