deepseek bad because it doesn’t parrot my US State Dept narrative 😞
Comment on Nvidia loses $500 bn in value as Chinese AI firm jolts tech shares
UnderpantsWeevil@lemmy.world 1 week ago
The number of people repeating “I bet it won’t tell you about Tianamen Square” jokes around this news has - imho - neatly explained why the US tech sector is absolutely fucked going into the next generation.
bennieandthez@lemmygrad.ml 1 week ago
Womble@lemmy.world 1 week ago
It’s not a joke, it wont:
Image
Not_mikey@slrpnk.net 1 week ago
It’s even worse / funnier in the app, it will generate the response, then once it realizes its about Taiwan it will delete the whole response and say sorry I can’t do that.
If you ask it “what is the republic of china” it will generate a couple paragraphs of the history of China, then it’ll get a couple sentences in about the retreat to Taiwan and then stop and delete the response.
Womble@lemmy.world 1 week ago
In fairness that is also exactly what gpt clauude and the rest do for their online versions too when you hit their limits (usually around sex). IIRC they work by having a second LLM monitor the output and send a cancel signal if they think its gone over the line.
JasSmith@sh.itjust.works 1 week ago
Okay but one is about puritanical Western cultural standards about sex, and one is about government censorship to maintain totalitarian power. One of these things is not like the other.
Bronzebeard@lemm.ee 1 week ago
You missed the entire point of their comment
Womble@lemmy.world 1 week ago
Maybe they should have been clearer than saying people were joking about it doing something that it actually does if they wanted to make a point.
Bronzebeard@lemm.ee 1 week ago
People caring more about “China bad” instead of looking at what the tech they made can actually do is the issue.
You needing this explicitly spelled out for you does not help the case.
Eyekaytee@aussie.zone 1 week ago
I’m slow, what’s the point? how does people joking about the fact China is censoring output explain
bennieandthez@lemmygrad.ml 1 week ago
Because they care more about the model not parroting US state dept narratives than the engineering behind it.
Smokeydope@lemmy.world 1 week ago
Try an albliterated version of the qwen 14 or 32b distills. it will give you a real overview.
Womble@lemmy.world 1 week ago
Oh I hadnt realised uncensored version had started coming out yet, I definitely wil look into it once quantised versions drop.
Smokeydope@lemmy.world 1 week ago
huggingface.co/…/DeepSeek-R1-Distill-Qwen-14B-abl…
Scolding7300@lemmy.world 1 week ago
That’s just dumb. It at least doesn’t suppress that when provided with search results/refuses to search (at least when integrated in Kagi)
UnderpantsWeevil@lemmy.world 1 week ago
What training data did you use?
Womble@lemmy.world 1 week ago
??? you dont use training data when running models, that’s what is used in training them.
UnderpantsWeevil@lemmy.world 1 week ago
DeepSeek open-sourced their model. Go ahead and train it on different data and try again.