deepseek bad because it doesn’t parrot my US State Dept narrative 😞
Comment on Nvidia loses $500 bn in value as Chinese AI firm jolts tech shares
UnderpantsWeevil@lemmy.world 10 months ago
The number of people repeating “I bet it won’t tell you about Tianamen Square” jokes around this news has - imho - neatly explained why the US tech sector is absolutely fucked going into the next generation.
bennieandthez@lemmygrad.ml 10 months ago
Womble@lemmy.world 10 months ago
It’s not a joke, it wont:
Image
Not_mikey@slrpnk.net 10 months ago
It’s even worse / funnier in the app, it will generate the response, then once it realizes its about Taiwan it will delete the whole response and say sorry I can’t do that.
If you ask it “what is the republic of china” it will generate a couple paragraphs of the history of China, then it’ll get a couple sentences in about the retreat to Taiwan and then stop and delete the response.
Womble@lemmy.world 10 months ago
In fairness that is also exactly what gpt clauude and the rest do for their online versions too when you hit their limits (usually around sex). IIRC they work by having a second LLM monitor the output and send a cancel signal if they think its gone over the line.
JasSmith@sh.itjust.works 10 months ago
Okay but one is about puritanical Western cultural standards about sex, and one is about government censorship to maintain totalitarian power. One of these things is not like the other.
Bronzebeard@lemm.ee 10 months ago
You missed the entire point of their comment
Womble@lemmy.world 10 months ago
Maybe they should have been clearer than saying people were joking about it doing something that it actually does if they wanted to make a point.
Bronzebeard@lemm.ee 10 months ago
People caring more about “China bad” instead of looking at what the tech they made can actually do is the issue.
You needing this explicitly spelled out for you does not help the case.
Eyekaytee@aussie.zone 10 months ago
I’m slow, what’s the point? how does people joking about the fact China is censoring output explain
bennieandthez@lemmygrad.ml 10 months ago
Because they care more about the model not parroting US state dept narratives than the engineering behind it.
Smokeydope@lemmy.world 10 months ago
Try an albliterated version of the qwen 14 or 32b distills. it will give you a real overview.
Womble@lemmy.world 10 months ago
Oh I hadnt realised uncensored version had started coming out yet, I definitely wil look into it once quantised versions drop.
Smokeydope@lemmy.world 10 months ago
huggingface.co/…/DeepSeek-R1-Distill-Qwen-14B-abl…
Scolding7300@lemmy.world 10 months ago
That’s just dumb. It at least doesn’t suppress that when provided with search results/refuses to search (at least when integrated in Kagi)
UnderpantsWeevil@lemmy.world 10 months ago
What training data did you use?
Womble@lemmy.world 10 months ago
??? you dont use training data when running models, that’s what is used in training them.
UnderpantsWeevil@lemmy.world 10 months ago
DeepSeek open-sourced their model. Go ahead and train it on different data and try again.