deepseek bad because it doesn’t parrot my US State Dept narrative 😞
Comment on Nvidia loses $500 bn in value as Chinese AI firm jolts tech shares
UnderpantsWeevil@lemmy.world 1 year ago
The number of people repeating “I bet it won’t tell you about Tianamen Square” jokes around this news has - imho - neatly explained why the US tech sector is absolutely fucked going into the next generation.
bennieandthez@lemmygrad.ml 1 year ago
Womble@lemmy.world 1 year ago
It’s not a joke, it wont:
Image
Not_mikey@slrpnk.net 1 year ago
It’s even worse / funnier in the app, it will generate the response, then once it realizes its about Taiwan it will delete the whole response and say sorry I can’t do that.
If you ask it “what is the republic of china” it will generate a couple paragraphs of the history of China, then it’ll get a couple sentences in about the retreat to Taiwan and then stop and delete the response.
Womble@lemmy.world 1 year ago
In fairness that is also exactly what gpt clauude and the rest do for their online versions too when you hit their limits (usually around sex). IIRC they work by having a second LLM monitor the output and send a cancel signal if they think its gone over the line.
JasSmith@sh.itjust.works 1 year ago
Okay but one is about puritanical Western cultural standards about sex, and one is about government censorship to maintain totalitarian power. One of these things is not like the other.
Bronzebeard@lemm.ee 1 year ago
You missed the entire point of their comment
Womble@lemmy.world 1 year ago
Maybe they should have been clearer than saying people were joking about it doing something that it actually does if they wanted to make a point.
Bronzebeard@lemm.ee 1 year ago
People caring more about “China bad” instead of looking at what the tech they made can actually do is the issue.
You needing this explicitly spelled out for you does not help the case.
Eyekaytee@aussie.zone 1 year ago
I’m slow, what’s the point? how does people joking about the fact China is censoring output explain
bennieandthez@lemmygrad.ml 1 year ago
Because they care more about the model not parroting US state dept narratives than the engineering behind it.
Smokeydope@lemmy.world 1 year ago
Try an albliterated version of the qwen 14 or 32b distills. it will give you a real overview.
Womble@lemmy.world 1 year ago
Oh I hadnt realised uncensored version had started coming out yet, I definitely wil look into it once quantised versions drop.
Smokeydope@lemmy.world 1 year ago
huggingface.co/…/DeepSeek-R1-Distill-Qwen-14B-abl…
Scolding7300@lemmy.world 1 year ago
That’s just dumb. It at least doesn’t suppress that when provided with search results/refuses to search (at least when integrated in Kagi)
UnderpantsWeevil@lemmy.world 1 year ago
What training data did you use?
Womble@lemmy.world 1 year ago
??? you dont use training data when running models, that’s what is used in training them.
UnderpantsWeevil@lemmy.world 1 year ago
DeepSeek open-sourced their model. Go ahead and train it on different data and try again.