Comment

Comment on [JS Required] MiniMax M1 model claims Chinese LLM crown from DeepSeek - plus it's true open-source

That’s not how distillation works if I understand what you’re trying to explain.

If you distill model A to a smaller model, you just get a smaller version of model A with the same approximate distribution curve of parameters, but fewer of them. You can’t distill Llama into Deepseek R1.

I’ve been able to run distillations of Deepseek R1 up to 70B, and they’re all censored still. There is a version of Deepseek R1 “patched” with western values called R1-1776 that will answer topics censored by the Chinese government, however.

source

Sort:hotnew top

LWD@lemm.ee ⁨5⁩ ⁨months⁩ ago

I’ve been able to run distillations of Deepseek R1 up to 70B

Where do you find those?

There is a version of Deepseek R1 “patched” with western values called R1-1776 that will answer topics censored by the Chinese government, however.

Thank you for mentioning this, as I finally confronted my own preconceptions and actually found an article by Perplexity that demonstrated R1 itself has demonstrable pro-China bias.

Although Perplexity’s own description should cause anybody who understands the nature of LLMs to pause. They describe it in their header as a

version of the DeepSeek-R1 model that has been post-trained to provide unbiased, accurate, and factual information.

That’s a bold (read: bullshit) statement, considering the only altered its biases on China. I wouldn’t consider the original model to be unbiased either, but apparently perplexity is giving them a pass on everything else. I guess it’s part of the grand corporate lie that claims “AI is unbiased,” a delusion that perplexity needs to maintain.

source