Comment on [JS Required] MiniMax M1 model claims Chinese LLM crown from DeepSeek - plus it's true open-source

<- View Parent
LWD@lemm.ee ⁨1⁩ ⁨day⁩ ago

If you’re talking about the distillations, AFAIK they take somebody else’s model and run it through their (actually open-source) distiller. I tried a couple of those models because I was curious. The distilled Qwen model is cagey about Tianmen Square, but Qwen was made by Alibaba. The distillation of a US-made model did not have this problem.

I don’t have enough RAM to run the full DeepSeek R1, but AFAIK it doesn’t have this problem. Maybe it does.

In case it isn’t clear, BTW, I do despise LLMs and AI in general. The biggest issue with them isn’t the glaring lies (not Tianmen Square, and certainly not the “it’s woke!” complaints about generating images of black founding fathers, but the subtle and insidious little details like agreeableness - trying to get people to spend a little more time with them, which apparently turns once-reasonable people into members of micro-cults.

source
Sort:hotnewtop