Comment on New York state bans DeepSeek from government devices
count_dongulus@lemmy.world 13 hours agoLol have you not used o1/o3? They show the inner monologue too. Fun little pretend detail to keep you entertained while the model takes 30 seconds to respond.
Hackworth@lemmy.world 6 hours ago
o1/o3 use a smaller model to summarize the reasoning, but they don’t show the actual CoT generation the way deepseek does.