Comment on New York state bans DeepSeek from government devices
count_dongulus@lemmy.world 1 month agoLol have you not used o1/o3? They show the inner monologue too. Fun little pretend detail to keep you entertained while the model takes 30 seconds to respond.
Hackworth@lemmy.world 1 month ago
o1/o3 use a smaller model to summarize the reasoning, but they don’t show the actual CoT generation the way deepseek does.