Comment

Comment on Over just a few months, ChatGPT went from correctly answering a simple math problem 98% of the time to just 2%, study finds. Researchers found wild fluctuations—called drift—in the technology’s abi...

<- View Parent

bric@lemm.ee ⁨1⁩ ⁨year⁩ ago

This. It is able to tap in to plugins and call functions though, which is what it really should be doing. For math, the Wolfram alpha plugin will always be more capable than chatGPT alone, so we should be benchmarking how often it can correctly reformat your query, call Wolfram alpha, and correctly format the result, not whether the statistical model behind chatGPT happens to use predict the right token

source

Sort:hotnew top

Gork@lemm.ee ⁨1⁩ ⁨year⁩ ago
It sounds like it’s time to merge Wolfram Alpha’s and ChatGPT’s capabilities together to create the ultimate calculator.

source