Comment

Comment on Microsoft launches ‘vibe working’ in Excel and Word

InEnduringGrowStrong@sh.itjust.works ⁨6⁩ ⁨months⁩ ago

Microsoft says its Agent Mode in Excel has an accuracy rate of 57.2 percent in SpreadsheetBench, a benchmark for evaluating an AI model’s ability to edit real world spreadsheets.

It generates 42.8% bullshit.

source

Sort:hotnew top

sparky@lemmy.federate.cc ⁨6⁩ ⁨months⁩ ago
defector.com/it-took-many-years-and-billions-of-d…

source
potoo22@programming.dev ⁨6⁩ ⁨months⁩ ago
Just keep regenerating data until it’s something the stock holders like. Doesn’t matter if it’s BS. They’re already accustomed to that.

source
SkaveRat@discuss.tchncs.de ⁨6⁩ ⁨months⁩ ago
Nice. Basically a coin flip

source
- GasMaskedLunatic@lemmy.dbzer0.com ⁨6⁩ ⁨months⁩ ago
  Slightly better than Vegas. Unfortunately, plenty of people are okay with Vegas odds.
  
  source
MadMadBunny@lemmy.ca ⁨6⁩ ⁨months⁩ ago
So it achieved the actual proficiency of a middle manager…

source
- MonkderVierte@lemmy.zip ⁨6⁩ ⁨months⁩ ago
  Decades ago. The company that replaced it’s CEO with a LLM thrives.
  
  source
Imgonnatrythis@sh.itjust.works ⁨6⁩ ⁨months⁩ ago
Not enough accuracy to be useful. Not enough bullshit for politics.

source
jubilationtcornpone@sh.itjust.works ⁨6⁩ ⁨months⁩ ago
They probably view that as a statistic worth bragging about. It’s not. If Excel got calculations right 57.2% of the time it would be completely worthless.

source
- PerogiBoi@lemmy.ca ⁨6⁩ ⁨months⁩ ago
  I asked copilot to look through my every spreadsheet and find how many instances of a category occurred. I was curious to see if it was any good. Gave me 2 different numbers. Neither were correct.
  
  source
  - jubilationtcornpone@sh.itjust.works ⁨6⁩ ⁨months⁩ ago
    Copilot: Putting the “Artificial” in Artificial Intelligence.
    
    source
    sirboozebum@lemmy.world ⁨6⁩ ⁨months⁩ ago
    Fartificial Intelligence
    
    source
    PerogiBoi@lemmy.ca ⁨6⁩ ⁨months⁩ ago
    The tech behind LLMs could have just been Clippy and everyone would be happy.
    
    source
- FreedomAdvocate@lemmy.net.au ⁨6⁩ ⁨months⁩ ago
  Did you read the next sentence? Humans only get like 72% right. It’s not far off at all.
  
  source
  - MountingSuspicion@reddthat.com ⁨6⁩ ⁨months⁩ ago
    I wonder where that “human accuracy” statistic is coming from. Plenty of people don’t know how to read and interpret data, much less use excel in the first place. There’s a difference between 1/4 of people in the workforce not being able to complete a task, and a specialized AI not being able to complete a task. Additionally, this is how you get into the KPI as a goal rather than a proxy issue. AI will never understand context isn’t directly provided in the workbook. If you introduced a new drink at your restaurant in 2020 AI will tell you that the introduction of the drink caused a 100% decrease in foot traffic since there’s no line item for “global pandemic”. I’m not saying AI will never be there, but people using this version of AI instead of actual analysis don’t care about the facts and just want an answer and for that answer to be cheap.
    
    source
    FreedomAdvocate@lemmy.net.au ⁨6⁩ ⁨months⁩ ago
    As I’ve said many times, though not in this topic - AI is a tool to be used, and using it is a skill that needs to be learned.
    
    For your pandemic example, that’s something that you would need to provide the AI with the context of. The joke of a “prompt engineer” being a job soon actually has merit, in that you want people who know how to use their tools the best. It’s constantly learning through iteration to give the AI a specific instruction set to get the results you want/need.
    
    source
  - FutileRecipe@lemmy.world ⁨6⁩ ⁨months⁩ ago
    Depending on where you go to school, 70% is passing while 50% is not. While “not far off,” one is a C, the other a F.
    
    source
    FreedomAdvocate@lemmy.net.au ⁨6⁩ ⁨months⁩ ago
    That’s not at all what this means. In this instance, 70% is basically “human level”.
    
    source