Comment

Comment on It's Breathtaking How Fast AI Is Screwing Up the Education System

How is this kind of testing relevant anymore? Isn’t it creating an unrealistic situation, given the brave new world of AI everywhere?

source

Sort:hotnew top

DoPeopleLookHere@sh.itjust.works ⁨5⁩ ⁨months⁩ ago
Because it test what you actually retained, not what you can convince an AI to tell you.

source
- FourWaveforms@lemm.ee ⁨5⁩ ⁨months⁩ ago
  But what good is that if AI can do it anyway?
  
  That is the crux of the issue.
  
  Years ago the same thing was said about calculators, then graphing calculators. I had to drop a star class and take it again later because the dinosaur didn’t want me to use a graphing calculator.
  
  Naturally they were all full of shit.
  
  But this? This is different. AI is currently as good as a graphing calculator for some engineering tasks, horrible for some others, excellent at still others. It will get better over time. And what happens when it’s awesome at everything?
  
  What is the use of being the smartest human when you’re easily outclassed by a machine?
  
  If we get fully automated yadda yadda, do many of us turn into mush-brained idiots who sit around posting all day? Everyone retires and builds Adirondack chairs and sips mint juleps and whatever? (That would be pretty sweet. But how to get there without mass starvation and unrest?)
  
  Alternately, do we have to do a Butlerian Jihad to get rid of it, and threaten execution to anyone who tries to bring it back… only to ensure we have capitalism and poverty forever?
  
  These are the questions. You have to zoom out to see them.
  
  source
  - Natanael@infosec.pub ⁨5⁩ ⁨months⁩ ago
    Because if you don’t know how to tell when the AI succeeded, you can’t use it.
    
    To know when it succeeded, you must know the topic.
    
    source
    FourWaveforms@lemm.ee ⁨5⁩ ⁨months⁩ ago
    I’m not sure what you’re implying. I’ve used it to solve problems that would’ve taken days to figure out on my own, and my solutions might not have been as good.
    
    I can tell whether it succeeded because its solutions either work, or they don’t. The problems I’m using it on have that property.
    
    source
    -> View More Comments
  - DoPeopleLookHere@sh.itjust.works ⁨5⁩ ⁨months⁩ ago
    
    but what good is that if AI can do it anyway?
    
    It can’t. It just fucking can’t. We’re all pretending it does, but it fundamentally can’t.
    
    appleinsider.com/…/apples-study-proves-that-llm-b…
    
    Creative thinking is still a long way beyond reasoning as well. We’re not close yet.
    
    source
    FourWaveforms@lemm.ee ⁨5⁩ ⁨months⁩ ago
    It’s already capable of doing a lot, and there is reason to expect it will get better over time. If we stick our fingers in our ears and pretend that’s not possible, we will not be prepared.
    
    source
    -> View More Comments
    Tiresia@slrpnk.net ⁨5⁩ ⁨months⁩ ago
    It can and it has done creative mathematical proof work. Nothing spectacular, but at least on par with a mathematics grad student.
    
    source
    -> View More Comments
    pinkapple@lemmy.ml ⁨5⁩ ⁨months⁩ ago
    
    The faulty logic was supported by a previous study from 2019
    
    This directly applies to the human journalist, studies on other models 6 years ago are pretty much irrelevant and this one apparently tested very small distilled ones that you can run on consumer hardware at home (Llama3 8B lol).
    
    Anyway this study seems trash if their conclusion is that small and fine-tuned models (user compliance includes not suspecting intentionally wrong prompts) failing to account for human misdirection somehow means “no evidence of formal reasoning”. Which means using formal logic and formal operations and not reasoning in general, we use informal reasoning for the vast majority of what we do daily and we also rely on “sophisticated pattern matching” lmao, it’s called cognitive heuristics. Kahneman won the Nobel prize for recognizing type 1 and type 2 thinking in humans.
    
    Why don’t you go repeat the experiment yourself on huggingface (accounts are free, over ten models to test, actually many are the same ones the study used) and see what actually happens? Try it on model chains that have a reasoning model like R1 and Qwant and just see for yourself and report back. It would be intellectually honest to verify things since we’re talking about critical thinking in here.
    
    Oh add a control group here, a comparison with average human performance to see what the really funny but hidden part is. Pro-tip: CS STEMlords catastrophically suck when larping being cognitive scientists.
    
    source
    -> View More Comments
  - HobbitFoot@thelemmy.club ⁨5⁩ ⁨months⁩ ago
    If you want to compare a calculator to an LLM, you could at least reasonably expect the calculator result to be accurate.
    
    source
    Zexks@lemmy.world ⁨5⁩ ⁨months⁩ ago
    Why. Because you put trust into the producers of said calculators to not fuck it up. Or because you trust others to vet those machines or are you personally validating. Unless your disassembling those calculators and inspecting their chips sets your just putting your trust in someone else and claiming “this magic box is more trust worthy”
    
    source
    -> View More Comments
    FourWaveforms@lemm.ee ⁨5⁩ ⁨months⁩ ago
    It often is. I’ve got a lot of use out of it.
    
    source
BakerBagel@midwest.social ⁨5⁩ ⁨months⁩ ago
What do you think testing is for? It’s to show what you know/have learned

source
- Skellysgirl@lemm.ee ⁨5⁩ ⁨months⁩ ago
  Education and learning are two different things. School tests are to repeat back what has been educated to you. Meaningful learning tends to be internally motivated and AI is unlikely to fulfill that aspect.
  
  source
  - MutilationWave@lemmy.dbzer0.com ⁨5⁩ ⁨months⁩ ago
    This is the purpose of essay questions.
    
    source