ChatGPT generates cancer treatment plans that are full of errors — Study finds that ChatGPT provided false information when asked to design cancer treatment plans

Submitted ⁨⁨1⁩ ⁨year⁩ ago⁩ by ⁨L4s@lemmy.world [bot]⁩ to ⁨technology@lemmy.world⁩

https://www.businessinsider.com/chatgpt-generates-error-filled-cancer-treatment-plans-study-2023-8

ChatGPT generates cancer treatment plans that are full of errors — Study finds that ChatGPT provided false information when asked to design cancer treatment plans::Researchers at Brigham and Women’s Hospital found that cancer treatment plans generated by OpenAI’s revolutionary chatbot were full of errors.

source

Comments

Sort:hotnew top

zeppo@lemmy.world ⁨1⁩ ⁨year⁩ ago
I’m still confused that people don’t realize this. It’s not an oracle. It’s a program that generates sentences word by word based on statistical analysis, with no concept of fact checking. It’s even worse that someone actually did a study instead of simply acknowledging or realizing that ChatGPT is happy to just make stuff up.

source
- Zeth0s@lemmy.world ⁨1⁩ ⁨year⁩ ago
  Publish or perish, that’s why
  
  source
  - agressivelyPassive@feddit.de ⁨1⁩ ⁨year⁩ ago
    I’m trying really hard for the latter.
    
    source
- fubo@lemmy.world ⁨1⁩ ⁨year⁩ ago
  
  It’s even worse that someone actually did a study instead of simply acknowledging or realizing that ChatGPT is happy to just make stuff up.
  
  Sure, the world should just trust preconceptions instead of doing science to check our beliefs. That worked great for tens of thousands of years of prehistory.
  
  source
  - zeppo@lemmy.world ⁨1⁩ ⁨year⁩ ago
    It’s not merely a preconception. It’s a rather obvious and well-known limitation of these systems. What I am decrying is that some people, from apparent ignorance, think things like “ChatGPT can give a reliable cancer treatment plan!” or “here, I’ll have it write a legal brief and not even check it for accuracy”. But sure, I agree with you, minus the needless sarcasm. It’s useful to prove or disprove even absurd hypotheses.
    
    source
    -> View More Comments
  - PetDinosaurs@lemmy.world ⁨1⁩ ⁨year⁩ ago
    Why the hell are people down voting you?
    
    This is absolutely correct. We need to do the science. Always. Doesn’t matter what the theory says. Doesn’t matter that our guess is probably correct.
    
    Plus, all these studies tell us much more than just the conclusion.
    
    source
  - yiliu@informis.land ⁨1⁩ ⁨year⁩ ago
    “After an extensive three-year study, I have discovered that touching a hot element with one’s bare hand does, in fact, hurt.”
    
    “That seems like it was unnecessary…”
    
    “Do U even science bro?!”
    
    Not everything automatically deserves a study. Were there any non-rando people out there claiming that ChatGPT could totally generate legit cancer treatment plans that people could then follow?
    
    source
  - Takumidesh@lemmy.world ⁨1⁩ ⁨year⁩ ago
    It’s not even a preconception, it’s willful ignorance, the website itself tells you multiple times that it is not accurate.
    
    The bottom of every chat has this text: “Free Research Preview. ChatGPT may produce inaccurate information about people, places, or facts. ChatGPT August 3 Version”
    
    And when you first use it, a modal pops up explaining the same thing.
    
    source
- net00@lemm.ee ⁨1⁩ ⁨year⁩ ago
  Yeah this stuff was always marketed to automate simple and repetitive things we do daily. it’s mostly the media I guess who started misleading everyone into thinking this was AI like skynet. It’s still useful, not just as a all knowing AI god
  
  source
- inspxtr@lemmy.world ⁨1⁩ ⁨year⁩ ago
  while I agree it has become more of a common knowledge that they’re unreliable, this can add on to the myriad of examples for corporations, big organizations and government to abstain from using them, or at least be informed about these various cases with their nuances to know how to integrate them.
  
  Why? I think partly because many of these organizations are racing to adopt them, for cost-cutting purposes, to chase the hype, or too slow to regulate them, … and there are/could still be very good uses that justify it in the first place.
  
  I don’t think it’s good enough to have a blanket conception to not trust them completely. I think we need multiple examples of the good, the bad and the questionable in different domains to inform the people in charge, the people using them, and the people who might be affected by their use.
  
  Kinda like the recent event at DefCon trying to exploit LLMs, it’s not enough we have some intuition about their harms, the people at the event aim to demonstrate the extremes of such harms AFAIK. These efforts can help inform developers/researchers to mitigate them, as well as showing concretely to anyone trying to adopt them how harmful they could be.
  
  Regulators also need these examples in specific domains so they may be informed on how to create policies on them, sometimes building or modifying already existing policies of such domains.
  
  source
  - zeppo@lemmy.world ⁨1⁩ ⁨year⁩ ago
    This is true and well-stated. Mainly what I wish people would understand is there are current appropriate uses, like ‘rewrite my marketing email’, but generating information that could result in great harm if inaccurate is an inappropriate use. It’s all about the specific model, though - if you had a ChatGPT system trained extensively on medical information, it would result in greater accuracy, but still the information would need expert human review before any decision were made. Mainly I wish the media had been more responsible and accurate in portraying these systems to the public.
    
    source
  - jvisick@programming.dev ⁨1⁩ ⁨year⁩ ago
    
    I don’t think it’s good enough to have a blanket conception to not trust them completely.
    
    On the other hand, I actually think we should, as a rule, not trust the output of an LLM.
    
    They’re great for generative purposes, but I don’t think there’s a single valid case where the accuracy of their response should be outright trusted. Any information you get from an AI model should be validated outright.
    
    There are many cases where a simple once-over from a human is good enough, but any time it tells you something you didn’t already know you should not trust it and, if you want to rely on that information, you should validate that it’s accurate.
    
    source
- iforgotmyinstance@lemmy.world ⁨1⁩ ⁨year⁩ ago
  I know university professors struggling with this concept. They are so convinced using an LLM is plagiarism.
  
  It can lead to plagiarism if you use it poorly, which is why you control the information you feed it. Then proofread and edit.
  
  source
  - zeppo@lemmy.world ⁨1⁩ ⁨year⁩ ago
    Another related confusion in academia recently is the ‘AI detector’. It could easily be defeated with minor rewrites, if they were even accurate in the first place. My favorite misconception is there was a story of a professor who told students “I asked ChatGPT if it wrote this, and it said yes” which is just really not how it works.
    
    source
- nfsu2@feddit.cl ⁨1⁩ ⁨year⁩ ago
  true, I tried to explain this to my parents because they were scared of it and they seemed skeptical.
  
  source
imperator3733@lemmy.world ⁨1⁩ ⁨year⁩ ago
No duh - why would it have any ability to do that sort of task?

source
- xkforce@lemmy.world ⁨1⁩ ⁨year⁩ ago
  Part of the reason for studies like this is to debunk peoples’ expectations of AI’s capabilities. A lot of people are under tge impression that cgatGPT can do ANYTHING and can think and reason when in reality it is a bullshitter that does nothing more than mimic what it thinks a suitable answer looks like. Just like a parrot.
  
  source
- PeleSpirit@lemmy.world ⁨1⁩ ⁨year⁩ ago
  Because if it’s able to crawl all of the science pubs, then it would be able to try different combos until it works. Isn’t that how it could/is being used, to test stuff?
  
  source
  - Ranessin@feddit.de ⁨1⁩ ⁨year⁩ ago
    It doesn’t check the stuff it generates other than on grammatical and orthographical errors. It’s not intelligent or has knowledge outside of how to create text. The text looks useful, but it doesn’t know what it contains in a way something intelligent would.
    
    source
    -> View More Comments
  - stephen01king@lemmy.zip ⁨1⁩ ⁨year⁩ ago
    If you want an AI that can create cancer treatment, you need to train it on creating cancer treatment, and not just use one that is trained on general knowledge. Even if you train it on science publications, all it can now reliably do is mimic a science journal since it has not been trained on how to parse the knowledge in the journal itself.
    
    source
Uncaged_Jay@lemmy.world ⁨1⁩ ⁨year⁩ ago
“Hey, program that is basically just regurgitating information, how do we do this incredibly complex things that even we don’t understand yet?”

“Here ya go.”

“Wow, this is wrong.”

“No shit.”

source
- JackbyDev@programming.dev ⁨1⁩ ⁨year⁩ ago
  “Be aware that ChatGPT may produce wrong or inaccurate results, what is your question?”
  
  How best cancer
  
  “”
  
  😱
  
  source
sentient_loom@sh.itjust.works ⁨1⁩ ⁨year⁩ ago
Why the fuck would anybody think a chat bot could create a cancer treatment plan?

source
- 5BC2E7@lemmy.world ⁨1⁩ ⁨year⁩ ago
  Because it’s been hyped. They had announced it could pass the medical licensing exam with good scores. The belief that it can replace a doctor has already been put forward
  
  source
  - Touching_Grass@lemmy.world ⁨1⁩ ⁨year⁩ ago
    It did pass it didn’t it. But who said it can replace doctors?
    
    source
- solstice@lemmy.world ⁨1⁩ ⁨year⁩ ago
  
  On two occasions I have been asked, ‘Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?’ I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question.
  
  Charles Babbage
  
  source
Kodemystic@lemmy.kodemystic.dev ⁨1⁩ ⁨year⁩ ago
Who tf is asking chatgpt for cancer treatments anyway?

source
- playerwhoplayyes@lemmy.world ⁨1⁩ ⁨year⁩ ago
  Probably people who want to check AI accuracy or people who don’t want to search or go to the doctor and ask it to ChatGPT, even if I ask a cure, I will use other AI such as the bing AI, but still I go to the doctor, I will never ask an AI or search on the internet cures to cancer, never self-medicated.
  
  source
- solstice@lemmy.world ⁨1⁩ ⁨year⁩ ago
  It’s hilarious to me that people need to be told word for word that chat gpt is NOT literally the cure for cancer.
  
  source
elboyoloco@lemmy.world ⁨1⁩ ⁨year⁩ ago
Scientist: Askes question to magic conch about cancer.

Conch: “Trying shoving bees up your ass.”

Scientists: 😡

source
Pyr_Pressure@lemmy.ca ⁨1⁩ ⁨year⁩ ago
Chatgpt is a language / chatbot. Not a doctor. Has anyone claimed that it’s a doctor?

source
- Agent641@lemmy.world ⁨1⁩ ⁨year⁩ ago
  Chatgpt fails at basic math, and lies ablut the existence of technical documentation.
  
  I mostly use it for recipe inspuration and discussing books Ive read recently. Just banter, you know? Nothing mission-critical.
  
  source
  - IDontHavePantsOn@lemm.ee ⁨1⁩ ⁨year⁩ ago
    Just a couple days ago it continually told me it was possible to re-tile part of my shower that is broken without cutting tiles, but none of the math added up. (18.5H x 21.5w area) “Place a 9” tile vertically. Place another 9“ tile vertically on top on the same side. Place another 9" tile on top vertically to cover the remainder of the area."
    
    I told chatgpt it was wrong, which it admitted, and spit out another wrong answer. I tried specifying a few more times before I started a new chat and dumbed it down to just a simple math algorithm problem. The first part of the chat said it was possible, layed out the steps, and then said it wasn’t possible in the last sentence.
    
    I surely wouldn’t trust chatgpt to advise my healthcare, but after seeing it spit out very wrong answers to a basic math question, I’m just wondering why anyone would try to have it advise anyone’s health are.
    
    source
LazyBane@lemmy.world ⁨1⁩ ⁨year⁩ ago
People really need to get in their heads that AI can “hallucinate” random information and that any implementation on an AI needs a qualified human overseeing it.

source
- grabyourmotherskeys@lemmy.world ⁨1⁩ ⁨year⁩ ago
  Exactly, it’s stringing together information in a series of iterations, each time adding a new inference consistent with what came before. It has no way to know if that inference is correct.
  
  source
TenderfootGungi@lemmy.world ⁨1⁩ ⁨year⁩ ago
The computer science classroom in my high school had a poster stating: “Garbage in garbage out”

source
sturmblast@lemmy.world ⁨1⁩ ⁨year⁩ ago
Why is anyone surprised by this? It’s not meant to be your doctor.

source
30mag@lemmy.world ⁨1⁩ ⁨year⁩ ago
I am shocked.

source
alienanimals@lemmy.world ⁨1⁩ ⁨year⁩ ago
Clickbait written by an idiot who doesn’t understand technology. I guess they give out journalism degrees to anyone who can write a top 10 buzzfeed article.

source
eager_eagle@lemmy.world ⁨1⁩ ⁨year⁩ ago
Image

source
autotldr@lemmings.world [bot] ⁨1⁩ ⁨year⁩ ago
This is the best summary I could come up with:

According to the study, which was published in the journal JAMA Oncology and initially reported by Bloomberg – when asked to generate treatment plans for a variety of cancer cases, one-third of the large language model’s responses contained incorrect information.

The chatbot sparked a rush to invest in AI companies and an intense debate over the long-term impact of artificial intelligence; Goldman Sachs research found it could affect 300 million jobs globally.

Famously, Google’s ChatGPT rival Bard wiped $120 billion off the company’s stock value when it gave an inaccurate answer to a question about the James Webb space telescope.

Earlier this month, a major study found that using AI to screen for breast cancer was safe, and suggested it could almost halve the workload of radiologists.

A computer scientist at Harvard recently found that GPT-4, the latest version of the model, could pass the US medical licensing exam with flying colors – and suggested it had better clinical judgment than some doctors.

The JAMA study found that 12.5% of ChatGPT’s responses were “hallucinated,” and that the chatbot was most likely to present incorrect information when asked about localized treatment for advanced diseases or immunotherapy.

The original article contains 523 words, the summary contains 195 words. Saved 63%. I’m a bot and I’m open source!

source
NigelFrobisher@aussie.zone ⁨1⁩ ⁨year⁩ ago
People really need to understand what LLMs are, and also what they are not. None of the messianic hype or even use of the term “AI” helps with this, and most of the ridiculous claims made in the space make me expect Peter Molyneux to be involved somehow.

source
- dx1@lemmy.world ⁨1⁩ ⁨year⁩ ago
  LLMs fit in the “weak AI” category. I’d be inclined to not call them “AI” at all, since there is no intelligence, just the illusion of intelligence. It’s possible to build intelligent AI, but probabilistic text construction isn’t even close.
  
  source
  - fsmacolyte@lemmy.world ⁨1⁩ ⁨year⁩ ago
    
    It’s possible to build intelligent AI
    
    What does intelligent AI that we can currently build look like?
    
    source
MrSlicer@lemmy.world ⁨1⁩ ⁨year⁩ ago
So does my dog, this isn’t news.

source
j4yt33@feddit.de ⁨1⁩ ⁨year⁩ ago
Why would you ask it to do that in the first place??

source
- dmonzel@lemmy.world ⁨1⁩ ⁨year⁩ ago
  To prove to all of the tech bros that ChatGPT isn’t an actual AI, perhaps. At least that’s the feeling I get based on what the article says.
  
  source
unreachable@lemmy.my.id ⁨1⁩ ⁨year⁩ ago
chatgpt/bard is only the next iteration of MegaHAL

that’s why they called it “large language model”, not “artificial intelligent”

source
Sanctus@lemmy.world ⁨1⁩ ⁨year⁩ ago
These studies are for the people out there who think ChatGPT thinks. Its a really good email assistant, and it can even get basic programming questions right if you are detailed with your prompt. Now everyone stop trying to make this thing like Finn’s mom in adventure time and just use it to helo you write a long email in a few seconds. Jfc.

source
quadropiss@lemmy.world ⁨1⁩ ⁨year⁩ ago
😱😱😱😱😱😱😱😱😱 /j

source
IceMan@lemmy.one ⁨1⁩ ⁨year⁩ ago
Google is full of bullshit too - what’s the big deal?

source
GBU_28@lemm.ee ⁨1⁩ ⁨year⁩ ago
No one is building document traversal LLM on the healthcare space with of the shelf tools

source
Sanctus@lemmy.world ⁨1⁩ ⁨year⁩ ago
I thought it released in 2021. Maybe it was on the cusp. I was basically using it to find what I couldn’t seem to find in the docs. Its definitely replaced my rubber ducky, but I still have to double check it after my Unity experience.

source