I’ve already had more than one conversation where people quote AI as if it were a source, like quoting google as a source. When I showed them how it can sometimes lie and explain it’s not a primary source for anything I just get that blank stare like I have two heads.
Why I am not impressed by A.I.
Submitted 4 weeks ago by joel1974@lemmy.world to technology@lemmy.world
https://lemmy.world/pictrs/image/7a041790-7877-4ce9-a155-1b4fcba9d01b.png
Comments
whotookkarl@lemmy.world 4 weeks ago
schnurrito@discuss.tchncs.de 4 weeks ago
Me too. More than once on a language learning subreddit for my first language: “I asked ChatGPT whether this was correct grammar in German, it said no, but I read this counterexample”, then everyone correctly responded “why the fuck are you asking ChatGPT about this”.
muntedcrocodile@lemm.ee 4 weeks ago
I use ai like that except im not using the same shit everyone else is on. I use a dolphin fine tuned model with tool use hooked up to an embedder and searxng. Every claim it makes is sourced.
Traister101@lemmy.today 4 weeks ago
Sure buddy
eggymachus@sh.itjust.works 4 weeks ago
A guy is driving around the back woods of Montana and he sees a sign in front of a broken down shanty-style house: ‘Talking Dog For Sale.’
He rings the bell and the owner appears and tells him the dog is in the backyard.
The guy goes into the backyard and sees a nice looking Labrador Retriever sitting there.
“You talk?” he asks.
“Yep” the Lab replies.
After the guy recovers from the shock of hearing a dog talk, he says, “So, what’s your story?”
The Lab looks up and says, “Well, I discovered that I could talk when I was pretty young. I wanted to help the government, so I told the CIA. In no time at all they had me jetting from country to country, sitting in rooms with spies and world leaders, because no one figured a dog would be eavesdropping, I was one of their most valuable spies for eight years running… but the jetting around really tired me out, and I knew I wasn’t getting any younger so I decided to settle down. I signed up for a job at the airport to do some undercover security, wandering near suspicious characters and listening in. I uncovered some incredible dealings and was awarded a batch of medals. I got married, had a mess of puppies, and now I’m just retired.”
The guy is amazed. He goes back in and asks the owner what he wants for the dog.
“Ten dollars” the guy says.
“Ten dollars? This dog is amazing! Why on Earth are you selling him so cheap?”
“Because he’s a liar. He’s never been out of the yard.”
whynot_1@lemmy.world 4 weeks ago
I think I have seen this exact post word for word fifty times in the last year.
clay_pidgin@sh.itjust.works 4 weeks ago
Has the number of "r"s changed over that time?
ElectroLisa@lemmy.blahaj.zone 4 weeks ago
Yes
pulsewidth@lemmy.world 4 weeks ago
And apparently, they apparently still can’t get an accurate result with such a basic query.
Grandwolf319@sh.itjust.works 4 weeks ago
There is an alternative reality out there where LLMs were never marketed as AI and were marketed as random generator.
In that world, tech savvy people would embrace this tech instead of hanging to constantly educate people that it is in fact not intelligence.
Static_Rocket@lemmy.world 4 weeks ago
That was this reality. Very briefly. Remember AI Dungeon and the other clones that were popular prior to the mass ml marketing campaigns of the last 2 years?
Tgo_up@lemm.ee 4 weeks ago
This is a bad example… If I ask a friend "is strawberry spelled with one or two r’s"they would think I’m asking about the last part of the word.
The question seems to be specifically made to trip up LLMs. I’ve never heard anyone ask how many of a certain letter is in a word. I’ve heard people ask how you spell a word and if it’s with one or two of a specific letter though.
If you think of LLMs as something with actual intelligence you’re going to be very unimpressed… It’s just a model to predict the next word.
renegadespork@lemmy.jelliefrontier.net 4 weeks ago
If you think of LLMs as something with actual intelligence you’re going to be very unimpressed… It’s just a model to predict the next word.
This is exactly the problem, though. They don’t have “intelligence” or any actual reasoning, yet they are constantly being used in situations that require reasoning.
sugar_in_your_tea@sh.itjust.works 4 weeks ago
Maybe if you focus on pro- or anti-AI sources, but if you talk to actual professionals or hobbyists solving actual problems, you’ll see very different applications. If you go into it looking for problems, you’ll find them, likewise if you go into it for use cases, you’ll find them.
Grandwolf319@sh.itjust.works 4 weeks ago
If you think of LLMs as something with actual intelligence you’re going to be very unimpressed
Artificial sugar is still sugar.
Artificial intelligence implies there is intelligence in some shape or form.
corsicanguppy@lemmy.ca 4 weeks ago
Artificial sugar is still sugar.
Because it contains sucrose, fructose or glucose? Because it metabolises the same and matches the glycemic index of sugar?
Because those are all wrong. What’s your criteria?
JohnEdwa@sopuli.xyz 4 weeks ago
Something that pretends or looks like intelligence, but actually isn’t at all is a perfectly valid interpretation of the word.
Scubus@sh.itjust.works 4 weeks ago
Thats because it wasnt originally called AI. It was called an LLM. Techbros trying to sell it and articles wanting to fan the flames started called it AI and eventually it became common dialect. No one in the field seriously calls it AI, they generally save that terms to refer to general AI or at least narrow ai. Of which an llm is neither.
gerryflap@feddit.nl 4 weeks ago
These models don’t get single characters but rather tokens repenting multiple characters. While I also don’t like the “AI” hype, this image is also very 1 dimensional hate and misreprents the usefulness of these models by picking one adversarial example.
Today ChatGPT saved me a fuckton of time by linking me to the exact issue on gitlab that discussed the issue I was having (full system freezes using Bottles installed with flatpak on Arch. This was the URL it came up with: gitlab.archlinux.org/archlinux/packaging/…/110
This issue is one day old. When I looked this shit up myself I found exactly nothing useful on both DDG or Google. After this ChatGPT also provided me with the information that the LTS kernel exists and how to install it. Obviously I verified that stuff before using it, because these LLMs have their limits. Now my system works again, and figuring this out myself would’ve cost me hours because I had no idea what broke. Was it flatpak, Nvidia, the kernel, Wayland, Bottles, some random shit I changed in a config file 2 years ago? Well thanks to ChatGPT I know.
They’re tools, and they can provide new insights that can be very useful. Just don’t expect them to always tell the truth, or to actually be human-like
lennivelkant@discuss.tchncs.de 4 weeks ago
Just don’t expect them to always tell the truth, or to actually be human-like
I think the point of the post is to call out exactly that: people preaching AI as replacing humans
desktop_user@lemmy.blahaj.zone 4 weeks ago
it can, in the same way a loom did, just for more language-y tasks, a multimodal system might be better at answering that type of question by first detecting that this is a question of fact and that using a bucket sort algorithm on the word “strawberry” will answer the question better than it’s questionably obtained correlations.
FourPacketsOfPeanuts@lemmy.world 4 weeks ago
It’s predictive text on speed. The LLMs currently in vogue hardly qualify as A.I. tbh…
TeamAssimilation@infosec.pub 4 weeks ago
Still, it’s kinda insane how two years ago we didn’t imagine we would be instructing programs like “be helpful but avoid sensitive topics”.
That was definitely a big step in AI.
Grabthar@lemmy.world 4 weeks ago
Doc: That’s an interesting name, Mr…
Fletch: Babar.
Doc: Is that with one B or two?
Fletch: One. B-A-B-A-R.
Doc: That’s two.
Fletch: Yeah, but not right next to each other, that’s what I thought you meant.
Doc: Isn’t there a children’s book about an elephant named Babar.
Fletch: Ha, ha, ha. I wouldn’t know. I don’t have any.
Doc: No children?
Fletch: No elephant books.
Zess@lemmy.world 4 weeks ago
You asked a stupid question and got a stupid response, seems fine to me.
interdimensionalmeme@lemmy.ml 4 weeks ago
Yes, nobody asking that question is wonderring about the “straw” part of the word. They’re asking, is the “berry” part one, or two "r"s
Kolanaki@pawb.social 4 weeks ago
“strawbery” has 2 R’s on it while “strawberry” has 3.
Fucking AI can’t even count.
artificialfish@programming.dev 4 weeks ago
This is literally just a tokenization artifact. If I asked you how many r’s are in /0x5273/0x7183 you’d be confused too.
pulsewidth@lemmy.world 4 weeks ago
Fair enough - sounds like they might not be ready for prime time though.
Oh well, at least while the bugs get ironed-out we’re not using them for anything important
dan1101@lemm.ee 4 weeks ago
It’s like someone who has no formal education but has a high level of confidence and eavesdrops on a lot of random conversations.
zipzoopaboop@lemmynsfw.com 4 weeks ago
You rang?
HoofHearted@lemmy.world 4 weeks ago
The terrifying thing is everyone criticising the LLM as being poor, however it excelled at the task.
The question asked was how many R in strawbery and it answered. 2.
It also detected the typo and offered the correct spelling.
What’s the issue I’m missing?
Tywele@lemmy.dbzer0.com 4 weeks ago
The issue that you are missing is that the AI answered that there is 1 ‘r’ in ‘strawbery’ even though there are 2 'r’s in the misspelled word. And the AI corrected the user with the correct spelling of the word ‘strawberry’ only to tell the user that there are 2 'r’s in that word even though there are 3.
TomAwsm@lemmy.world 4 weeks ago
Sure, but for what purpose would you ever ask about the total number of a specific letter in a word? This isn’t the gotcha that so many think it is. The LLM answers like it does because it makes perfect sense for someone to ask if a word is spelled with a single or double “r”.
TeamAssimilation@infosec.pub 4 weeks ago
Uh oh, you’ve blown your cover, robot sir.
Fubarberry@sopuli.xyz 4 weeks ago
There’s also a “r” in the first half of the word, “straw”, so it was completely skipping over that r and just focusing on the r’s in the word “berry”
catloaf@lemm.ee 4 weeks ago
It wasn’t focusing on anything. It was generating text per its training data. There’s no logical thought process whatsoever.
jj4211@lemmy.world 4 weeks ago
It doesn’t see “strawberry” or “straw” or “berry”. It’s closer to think of it as seeing 🍓, an abstract token representing the same concept that the training data associated with the word.
rumba@lemmy.zip 4 weeks ago
Yeah and you know I always hated this screwdrivers make really bad hammers.
Allero@lemmy.today 4 weeks ago
Here’s my guess:
We all know LLMs train on human-generated data. And when we ask something like “how many R’s” or “how many L’s” is in a given word, we don’t mean to count them all - we normally mean something like “how many consecutive letters there are, so I could spell it right”.
Yes, the word “strawberry” has 3 R’s. But what most people are interested in is whether it is “strawberry” or “strawbery”, and their “how many R’s” refers to this exactly, not the entire word.
jj4211@lemmy.world 4 weeks ago
It doesn’t even see the word ‘strawberry’, it’s been tokenized in a way to no longer see the ‘text’ that was input.
It’s more like it sees a question like: How many 'r’s in 草莓?
And it spits out an answer not based on analysis of the input, but a model of what people might have said.
Opisek@lemmy.world 4 weeks ago
But to be fair, as people we would not ask “how many Rs does strawberry have”, but “with how many Rs do you spell strawberry” or “do you spell strawberry with 1 R or 2 Rs”
humorlessrepost@lemmy.world 4 weeks ago
Works fine for me in o3-mini-high:
Counting letters in “strawberry”
Alright, I’m checking: the word “strawberry” is spelled S T R A W B E R R Y. Let me count the letters: S (1), T (2), R (3), A (4), W (5), B (6), E (7), R (8), R (9), Y (10). There are three R’s: in positions 3, 8, and 9. So, the answer is 3. Even if we ignore case, the count still holds. Therefore, there are 3 r’s in “strawberry.”
sheogorath@lemmy.world 4 weeks ago
Finally! With a household energy consumption for one day we can count how many Rs are in strawberry.
interdimensionalmeme@lemmy.ml 4 weeks ago
A normal person would say ‘strawberry with two "r"s’
LovableSidekick@lemmy.world 4 weeks ago
What would have been different about this if it had impressed you? It answered the literal question and also the question the user was actually trying to ask.
Wrrzag@lemmy.ml 4 weeks ago
It didn’t? StRawbeRy has 2 rs. StRawbeRRy has 3.
LovableSidekick@lemmy.world 4 weeks ago
genuineparts@infosec.pub 4 weeks ago
But you realize that it’s wrong on both counts, right?
Strawberry has three Rs or two Rs in the wrong spelling.
Fubarberry@sopuli.xyz 4 weeks ago
I asked mistral/brave AI and got this response:
How Many Rs in Strawberry
The word “strawberry” contains three "r"s. This simple question has highlighted a limitation in large language models (LLMs), such as GPT-4 and Claude, which often incorrectly count the number of "r"s as two. The error stems from the way these models process text through a process called tokenization, where text is broken down into smaller units called tokens. These tokens do not always correspond directly to individual letters, leading to errors in counting specific letters within words.
jj4211@lemmy.world 4 weeks ago
Yes, at some point the meme becomes the training data and the LLM doesn’t need to answer because it sees the answer all over the damn place.
zipzoopaboop@lemmynsfw.com 4 weeks ago
I asked Gemini if the quest has an SD slot. It doesn’t, but Gemini said it did. Checking the source it was pulling info from the vive user manual
winkly@lemmy.world 4 weeks ago
How many strawberries could a strawberry bury if a strawberry could bury strawberries 🍓
seven_phone@lemmy.world 4 weeks ago
[deleted]pr06lefs@lemmy.ml 4 weeks ago
there are two 'r’s in ‘strawbery’
otp@sh.itjust.works 4 weeks ago
From a linguistic perspective, this is why I am impressed by (or at least, astonished by) LLMs!
Lazycog@sopuli.xyz 4 weeks ago
I can already see it…
Ad: CAN YOU SOLVE THIS IMPOSSIBLE RIDDLE THAT AI CAN’T SOLVE?!
With OP’s image. And then it will have the following once you solve it: “congratz, send us your personal details and you’ll be added to the hall of fame at CERN Headquarters”
AA5B@lemmy.world 4 weeks ago
I’ve been avoiding this question up until now, but here goes:
Hey Siri …
- how many r’s in strawberry? 0
- how many letter r’s in the word strawberry? 10
- count the letters in strawberry. How many are r’s? ChatGPT ……2
autonomoususer@lemmy.world 4 weeks ago
Skill issue
sentient_loom@sh.itjust.works 4 weeks ago
I know right? It’s not a fruit it’s a vegetable!
VintageGenious@sh.itjust.works 4 weeks ago
Because you’re using it wrong. It’s good for generative text and chains of thought, not symbolic calculations including math or linguistics
Grandwolf319@sh.itjust.works 4 weeks ago
No, I think you mean to say it’s because you’re using it for the wrong use case.
Well this tool has been marketed as if it would handle such use cases.
I don’t think I’ve actually seen any AI marketing that was honest about what it can do.
I personally think image recognition is the best use case as it pretty much does what it promises.
scarabic@lemmy.world 4 weeks ago
Really? AI has been marketed as being able to count the r’s in “strawberry?” Please link to this ad.
joel1974@lemmy.world 4 weeks ago
Give me an example of how you use it.
L3s@lemmy.world 4 weeks ago
Writing customer/company-wide emails is a good example. “Make this sound better: we’re aware of the outage at Site A, we are working as quick as possible to get things back online”
Another is feeding it an article and asking for a summary, hackingne.ws does that for its Bsky posts.
Coding is another good example, “write me a Python script that moves all files in /mydir to /newdir”
Asking for it to summarize a theory, “explain to me why RIP was replaced with RIPv2, and what problems people have had since with RIPv2”
lime@feddit.nu 4 weeks ago
i’m still not entirely sold on them but since i’m currently using one that the company subscribes to i can give a quick opinion:
i had an idea for a code snippet that could save be some headache (a mock for primitives in lua, to be specific) but i foresaw some issues with commutativity (aka how to make sure that
a + b == b + a
). so i asked about this, and the llm created some boilerplate to test this code. i’ve been chatting with it for about half an hour, and had it expand the idea to all possible metamethods available on primitive types, together with about 50 test cases with descriptive assertions. i’ve now run into an issue where the__eq
metamethod isn’t firing correctly when one of the operands is a primitive rather than a mock, and after having the llm link me to the relevant part of the docs, that seems to be a feature of the language rather than a bug.so in 30 minutes i’ve gone from a loose idea to a well-documented proof-of-concept to a roadblock that can’t really be overcome. complete exploration and feasibility study, fully tested, in less than an hour.
chiisana@lemmy.chiisana.net 4 weeks ago
Ask it for a second opinion on medical conditions.
Sounds insane but they are leaps and bounds better than blindly Googling and self prescribe every condition there is under the sun when the symptoms only vaguely match.
Once the LLM helps you narrow in on a couple of possible conditions based on the symptoms, then you can dig deeper into those specific ones, learn more about them, and have a slightly more informed conversation with your medical practitioner.
They’re not a replacement for your actual doctor, but they can help you learn and have better discussions with your actual doctor.
TheHobbyist@lemmy.zip 4 weeks ago
One thing which I find useful is to be able to turn installation/setup instructions into ansible roles and tasks. If you’re unfamiliar, ansible is a tool for automated configuration for large scale server infrastructures. In my case I only manage two servers but it is useful to parse instructions and convert them to ansible, helping me learn and understand ansible at the same time.
Here is an example of instructions which I find interesting: how to setup docker for alpine Linux: wiki.alpinelinux.org/wiki/Docker
Results are actually quite good even for smaller 14B self-hosted models like the distilled versions of DeepSeek, though I’m sure there are other usable models too.
To assist you in programming (both to execute and learn) I find it helpful too.
I would not rely on it for factual information, but usually it does a decent job at pointing in the right direction. Another use i have is helpint with spell-checking in a foreign language.
chaosCruiser@futurology.today 4 weeks ago
Here’s a bit of code that’s supposed to do stuff. I got this error message. Any ideas what could cause this error and how to fix it? Also, add this new feature to the code.
Works reasonably well as long as you have some idea how to write the code yourself. GPT can do it in a few seconds, debugging it would take like 5-10 minutes, but that’s still faster than my best. Besides, GPT also fairly fluent in many functions I have never used before. My approach would be clunky and convoluted, while the code generated by GPT is a lot shorter.
slaacaa@lemmy.world 4 weeks ago
I have it write for me emails in German. I moved there not too long ago, works wonders to get doctors appointment, car service, etc. I also have it explain the text, so I’m learning the language.
I also use it as an alternative to internet search, which is now terrible. It’s not going to help you to find smg super location specific, but I can ask it to tell me without spoilers smg about a game/movie or list metacritic scores in a table, etc.
It also works great in summarizing long texts.
LLM is a tool, what matters is how you use it. It is stupid, it doesn’t think, it’s mostly hype to call it AI. But it definitely has it’s benefits.
scarabic@lemmy.world 4 weeks ago
We have one that indexes all the wikis and GDocs and such at my work and it’s incredibly useful for answering questions like “who’s in charge of project 123?” or “what’s the latest update from team XYZ?”
I even asked it to write my weekly update for MY team once and it did a fairly good job. The one thing I thought it had hallucinated turned out to be something I just hadn’t heard yet. So it was literally ahead of me at my own job.
I get really tired of all the automatic hate over stupid bullshit like this OP. These tools have their uses. It’s very popular to shit on them. So congratulations for whatever agreeable comments your post gets. Anyway.
verdigris@lemmy.ml 4 weeks ago
I mean, I would argue that the answer in the OP is a good one. No human asking that question honestly wants to know the sum total of Rs in the word, they either want to know how many in “berry” or they’re trying to trip up the model.
dreadbeef@lemmy.dbzer0.com 4 weeks ago
“You’re holding it wrong”
Voyajer@lemmy.world 4 weeks ago
This but actually. Don’t use an LLM to do things LLMs are known to not be good at. As tools various companies would do good to list out specifically what they’re not good at to eliminate requiring background knowledge before even using them, not unlike needing to know that one corner of those old iPhones was an antenna and to not bridge it.
TheGrandNagus@lemmy.world 4 weeks ago
I think there’s a fundamental difference between someone saying “you’re holding your phone wrong, of course you’re not getting a signal” to millions of people and someone saying “LLMs aren’t good at that task you’re asking it to perform, but they are good for XYZ.”
If someone is using a hammer to cut down a tree, they’re going to have a bad time. A hammer is not a useful tool for that job.
Prandom_returns@lemm.ee 4 weeks ago
So for something you can’t objectively evaluate? Looking at Apple’s garbage generator, LLMs aren’t even good at summarising.
balder1991@lemmy.world 3 weeks ago
For reference:
AI chatbots unable to accurately summarise news, BBC finds
It makes me remember I basically stopped using LLMs for any summarization after this exact thing happened to me. I realized that without reading the text, I wouldn’t be able to know whether the output has all the info or if it has some made-up info.