It is far better than a modern search engine, although that is in part because of all of the SEO slop that Google has ingested. The fact that you need to think critically is not something new and it’s never going to go away either.
Very much disagree with that. Google got significantly worse, but LLM results are worse still. You do need to think critically about it, but with LLM blurb there is no ways to check for validity other than to do another search without LLM, to find sources, and in this case why even bother with the generator in the first place, or accept that some of your new info can be incorrect, and you don’t know which part.
With conventional search you have all the context of your result, you have the reputation of the website itself, you have the info about who wrote the article or whatever, you have the tone of article, you have comments, you have all the subtle clues that we learnt to pick up on both from our lifetime experience on the internet, and civilisational span experience with human interaction. With the generator you have zero of that, you have something that is stated as fact, and everything has the same weight and the same validity, and even when it sites sources, those can be just outright lies.
hoppolito@mander.xyz 1 hour ago
I think you really nailed the crux of the matter.
With the ‘autocomplete-like’ nature of current LLMs the issue is precisely that you can never be sure of any answer’s validity. Some approaches try by giving ‘sources’ next to it, but that doesn’t mean those sources’ findings actually match the text output and it’s not a given that the sources themselves are reputable - thus you’re back to perusing those to make sure anyway.
If there was a meter of certainty next to the answers this would be much more meaningful for serious use-cases, but of course by design such a thing seems impossible to implement with the current approaches.
I will say that in my personal (hobby) projects I have found a few good use cases of letting the models spit out some guesses, e.g. for the causes of a programming bug or proposing directions to research in, but I am just not sold that the heaviness of all the costs (cognitive, social, and of course environmental) is worth it for that alone.