Comment on Turns out Generative AI was a scam

<- View Parent
partial_accumen@lemmy.world ⁨19⁩ ⁨hours⁩ ago

LLMs are not capable of creating anything, including code. They are enormous word-matching search engines that try to find and piece together the closest existing examples of what is being requested. If what you’re looking for is reasonably common, that may be useful.

Just for common understanding, you’re making blanket statements about LLMs as though those statements apply to all LLMs. You’re not wrong if you’re generally speaking of the LLM models deployed for retail consumption like, as an example, ChatGPT. None of what I’m saying here is a defense about how these giant companies are using LLMs today. I’m just posting from a Data Science point of view on the technology itself.

However, if you’re talking about the LLM technology, as in a Data Science view, your statements may not apply. The common hyperparameters for LLMs are to choose the most likely matches for the next token (like the ChatGPT example), but there’s nothing about the technology that requires that. In fact, you can set a model to specifically exclude the top result, or even choose the least likely result. What comes out when you set these hyperparameters is truly strange and looks like absolute garbage, but it is unique. The result is something that likely hasn’t existed before. I’m not saying this is a useful exercise. Its the most extreme version to illustrate the point. There’s also the “temperature” hyperparamter which introduces straight up randomness. If you crank this up, the model will start making selections with very wide weights resulting in pretty wild (and potentially useless) results.

What many Data Scientists trying to make LLMs generate something truly new and unique is to balance these settings so that new useful combinations come out without it being absolute useless garbage.

source
Sort:hotnewtop