LLMs have been around since roughly 2016. While scaling the up has improved their performance/capabilities, there are fundamental limitations on the actual approach. Behind the scenes, LLMs (even multimodal ones like gpt4) are trying to predict what is most expected, while that can be powerful it means they can never innovate or be truth systems.
For years we used things like tf-idf to vectorize words, then embeddings, now transformers (supped up embeddings). Each approach has it limits, LLMs are no different. The results we see now are surprisingly good, but don’t overcome the baseline limitations in the underlying model.
todd_bonzalez@lemm.ee 2 months ago
The “Attention Is All You Need” paper that birthed modern AI came out in 2017. Before Transformers, “LLMs” were pretty much just Markov chains and statistical language models.
jacksilver@lemmy.world 2 months ago
You’re right, I thought that paper came out in 2016.