Comment on The Value of NVIDIA Now Exceeds an Unprecedented 16% of U.S. GDP

<- View Parent
TropicalDingdong@lemmy.world ⁨1⁩ ⁨day⁩ ago

I mean they aren’t a technological dead end just as they aren’t a technological panacea.

You can absolutely use them as coding assistants. They can be used to fool people, sometimes quite effectively. There is definitely “something” going on under the hood even if we don’t want to use words traditionally applied to the human experience like"learning" or “intelligence”. There is a surprising amount of consilience in current models, where you train to get good at task A, but also get good at task B, for no obvious reason.

It’s clear to me no amount of paper mache smearing over the half-glass of wine issue fixes it. There is something fundamental to the “gappiness” present in llms both knowledge set and appearance of logic. It’s becoming clear this is something intrinsic to the architecture and gluing in hot fixes isn’t going to change that. There is some very real underlying weirdness (sea horse emoji). Context windows still only create the mirage of global states (and maybe with a large enough window this doesn’t matter (relative to a human perspective). It’s also clear that nothing about llms or transformers overcomes basic principles of entropy or information theory: you can’t just model noise like some kind of infinite training cheat code.

From where we were (lstm’s) to where we are, they are easily a 100x improvement. ML now is MUCH better than ML 10 years ago, and it has everything to do with transformers.

When llms came into the scene, attention and transformers were not new. but it was a new approach to training them, and creating some clever things to get them to generalize, along with making them utterly massive. But “Attention is all you need” had been published quite a while before this generation, and I promise, if Google has seen the potential, they would not have released that research.

There will be stepwise and generational improvements to AI and ML. even though transformers are what broke through to the mainstream, the progress is much more linear and continuous than it might at first appear. So we shouldn’t expect transformers to be the end state, nor should we expect the next major jump to come from them, or even necessarily something novel. it may be the tools for the next big jump are already here, just waiting to be applied in a clever way

source
Sort:hotnewtop