Comment on The Inventor Behind a Rush of AI Copyright Suits Is Trying to Show His Bot Is Sentient

<- View Parent
hedgehog@ttrpg.network ⁨1⁩ ⁨year⁩ ago

What stupid bullshit. There is nothing remotely close to an artificial general intelligence in a large language model.

Correct, but I haven’t seen anything suggesting that DABUS is an LLM. My understanding is that it’s basically made up of two components:

  1. An array of neural networks
  2. A supervisor component (that its creator calls a “thalamobot”) that manages those networks, notices when they’ve come up with something worth exploring further. The supervisor component can direct the neural networks as well as trigger other algorithms.

Other than using machine vision and machine hearing (“acoustic processing algorithms”) to supervise the neural networks, I haven’t found any description of how the thalamobot functions. Machine vision / hearing could leverage ML but might not, and either way I’d be more interested in how it determines what to prioritize / additional algorithms to trigger rather than how it integrates with the supervised system.

This person is a crackpot fool.

As far as I can tell, probably, but not necessarily.

There is no way for a LLM to have persistent memory. Everything outside of the model that pre and post processes infor is where the smoke and mirrors exist. This just just databases and standard code.

Ignoring Thaler’s claims, theoretically a supervisor could be used in conjunction with an LLM to “learn” by re-training or fine-tuning the model. That’s expensive and doesn’t provide a ton of value, though.

That said, a database / external process for retaining and injecting context into an LLM isn’t smoke and mirrors when it comes to persistent memory; the main difference compared to re-training is that the LLM itself doesn’t change. There are other limitations, too. But if I have an LLM that can handle an 8k token context where the first 4k is used (including during training) to inject summaries of situational context and of topics/concepts that are currently relevant, and the last 4k are used like traditional context, then that gives you a lot of what persistent memory would. Combine that with the ability for the system to retrain as needed to assimilate new knowledge bases and you’re all the way there.

That’s still not an AGI or even an attempt at one, of course.

source
Sort:hotnewtop