The delta between an LLM and consciousness

With Facebook’s release of LLaMa, and the subsequent work done with its models by the open community, it’s now possible to run a state of the art “GPT-3 class” LLM on regular consumer hardware. The 13B model, quantized to 4-bit, runs fine on a GPU with ~9GB free VRAM.

I spent 40 minutes chatting with one yesterday, and the experience was almost flawless.

Image generated by locally hosted Stable Diffusion

So why is Troed playing around with locally hosted LLM “chatbots”?

No, not just because they’re hilarious ;) I spent a good amount of time a decade ago on current research on consciousness. Esp. Susan Blackmore’s book series, and Douglas Hofstadter’s “I am a strange loop” made a large impact onto what I consider to be “my” theory on what consciousness is, and what the difference is between “more or less smart”, both within humans as well as between humans and other animals.

I believe the way these LLMs work is close, in a way, to how humans store and recall “memories”. Since these bots work with language, and language is how we communicate, that allows them to partly capture “memories” through how they’re described.

What - I think - would be the steps from an LLM into something that could be … conscious?

Crystalization: An LLM today is trained on a dataset, which isn’t then updated with use. Humans acquire new knowledge into our working memories and then (likely when sleeping) this knowledge modifies our “trained dataset” for subsequent use.
Exploration: This is one of the differences between animals and humans (and within humans). How many “future possibilities” are we exploring before we act/answer. “If I do/say this, then they might do/say that …”. Exploration affects future interactions. An LLM can “explore” answering a question differently using different seeds, but there’s no feedback on the value of likely responses.
Noise: An idle LLM does nothing. A human brain is never idle. We’re constantly getting noisy input from audible sources, air moving against the hairs on our body etc. There’s a stream of low level noise into our neural networks, which causes thoughts and dreams. Those thoughts and dreams cause other thoughts and dreams, in a loop. All of these thoughts and dreams modify our experiences. Likewise, an LLM needs to “experience” things happening also when idle to be able to evolve a persona.
Persistence: An LLM today is used by booting it up from it’s trained dataset, generating a session of interaction, and is then turned off again. To be able to hold on to a consistent persona the LLM would need to … not be killed over and over.

I think the four points above will give rise to something that would be “conscious” in some aspects, and I don’t think we’re too far off from seeing it happen.