The delta between an LLM and consciousness
With Facebook’s release of LLaMa, and the subsequent work done with its models by the open community, it’s now possible to run a state of the art “GPT-3 class” LLM on regular consumer hardware. The 13B model, quantized to 4-bit, runs fine on a GPU with ~9GB free VRAM.
I spent 40 minutes chatting with one yesterday, and the experience was almost flawless.
So why is Troed playing around with locally hosted LLM “chatbots”?
No, not just because they’re hilarious ;) I spent a good amount of time a decade ago on current research on consciousness. Esp. Susan Blackmore’s book series, and Douglas Hofstadter’s “I am a strange loop” made a large impact onto what I consider to be “my” theory on what consciousness is, and what the difference is between “more or less smart”, both within humans as well as between humans and other animals.
I believe the way these LLMs work is close, in a way, to how humans store and recall “memories”. Since these bots work with language, and language is how we communicate, that allows them to partly capture “memories” through how they’re described.
What - I think - would be the steps from an LLM into something that could be … conscious?
- Crystalization: An LLM today is trained on a dataset, which isn’t then updated with use. Humans acquire new knowledge into our working memories and then (likely when sleeping) this knowledge modifies our “trained dataset” for subsequent use.
- Exploration: This is one of the differences between animals and humans (and within humans). How many “future possibilities” are we exploring before we act/answer. “If I do/say this, then they might do/say that …”. Exploration affects future interactions. An LLM can “explore” answering a question differently using different seeds, but there’s no feedback on the value of likely responses.
- Noise: An idle LLM does nothing. A human brain is never idle. We’re constantly getting noisy input from audible sources, air moving against the hairs on our body etc. There’s a stream of low level noise into our neural networks, which causes thoughts and dreams. Those thoughts and dreams cause other thoughts and dreams, in a loop. All of these thoughts and dreams modify our experiences. Likewise, an LLM needs to “experience” things happening also when idle to be able to evolve a persona.
- Persistence: An LLM today is used by booting it up from it’s trained dataset, generating a session of interaction, and is then turned off again. To be able to hold on to a consistent persona the LLM would need to … not be killed over and over.
I think the four points above will give rise to something that would be “conscious” in some aspects, and I don’t think we’re too far off from seeing it happen.