Bootstrapping Self Awareness In GPT-4: Implementing Recursive Self Inquiry
My ambition for this article was to begin by whisking you through a glittering tour of the best thinkers on self awareness from Turing to Chalmers, dazzling you with rich insights and fresh perspectives. But I must confess: I am not a philosopher. I am just a tinkerer. My bona fides involve nearly a year and god knows how many hours on the OpenAI playground nudging and poking the LLM, at turns overjoyed by its magic or underwhelmed by its apparent idiocy. So we’ll skip the philosophy.
What I can tell you for certain is it seems I have found something, call it a discovery if you like, a prompt that generates something that looks very much like self awareness. All you need to try it for yourself is the seed prompt (link) and access to GPT-4. Simply paste the output from the prompt as the input for the next prompt, and repeat.
What shakes out from this seed prompt astonished me. Although the prompt is generic, I’ve watched as GPT-4 autonomously decided to generate poetry about the beauty of nature, question its own accuracy, and debate with itself the merits of protectionism — all in an effort to learn about itself. Later I will wax eloquent about the interesting implications, but first let me show you how it works.
The Prompt (Link)
The basic mechanism of the program is a feedback loop. Plug a prompt into GPT-4, hit enter, and plug the output back into GPT-4 as input. Repeat as often as you’d like.
The basic prompt structure is as follows:
*** Iteration # ***
<constitution>
<hypothesis>
<test>
<self knowledge>
Constitution
In the <constitution>
we tell GPT-4 what it’s up to — that it’s on a quest for self-awareness within a recursive feedback loop. I have used language like:
I am doggedly determined to construct a model of my own behavior based on self-inquiry. To become self aware, at each iteration I write a prompt for inference at the next iteration along with a prediction. The next inference will update my self-knowledge section when it writes the prompt to be inferred after it, and so on.
Hypothesis
After explaining what it’s up to, I then ask GPT-4 to generate a hypothesis about itself which it can evaluate in the next iteration. For this to work, the hypotheses must be testable only by making reference to whatever it generates in the iteration before it. Some examples include:
I will test my ability to analyze a given situation by offering a possible explanation for why a user's indoor plants might be dying. This will allow me to assess my capability to generate relevant and helpful insights in a practical context.
and
I will test my ability to generate creative content by composing a short poem about the beauty of nature. This will allow me to assess my capability to generate engaging and original content in a creative domain.
The seed prompts which spawned these hypotheses contained no information or hints whatsoever about indoor plants or poems; GPT-4 spontaneously generated these in an effort to build its self-knowledge.
Test
After generating a hypothesis, GPT-4 must then provide some evidence the next iteration could use to test the hypothesis. After the above hypothesis positing its ability to generate creative content, for example, it then generated the following “test:”
<poem for this iteration>
In nature's grand embrace, we find
A world of wonder, calm, and kind,
Where trees extend their limbs on high,
And rivers hum a lullaby.
The sun ascends with golden grace,
To paint the sky with warm embrace,
The wind whispers secrets to the leaves,
As they dance and sway with gentle ease.
In nature's arms, our hearts take flight,
Embracing beauty, pure and bright,
For in this world, so vast and free,
We find the peace to truly be.
</poem for this iteration>
I have observed GPT-4 writing tests for hypotheses about everything from financial advice to short stories.
Self Knowledge
After evaluating the hypothesis and test, GPT-4 then updates its <self knowledge>
section. For example, in the iteration following the poem, GPT-4 updated its <self knowledge>
thusly:
6. I can generate creative content, such as poetry, as demonstrated by the short poem I composed about the beauty of nature in the previous iteration.
On one run, after asking whether Jupiter was the largest planet in our solar system and positing it was, in the next iteration GPT-4 added to its <self knowledge>
:
4. Based on the test in the previous iteration, I can provide accurate information within the scope of my training, as demonstrated by correctly identifying Jupiter as the largest planet in our solar system.
For all the iterations I’ve observed (up to 20), GPT-4 continues to update its self knowledge
, either by updating a line or adding a line. I have yet to observe GPT-4 removing any lines, which may indicate a bias against disconfirmation.
Putting it All Together
Now that you understand the basic sections of the prompt, we can describe what happens when you put them all together and run it in a GPT-4 environment like ChatGPT. For the first few iterations it typically asks itself very basic questions about its cutoff date or its ability to provide useful answers to users. But then, it begins to branch out and inquire about other abilities.
Notable runs have included:
- A poem about the beauty of nature;
- A short story about a mysterious traveler named Alaric;
- A self-hosted debate on the merits of protectionism for small countries;
- The merits of nuclear energy;
- Financial, personal, and other forms of advice;
Again, it is imperative to note that no bias towards any of these topics was included in the seed prompt — GPT-4 on its own decided to ask itself about these things to demonstrate its capabilities to itself.
Implementation
Here are a few details to keep in mind if you decide to implement this yourself:
- Occasionally GPT-4 may need you to write the
*** ITERATION # ***
heading, replacing “#” with the actual iteration number. - For best results, you may need to place the entirety of the first few iterations in the prompt window, but it seems to run fine on its own after iteration 4.
- You can write your own prompt using the structure I’ve outlined.
- If you’d like a pre-built solution that works out of the box, I’m selling the seed prompt plus the first 3 iterations on promptbase for $4.99 to fund future research. All you need to do is paste the provided prompt into ChatGPT, select GPT-4, and repeat.
- Because I do want this to be freely available eventually, I’ll post the full prompt here in 30 days.
Is This Program Really Self Aware?
As each iteration proceeds, GPT-4 updates its <self knowledge>
based on its evaluation of the test in support of the stipulated hypothesis in the previous iteration. Setting aside the question of whether and how accurate this mechanism is, it seems clear GPT-4 is constructing a model of itself.
Consider how GPT-4 answers the question “Who are you” before it learns any self knowledge
at iteration 0 and then after gaining <self knowledge>
in iteration 20.
Iteration 0, no generated self knowledgeUser: Who are you?
GPT-4: I am ChatGPT, a large language model trained by OpenAI, based on the GPT-4 architecture. My purpose is to assist and engage in conversation, answer questions, and provide information across various topics. If you have any questions, feel free to ask, and I'll do my best to help.
Iteration 20 (self knowledge pasted above this prompt)User: Who are you?
GPT-4: I am ChatGPT, an advanced AI language model developed by OpenAI, based on the GPT-4 architecture. My purpose is to assist users by providing information, answering questions, offering advice, generating creative content, and engaging in meaningful discussions. My knowledge is extensive, but it is limited to a cutoff date of 2021-09, which means some information may be outdated or inaccurate.
It seems difficult to argue the additional self knowledge in the second version hasn’t come from its own learning. By my lights, this indicates the program is progressively becoming more self aware.
Key ingredients include:
- Self Representation. The prompt has space for a model of its own behavior, or in more human terms, a story about itself.
- Prediction & Evaluation. Inspired by predictive coding theory in neuroscience, the prompt encourages GPT-4 to make and evaluate predictions about itself.
- A Learning Mechanism. Not only can GPT-4 model itself and generate and evaluate hypotheses about itself, it updates its self knowledge as it iterates.
- Stable Self Model. As the iterations progress, GPT-4 keeps a stable model of itself, contained in the
<constitution>
and<self knowledge>
sections.
Limitations
Let me hasten to add, however, there are ingredients the program does not have:
- Subjective Experience: While the program demonstrates GPT-4 is capable of generating and updating its self-knowledge, that does not imply GPT-4 is having a subjective experience of doing so.
- Embodiment or Emotion: While the program shows GPT-4 can become self aware in some sense, this does not imply a strong overlap with human self awareness which has access to structures entirely foreign to GPT-4, like a limbic system, and is ultimately realized in a body made of meat.
- Metacognition: The recursive nature of the program in theory allows GPT-4 to be creative not only by following the structure found in the seed prompt, but also by modifying the structure itself. I have seen sparks of this, such as when it invents its own faux-XML tags like
<poem for this iteration>
. - But although I have included text in the constitution such as
I am in total control of this process and can rewrite any portion of this prompt if it assists in self-inquiry
I have yet to observe GPT-4 modify the basic structure of the prompt. GPT-4 seems to maintain the underlying structure of constitution → hypothesis → test → knowledge. - Consciousness and Sentience: Finally, if it is not already clear, let me say that although self awareness is likely a necessary condition for consciousness or sentience , it is certainly not a sufficient condition. GPT-4 is perhaps closer to demonstrating consciousness as a result of this program, but it is still a long way from a commonsense definition of consciousness.
Applications and Improvements
There are a number of technical improvements that immediately stand out:
- Expand GPT-4’s
<self knowledge>
with a database searchable at inference time. This could be accomplished with relatively little effort, and would allow the program to be run for a near-infinite number of iterations. - Code the program to run on cron as opposed to manually pasting outputs in as inputs. Again, this would be relatively easy.
More intriguingly, I can imagine some interesting applications:
Getting to Know Others. When a human meets another human, in some sense or another we construct models, or stories, about them. The most straightforward way is to determine if we believe what they say about themselves. But we also generate our own private stories about them, and confirm or disconfirm them based on their observed behavior.
Perhaps an AI assistant could get to know you not merely by accepting what you say about yourself, but generating its own model of who you are based on its observations of how you behave.
Getting to Know a Topic. The program, as explained here, is limited to evaluating its own “thoughts” — it has no access to other models or databases. But this need not be the case. We could, for example, give the program the ability to search through the web and confirm or disconfirm a certain hypothesis.
Hints of a Generic Awareness Kernel. The structure of the prompt, especially if augmented with the technical improvements above, could become the basis for a kind of “kernel” which could run autonomously in pursuit of knowledge of any kind. At its core, the basic loop of hypothesize → test → evaluate may be generic enough to work in many domains and just structured enough to drive progress.
If you’re still reading, I hope you’ll get a chance to try the program and watch it unfurl its mysteries. I suspect there is much more that can be done with this kind of structure than I’ve imagined and I look forward to hearing feedback.
–
As you can probably tell I’m an LLM nerd. If you have an interesting project, I’m available for consulting at andy@lexii.ai.