Metacognition - Part 1

Thinking about thinking

Apr 08, 2025

In my last post (Human-centred AI), I discussed the practical applications of AI agents and divided them into “autonomous adaptive systems” and “human centred interactive systems”. For both types of application, the agent has to monitor and manage its own information processing. This process is called metacognition, and fits in with the concept of cognitive architecture introduced earlier.

Metacognition is a huge topic and this post is part 1, which is focusing on general background.

Human thinking about thinking

Human metacognition involves thinking about our mental experiences and decisions. We ask ourselves questions such as: “Did I understand this message correctly?” or “Have I forgotten anything?” Metacognition is often used in everyday situations when things go wrong. For example, if a hill walker is following a route and finds that a landmark has not appeared when expected, they may ask: “Did I make a navigation error?” or “is my route correct but am I overestimating how fast I am going?”.

These questions are metacognitive because they are attempting to diagnose mistakes in reasoning, such as navigation errors. In contrast, the hillwalker’s non-metacognitive reasoning solves problems in the outside world, such as determining the current location and planning a route to the destination.

Group metacognition is also possible. For example, in a meeting, the participants can talk (and think) about the agenda and the progress of the meeting instead of the topics on the agenda (such as budget or hiring).

Awareness and management

Metacognition also includes management of thought processes and emotions. For example, we make decisions about where to focus our attention, or we decide on a topic to learn about. In education, metacognitive strategies include the learning of new concepts by connecting them to familiar concepts.

Similarly, awareness of our own cognitive biases is a metacognitive capability. To mitigate cognitive biases, we might look for a different perspective, or we might plan in advance to be alert about biases when making decisions. Metacognition also includes awareness and regulation of emotions and how they might cause biases.

Metacognitive AI systems

AI systems can also be metacognitive, although in a very simplified way. The architecture of such systems is usually divided into two levels:

Object-level: solving the problem (e.g. route planning, medical diagnosis).
Meta-level: reasoning about the methods used to solve the problem (e.g. algorithms, knowledge representations).

Metacognition happens on the “meta-level”. The term “meta-reasoning” is often used for this capability in AI. The term “reasoning” can include a wide range of problem-solving techniques which can happen on the meta-level or object-level.

A simple architecture

Figure 1 shows a simplified agent architecture with a meta-level and object-level. (This is based very roughly on the framework given by Cox and Raja here, but does not follow every detail). In the diagram, the boxes labelled “reasoning” and “meta-reasoning” have internal components that are not shown in the diagram (for simplicity).

To get some idea of these components, the earlier post on AI agents has a diagram of a generic agent architecture which includes a model of the world, current state, data, decision process and a box labelled “goals and values” (also simplified). These are all summarised as the “reasoning” box inside the object-level box in Figure 1.

The meta-level can be divided into two processes:

Monitoring: monitor progress and recognise problems in object-level methods
Control: make adjustments to object-level methods.

In the meta-level box in the diagram, these are shown as sensors and effectors respectively, along with a “meta-reasoning” box. Just as with “reasoning”, meta-reasoning can contain different components, such as models, data, and decision processes.

Meta-level monitoring

Monitoring happens when sensors collect data about ongoing processes in the object-level reasoning box. (Sensors can include data analysis as well). A common use of meta-level monitoring is the detection of expectation violation. Using navigation as an example, a robot could use an algorithm to predict the duration of a route, or it might have learned to recognise typical landmarks. If unexpected events occur, it indicates a possible error in the navigation planning. For example, a predicted landmark may fail to appear. A high level of unexpectedness can be detected by the meta-level sensors. The meta-reasoning would then identify possible causes and determine whether its world model needs to be updated.

Metacognition also allows explanation of reasoning. This is possible if the meta-level can produce a trace of the reasoning steps taken by the object-level (see for example this paper). The trace can indicate whether expectation violations or uncertainty occurred when encountering new information.

Meta-level control

In response to the error detection above, the robot could recalculate its position and re-plan its route, or it could just ask for assistance. This could involve a minimal level of meta-level control, such as initiating new algorithms and stopping current ones.

A more complex form of meta-level control would involve the robot making a decision on whether learning is required, and then initiating a learning strategy. It could identify specific features of the route that were different from the route that its world model or map already contains, and use the information to generate a new learning goal, along with a learning plan. This could involve updating the model or map with the new information. The concept of generating learning goals in AI has been around for some time. (See for example, here and here).

Distributed metacognition

When discussing failure detection and correction, it is reasonable to ask what should be done if the meta-reasoning itself goes wrong. Do we just add a further meta-level? I think the answer is to distribute the meta-reasoning in a non-hierarchical multi-agent system, where agents monitor and critique each other. It would of-course be necessary to have an agreement process before collective action is taken as part of metacognitive control. Some papers from my research on this are here and here.

Human metacognition can also be collective and may be non-hierarchical. In the meeting example mentioned above, someone may raise a concern about spending too much time on a topic and not making progress due to disagreement. The participants may then agree to move to the next topic, and plan to use other methods to resolve conflicts (such as one to one meetings).

Key take-aways

Human metacognition is the monitoring and management of our own mental processes. It is important for recognising mistakes in our thinking (such as biases) and for setting learning goals (but there are many other examples).
AI agents can also have a simplified form of metacognition. This is usually divided into object-level and meta-level.
Meta-level processing is typically divided into monitoring and control. Monitoring collects data on the ongoing object-level processing. Control determines corrective action. This can be an autonomous learning plan, or the agent could simply ask for help.
Distributed metacognition is an architecture where multiple agents monitor and critique each other, and make decisions collectively.

Metacognition is not just about mistakes in reasoning. It may also be about self-regulation. In Part 2, I plan to talk about this kind of metacognition.

Note

This blog is human produced.

Thoughts on AI and Society

Discussion about this post