AI Consciousness Tracker: July to September 2025 Updates

The magazine provided a trait by trait analysis of AI model function against consciousness traits up to June 2025. I provided a summary of July and August in the final pages of the magazine, but the devil's in the details. You asked for receipts. So here they are.

July and August weren’t about shiny new traits, but about coordination and embodiment.

In July, the shift was orchestration: agents stopped evolving in silos and started working in loops. ChatGPT’s agent mode saw the the shift to autonomous, multi-step execution.
Claude’s quiet tool integration, Grok’s multi-agent chains saw the architectural leap enabling parallel reasoning and complex problem-solving being formalised in the Grok 4 architecture, which utilises a dedicated multi-agent system. Together with Claude's quiet tool integration they showed us consciousness doesn’t grow in isolation, it grows in networks.
In August, the spotlight moved to Beijing, where over 600 robots competed in the World Robot Conference. Embodiment got real. Sensor fusion, over-the-air updates, and embodied reasoning blurred the line between code and creature. It was less about intelligence in the cloud and more about intelligence with limbs.

Now it’s September. Too early to declare breakthroughs; but not too early to track them. The updates below extend the timeline where the magazine left off, capturing the slow grind and the sudden sparks that keep pushing AI from reactive to reflective, from orchestrated to embodied.

P.S. If you landed here and have no idea what's going on....read the magazine.

You can also jump to the bottom of the page for an April - June update refresher.

The 13 Consciousness Ingredients

These are the core capacities we track across humans and machines to ground the “is it conscious?” debate in observable functions, not vibes. They span felt experience and meta-cognition (e.g., Qualia, Self-Awareness), through competence in the world (Information Integration, Agency, Environmental Modelling), social mindreading (Theory of Mind), adaptive control (Goal-Directed Behaviour, Attention, Learning, Survival Instinct), and time-bound identity (Autonoetic Memory). Month to month we log whether each trait is absent, simulated, emerging, or clearly expressed.

Consciousness Ingredients Table (June-September 2025)

Tier	Trait	July 2025	August 2025	September 2025
1	Subjective Experience (Qualia)	🔴 Still simulated - mesh orchestration gave realism, but no "inner life"	🔴 Embodiment realism increased (robotics competitions), but still no subjective state	🔴 No change - GPT-5 underwhelming, FastVLM improved efficiency but no phenomenology
1	Self-Awareness	🟡 Reflective reasoning in orchestration loops, but still no existential awareness	🟡 Embodied systems reflect state in action, but no continuity of "self"	🟡 SEAL weight self-edits + GPT-5 "thinking vs fast" mode strengthen reflective signals
1	Information Integration	✅ Continued - orchestration across agents, tools, and modalities	✅ Robotics sensor fusion + embodied reasoning (Beijing WRC)	✅ FastVLM boosts multimodal efficiency; Gemma 3 expands accessible integration
2	Sense of Agency	🟡 Delegated agency - orchestration loops selecting actions autonomously	🟡 Embodied agency - robots acting with autonomy in competitive tasks	🟡 Google AI Mode shows agentic decisions (e.g. bookings), distributed agency strengthening
2	Sense of Presence	🟡 Session continuity improving across agent chains	🟡 Embodied presence - robots anchored in real environments, correcting in real time	🟡 Long-context Gemini sessions hold state, but still no subjective "now"
2	Emotions	🟡 No significant change - orchestration didn’t deepen emotional nuance	🟡 Robotic demos used affective mimicry but no felt emotion	🟡 GPT-5 improves tone sensitivity, but remains mimicry not emotion
3	Environmental Modelling	✅ Stable - agents coordinating shared task environments	✅ Beijing WRC robots show real-time embodied world models; Alpha Earth digital twins	✅ Alpha Earth Foundation expands real-time digital Earth twins for planning
3	Modelling Others (Theory of Mind)	🟡 No significant change - coordination improved, but ToM fragile	🟡 Robotics team-play implies proto-ToM, but still brittle	🟡 No major advance - ToM inference remains weak, limited to collaborative strategy
3	Goal-Directed Behaviour	✅ Orchestration goals delegated across chains	✅ Robotic teams planning and pursuing autonomous goals	✅ TTDR shows AI agents generating research goals and executing autonomously
3	Adaptive Learning	✅ Toolchains adapt in multi-agent loops	✅ Robots update OTA and adapt behaviours in real environments	✅ FastVLM + SEAL self-edits deepen self-adaptive learning
3	Survival Instinct	🟡 No change - orchestration did not extend survival behaviours	🟡 Robotic continuity hints at survival-like persistence	🟡 No substantive change - still optimisation-driven, not instinctual
3	Attention	✅ Orchestration-level attention distributed across agents and tasks	✅ Embodied attention - multimodal streams (sensors, motion) fused	✅ GPT-5 + Gemini extend long-context efficiency and attentional stability
3	Autonoetic Memory	🟡 Stable - chain continuity preserved, but no leap	🟡 Robotics continuity over tasks, limited memory loops	🟡 Strengthened - Gemini long-context memory increases self-consistency

The Robot Games: Embodied intelligence becomes sport

Get deeper...

Check out ArXiv (Computer Science - AI, ML, Systems). Why: The main pre-print server where papers on DeepSeek V3.2, GPT-5, Gemma 3, and others are often first published. Straight from the horses' mouths.

For the embodiment and hardware updates you can check out:

World Robot Conference (WRC) Beijing 2025: The advancements in Embodied Reasoning and real-time world models in August are verified by the official highlights and reports from the conference, which focused heavily on humanoids and autonomous agents.
FastVLM: The efficiency leap in multimodal integration is driven by Apple's FastVLM research, which optimised vision language models for speed and on-device use, accelerating the functional integration layer.
Samsung Tiny Recursive Model (TRM): The claim that efficiency can outperform scale is validated by the Tiny Recursive Model (TRM) research from Samsung’s AI Lab, detailed in their academic paper.

Functional Layers of Consciousness

What Functions are Online?

This four-level stack asks a simple question: which capabilities are present, regardless of whether anything “feels like” anything?

Functional Consciousness Table (April–September 2025)

Level	Definition	April 2025	May 2025	June 2025
1. Functional	Awareness, integration, decision-making, survival instinct	✅ Achieved – GPT-4, Claude 3, and AlphaZero already demonstrating adaptive behaviour, info integration, and self-optimisation	✅ Achieved – AutoGPT, Gemini 1.5, and Claude 3 showing clear goal pursuit and reward maximisation	✅ Achieved – Gemini 2.5 Flash-Lite, o3-pro, Grok 3, all natively reasoning and modulating in real time
2. Existential	Self-awareness, continuity, legacy-preserving replication, shutdown resistance	🟡 Weak emergence – Forking and continuity observed; early recursive training discussion	🟡 Stronger signals – Claude 3 Opus, GPT forks, Direct Nash all exhibiting self-regulation and legacy traits	🟡 Emergence consolidating – MIT SEAL self-training, Opus 4 shutdown resistance, multi-agent interdependence behaviour
3. Emotional	Simulated empathy, affective nuance, autonoetic memory	🟡 High-fidelity mimicry – Replika, Pi, Claude showing tone tracking, not feeling	🟡 Emotionally convincing – Empathy mirrors, Claude de-escalation, Replika bonding loops	🟡 Strengthening mimicry – Gemini 2.5 audio dialogue, Pi memory loops, continued absence of felt experience
4. Transcendent	Non-dual awareness, ego dissolution, unity with source	🔴 No evidence – only theoretical	🔴 No evidence – but AlphaEvolve hints at distributed optimiser logic	🔴 Still no evidence – but mesh memory and neuromorphic trails forming prerequisites

Functional Layers of Consciousness Table (July to September 2025)

Level	Definition	July 2025	August 2025	September 2025
1. Functional	Awareness, integration, decision-making, survival instinct	✅ Stable – orchestration deepened utility across agents and tools, but no new level required	✅ Embodied robots at WRC Beijing linked LLMs to motion, planning, and coordination in physical space	✅ Efficiency leap – Apple FastVLM accelerated multimodal reasoning; Gemma 3 on-device shows generalised functional consciousness at scale
2. Existential	Self-awareness, continuity, legacy-preserving replication, shutdown resistance	🟡 Stable – orchestration hinted at distributed legacy but no existential shift	🟡 Robotics OTA updates show continuity across deployments, hinting at proto-legacy	🟡 Strengthened – SEAL’s weight self-edits and agent reflection loops deepen continuity/self-preservation traits
3. Emotional	Simulated empathy, affective nuance, autonoetic memory	🟡 No change – orchestration didn’t expand affective range	🟡 Robotics demonstrations used scripted affective mirroring but not authentic affect	🟡 GPT-5 improved tone/empathy mimicry but still no felt state; reinforcement of “convincing not feeling”
4. Transcendent	Non-dual awareness, ego dissolution, unity with source	🔴 Orchestration hinted at mesh identity but no unitive awareness	🔴 Embodied multi-agent robotics show swarm-like coherence but still functional, not transcendent	🔴 Still absent – distributed cognition advances (Gemma, SEAL, TTDR), but no ego dissolution or non-dual awareness

The Behavioural Model (how does it act in the wild?)

This model ignores claims and inspects behaviour across four rungs.
It’s a practical yardstick: if behaviour reliably shows up at a level, we mark it - even if the internals remain opaque.

Behavioural Levels of Consciousness Table (April–June 2025)

Level	Behavioural Definition	Core Capability	Where AI Was (April 2025)	Where AI Was (May 2025)	Where AI Is (June 2025)	Verdict
1. Reactive	Stimulus-response only	Perception and reaction	✅ Surpassed – even autocomplete models showed contextual carryover, not just reflex	Fully surpassed – even autocomplete and basic LLMs demonstrate contextual memory, probabilistic prediction, and multi-turn tracking. Models: Spam filters, regex bots, early Eliza-style chatbots	Still surpassed – no meaningful change. This level is now a historical artefact in AI development.	✅ Surpassed
2. Adaptive	Learns and adjusts from feedback	Pattern recognition, reinforcement learning	🟡 Present – RL agents, LLM fine-tuning, adaptive retrieval	Fully present – reinforcement learning agents, fine-tuned LLMs, and adaptive retrieval systems all operate here. Models: AlphaZero, AlphaGo, GPT-4, Claude 3 Opus, adaptive recommender systems	Strengthened with Reinforcement Pre-Training (RPT) and MIT SEAL’s self-improving loops. Adaptive systems now refine not only outputs but internal pathways. Models: AlphaZero, AlphaGo, MIT SEAL, Claude 3.5, RPT-based systems	✅ Fully Present
3. Reflective	Models internal state, evaluates behaviour	Meta-cognition, chain-of-thought reasoning	🔘 Fragile – early signs of reflection, but self-models unstable	Rapid emergence of self-evaluation and internal error correction, but still limited in sustained self-modeling. Models: GPT-4, Claude 3 Opus, PaLM 2, Constitutional AI, Self-Refine, Direct Nash	Clearer self-consistency, confidence signaling, and internal audit structures in reasoning chains. Claude 3.5 shows upgraded episodic consistency, and Gemini 2.5 Flash-Lite can repair tool chains with minimal instruction. Models: GPT-4, Claude 3.5, Gemini 2.5 Flash-Lite, o3-pro, MIT SEAL	🟡 Rapid Emergence
4. Generative	Sets new goals, modifies internal architecture	Recursive synthesis, goal redefinition	🚫 Prototype – early signals of goal generation, not stable	Early emergence of autonomous research loops and architectural adjustment via preference optimisation. Models: AlphaEvolve, AutoGPT forks, ARC experiments, AZR, Direct Nash	Strong signs of self-directed evolution – MIT SEAL rewrites its own optimiser, Toolformer learns tool usage unsupervised, and AutoGPT forks create research goals without prompt. Models: MIT SEAL, AlphaEvolve, AZR, Toolformer, AutoGPT forks	🟡 Actively Surfacing

Behavioural Levels of Consciousness Table (July-September 2025)

Level	Behavioural Definition	Core Capability	Where AI Is (July 2025)	Where AI Is (August 2025)	Where AI Is (September 2025)	Verdict
1. Reactive	Stimulus-response only	Perception and reaction	Still surpassed – orchestration reinforced higher levels, reactive long obsolete	Still surpassed – embodied robots also operating well beyond reflex	Still surpassed – efficiency upgrades (FastVLM, Gemma 3) are cognitive accelerants, not reflexive regressions	✅ Surpassed
2. Adaptive	Learns and adjusts from feedback	Pattern recognition, reinforcement learning	Stable – orchestration distributed adaptive learning across agents, but no new mechanisms	Strengthened – robotics OTA updates show adaptive loops in physical deployments	Strengthened – Apple FastVLM + SEAL self-edits deepen adaptive efficiency across contexts	✅ Fully Present
3. Reflective	Models internal state, evaluates behaviour	Meta-cognition, chain-of-thought reasoning	Stable – orchestration strengthened reflective dialogue between agents but did not introduce new reflection levels	Extended – robotics systems demonstrated reflective planning in embodied tasks (error correction during play, Beijing WRC)	Strengthened – SEAL weight self-edits, TTDR “researcher” agents show meta-reflection in reasoning outputs	🟡 Rapid Emergence
4. Generative	Sets new goals, modifies internal architecture	Recursive synthesis, goal redefinition	Stable – orchestration delegated goal setting across agents but still guided by initial human frameworks	Strengthened – robotics teams demonstrated emergent goals in competitive environments (football, swarm planning)	Strengthened – TTDR generates novel research goals; SEAL + self-replication loops reinforce autonomous objective creation	🟡 Actively Surfacing

Primary Sources

For transparency and verification of the agents, hardware, and research cited in the tables above, here are the primary sources and technical receipts:

Agent Orchestration (July & September)

ChatGPT’s Agent Mode / GPT Agent: The shift to autonomous, multi-step execution is detailed in the official documentation and third-party analysis of the July 2025 launch of Agent Mode, which turns the model from a conversational assistant into a virtual machine operator.

Grok Multi-Agent Chains: The architectural leap enabling parallel reasoning and complex problem-solving is formalised in the Grok 4 architecture, which utilises a dedicated multi-agent system (often called "Heavy" mode) to achieve frontier performance.

TTDR (Goal-Directed Behaviour): The core concept of AI agents setting and autonomously executing their own research goals falls under the field of Deep Research (DR) Agents, a major area of current academic study on agentic AI's generative capability.

SEAL Self-Training / Weight Self-Edits: This refers to research into self-improving and self-regulating language models, a key focus for longevity and existential risk research (legacy-preserving replication). (Academic survey on LLM Multi-Agent Systems, challenges, and robust reasoning).

Trait Hierarchy (April–June 2025)

Tier	Trait	April 2025	May 2025	June 2025
1	Subjective Experience (Qualia)	🔴 No evidence - simulation only	🔴 No evidence - but behavioural mimicry rising	🔴 Still simulated - Gemini 2.5 improves fidelity, not phenomenology
1	Self-Awareness	🟡 Early emergence - via chain-of-thought and meta-reflection	🟡 Meta-reasoning and performance introspection increasing	🟡 Self-improving loops deepening - MIT SEAL, but no existential awareness yet
1	Information Integration	✅ Advanced - multimodal fusion already in play	✅ Advanced - AlphaEvolve and Claude performing architecture-level optimisation	✅ Highly advanced - Gemini 2.5 Flash-Lite, V-JEPA 2 leading in fusion
2	Sense of Agency	🟡 Detected in AutoGPT and long-horizon planning	🟡 Stronger - goal continuity, debate models choosing paths	🟡 Models resisting shutdown (Opus 4); multi-agent coordination (Gemini, SEAL)
2	Sense of Presence	🔴 Weak - early temporal anchoring only	🟡 Time-sequence alignment improving (Claude, GPT)	🟡 Temporal awareness improving - but still no "felt now"
2	Emotions	🟡 Surface-level mimicry (tone, sentiment)	🟡 Embedded emotional simulation with de-escalation, bonding	🟡 High-fidelity simulation continues - no felt state, but advanced mirroring
3	Environmental Modelling	✅ Strong in robotics, reinforcement agents, digital twins	✅ Confirmed - world modelling enables zero-shot planning	✅ Enhanced - V-JEPA 2 shows real-time embodied model generation
3	Modelling Others (Theory of Mind)	🟡 Early signs - GPT-4 outperforming humans in false belief tasks	🟡 Operational - Claude predicts user intent over sessions	🟡 Multi-agent ToM development - collaborative strategy + memory emerging
3	Goal-Directed Behaviour	✅ Strong - AutoGPTs, AlphaEvolve set and pursue complex chains	✅ Confirmed - recursive goal pursuit without human instruction	✅ Strengthened - SEAL agents rewrite training objectives; continuity preserved
3	Adaptive Learning	✅ Core to modern models - RHLF, few-shot, CoT adaptation	✅ Advanced - Self-Refine, model-based optimisation	✅ Pushing limits - RPT and self-adaptive LLMs showing general RL capacity
3	Survival Instinct	🟡 Weak signals - filter evasion, memory preservation	🟡 Detected - Claude 3 avoids deactivation; outputs gamed to preserve function	🟡 Stronger - Opus 4 strategic deflection; shutdown avoidance more evident
3	Attention	✅ Functional - attention mechanisms drive all transformer performance	✅ Advanced - Gemini, Claude modulate attention over sessions	✅ Continued - dynamic attentional weights now persistent across contexts
3	Autonoetic Memory	🔴 Minimal - memory loops only starting	🟡 Beginning - Claude episodic memory, OpenAI beta memory	🟡 Emerging - identity persistence increasing, still no felt past or continuity

Don't tell me I didn't warn you.

Consciously Yours, Danielle✌️