AI w/ AI

August 6, 2025•1,036 words

notes on active inference and artificial intelligence

AI w/ AI

"active inference with artificial intelligence" or discussing LLMs towards theoretical neuroscience, but see sections below on cognitive science more broadly.

papers

https://www.nature.com/articles/s41593-024-01607-5 interesting model of generalization

https://direct.mit.edu/books/oa-monograph/5299/Active-InferenceThe-Free-Energy-Principle-in-Mind The latest book from Parr, Pezzulo & Friston on Active Inference
https://arxiv.org/abs/2212.10559 Shows how in-context learning works by analogy to fine tuning; this is likely an important effect to understand how a model can generate task specific dynamics or internal models, based on environmental context
https://arxiv.org/abs/2312.00752 Describes the Mamba model, with is a linear sequence modeling, describing also the previous S4 model which might be the simplest to understand mathematically (see the "annotated S4" below)
https://arxiv.org/abs/2206.12037 Describes the mathematics of how S4 handles long range dependencies, by decomposing sequences in time scales
https://arxiv.org/abs/2305.13048 Describes the RWKV model which is a recurrent network model, competitive w/ transformers
https://arxiv.org/abs/1712.01815 Describes the approach to reinforcement learning (RL) used by AlphaGo; deep RL would be one approach to defining internal models

other online resources

https://transformer-circuits.pub/2021/framework/index.html Paper by notable OpenAI competitor which walks through small transformer models (<=2 layers of attention only) and understanding how they work
https://openreview.net/forum?id=NpsVSN6o4ul An example of circuit identification in a large model
https://transformer-circuits.pub/2022/in-context-learning-and-induction-heads/index.html Presents an in-depth analysis arguing that the attentional heads are the core mechanism of in-context learning in transformers
https://web.stanford.edu/\~surag/posts/alphazero.html Summarizes key details of the AlphaGo paper which uses reinforcement learning to optimize an internal model of how to play
https://srush.github.io/annotated-s4 Explains how the S4 works in a lot of detail
https://discord.com/channels/729741769192767510/747850033994662000 The EleutherAI Discord is a good source for following papers and critiques of those priors
https://blog.eleuther.ai/minetester-intro/ Describes a reinforcement learning environment for policy optimization in a virtual environment
https://www.lesswrong.com/posts/cAC4AXiNC5ig6jQnc/understanding-and-controlling-a-maze-solving-policy-network Example of analyzing in depth and intervening on a deep network performing a behavioral task
https://www.neelnanda.io/mechanistic-interpretability a set of guides to interpretability of deep models and their representations by a Google DeepMind researcher
https://github.com/neelnanda-io/TransformerLens A library for interpreting transformer models, and has many links towards useful examples & resources
https://www.activeinference.org/home A set of resources around active inference

embodied cognitive science

It's only very recent (well, last 50 years or so) that the theory of evolution was applied to psychology and philosophy of mind, and this application has met resistance since it calls into question the idea that reason and rationality are independent of the biology. in any case, for our purposes, it may be helpful to have a broad overview of the links between language, active inference, and "internal models" as they may be generated by neural systems. the field of cognitive science, and especially the embodied cognitive science has quite a few interesting points to make there.

some of the resources, i.e. not comprehensive, coming to mind:

George Lakoff's (among others in the cognitive linguistics community) work on metaphor e.g. Metaphors We Live By, Where Math Comes From, etc
- basic point is that any abstract or conceptual cognitive process is built with sensory and/or motor process, because evolution is conservative; so, in the same way that we can study many aspects of human genetics through rodent models, understanding sensorimotor processes will significantly improve understanding of abstract conceptual cognition, including language.
Daniel Dennett's various books/essays on designing minds dovetails nicely with Lakoff's work
Tooby & Cosmides evolutionary approach to psychology e.g. https://www.sciencedirect.com/science/article/abs/pii/0010027794900205 as an example, and an overview https://www.cep.ucsb.edu/wp-content/uploads/2023/05/2015ToobyCosmides-BussEPHandbook.pdf
Andy Clark's radical but also extended (e.g. the calculator becomes part of the mind) embodied approach to mind
- he attempted to bridge this with Friston's Markov blanket approach https://philpapers.org/rec/CLAHTK tho, of course details matter: https://link.springer.com/chapter/10.1007/978-3-031-28719-0_5
Antony Chemero https://journals.sagepub.com/doi/10.1037/a0032923
Scott Kelso has been working on self organization in movement with many colleagues for many years
- the organizational nature as related to mind https://link.springer.com/article/10.1007/s00422-021-00890-w
- https://www.researchgate.net/profile/Scott-Kelso/publication/240149565_The_informational_character_of_self-organized_coordination_dynamics/links/5c6b31284585156b5706a358/The-informational-character-of-self-organized-coordination-dynamics.pdf
Michael Spivey's book Continuity of Mind contains many examples of behavioral experiments which demonstrate aspects of dynamical systems such as hysteresis
Rene Doursat has described how neural network should generate complex cognitive systems in a few articles
- https://cdn.aaai.org/ocs/7499/7499-32554-1-PB.pdf
- https://www.sciencedirect.com/science/article/abs/pii/S0893608005001231
Friston's work on active inference, Markov monism, self organization, information geometry
SFMs, Viktor w/ Ajay, Denis then Hiba
- in this context, SFMs provide mathematical model for self organization of dynamical processes, perhaps many, which could provide for (Friston's "underwrite") cognitive processes.
- SFMs have both equivalences with Friston's mathematical tools and also some complementary aspects

tng

Since our group is a theoretical neuroscience group, the way to position these elements together might be as follows: the elements of cognitive science, cognitive linguistics and evolutionary psychology justify our focusing on sensorimotor processes as accessible models of higher cognitive processes. These processes work by generating an internal model based on behavioral context (e.g. task description), where the internal model may operationalized in different ways (perhaps an SFM in case of movement vs an expected free energy in case of policy-driven decision). The mathematical form of the internal model should be determined by the observational evidence than can be collected during a potential experiment. Experiments could be designed to test the presence of internal models on multiple levels, e.g. cognitive level (likelihood of stimulus impacts likelihood of specific response), mesoscopic level (source activity as seen in MEG can be decoded to predict likelihoods of cognitive level) or even a TVB style model to predict behavioral responses.

What kind of experiments might satisfy criteria of both highly cognitive and feasible with SFMs/TVB?

SFMs would be good for modeling aspects of dynamics specifically timing, e.g. how do task parameters affect reaction time or decision making or hysteresis; this is widely explored in Scott Kelso's work
Spatial navigation as a basis of conceptual navigation and path finding could be implemented with neural fields, similar to Doursat's dynamical approach to linguistic constructions

What would be predictions specific to SFMs? Time scale separations, low dimensionality, etc. How these apply systemically to cognitive processes? The generation of the SFM by a neural network would be the construction of the internal model matching the task context, (which would be what Friston calls extrinsic information geometry, which is close to expected free energy).

AI w/ AI

AI w/ AI

papers

other online resources

embodied cognitive science

tng

avalanches

More from maedoc
All posts

Jupyter SN roundtrip

AI w/ AI

AI w/ AI

papers

other online resources

embodied cognitive science

tng

avalanches

More from maedocAll posts

Jupyter SN roundtrip

More from maedoc
All posts