Future shapes present: autonomous goal-directed and sensory-focused mode switching in a Bayesian allostatic network model

Wait 5 sec.

IntroductionContinual behavior adaptation in an ever-changing world involves continuous mode switching between moving to achieve a goal (“goal-directed”) and sensitively perceiving their environment (“sensory-focused”). Moving for obtaining food or water and examining the surrounding environment or danger clearly demonstrates the significance of the dual modes for survival. Notably, cognitive agents exhibit predictive regulation of nutritional (energy) status in anticipation of future environmental changes1,2,3, suggesting that the predicted future shapes present behavior. For example, hibernating or migrating animals increase their food intake prior to periods when food is scarce or unavailable4,5. This phenomenon, referred to as behavioral allostasis, shows that both modes play a profound role in maintaining homeostasis in a dynamic environment. However, trade-offs between goal-directed movement and sensory-focused perception (e.g., to carefully hear the surrounding sounds, one must stop) as well as the phenomenon of goal emergence complicate the mechanistic understanding of autonomous mode switching6,7. The mechanisms by which the brain resolves the computational conflict between movement and perception, the core elements of cognition, remain unclear. Therefore, this study proposes a multimodal neural network model of autonomous behavioral allostasis based on the Bayesian brain theory. It investigated how the mode switching between goal-directed movement and sensory-focused perception can emerge through autonomous behavioral allostasis. Exploring these things may be important for understanding the relationship between life and cognition8,9.Autonomous mode switching between goal-directed movement and sensory-focused perception exposes critical issues in the computational modeling of the brain and the construction of autonomous intelligence. The notion that the brain is a predictive or generative machine has been formalized in neuroscience as predictive coding and active inference (or the free-energy principle (FEP))10,11,12,13,14. Additionally, it is applied to architectures in machine learning (deep learning) and robotics15,16,17 as well as computational understanding of psychiatric disorders18,19,20,21,22. The neurobiologically and systematically plausible computational basis converges into the Bayesian framework, or prediction error minimization (minimization of the mismatch between real and predicted sensations). Within which a Bayesian agent infers the causes of sensations as internal beliefs (latent variables), influenced by prior experiences. The conflict between the computational processes of perception and movement is a critical issue in explaining autonomous switching between both modes. Specifically, in the context of perception (perceptual inference), Bayesian beliefs should be adapted to reduce sensory prediction errors by fitting sensations. In contrast, during the execution of movements (active inference), these beliefs should remain unchanged and be realized by changing sensations. In other words, Bayesian beliefs serve as “recognition” in sensory-focused perception but “intention” in goal-directed movement. Furthermore, goal-directed movement itself is a risk factor for prediction errors due to interactions with the environment. This may be related with the argument of “the dark-room problem”23.Previous computational modeling and neurorobotic studies have heuristically avoided these problems by operating or modeling one process of movement or perception at a time24,25, explicitly training an agent to move in a specific manner26,27,28, and/or preparing separate latent variables and cost functions for perception and movement (action planning)29,30,31. However, these heuristic solutions neglect autonomous mode switching between goal-directed movement and sensory-focused perception in the face of computational conflicts experienced in daily life. Furthermore, the same latent variable (Bayesian belief) can serve as both “recognition” and “intention” (e.g., the experience of seeing or recognizing a certain food in a forest leads to the formation of the intention to find that food in the forest later on). Behavioral allostasis through autonomous switching between both modes may require an account beyond prediction error minimization for explaining how “recognition” changes into “intention” and vice versa. This may involve the dynamic self-organization of future sensorimotor goals and their precision (or strength), which modulates the balance between sensory information and goal representation. However, the mechanisms behind autonomous goal generations for behavioral allostasis and the mode switching between goal-directed movement and sensory-focused perception within autonomous behavioral allostasis remain unclear.In this study, we propose that a cognitive agent not only minimizes the sensory prediction errors at hand but also minimizes the predicted future sensory entropy (uncertainty). This hypothesis aims to explain the continual adaptation of homeostatic behaviors by linking the Bayesian brain hypothesis with a physical view of homeostasis. As physical systems, biological cognitive agents naturally decay without a cycle of nutrition or excretion, eventually leading to death. This inevitable tendency toward disorder may be associated with an increase in entropy owing to irreversible physical processes within the system (body)32. Thus, homeostatic behaviors of cognitive agents may be regarded as means of resisting the tendency toward disorder, as Schrödinger originally stated that life feeds on “negative entropy” or free-energy33. We consider this physical constraint in brain information processing given that sensation is the interface between the brain information processing system and the bodily physical system. Then, minimization of predicted future sensory entropy can be considered a “meta-goal” concerning second-order prediction (of uncertainty), which may explain the dynamic self-organization of future sensorimotor goals and their precision or strength for behavioral allostasis.The basic premise of this hypothesis is that appropriate physiological (interoceptive) conditions in the body are required to maintain homeostasis. Conversely, unusual physiological conditions increase the influence of the physical tendency towards disorder (physiological entropy increase). This may be supported by that certain food habits (e.g., inappropriate caloric content of diet) and thermal stress increase entropy production34,35,36,37. Unusual physiological conditions can affect or disrupt all aspects of the body including sensory organs, leading to noisy sensory signals, which may be experienced as bad physical conditions such as dizziness and sluggishness in severe cases. Therefore, sensory (information) entropy, or uncertainty, is a potential source from which the brain can recognize the homeostatic context of the system through sensorimotor experience.To test our hypothesis, we developed a Bayesian neural network framework for homeostatic behavioral adaptation via hierarchical multimodal integration. Behavioral allostasis is associated with various aspects of cognitive processing, including learning of predictive models, perception of internal and external environments, movement generation, and goal generation. A computational model that can explain behavioral allostasis requires a principle that can explain various aspects of cognitive processing and multimodal integration involving interoception, exteroception, and proprioception. Thus, we aimed to incorporate the Bayesian brain theory and multimodal predictive processing into the neural network model. In addition, we showed that the meta-goal of Bayesian allostasis (minimizing entropy) can be derived by extending the framework of variational Bayes, a formal tractable solution of Bayesian inference, into future predictive processing. Our simulation experiment demonstrates how autonomous mode switching emerges through Bayesian allostasis. Our modeling framework provides a basis for exploring continual behavioral adaptation and the underlying computational mechanisms in the brain network.ResultsConstraints on environment and sensationWe designed a navigation survival task in a dynamic environment in which a cognitive agent was required to survive through predictive interoceptive (energy) regulation. In the environment of the simulated square space $\left[-1.0,1.0\right]\times \left[-1.0,1.0\right]$, food was placed at a random position (Fig. 1a). The agent’s energy state increased, if the agent approached it. In a 100 timestep cycle, the nutritional value of the food gradually decreased, and the food reappeared at another random position (Fig. 1b). It roughly mimics the changes in locations and the amount of food in the natural environment. In addition, a cue was placed at a fixed position (0.0, 0.4) that temporarily informed the agent of the x-y coordinate of the current food position if the agent approached it. This cue setting enables to consider behavior adaptation in an uncertain environment38.Fig. 1: Constraints on environment and sensation.a Environmental setup. In the environment of simulated square space $\left[-\mathrm{1.0,1.0}\right]\times \left[-\mathrm{1.0,1.0}\right]$, a food is placed at a random position, and a cue is placed at a fixed position (0.0, 0.4). Unusual interoceptive state causes large sensory (information) uncertainty, and movements lead to energy expenditure. b Timeseries of proprioceptive state, exteroceptive state, interoceptive state, and food nutrition during random-exploration learning. Note that the agent does not sense the current food position or food nutrition directly. Each sensation takes values between -1.0 and 1.0, corresponding to the range of neural network output. The first 200 steps of learning data (100,000 steps) are shown.Full size imageAt each time step, the agent received two-dimensional x-y coordinates of the body position, a two-dimensional cue signal, and a one-dimensional energy state, which were regarded as proprioceptive, exteroceptive, and interoceptive sensations, respectively. Each sensation took a value between −1.0 and 1.0, corresponding to the range of the neural network output. The exteroceptive state is meaningful only when the agent is near the cue, taking the value (−0.8, −0.8) when it is not. If the food is located at (−0.8, -0.8), the agent can deduce this because the exteroceptive state remains unchanged while near the cue. As a physiological constraint of the agent, a low or high interoceptive state (negative or positive deviations from 0.0 value) causes large sensory (information) uncertainty implemented as Gaussian noise added to all sensory modalities (Fig. 1a). We assume that this corresponds to the effects of deviations from homeostasis or an increase in physiological entropy due to excessively low or high interoceptive states. We regard the agent as dead when the interoceptive state exceeds 1.0 or drops below −1.0. In addition, we assume that the movements of the agent lead to energy expenditure (a decrease in the interoceptive state).Multimodal Bayesian homeostatic recurrent neural networkFor behavioral allostasis, the agent must have an internal predictive model of the environment that represents how multimodal sensations and their uncertainties are changed by internal and external causes. Here, we propose the multimodal Bayesian homeostatic recurrent neural network (MBH-RNN), which integrates multimodal sensorimotor processing and homeostatic processing based on variational Bayes (Fig. 2a). The MBH-RNN has a structural and temporal hierarchy consisting of sensorimotor modules (lower-perceptual modules distributed for each modality and multimodal-associative modules) and homeostatic modules (unexpected-uncertainty-cause module and higher-cognitive module). Sensorimotor modules process information from each sensory modality and its associations. They may be associated with the functionality of brain regions such as sensory areas and the insula cortex39. In contrast, the higher-cognitive module integrates multimodal information with uncertainty (homeostatic) information, which may be analogous to cognitive control regions in the brain, such as the prefrontal and anterior cingulate cortices, which are thought to control a wide range of motivational, goal-directed, and uncertainty-related behaviors40,41. Moreover, the unexpected-uncertainty-cause module is primarily involved in the perception of the cause of uncertainty, including unexpected causes, and generates predictions about sensory uncertainty related to all sensory modalities. The function of this module is essential for computing the predicted future sensory entropy related with the homeostatic context and driving feeding behavior through interactions with the higher-cognitive module. We hypothesize that the unexpected-uncertainty-cause module may be associated with brain regions, such as the amygdala and basal forebrain, which are thought to represent a range of uncertainty information, including unexpected uncertainty, and modulate feeding behavior42,43,44,45,46.Fig. 2: Multimodal Bayesian homeostatic recurrent neural network.a Structure and information processing of multimodal Bayesian homeostatic recurrent neural network (MBH-RNN). Each module $m$ is based on a predictive-coding-inspired variational RNN (PV-RNN). Prior and posterior distributions of latent state are represented as single Gaussian distributions for clarity, although each of them is a multivariate Gaussian distribution. NLL: negative log-likelihood. KLD: Kullback-Leibler divergence. b An example of the effect of ablating a latent variable in the multimodal-associative module on generations of predictions about proprioceptive state, exteroceptive state, interoceptive state, and sensory uncertainty. c Probability of trained latent variables representing either proprioceptive, exteroceptive, interoceptive, or sensory uncertainty information. A simultaneous display of colors or lines means the representation of multiple pieces of information. Proprio proprioceptive module, Extero exteroceptive module, Intero interoceptive module, Multi multimodal-associative module, Uncer unexpected-uncertainty-cause module, Cog higher-cognitive module.Full size imageEach module of the MBH-RNN is based on a predictive-coding-inspired variational RNN (PV-RNN)47,48. A brief explanation of the information processing in the MBH-RNN is as below (see Fig. S1 and the “Methods” section for details). At each time step $t$, the latent states ${{\boldsymbol{z}}}_{t}^{\left(m\right)}$ of the MBH-RNN represent Bayesian beliefs regarding the causes of sensations or their uncertainty as (multivariate) Gaussian distributions individually assigned to each module $m$. Based on the Bayesian brain hypothesis10, each latent state has prior and posterior probability distributions that correspond to the estimated hidden causes before and after observing the current sensations, respectively. Based on the latent states, the MBH-RNN generates top-down predictions about the mean ${\hat{{\boldsymbol{x}}}}_{t}$ and standard deviation (uncertainty) ${\widehat{{\boldsymbol{\sigma }}}}_{x,t}$ of sensations ${{\boldsymbol{x}}}_{t}$ as a Gaussian distribution. Here, the deterministic recurrent states ${{\boldsymbol{d}}}_{t}^{\left(m\right)}$ transform latent states into sensory predictions via synaptic connections that represent the dynamic relationships between sensations and their causes. The MBH-RNN uses a multiple timescale RNN (MT-RNN) as the transformation function, which represents the temporal hierarchy by a multiple timescale property in neural activation, inspired by the biological brain26. Specifically, higher-level modules (e.g., multimodal-associative module) have slower neural dynamics than lower-level modules have (e.g., lower-perceptual modules). Importantly, ${\widehat{{\boldsymbol{x}}}}_{t}$ and ${\widehat{{\boldsymbol{\sigma }}}}_{x,t}$ are generated through different network pathways, where the higher-cognitive module serves as the information hub. This distinction of the prediction pathways enables us to dissociate the sensorimotor processing from homeostatic processing and assures that the predicted sensory uncertainty ${\hat{{\boldsymbol{\sigma }}}}_{x,t}$ is the uncertainty caused by homeostatic context that affects all sensory modalities.The cost function of the MBH-RNN is variational free-energy (VFE) that is introduced as a tractable quantity within variational Bayes that bounds the surprisal (or the negative log model evidence) for sensations48 (see “Methods” for the derivation).$${{\rm{VFE}}}_{{\rm{t}}}=\underbrace{\frac{{({{\boldsymbol{x}}}_{{{t}}}-{{\widehat{\boldsymbol{x}}}}_{{{t}}})}^{2}}{2{{\widehat{\boldsymbol{\sigma }}}}_{{{x}},{{t}}}^{2}}+\frac{{\rm{l}}{\rm{n}}(2\pi {\widehat{{\boldsymbol{\sigma }}}}_{x,t}^{2})}{2}}_{{\rm{Negative}}\,{\rm{accuracy}}}+\underbrace{\mathop{\sum}\limits_{m}{D}_{KL}[q({{\boldsymbol{z}}}_{t}^{(m)}|{{\boldsymbol{e}}}_{t:T})||p({{\boldsymbol{z}}}_{t}^{(m)}|{{\boldsymbol{d}}}_{t-1}^{(m)})]}_{{\rm{Complexity}}}$$(1)The first term (negative accuracy term) of the VFE describes the negative log-likelihood (NLL) or precision-weighted sensory prediction error. The second term (complexity term) describes Kullback-Leibler divergence (KLD) between the posteriors ${\boldsymbol{q}}({{\boldsymbol{z}}}_{t}^{\left(m\right)}|{{\boldsymbol{e}}}_{t:T})$ and priors ${\boldsymbol{p}}({{\boldsymbol{z}}}_{t}^{\left(m\right)}|{{\boldsymbol{d}}}_{t-1}^{\left(m\right)})$. The priors are generated from prior experience through previous deterministic recurrent states ${{\boldsymbol{d}}}_{t-1}^{\left(m\right)}$ whereas the posteriors are determined by the backpropagated errors ${{\boldsymbol{e}}}_{t:T}$ ($T$: the last time step of the time window).Learning of internal predictive modelThe learning of the MBH-RNN was performed using data acquired through random exploration of the environment for 100 food nutrition cycles (10,000 time steps). Figure 1b shows the time series of the proprioceptive, exteroceptive, and interoceptive sensations during random exploration. The learning data did not contain any organized movements, and the proprioception data for random movements were generated automatically using a predefined algorithm. For convenience, in the learning phase, the agent was assumed to be rescued when it was near death, in which the interoceptive state was constrained so that it could not exceed 1.0 or drop below −1.0. The MBH-RNN learned to reconstruct the experienced sensations ${{\boldsymbol{x}}}_{1:T}$ ($T$ = 10,000) by repeating the prediction generations and parameter updates within time steps $1:T$ 100,000 times offline (Fig. S1). In the learning process, posteriors at each time step ${{\boldsymbol{z}}}_{q,1:T}$ and synaptic weights ${\boldsymbol{\omega }}$ are updated through the gradient descent method for the accumulated VFE over time steps, where the partial derivative of the VFE with respect to each parameter is calculated by the back propagation through time (BPTT) algorithm49.We investigated how latent states in the trained MBH-RNN represented multimodal information. It was analyzed through an ablation study by removing each latent variable one-by-one and evaluating the influence on the generations of ${\hat{{\boldsymbol{x}}}}_{t}$ and ${\hat{{\boldsymbol{\sigma }}}}_{x,t}$ (see “Methods” for the evaluation). Figure 2b shows an example of the effect of ablating a latent variable in the multimodal-associative module. The ablation of the latent variable impacted the generation of predictions about the proprioceptive and interoceptive states (the solid red lines deviate from the black dashed lines), but not those about the exteroceptive state and sensory uncertainty. This suggests that the ablated latent variable represents bimodal proprioceptive and interoceptive information. Based on the ablation analysis for all latent variables of 10 trained MBH-RNNs, we categorized the latent variables depending on the information they represented. As shown in Fig. 2c, each of the lower-perceptual modules developed a corresponding modality-specific representation, whereas the multimodal-associative module developed bimodal and trimodal latent variables and represented interoceptive information and its integration with proprioceptive and exteroceptive information. However, the unexpected-uncertainty-cause module was only involved with sensory uncertainty predictions, while the higher-cognitive module represented the integration of multimodal information and sensory uncertainty. This suggests that the unexpected-uncertainty-cause module represents the estimated cause of sensory uncertainty that is independent of the internal and external environmental states, whereas the higher-cognitive module represents the internal and external environmental causes of sensory uncertainty. Thus, we confirmed that the MBH-RNN can be used to develop a hierarchical multimodal predictive model of the environment.Autonomous goal-directed and sensory-focused mode switchingUsing the trained MBH-RNNs, we analyzed the autonomous behavior generation of the agent in a dynamic environment, in which the random changes in food position in a 100 timestep cycle were different from the learning experience. Ten test trials were performed by each of the 10 trained MBH-RNNs. Autonomous behavior generation was implemented as a simultaneous operation of perception, action generation, and goal modulation based on VFE minimization from past to future (Figs. 3a and S2). Specifically, at the current sensorimotor time step ${t}_{c}$ $(\ge\! 0)$, the MBH-RNN, with fixed synaptic weights, repeats prediction generations and posterior updates within time steps from ${t}_{c}-{{win}}_{p}+1$ to ${t}_{c}+{{win}}_{f}$ for 200 times in an online manner. The length of the past time window is ${{win}}_{p}=10$(or winp= tc + 1 if tc 25), the averages of the moving distances over the time steps from $t-25$ to $t-1$ and from $t$ to $t+24$ were regarded as MDbefore and MDafter, respectively. If MDbefore was lower than one-third of MDafter and the moving distance at each time step from $t-25$ to $t-1$ were all smaller than 0.5, we considered that the switching from resting to moving started at time step $t$. In contrast, in Fig. 4b, if MDafter was lower than one-third of MDbefore and the moving distance at each time step from $t$ to $t+24$ were all smaller than 0.5, we considered that the switching from moving to resting started at time step $t$. Finally, in each trained network, we averaged the extracted network behaviors for all behavior switchings observed in 10 test trials.Statistical analysisWe used paired t-tests for statistical analyses to compare the proposed allostasis model and the setpoint model (control condition), which were tested using the same trained MBH-RNNs. All statistical tests were two-tailed, and the significance level was set at $p < 0.05$. In this study, an unprecedented computational simulation was conducted. Thus, it was difficult to estimate the effect size, and no statistical methods were used to pre-determine the sample size. Considering the high reproducibility of the computational simulation, we set the minimum sample size that seemed statistically testable (10 samples). Indeed, even for the 10 samples, the paired t-test reported clear differences in the analyses. Therefore, we conclude that a larger sample size would not have significantly influenced our main results. Data analyses were conducted using the R software (version 3.3.2).Data availabilityAll data is available in the manuscript and the supplementary information.Code availabilityComputer code for the neural network model was written using C++ and is available at https://github.com/h-idei/mbhrnn.git.ReferencesMcEwen, B. S. & Wingfield, J. C. The concept of allostasis in biology and biomedicine. Hormones Behav. 43, 2–15 (2003).Google Scholar Schulkin, J. & Sterling, P. Allostasis: a brain-centered, predictive mode of physiological regulation. Trends Neurosci. 42, 740–752 (2019).Google Scholar Corcoran, A. W., Pezzulo, G. & Hohwy, J. From allostatic agents to counterfactual cognisers: active inference, biological regulation, and the origins of cognition. Biol. Philos. 35, 32 (2020).Google Scholar Körtner, G. & Heldmaier, G. Body weight cycles and energy balance in the alpine marmot (Marmota marmota). Physiol. Zool. 68, 149–163 (1995).Google Scholar Sapir, N., Butler, P. J., Hedenström, A. & Wikelski, M. Energy gain and use during animal migration. in Animal Migration: A Synthesis (eds Milner-Gulland, E. J., Fryxell, J. M. & Sinclair, A. R. E.) 52–67. https://doi.org/10.1093/acprof:oso/9780199568994.003.0005 (Oxford University Press, 2011).Corbetta, M. & Shulman, G. L. Control of goal-directed and stimulus-driven attention in the brain. Nat. Rev. Neurosci. 3, 201–215 (2002).Google Scholar van Ede, F., Board, A. G. & Nobre, A. C. Goal-directed and stimulus-driven selection of internal representations. Proc. Natl. Acad. Sci. USA 117, 24590–24598 (2020).Google Scholar De Jesus, P. Autopoietic enactivism, phenomenology and the deep continuity between life and mind. Phenomenol. Cogn. Sci. 15, 265–289 (2016).Google Scholar Sander Oest, S. Life-mind continuity: untangling categorical, extensional, and systematic aspects. Synthese 203, 187 (2024).Google Scholar Friston, K. The free-energy principle: a unified brain theory. Nat. Rev. Neurosci. 11, 127–138 (2010).Google Scholar Seth, A. K. & Tsakiris, M. Being a beast machine: the somatic basis of selfhood. Trends Cogn. Sci. 22, 969–981 (2018).Google Scholar Friston, K. et al. The free energy principle made simpler but not too simple. Phys. Rep. 1024, 1–29 (2023).MathSciNet Google Scholar Isomura, T., Kotani, K., Jimbo, Y. & Friston, K. J. Experimental validation of the free-energy principle with in vitro neural networks. Nat. Commun. 14, 4547 (2023).Google Scholar Lao-Rodríguez, A. B. et al. Neuronal responses to omitted tones in the auditory brain: a neuronal correlate for predictive coding. Sci. Adv. 9, eabq8657 (2023).Google Scholar Ito, H., Yamamoto, K., Mori, H. & Ogata, T. Efficient multitask learning with an embodied predictive model for door opening and entry with whole-body control. Sci. Robot. 7, 65 (2022).Google Scholar Prescott, T. J. & Wilson, S. P. Understanding brain functional architecture through robotics. Sci. Robot. 8, eadg6014 (2023).Google Scholar Yuan, K., Sajid, N., Friston, K. & Li, Z. Hierarchical generative modelling for autonomous robots. Nat. Mach. Intell. 5, 1402–1414 (2023).Google Scholar Idei, H. et al. A neurorobotics simulation of autistic behavior induced by unusual sensory precision. Comput. Psychiatry 2, 164–182 (2018).Google Scholar Idei, H., Murata, S., Yamashita, Y. & Ogata, T. Homogeneous intrinsic neuronal excitability induces overfitting to sensory noise: a robot model of neurodevelopmental disorder. Front. Psychiatry 11, 762 (2020).Google Scholar Takahashi, Y., Murata, S., Idei, H., Tomita, H. & Yamashita, Y. Neural network modeling of altered facial expression recognition in autism spectrum disorders based on predictive processing framework. Sci. Rep. 11, 14684 (2021).Google Scholar Idei, H., Murata, S., Yamashita, Y. & Ogata, T. Paradoxical sensory reactivity induced by functional disconnection in a robot model of neurodevelopmental disorder. Neural Netw. 138, 150–163 (2021).Google Scholar Idei, H. & Yamashita, Y. Elucidating multifinal and equifinal pathways to developmental disorders by constructing real-world neurorobotic models. Neural Netw. 169, 57–74 (2024).Google Scholar Sun, Z. & Firestone, C. The dark room problem. Trends Cogn. Sci. 24, 346–348 (2020).Google Scholar Rao, R. P. N. & Ballard, D. H. Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effects. Nat. Neurosci. 2, 79–87 (1999).Google Scholar Brown, H., Adams, R. A., Parees, I., Edwards, M. & Friston, K. Active inference, sensory attenuation and illusions. Cogn. Process. 14, 411–427 (2013).Google Scholar Yamashita, Y. & Tani, J. Emergence of functional hierarchy in a multiple timescale neural network model: a humanoid robot experiment. PLoS Comput. Biol. 4, e1000220 (2008).Google Scholar Inoue, K., Nakajima, K. & Kuniyoshi, Y. Designing spontaneous behavioral switching via chaotic itinerancy. Sci. Adv. 6, eabb3989 (2020).Matsumoto, T., Ohata, W. & Tani, J. Incremental learning of goal-directed actions in a dynamic environment by a robot using active inference. Entropy 25, 1506 (2023).Google Scholar Friston, K. et al. Active inference and epistemic value. Cogn. Neurosci. 6, 187–214 (2015).Google Scholar Parr, T. & Friston, K. J. Generalised free energy and active inference. Biol. Cybern. 113, 495–513 (2019).MathSciNet Google Scholar Millidge, B. Deep active inference as variational policy gradients. J. Math. Psychol. 96, 102348 (2020).MathSciNet Google Scholar Kondepudi, D. & Prigogine, I. Modern Thermodynamics: From Heat Engines to Dissipative Structures 2nd edn. https://doi.org/10.1002/9781118698723 (Wiley, 2014).Schrödinger, E. What is Life? The Physical Aspect of the Living Cell (Cambridge University Press, 1944).Aoki, I. Effects of exercise and chills on entropy production in human body. J. Theor. Biol. 145, 421–428 (1990).Google Scholar Silva, C. A. & Annamalai, K. Entropy generation and human aging: lifespan entropy and effect of diet composition and caloric restriction diets. J. Thermodyn. 2009, 186723 (2009).Bienertová-Vašků, J., Zlámal, F., Nečesánek, I., Konečný, D. & Vasku, A. Calculating stress: from entropy to a thermodynamic concept of health and disease. PLoS ONE 11, e0146667 (2016).Google Scholar Semerciöz-Oduncuoğlu, A. S., Mitchell, S. E., Özilgen, M., Yilmaz, B. & Speakman, J. R. A step toward precision gerontology: lifespan effects of calorie and protein restriction are consistent with predicted impacts on entropy generation. Proc. Natl. Acad. Sci. USA 120, e2300624120 (2023).Google Scholar Schwartenbeck, P. et al. Computational mechanisms of curiosity and goal-directed exploration. eLife 8, e41703 (2019).Google Scholar Craig, A. D. How do you feel—now? The anterior insula and human awareness. Nat. Rev. Neurosci. 10, 59–70 (2009).Google Scholar Monosov, I. E. Anterior cingulate is a source of valence-specific information about value and uncertainty. Nat. Commun. 8, 134 (2017).Google Scholar Friedman, N. P. & Robbins, T. W. The role of prefrontal cortex in cognitive control and executive function. Neuropsychopharmacology 47, 72–89 (2022).Google Scholar Herman, A. M. et al. A cholinergic basal forebrain feeding circuit modulates appetite suppression. Nature 538, 253–256 (2016).Google Scholar Soltani, A. & Izquierdo, A. Adaptive learning under expected and unexpected uncertainty. Nat. Rev. Neurosci. 20, 635–644 (2019).Google Scholar Zhang, K., Chen, C. D. & Monosov, I. E. Novelty, salience, and surprise timing are signaled by neurons in the basal forebrain. Curr. Biol. 29, 134–142.e3 (2019).Google Scholar Peters, C. et al. Transcriptomics reveals amygdala neuron regulation by fasting and ghrelin thereby promoting feeding. Sci. Adv. 9, eadf6521 (2023).Google Scholar Crimmins, B. E. et al. Basal forebrain cholinergic signaling in the basolateral amygdala promotes strength and durability of fear memories. Neuropsychopharmacology 48, 605–614 (2023).Google Scholar Ahmadi, A. & Tani, J. A novel predictive-coding-inspired variational RNN model for online prediction and recognition. Neural Comput. 31, 2025–2074 (2019).MathSciNet Google Scholar Idei, H., Ohata, W., Yamashita, Y., Ogata, T. & Tani, J. Emergence of sensory attenuation based upon the free-energy principle. Sci. Rep. 12, 14542 (2022).Google Scholar Werbos, P. J. Backpropagation through time: what it does and how to do it. Proc. IEEE 78, 1550–1560 (1990).Google Scholar Millidge, B., Tschantz, A. & Buckley, C. L. Whence the expected free energy. Neural Comput. 33, 447–482 (2021).MathSciNet Google Scholar Stephan, K. E. et al. Allostatic self-efficacy: a metacognitive theory of dyshomeostasis-induced fatigue and depression. Front. Hum. Neurosci. 10. https://doi.org/10.3389/fnhum.2016.00550 (2016).Tschantz, A. et al. Simulating homeostatic, allostatic and goal-directed forms of interoceptive control using active inference. Biol. Psychol. 169, 108266 (2022).Google Scholar Modell, H. et al. A physiologist’s view of homeostasis. Adv. Physiol. Educ. 39, 259–266 (2015).Google Scholar Keramati, M. & Gutkin, B. Homeostatic reinforcement learning for integrating reward collection and physiological stability. eLife 3, e04811 (2014).Google Scholar Libet, B., Gleason, C. A., Wright, E. W. & Pearl, D. K. Time of conscious intention to act in relation to onset of cerebral activity (readiness-potential): the unconscious initiation of a freely voluntary act. Brain 106, 623–642 (1983).Google Scholar Soon, C. S., Brass, M., Heinze, H.-J. & Haynes, J.-D. Unconscious determinants of free decisions in the human. Brain. Nat. Neurosci. 11, 543–545 (2008).Google Scholar Fried, I., Mukamel, R. & Kreiman, G. Internally generated preactivation of single neurons in human medial frontal cortex predicts volition. Neuron 69, 548–562 (2011).Google Scholar Montebelli, A., Herrera, C. & Ziemke, T. On Cognition as dynamical coupling: an analysis of behavioral attractor dynamics. Adapt. Behav. 16, 182–195 (2008).Google Scholar Kilteni, K. & Ehrsson, H. H. Predictive attenuation of touch and tactile gating are distinct perceptual phenomena. iScience 25. https://doi.org/10.1016/j.isci.2022.104077 (2022).Parr, T., Da Costa, L. & Friston, K. Markov blankets, information geometry and stochastic thermodynamics. Philos. Trans. R. Soc. A Math. Phys. Eng. Sci. 378, 20190159 (2019).MathSciNet Google Scholar Friston, K. J., Da Costa, L. & Parr, T. Some interesting observations on the free energy principle. Entropy 23, 1076 (2021).Kiverstein, J., Kirchhoff, M. D. & Froese, T. The problem of meaning: the free energy principle and artificial agency. Front. Neurorobotics 16, 844773 (2022).Vijayaraghavan, P., Queißer, J. F., Flores, S. V. & Tani, J. Development of compositionality through interactive learning of language and action of robots. Sci. Robot. 10, eadp0751 (2025).Google Scholar Aguilera, M., Millidge, B., Tschantz, A. & Buckley, C. L. How particular is the physics of the free energy principle?. Phys. Life Rev. 40, 24–50 (2022).Google Scholar Di Paolo, E. A. A test run of the free energy principle: all for naught?: comment on “How particular is the physics of the free energy principle?” by Miguel Aguilera et al. Phys. Life Rev. 41, 61–63 (2022).Google Scholar Di Paolo, E., Thompson, E. & Beer, R. Laying down a forking path: Tensions between enaction and the free energy principle. Philos. Mind. Sci. 3, 2 (2022).Google Scholar Fernandez-Leon, J. A. & Acosta, G. A heuristic perspective on non-variational free energy modulation at the sleep-like edge. Biosystems 208, 104466 (2021).Google Scholar Fernandez-Leon, J. A., Arlego, M. & Acosta, G. G. Is Free Energy an Organizational Principle in Spiking Neural Networks? in From Animals to Animats 16 (eds Cañamero, L., Gaussier, P., Wilson, M., Boucenna, S. & Cuperlier, N.) 79–90 (Springer International Publishing, 2022).Isomura, T., Shimazaki, H. & Friston, K. J. Canonical neural networks perform active inference. Commun. Biol. 5, 55 (2022).Google Scholar Kleckner, I. R. et al. Evidence for a large-scale brain system supporting allostasis and interoception in humans. Nat. Hum. Behav. 1, 0069 (2017).Google Scholar Migeot, J. A., Duran-Aniotz, C. A., Signorelli, C. M., Piguet, O. & Ibáñez, A. A predictive coding framework of allostatic–interoceptive overload in frontotemporal dementia. Trends Neurosci. 45, 838–853 (2022).Google Scholar Petzschner, F. H., Garfinkel, S. N., Paulus, M. P., Koch, C. & Khalsa, S. S. Computational models of interoception and body regulation. Trends Neurosci. 44, 63–76 (2021).Google Scholar Edvardsen, V., Bicanski, A. & Burgess, N. Navigating with grid and place cells in cluttered environments. Hippocampus 30, 220–232 (2020).Google Scholar Han, D., Doya, K., Li, D. & Tani, J. Synergizing habits and goals with variational Bayes. Nat. Commun. 15, 4461 (2024).Google Scholar Khalsa, S. S. et al. Interoception and mental health: a roadmap. Biol. Psychiatry Cogn. Neurosci. Neuroimaging 3, 501–513 (2018).Google Scholar Paulus, M. P., Feinstein, J. S. & Khalsa, S. S. An active inference approach to interoceptive psychopathology. Annu. Rev. Clin. Psychol. 15, 97–122 (2019).Google Scholar Barrett, L. F., Quigley, K. S. & Hamilton, P. An active inference theory of allostasis and interoception in depression. Phil. Trans. R. Soc. B 371, 20160011 (2016).Liu, L. et al. On the variance of the adaptive learning rate and beyond. The Eighth International Conference on Learning Representations. https://iclr.cc/virtual_2020/poster_rkgz2aEKDr.html (2020).Glorot X. & Bengio Y. Understanding the difficulty of training deep feedforward neural networks. in Proc. Thirteenth International Conference on Artificial Intelligence and Statistics Vol. 9 (eds Teh Y. W. & Titterington D. M.) 249–256 (PMLR, 2010).Download referencesAcknowledgementsH.I. acknowledges the support from a Grant-in-Aid from the Japan Society for the Promotion of Science Research Fellows (Nos. JP22J01708, JP22KJ3167) and the Japan Science and Technology Agency ACT-X (No. JPMJAX24C2). Y.Y. acknowledges support from the Japan Science and Technology Agency Moonshot R&D (No. JPMJMS2031), grants from the Japan Science and Technology Agency Core Research for Evolutional Science and Technology (Nos. JPMJCR16E2, JPMJCR21P4), and AMED Multidisciplinary Frontier Brain and Neuroscience Discoveries (Brain/MINDS 2.0) (No. JP24wm0625407). T.O. acknowledges support from the Japan Science and Technology Agency Moonshot R&D (No. JPMJMS2031). The funders had no role in the writing of the manuscript or in the decision to submit the manuscript for publication.Author informationAuthors and AffiliationsDepartment of Information Medicine, National Institute of Neuroscience, National Center of Neurology and Psychiatry, Tokyo, JapanHayato Idei & Yuichi YamashitaCognitive Neurorobotics Research Unit, Okinawa, Okinawa Institute of Science and Technology, Okinawa, JapanJun TaniDepartment of Intermedia Art and Science, Waseda University, Tokyo, JapanTetsuya OgataAuthorsHayato IdeiView author publicationsSearch author on:PubMed Google ScholarJun TaniView author publicationsSearch author on:PubMed Google ScholarTetsuya OgataView author publicationsSearch author on:PubMed Google ScholarYuichi YamashitaView author publicationsSearch author on:PubMed Google ScholarContributionsConceptualization: H.I.; methodology: H.I.; software: H.I.; investigation: H.I.; resources: H.I. and Y.Y.; visualization: H.I., J.T., and Y.Y.; funding acquisition: H.I., T.O., and Y.Y.; project administration: H.I., Y.Y.; supervision: J.T., T.O., and Y.Y.; writing—original draft: H.I.; writing—review and editing: H.I., J.T., T.O., and Y.Y.Corresponding authorsCorrespondence to Hayato Idei or Yuichi Yamashita.Ethics declarationsCompeting interestsThe authors declare no competing interests.Additional informationPublisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.Supplementary informationSupplementary informationSupplementary VideoRights and permissionsOpen Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.Reprints and permissionsAbout this article