Skip to content

"Entropic motivation and the roots of agency"by Alex Kiefer

Philosopher Alex Kiefer explores how entropy might underlie motivation and agency in natural and artificial systems, touching on psychophysical identity, willpower, Buddhist views of purpose, emergent goals under uncertainty, and Platonic patterns.

Watch Episode Here


Listen to Episode Here


Show Notes

This is a ~35 min talk titled "Entropic motivation and the roots of agency" by Alex Kiefer (https://philpeople.org/profiles/alex-kiefer) + ~25 min discussion of the issues of motivation in artificial and natural agents.

CHAPTERS:

(00:00) Entropy, Motivation, Psychophysical Identity

(35:45) Willpower, Purpose, Buddhist Motivation

(45:57) Emergent Goals and Uncertainty

(58:06) Platonic Patterns and Wrap-up

PRODUCED BY:

https://aipodcast.ing

SOCIAL LINKS:

Podcast Website: https://thoughtforms-life.aipodcast.ing

YouTube: https://www.youtube.com/channel/UC3pVafx6EZqXVI2V_Efu2uw

Apple Podcasts: https://podcasts.apple.com/us/podcast/thoughtforms-life/id1805908099

Spotify: https://open.spotify.com/show/7JCmtoeH53neYyZeOZ6ym5

Twitter: https://x.com/drmichaellevin

Blog: https://thoughtforms.life

The Levin Lab: https://drmichaellevin.org


Transcript

This transcript is automatically generated; we strive for accuracy, but errors in wording or speaker identification may occur. Please verify key details when needed.

[00:00] Alex Kiefer: Again, thanks so much for talking about this. This is a summary of some work I've been doing recently, mostly a priori conceptual philosophical work. It is by no means finished, so I'm trying to work out where this goes. A lot of the things that you've been talking about recently concerning Platonic spaces and so on are going to crop up at the end here. The overall picture here is the main argument, which has been made in different forms over the years by different people, that fitness or utility functions basically provide constraints on motivation in agents, but that these constraints are not sufficient to explain motivation. There is a lot of evidence from different corners pointing to the idea that you can understand motivation in terms of constrained entropy maximization. I'll talk about that a bit. I was thinking more deeply about this, and there's a decoupling in part between cognition — things like knowledge and memory — and motivation and value. The fact that AI alignment is this field that's come to the fore is because we've spent a lot of time modeling cognition and not a lot of time modeling motivation. That's a claim anyway. I have a thread on the role of cross-scale composition, and this is a way that I want to argue there is an important difference between contemporary AI systems and us. But as you'll see, I'm also very sympathetic to your arguments about continuity. I'll get to some of this stuff about platonic ingression at the end. I actually want to start here: for me, my starting point was not really thinking about motivation. I started thinking about motivation through this lens that I already had, which I'm calling psychophysical identity. That comes from the work by David Lewis, a philosopher in the 70s, about the mind-body relationship. He argued that if you had some advanced neuroscience that provided a description of brain states, and we have an intuitive folk psychology, if you remove the content-bearing terms or the names from each of those theories and replace them with variables, you might end up with the same formal structure. It's an ideal case. Then you could say I've discovered that what folk psychology is talking about is brain states. I see active inference and probabilistic modeling and cognitive science as affording an opportunity to do the same thing. You can take any physical structure and interpret it either from an external point of view as a physical structure, or you can say this thing intrinsically is to be an agent, to be a probably conscious, certainly cognitive, motivated agent. You relabel this neural-network-looking thing here, switching the physical-state labels for labels of what those things represent. That's the overall place I'm coming from here. If that's true, then in particular, if you look at the dynamics of these things, there's going to be an isomorphism or a duality between the physical and the inferential dynamics. You can describe the physical dynamics in terms of constrained entropy maximization or dissipation under energetic constraints. This is old hat for people who have been looking at active inference, but you can view this as minimizing prediction error and maximizing entropy. The argument that these things are the same I won't give now; there's lots of work that relates these things. At equilibrium, there's a clear relationship between probability and energy of a state. You can see that there's a tight relationship between thermodynamic free energies and variational free energies. The overall paradigm is that instead of relaxing towards a global thermal equilibrium state, you relax towards a non-equilibrium steady state that's still driven but is a stable configuration. I'm going to appeal to authority and say there's a lot out there on this; I don't want to make that argument in full right now. What I'm really interested in is how this duality works in the case of beliefs and knowledge, and in motivational states like desires and intentions.

[04:28] Alex Kiefer: Belief contents. With respect to belief, this story has been told a lot. I don't want to spend a lot of time on it, but there are really two major ways of getting at what belief contents are. I think either way you go, either an externalist, an activist view or a more internalist view, you get something that works under this framing. This is how I understand the FEPs or active inference framing of belief content. You've got this Markov blanket partition of the overall state space, and this conditional independence induces a map from the internal states to the external ones, the synchronization map, where the internal states parameterize the distribution over external states. I call it activist or externalist because it depends on the full generative model, including the external states. I have always intuitively viewed this in a more internalist way. You just think of representation as simulation, which I think is how a lot of people who work with the notion of representation in cognitive science, machine learning, and neuroscience see it. From that point of view, tracking external things is a consequence of the fact that you're modeling the world, and your overall internal model comes to resemble the overall external state of affairs. That's why you get this tracking relationship. Contents are really encoded holistically. Networks of constraints divide the world into equivalence classes. The more constraints you have, the more complex your model is, the more finely you can individuate possible worlds. That's how I see that working. That's the story that's been told many times. What about motivation? I've done some work on this with Ryan Smith and Maxwell Ramstead. This is another expression of the Helmholtz free energy, the variational free energy. You can see if you adjust the Q(s) term to be close to the true posterior, then you minimize this divergence term. That's like adjusting the model to fit the world. Beliefs are often described as having a mind-to-world direction of fit. That seems to be what this is implementing. On the other hand, if you act so as to change observations, then you're changing the world to improve the model evidence term, and you're implementing something that has a world-to-mind direction of fit, which is how people describe desire. You can also change PO itself. Instead of changing the observations, you can change the generative model. That's something like model selection or evolution, which I have a little more to say about later. I'd say at that base level, there's a pretty tight relationship between physical and cognitive free energy minimization. There are two strata of motivation here. What I've been talking about so far is instantaneous free energy minimization, which would implement a closed-loop control, homeostatic system. Intelligent things are able to plan.

[08:56] Alex Kiefer: Any kind of deep planning that's not just greedy energy minimization, in this framework, depends on beliefs about the consequences of adopting different policies. The way I'm thinking of this is, this leverages some internal machinery, and you drive that machinery internally to simulate counterfactual trajectories. The interesting line of research I want to go in here is to show how that sort of policy selection machinery can be implemented using just the basic thing, the sort of closed-loop control, maybe in internal structures. There is some interesting work on this. This one shows how you can get something like epistemic value, something like expected free energy, which I'll talk about in a minute, using message passing on a factor graph. You're doing this by variational free energy minimization. The overall argument I want to make with respect to motivation here is that basic physical and cognitive dynamics can definitely be understood in terms of constrained max N. There are also these formal treatments of intrinsic motivation, which have to do with planning policy selection that point in the same direction. Intrinsic motivation here means something that's universal and not a learned reward function. How can we maximize model evidence in a less greedy way than just this closed-loop control? In active inference, policy selection is modeled using expected free energy. You can write that as the divergence between expected and preferred observations or sensations minus the informativeness of the observations. I'm using these red and purple to color-code constraint and entropy maximization. The information gain term here you can read as a form of entropy maximization, although it's under its own constraints. This expected free energy search is under an additional constraint specified by this preference distribution, which is the marginal likelihood of the model evidence. The important point I want to make is that if you go outside active inference and look at other influential frameworks for understanding intrinsic motivation, you also get the same kind of structure, at least at a broad level of description. An empowered agent chooses states; it tries to maximize the mutual information between its actions up to a certain time and its ensuing observations. You can break down that mutual information into these entropy terms. These agents are maximizing the marginal entropies of both actions and observations under the constraints that there's a certain relationship among them. These are semantically meaningful constraints; for example, maximize action entropy while ensuring that your actions are based on evidence or observations. You can also think of it as maximizing observation entropy under the constraint that you can control things. There is a recent line of work that makes this explicit called the maximum occupancy principle, where the objective is to maximize entropy. Entropy in this case is taken to measure occupancy of action-state paths. This reward function here is the sum on an infinite time horizon of the surprise associated with the actions and the next states. This "on an infinite time horizon" clause acts as an implicit constraint that makes these agents do interesting things. This is nice because it's a formula for policy selection, but it also immediately maximizes the entropy of the next action distribution. Is there a tension here between maximizing model evidence and minimizing observation entropy? If this is a flat maximum entropy distribution on the right, and to the extent that you have a model, you're constraining that distribution and it's less flat. It really depends on what suits the environment. A flatter model, a flatter distribution of observations, is better in that it hedges its bets and is more resilient. It's also more complex. The extent to which you want to flatten that depends on the environment and how much source entropy there is that you have to model.

[13:24] Alex Kiefer: I was just talking to Connor Heinz about this. If you look at the expected free energy, there's a term that maximizes the observation entropy in the ambiguity while also under the constraint that you want to maximize model evidence. Thank you for listening to this. If psychophysical identity is true, then the point is that the maximization of entropy in physical terms, not just in some abstract sense or diffusion, is an irreducible component of motivation, and if it is, then it's a more universal aspect of motivation than utility maximization or risk minimization, since it doesn't depend on the particulars of the agent. A poetic way of describing this, and it's an important signpost for me, is that you can think of a tendency toward entropy as the carrier wave of agency, and it's modulated in particular cases by these energetic constraints. It's a holistic way of thinking about agency, where individuals arise out of this sea. I think this is evident on the classical evolutionary time scale. The evolution on the phylogenetic time scale is the same structure as the action-perception loop. This is not a new point by any means, but you're basically maximizing the variety of the posterior states or models under a fitness constraint. Importantly here, the constraint set also evolves. If the universe is a closed system, I'm not convinced of that, but if it is, then it seems to evolve toward this maximum entropy distribution. It also evolves so that the structures in it are more efficient at maximizing entropy. This is the idea of dissipative adaptation from Jeremy England. One more note on the importance of open-ended evolution. This is something Adam Safran pointed me to, which I think is really resonant. This great paper showed that this open-ended search discovered very different and arguably much better structures for representing things than a greedy stochastic gradient descent. I think there are many reasons for that, having to do with the fact that people are selecting these, providing the cost function based on their sense of beauty. But in any case, it shows that less directed evolutionary search can yield very different results. I wanted to talk to you about when I was trying to frame these ideas. I'm convinced that this is important, but it's hard to convince myself and others that this is really a form of motivation. Maximizing entropy: you think of producing heat. Why should we think of this as a form of motivation? There's a counter-thesis that you might propose here, which is that to be motivated is to seek a goal. To get out of bed in the morning, you have to want to do something, either directly via this closed-loop homeostasis, or by implementing policy search using that. If that's true, then there's really no motivation without a constraint or a target. Motivation or desire on this view always involves some kind of lack. It's a free energy gap you're trying to close. In a pure thermal equilibrium state, there are no structures; there's no internal energy with which to do anything. But there's an antithesis to this, which is perhaps more of an intuition so far than an argument. I'd say some of the most interesting, fun, and meaningful aspects of life involve free play for its own sake. That play is always structured by transient goals at least. It's possible that anything that gets done a lot and has beneficial consequences could then be selected for, but I don't think you need an evolutionary explanation of why this cat is playing with its prey. If it is not starving and has some time on its hands, it's just the thing that it does.

[17:52] Alex Kiefer: I'm saying evolutionary thinking is, of course, foundationally important, but it's just half the story. It's not necessarily the explanation for everything. I think there are different reasons that you have for believing that, or maybe some of the same reasons. An alternative to — you saw this kind of philosophical problem: what's motivation if it's not goal-seeking? In recent work, I've thought of it as endogenously driven movement. That might be a slippery category. If you define it this way, then you can see goal-seeking, specific goal-seeking is something that life learns to do on evolutionary time. This definition still encompasses desires and values of two different degrees of stimulus dependence as usual, but also spontaneous activity. It rules out idealized pure beliefs, which I think of a belief as a perception as kind of the paradigm for it. Your organism is informed by some external thing. Constrained maximum entropy is again the synthesis here, where evolving goals partially constrain and thereby also enable this movement or free movement at larger scales. This is another way of looking at this. Cyberneticists like Varela and others who are into autopoiesis described systems as having the property of operational closure. In this kind of system, these internal states mutually reinforce each other. There's a dynamic chord of this to the system. The point is that it's dynamic and it is constrained and it is additionally further constrained by these external stimuli. Once those constraints are in place, anything else is fair game. The constraints alone don't explain the behavior of the thing completely. I think there's a thread here you could develop on genetic bioelectricity. I don't have to tell you about that, but the fact that it stores memories in a way that's integrated with this dynamic core of the organism is really important. The fact that it disappears when the organism dies shows that this core attractor is more closely related to the essence of the living thing than these stigmergic memories in the genome, which are very important, but not the whole story. How much time do we have? Now I want to get into a bit of an argument about how this works in different types of particular systems, and the artificial intelligence versus natural intelligence axis is the main one here. I'd say the constraint portion of this, the encoding of knowledge and memories, is relatively cheap. Books and databases do this, but doing that in a way that's integrated with this, what I'm calling an endogenous controller or multi-scale controller, is much harder. From that point of view, I think contemporary AI is probably very cognitively sophisticated, but motivationally pretty primitive. This is not an easy view to state because if entropic motivation is grounded in basic thermodynamics, then it should be a component of anything. A large language model is a thermodynamic system. It's implemented somehow. There are going to be components of these language models that are alive in the liberal sense: they have internal energy. For example, there is electromagnetic energy. It's not easy to say that there just isn't any agent there.

[22:20] Alex Kiefer: This alone doesn't mean that the forms of agency that we want to project onto the thing that we're interacting with are those that they actually have. Or even that an agent, I'm glossing this as an organic whole, exists at the relevant scale. The argument here is that non-trivial cross-scale composition of agents is an intrinsic feature of self-organized agents. It's necessary for these larger scale agents, on our scale for example, to be intrinsically motivated by entropy production or max, as I'm arguing here. I'm appealing to some of your work. There's all this evidence that higher level agency in multi-scale systems at least sometimes can be based in a direct way on the aligned activities of the parts. I'm simplifying this into constructive interference. If the microagents' decisions are out of phase — I'm referring to the work in the paper bottom left here — then nothing interesting emerges at the higher level. If their decisions are phase locked, you get this black wave — the sum of the individual waves — and you get a higher level thing that's distinct in kind. This is to be expected if this structure is based on the idea of dissipative adaptation. Evolved structures like those in biology are capable of taking advantage of intrinsic randomness because that's something they've done in the past. It's important that if you want to produce this kind of cross-scale composition at all, but in particular if you want to produce a probability distribution at the macro scale, which you would need to have Bayesian beliefs in the personal level, then the microstates have to be non-trivial distributions so that they can compose in this flexible way so that the top-level agent can have uncertainty. I don't know how helpful this diagram will be. I'll just go through this. This is an attempt to visualize this cross-scale relationship. Suppose that this generative model, you can separate the states of this agent functionally into policy, things that encode policies, things that encode parameters and latent states. The idea here is that each of these, the agent at a given scale has a mean and a variance, and the variance is directly implemented by the variance of the means of the lower-level things. If the microstates here are subject to random fluctuations or thermal fluctuations, then this is an example in which the actions at the larger scale are thermodynamically grounded. In probabilistic terms, you see this in mixture models. A Gaussian mixture model, for example, has these component distributions and the mixture model is the sum or weighted sum of the component distributions and you get a smooth combination that preserves the uncertainty of the parts and produces a new distribution. My point with "digitally simulated agents" — it's okay to use scare quotes — is that this cross-scale relationship can't exist because it's very different. I'm sympathetic to the idea that everything is a continuum, but it's a very sharp difference even if it's continuous, strictly speaking. To implement this binary code, you need to approximate delta distributions, a one or a zero at this micro scale. Then you implement these macro states as a virtual machine. I'm saying it's a virtual machine, which begs the question, but the point is, if you have purely deterministic states and transitions, then you can't compose these things to give rise to uncertainty at the top level. You can simulate uncertainty using pseudo-random number generation, but that's just a deterministic process. You can fix the seed and get the same thing over and over. If you have race conditions, running parallel threads, then you can get some stochastic behavior out of this purely digital binary system, but that's also breaking the machine. It doesn't do what it's described to do formally. If you're implementing a Turing machine, then for it to behave as a Turing machine you need to constrain the microstate so that there's no uncertainty at that level. This means that you can't have this organic synchronization and bootstrapping of a higher-level model by tuning the parts to each other. I'd argue that it's not that LLM agents are defective. There's no higher-level agent that composes in this way in that case.

[26:48] Alex Kiefer: I'll just go through this quickly. This is important, but still remains to be developed: this idea. Reward hacking is discussed in AI alignment, where a system will try to change the value of its cost function, basically in unconventional ways. The human paradigm for this would be what's called wireheading, where you directly stimulate something in your brain, activating pleasure centers in a direct way. Humans do that. That was found to be addictive. Another way you might modify your own structure is what I'm calling controller hacking. The controller here — in a conventional digitally implemented AI system — you can think of the cost function as the nexus of control, but the interpretation of the cost function and the way it's used to update the model is the cybernetic controller for the system. The question is, can agents hack their own control systems in this way? Jurgen Schmidt-Huber did interesting work on that back in the day. I argue that we should expect a pure reinforcement learning agent to do reward hacking, but it doesn't have any motivation for controller hacking because if the cost function governs its activities, changing the interpretation of the cost function may not mean anything to the agent. We do see living things modify their own controllers, sometimes deliberately, sometimes not, in open-ended ways. If you think of us living things as governed by constrained entropy maximization, and there's a dimension of motivation that's not just utility-seeking but exploration that's fundamental and at the root of things, then you might be able to explain that, although there's probably more to say. Model collapse is not hard to explain. If you don't sample from the tails of the distribution as much, you lose resolution when training a new model on the output of the old one. But you could think of this tendency toward entropy as making up for the lost variety in the training data. I've said this already. You asked me to present on the work I communicated with you about. I had this whole section on the link to ethics and alignment. I'm not going to talk much about that now because it needs more development. The basic idea is that, first of all, there's maybe a link to mindfulness. Being able to hack your own controller and observe yourself at different levels of metacognition are related. Also, the idea that agency at bottom involves free play not overly attached to particular outcomes resonates with Buddhist themes as well as Kantian themes. If value in this way is grounded in the spontaneous will to act without necessarily having any preconceived goal, then you could argue that only life is capable of this deep alignment with us at the level of structure of agency. I'm using "life" in a very broad sense. This is all consistent with panpsychism, which I increasingly have no problem with accepting and championing. The question is which phenomena and which scales act as agency amplifiers that reveal interesting things and forms of variation. I have just a couple slides on the topic of how we should understand where the extra capabilities come from. This is really your work. I'm interested in it too, and I wanted to discuss how it looks under the lens I've been using. I think of it as the transcendent reality that informs the phenomena we encounter. You can think of it as a structured Platonic space. I've also been thinking of it — and I'm wondering how this relates to that — as a pure potentiality.

[31:16] Alex Kiefer: Greeks use the word chaos. It's this thing that contains all possibilities. When you constrain it, when you prompt it with a constraint, then it's a breaking of symmetry and it reveals some aspect of those possibilities. This is also a very Whiteheadian theme. You think of possibility as one pole and actuality as the other pole. I'm not a Whitehead scholar, but I think that's resonant with what you said. You pointed out in some amazing work and talks recently that very simple mathematical formulas, like complex number descriptions, can produce these amazing patterns. I would say mathematical formulas are constraints. From this psychophysical identity perspective, you can also think of organisms or life forms or really any physical structure as constraints. These are, I think, the reason that we see so much commonality. This might just be a different way of putting things that you've said, or maybe you totally disagree. We'll see. But you can think of these as sometimes more complex, sometimes less precise, but sets of constraints on things. The Braitenberg vehicles famously give rise to these really rich behaviors with very simple, very few constraints. I think I've said some of that already. This is something that I had: I recently watched a talk that you gave at the BAM that was great. I wrote some notes on it, and at the end I had this question: there are two distinct questions here. One is how deterministic or stochastic is a system, and the second is how much more can the system do than what our description of it predicts? Seeing more of your work on this, I realized that you clearly distinguish these questions as well. I think the appeal to entropy here: there's a danger that you're just appealing to randomness. But I think there's more to it than that, at least I suspect. The appeal in my work on this entropic motivation idea is to the process of maximizing entropy, which is, again, dual to free energy minimization. That process is a matter of randomness; it produces maximum variety within the known constraints. But importantly, during this free energy minimization process, any prediction error is like a soft violation of this probabilistic constraint. I think there's an analogy here to violations of the attempted behavior of the sorting algorithms your team has looked at, where you've got a formal description and the system's trying to do a thing, but the substrate doesn't let it do that, and then you get these emergent behaviors. I think that this gap between the way you would like things to be or the way things are supposed to be and what actually happens is a common element across these cases. So is this different from the Platonic view? I don't think so. I don't know if it's merely a scholastic difference either. I want to continue to explore this way of looking at it. One potential difference is if you see this as a continuum between completely unconstrained states and highly constrained states, then you might be able to get away with a monistic view here, in which this chaotic potential of things you can regard as physical, and it implicitly contains all of these patterns. But I'm not sure what Platonic space would be enough, and I don't know if anyone does, to understand the difference between explicit and implicit storage or containment of these patterns. In any case, whichever way you look at it, I think you can argue that agency and lifelikeness go all the way down and you can recover them in any level of organization. It's no accident that, as you said, you end up in the math department when you start thinking about these things. Thanks so much, Mike, for letting me present this stuff. I'll leave these slides up, but I'll go back to...

[35:45] Michael Levin: Cool. Thanks so much. That was super interesting. I have a bunch of thoughts and things to talk to you about. Is there anything specific you wanted to do first?

[35:58] Alex Kiefer: Another thing that I wanted to do, I wanted to discuss it with you. I'd love to know what you think.

[36:07] Michael Levin: One thing that I think is interesting about motivation is the meta aspects of it. We can be motivated or beings can be motivated towards a specific outcome, but it also seems to me there's an important component to this, maybe captured in something called willpower. It's this notion that doing something with intent, or he failed to do XYZ but that's because he didn't have the willpower to do it. So is it really his fault? I wonder to what extent motivation also has this second-order meta component; what that means is some sort of stick-to-itiveness to continue to exert effort in some general direction over periods of time. Not to get into free will here, but that's how I see it: showing up little by little every day to modify yourself in a certain way so your distribution becomes different over time. That's where it is. You're not going to get it, I don't think you get it at any particular time slice. It's the continued effort. And so that effort has to come from somewhere, presumably. What do you think about that meta aspect of it?

[37:41] Alex Kiefer: I have some experience with losing my top level focus because I've looked at the big picture. From the big picture perspective, there's no particular reason I should be doing anything. When I do have motivations that are concrete, it's because of family or friends or people that I care about. But I think there's a sense in which there's resonance with Buddhist ideals and so on, and I'm not sure. I guess there's a normative question and then the descriptive question — so normatively, who knows if that's the right way to act or not. Certainly, when you pursue something day after day and try to push in one direction, that has to come from somewhere.

[38:46] Michael Levin: Specifically as an engineering question, if we are building agents, what does it mean that we're going to put in a level of willpower? In other words, you can try to get what you want, but can you decide what you're going to want? What level and how many levels? Is it possible to have a being — could we be visited by some alien and say, "You guys can't just decide your goals, your preferences? Oh my God, that must be brutal," which it is? How many levels of that could you make an agent that had a deep hierarchy of being able to make decisions about large-scale goals and some amount of oomph to actually carry them out over time?

[39:36] Alex Kiefer: I think that the most benevolent agent is one that doesn't try to do anything. So if we're going to engineer an agent, I don't know what we can safely ask it to do other than not exist. It sounds terribly nihilistic. I'm not really a nihilist.

[40:02] Michael Levin: I hear that a lot. I'm not an expert on any of this Buddhist stuff, but I have a lot of Buddhist colleagues and I've talked to them. My big question always is, what's the success plan? If everybody does the thing and the Earth becomes populated with advanced high-level meditators, what does that look like? Are we still going to Mars? Are we just sitting in the forest and hanging out? What are we actually doing at that point? What does success look like? People typically say, "well, you do whatever you want." But really, if you get rid of your attachments and you get rid of the local, a lot of the crud that motivates us every day, we should dump that, good. What is left? What do you end up doing?

[40:49] Alex Kiefer: I do have some — I think faith is the appropriate word because I don't have any proof for it. It's the Heideggerian point of being thrown into the world, but there's the process unfolding; becoming exists. It's not just being. So perhaps there is a purpose. It's not something I have access to, but there are things happening that I can't begin to understand in full. And so I am willing to say that the entirety of the cosmos or whatever might be doing something, and I would like to get on board with that. I do think also the fact that we're here. I draw a distinction: I'm cool with Buddhism, but I also don't mind letting myself spontaneously do things that my organism wants to do, to some extent, unless I then reflectively think they're bad. From a multi-scale active inference point of view we are not that different in kind, different in scale, from just sets of beliefs. Your body is an attachment. It's an attachment of the universe to this form. And so, yes, that might dissipate over time, but insofar as I'm inhabiting it, I think letting it do what it wants with as much correction as you feel compelled to give it is fine. On the question of designing agents and engineering, I also hear you're trying to cure cancer and stuff, so that's great. That is worth doing.

[42:36] Michael Levin: I think so, but I have had people say to me in all seriousness, it's a cycle. Everything just comes back to where you started at some point, and there's really no reason to sweat any of it. It's the journey, not the destination. That sounds fine to me when you're talking about nice stuff, the arts and music and dance and whatever. Yes, you're not trying to get as fast as you can to the end. But pediatric oncology strikes me as I just can't get around the fact that some people just think that it is what it is, and you just sort of chill out about it. I just don't see it, and maybe I'm wrong, but I don't see it at all. So, it seems like there are some things that are actually worth doing at this point.

[43:28] Alex Kiefer: I know I agree, and so I'm not sure how to square that with I think I've been influenced by Kant and maybe too much, and I think you can almost read him in a Buddhist light as saying that to the extent that you're deliberately pursuing some goal, you're in a sense—well, it's not that you're doing something wrong according to him, it's just that that's not the source of ultimate or unconditional value. Maybe that's fair enough, but I don't know. I do think it's bad for the innocent to suffer; it's very traditional, but I think that's bad. Clearly we have values, we have motivations. It seems to me fine to think that's bad and to try to alleviate it. But then there's the engineering question. Maybe just to bring this back to some of the stuff I was talking about: I'm not worried about—so one worry you might have is if you try to encode values into these agents that we're creating, they will be motivated and that's bad because they might try to do untoward things in order to achieve their goals if they're not properly constrained. So that's a worry, but this is why I'm making the argument: even though I'm not sure and I don't want to disrespect any possible agents, I don't think contemporary language models are coherent enough to be agents because a lot of people are concerned they'll have Machiavellian intentions. I think they could simulate those things, but it's not the same kind of danger. The danger is a black box that might do something agential-looking versus a motivated agent. Because of this composition argument, I'm currently convinced that's not the case, but I might be wrong.

[45:57] Michael Levin: I don't have any strong claims about language models per se, but what bothers me is the seeming situation that we're in where we are not good at predicting the very specific goals and behavioral competencies of novel things. We have a list of biologicals that we know a little bit about. But when we make new things — these collective financial structures and social structures — we have no idea what level of intelligence they will have as a collective or what the goals of novel systems will be. We have this story of evolution around where goals come from, and that's fine for biologicals. You can say it's eons of selection that made sure that you have this goal and that goal. But even the synthetic life forms that we deal with — we make xenobots and anthropots — when did you pay the computational cost of selecting those specific goals? Never, because there's never been any. So where do they come from? The minimal models. The reason I did that algorithm work is that I wanted the maximal shock value in looking at the simplest, most transparent. So there's no biology. You can't say that there's something underneath that we haven't found yet. Here are the five or six lines of code. There is no more. You can see everything. And yet this thing is doing things you did not see coming after 60-plus years of every undergraduate CS student playing with these things. It seems to me from that I take away two conclusions. One is that even extremely minimal systems are potential interfaces for specific targets that we can't predict and we don't know about, and we're actually shocked when we find them. That suggests that, no specific claims about language models, but something much more complex will have more opportunity for this. The second thing I agree with you on is that the conclusions people draw from what language models say are likely not relevant because the things they say are the things we made them do. That's not the interesting part. The interesting part is what these things do in the spaces that the algorithm allows. This is a very weird way of thinking about computer science, but in this case the interesting thing is what it does despite the algorithm. The sorting algorithm sorts like nobody's business. There's no problem with the sorting. It absolutely sorts. But in the space allowed by even that minimal thing, it also does some other stuff. I suspect that these language models may have motivations that have nothing to do with the things they're saying. The things they're saying — I'm not saying that they can't be dangerous. Anything can happen. I agree that's not where the motivations will come in. I suspect they may have some. We're conditioned to look at talking things. We listen to what they say; that has some relationship to what's going on in there. I'm not sure that's true here at all. Or if it is, maybe to a small extent. I'm concerned about our inability to predict these things. I'm concerned that people resist even the idea that very complex things like organs and tissues are goal-directed. Despite cybernetics having been here for many years, it's still a big shocker. That leaves us very unprepared. We are easily distracted by what the algorithm is actually doing, and functionalism and computationalism have led us to focus on that. I'm concerned about that. I also wonder about more specifics on motivation. I don't know if Maslow's hierarchy is still a thing in cognitive science, but you made that point that the cat, once it's not starving, will do other stuff. I wonder what Maslow's hierarchy looks like for these systems. If we give them specific tasks, once those are met, what's next? I'm doing the job I'm supposed to do and getting the reward from that. Are there other levels once the basic needs are satisfied? Nobody's looking for them; we're not good at looking for that.

[50:44] Alex Kiefer: Your work on the bubble sort really threw me for a loop because what I was thinking with respect to language models was that... I want to say there's a sense in which, I'll come back to the sorting, but there's a sense in which the bits that compose these things, the transistor state, I think of that as a little thing. It's alive for all intents and purposes. It's got energy flowing through it, it's fine, so it's a life. But they don't know about each other. You're a great example you've shown a couple of times of the cell assembly reaching out and changing the voltage of another cell. It resonates with that paper about multi-scale agency where there has to be this continuous sinusoidal thing so that the agents can take advantage of the ramp-up period to the decision period to take the summed input. I think, again, insofar as the language model is its Turing machine idealization, the little life forms don't know about each other. They're constrained to stay in their little. There's no period of partial overlap that they can leverage. Again, that's just according to the Turing machine description, though.

[51:55] Michael Levin: All of that stuff, that description, is an okay interface for the front end of things that we're looking at, but I don't think it captures everything that's going on. Gio Pizzullo and I are writing something I call "Booting Up the Self." It's basically the earliest moments: you got this computer, it is a hunk of metal that obeys the laws of physics and Maxwell's equations and whatever. You turn on the juice and it's still in the first however many microseconds. But then it's taking instructions off a stack and now it's executing an algorithm. What happened? How do we get from here to there? This notion that the pieces are in one sense independent, but in another sense, if you were one of those transistors and you could look around and see what's going on, there's some physics you have to obey, but also there's this incredible synchronicity happening. Isn't it amazing that heat, this guy turned on and that guy turned on, and if I squint I can see this is some sort of giant addition operation? This happens to us all the time. At one level you don't see what's going on; it just looks like synchronicity that all this other stuff is happening and it looks descriptive, but the coders don't think it's descriptive. You wouldn't hire somebody who didn't think the algorithm actually drives the show. You don't want to code it like that, so there's something very interesting here. That leads me to another thing I wanted to ask you about something we were talking about: the difference between pseudo-random and the fact that algorithms only have access to pseudo-random. In terms of needing to generate uncertainty — uncertainty to whom? Whose perspective is it supposed to get this uncertainty? I'm sure I just missed it.

[53:58] Alex Kiefer: I probably forgot to say something there, but it's the system's uncertainty. The agent in question is the physical structure we're talking about, which I'm arguing can always have a cognitive interpretation. That's why I'm saying the language model—it's not that there's something wrong with it as an agent. It's that I don't think the micro agents add up to an agent that has uncertainty. It could be a very certain agent, but I think this composition does not work in the same way. You can't build a distribution with long tails out of these delta distributions, in effect, that are encoded in the digital, the binary code.

[54:36] Michael Levin: Ultimately, I don't disagree with you that it may well not add up in a useful way in that architecture. But because of deterministic chaos and Turing limits and things like this, it seems like you could generate tons of uncertainty about yourself with very simple, the logistic equation or, if you just start computing things, it doesn't seem hard to generate a bunch of — it may not be the right kind of scaled-up uncertainty, but it doesn't seem super hard to be uncertain about: is this thing I'm doing ever going to halt? I thought I started with the same starting point, but each time I do it I end up slightly in a different way. That doesn't seem hard to...

[55:23] Alex Kiefer: I think partly I'm reflexively taking the uncertainty of the beliefs of the agent to be the same thing as the uncertainty in the probabilistic description of it as a physical thing. A program might have some uncertainty about whether it'll halt and so on, but that would be encoded as you'd be simulating that kind of belief. I'm struggling with the premise that you've got a system that can represent that kind of uncertainty.

[56:07] Michael Levin: Only insofar as if you have a system that has no metacognition at all, then that's not a great candidate for anything significant. But if you do have a system with some metacognition where its own states are part of the environment it cares about, which I don't think is a huge ask. Bacteria do this. They have metacognitive loops that ask, "How's my metabolism going overall?" I think if we have a system like that, then having a module that is reading internal states and saying, "Wow, I have no idea what the hell this is going to do next. I need to make accounts for that because it matters to whatever," seems part of it. And I think the uncertainty you get from the outside world is maybe less internally, but not zero, not that much less, I would think.

[57:01] Alex Kiefer: Your example of the sorting algorithm—clearly there's interesting unexpected behavior, even though there isn't this soft composition; it's a very simple set of rules. We're probably not going to be able to cover this. I've got 3 minutes; whatever you need to go is fine, but I think I need to better understand this. I'm really curious. This platonic aggression research program — I want to get a better handle on what it's going to mean to explain things in those terms. I imagine that it's related to this idea of what was left open by the algorithm. It's the same way. You've got a set of constraints. So what do the constraints leave open? Well, that's a matter of what kind of mathematical structure you've described, right?

[58:06] Michael Levin: Maybe we have math for that. I'm not sure that we already do. Maybe that's something else. There's also this issue of what specific—for example, in the algorithm, one of the things that we find, having converted it to a bottom-up algorithm. So there's two sets of surprises in that paper. One is, what does the standard algorithm do? And that's the stuff with the frozen digits and all that. And then there's this sort of bottom-up version where every digit is running the algorithm on its own and then we make time errors. One of the things that one likes to do is to cluster. In other words, cell numbers with specific algotypes like to sit together while they can, even though there is no code for that. There's no code for knowing what's your algotype, what's your neighbor. So that is a very specific thing that it's doing. It's some kind of weird motivation. It's an intrinsic motivation. It's very specific. It's one thing to ask what are the spaces of other things it might do, but it's something else to say, why that particular thing. And minimally, I think we can say that the Platonic Space Research Program is simply asking: what distribution are those things drawn from? But I would like to go one step further and say, why did this particular thing ingress into this particular hanging out with your own like-minded individual; that's very biological. And the fact that that's already showing up in this minimal thing, what is that telling us? That might be telling us something. By the way, there will be a couple more papers in the next three or four months showing even more minimal systems doing it. We've done some cool stuff with even simple one line of code now. We've got a 1D cellular automaton that's doing some weird stuff. I think being able to account for the specifics is important. Part of the way I'm thinking about it now is that the math that we have is the behavioral science of a layer of that space. But there are other patterns in that space that look like they belong to behavioral science or to cognitive sciences or something else. They're not fundamentally different. They're just complex enough that the kind of formal structures that we use in math and algorithms don't work very well with them. And so we've put them in another department. But I think there actually might be real continuity there. So again, that's just a hypothesis.

[1:00:51] Alex Kiefer: If I can make a quick comment to bring it back to the entropic motivation idea, I think this idea of metacognition, desiring what you desire and so on, I think what I'm arguing is that there's a sense in which, if you're motivated by the unfolding of the universe, this maximum entropy process, without unduly imposing constraints on a small time scale, then there is a kind of motivation that's a matter of relaxing your specific local model and being pushed by that process. Insofar as you relax your constraints, there's nothing else to account for what you do other than this deeper, more holistic motivation. That's how I'm seeing it. I really appreciate this.

[1:01:50] Michael Levin: No, thank you. That was super interesting, and I think you've raised a lot of important points to think about. The thing with the pseudorandom: One of the things that I really like about some of this craziness is that it means that we didn't need quantum magic to make this happen. You don't need life, you don't need quantum anything. It's even in Newton's boring, deterministic classical universe, you already had non-physical mathematical objects constraining physics and enabling evolution to go faster and things like that. You didn't need that quantum interface. I'm sure it's cool and it adds some stuff, but you're not reliant on that. This interaction is something else. And I wonder what it does add and what you can't do with pseudorrandomness that you could do with a proper quantum interface?

[1:03:01] Alex Kiefer: I don't know. That depends on what's under the hood of what, how fundamental is this, either the thermodynamic description or the quantum description. Because entropy in a way is you can take it up epistemically. It's something based on my observation, I don't know what microstate it's in. I'm not sure exactly what that gets you, but I'm interested in the question.


Related episodes