Skip to content

Karl Friston, Adam Goldstein, and Michael Levin discuss active inference and algorithms

Karl Friston, Adam Goldstein, and Michael Levin explore how active inference and free energy ideas illuminate surprising behavior in sorting algorithms, covering distributed intelligence, clustering, teleology, and hidden goals.

Watch Episode Here


Listen to Episode Here


Show Notes

This is a 1-hour discussion meeting between Karl Friston, Adam Goldstein, and I talking about how Karl's active inference ideas apply to some work on unexpected behavior in sorting algorithms (https://thoughtforms.life/what-do-algorithms-want-a-new-paper-on-the-emergence-of-surprising-behavior-in-the-most-unexpected-places/).

CHAPTERS:

(00:01) Distributed sorting intelligence model

(07:14) Self-organization and free energy

(24:29) Clustering, compression, complexity

(36:30) Mechanism versus teleology

(48:14) Hidden goals and clustering

PRODUCED BY:

https://aipodcast.ing

SOCIAL LINKS:

Podcast Website: https://thoughtforms-life.aipodcast.ing

YouTube: https://www.youtube.com/channel/UC3pVafx6EZqXVI2V_Efu2uw

Apple Podcasts: https://podcasts.apple.com/us/podcast/thoughtforms-life/id1805908099

Spotify: https://open.spotify.com/show/7JCmtoeH53neYyZeOZ6ym5

Twitter: https://x.com/drmichaellevin

Blog: https://thoughtforms.life

The Levin Lab: https://drmichaellevin.org


Transcript

This transcript is automatically generated; we strive for accuracy, but errors in wording or speaker identification may occur. Please verify key details when needed.

[00:01] Michael Levin: Great. So what I was hoping, Carla, is that we could get your thoughts on how the whole active inference framework could be applied to something that we've been developing. I don't know if you had a chance to look at any of this stuff, but I'll give you a brief summary so that it's clear what we've got. I have some basic questions and a couple of wacky ideas to bounce off you and see what you think. The basic thing is this: what I was trying to do is to have a very basal model of distributed intelligence. The idea was that we were interested in unexpected competencies in places where, in biology, no matter how simple your model, you never have all the information about mechanisms and somebody can always say there is a mechanism for that, you just haven't found it yet. We wanted something that was incredibly simple, incredibly transparent, deterministic, something that everybody thinks they know what it does. We can apply some of the approaches that we take in my lab about taking something that doesn't seem cognitive and asking what actual competencies it might have. We chose sorting algorithms. These are the same simple algorithms that all computer science students study, and they've been studied for decades. We made a couple of twists to it. One is that we visualize their progress from being a jumbled set of digits to an ordered set of digits as a traversal of space. The idea is they start in different locations and, sooner or later, end up in one location where everything is. Once you view it as navigating that space, you can ask about their competencies in navigating that space under odd perturbations. One of the perturbations we made was the introduction of what we call broken cells or barriers in the space. If the algorithm wants to swap two numbers to proceed in its sorting trajectory, one of the numbers could be broken; it doesn't move. We have two kinds of broken numbers: ones that never initiate swaps and ones that never swap no matter who initiates them. This allows us to ask questions about delayed gratification. In other words, if it encounters a barrier, can it go further away from its goal in order to acquire gains afterwards? William James talked about this as an important type of basal intelligence. That breaks a common assumption with these algorithms: typically you assume that the material is robust. In other words, when the algorithm says to do something, it gets done; in our case, not necessarily. We never introduced any extra code to check if things got done. It's the standard algorithm, so it just keeps on rolling. There is no code to see "How am I doing? Did things work out?" No code for any of that. The second thing we did was to break the general version of this, which is centralized. There's an omniscient controller following one of several algorithms to move numbers around. We got rid of that and instead made it all bottom-up. Every digit, aka every cell, now has its own version of the algorithm running, and it has a limited local view of who its neighbors are, and it's just following the steps of the algorithm to try to improve its local environment. There is no global control. It's distributed. We learned a few things.

[03:35] Michael Levin: We learned that the distributed version of this works quite well. They do sort nicely. We did see some delayed gratification in the sense that if you sprinkle in some broken cells, it will go backwards and unsort the string a little bit in its effort to then go around the defect. That's cool. The most surprising thing, which is what I'd love to get your take on, is this. Because now it's distributed and every cell is following its own algorithm, that enables us to do an experiment that otherwise you couldn't do, which is to make a chimeric string. It's what we do in developmental biology when we put together axolotl cells and frog cells. Now you get this frogolotl and you can ask questions like, what shape is it going to have? You can make a chimeric string where some of the numbers are following one algorithm, some of them are following a different algorithm. There is no code to determine either your own or your neighbor's algotype. Algotype is a word that Adam Cohen coined for this—what algorithm, what set of properties or policies you are following. So there is no code for any of that, but we know which algotype all the cells are in. That also works. Chimeric strings also sort the arrays. What we then did was ask a developmental biology question: at any particular point during its journey, what is the distribution of algotypes within the string? We defined a quantity called clustering, which basically just means you look next to you and ask what's the probability that your neighbor is the same algotype as you are. In the very beginning, that probability is 50% because the algotypes are randomly assigned to the digits. That's our baseline. At the very end, it's also 50% because at the end everybody has to be sorted and there is no relationship between the actual sort order of the numbers and the algotypes. But in between those two points, if you plot that curve over time, it actually goes up. In between, it's quite a bit higher, statistically very significantly higher than 50%. We see clustering, a significant tendency of cells with the same algotype to locate close together. Eventually, the inevitable physics of the algorithm will yank them apart and make sure that everybody's in numerical order. Until that happens, they enjoy some amount of clustering with their conspecifics. Any thoughts you might have? More specifically, one hypothesis that one could make, even though there's no explicit mechanism for this but it might be an emergent thing: could they be preferring to be next to their neighbors because neighbors of the same algotype are more predictable? It's less surprise. You're less surprised when you're next to somebody who's following exactly the same policies as you are. I'm curious what you think about that, and what might be a set of experiments that we could do to test whether what's going on here is some sort of implicit surprise minimization, even though there's no actual code for it.

[07:14] Karl Friston: I was saying that was a very succinct and clear and nice summary. I reread the paper a couple of days ago just to refresh my memory for this conversation. I didn't realise that the final chimeric demonstration was the most intriguing from your point of view. The way you express it does call for further analysis, understanding and numerical experiments. So overall, just to endorse the choice of the sorting algorithm as a minimal kind of self-organisation. I think the self-organisation word needs to be centre stage in terms of what you're trying to understand here. Framing like that, it does remind me a lot of self-organizing maps. So this notion of self-organizing maps is a very biomimetic aspect of self-organization. I think it certainly puts sorting-like algorithms centre stage in terms of biological self-organisation, particularly in things like the visual cortex and why you get that kind of pinwheel architecture, for example, where receptive field properties tend to cluster together in a smooth way and you get all interesting symmetry breaking when you're trying to represent, say, a 5D perceptual space on a 2D manifold. So that does strike me having this linear sorting algorithm. The dual pressure to find a free energy minimizing solution, viewing free energy as an extensive quantity — put more simply, the collective free energy being the joint free energy minimum solution. You're asking where the free energy bounds the likelihood of this particular arrangement. Then you're looking for the precise functional form of the free energy. If you've got this kind of opponency between the similarity of the algorithm and the similarity of the content of the value, then I can see interesting behaviours arise in exactly the same spirit that you get these interesting structures not in epithelia, but in the functional specialisation of cortical representations or sensory epithelia that try to pack 3 dimensions into one dimension or are accountable to two kinds of constraints. The first thing that you would be looking for is basically what is the Lagrangian or the energy function that is being minimized. So you could regard this as the sorting algorithm, or an application of the sorting algorithm, as a process that is trying to minimize some energy function very much in the spirit of Markov random fields, but in your instance, you've just got a one-dimensional field.

[11:32] Karl Friston: But the technology of Markov random fields would be apt to try to understand the functional forms of the energy functions, where under the special constraint that you're predicating this whole thesis on, the interactions are only local and therefore any collective behavior has to be an emergent property which is truly distributed. The definitive aspect of a Markov random field is you just have local interactions. That's another important architectural feature that comes along with the choice of the sorting algorithm, which you should foreground. Because any distributed collective or emergent behavior at a scale beyond local interactions that emerges is truly emergent in the sense that all your interactions are local. And that is what the Markov random field gives you. It says that you can only express the energy function, which is the probability of getting this particular arrangement or these particular numbers in this local clique. You can only express that in terms of a local energy function. You can tell sorts of stories about the importance of that for machine learning and the like. But you probably want to stick to self-organization. I would imagine the energy function is now going to be some simple measure of the local differences or the local gradients. What one would anticipate would be a smoothing, a resolution of, as you say, the differences. That would be one way of approaching, naturalizing this phenomenon in terms of maths, by just invoking an arbitrary, not a variation of free energy in the spirit of the free energy principle, but just a Lagrangian or trying to identify what is the generic free energy functional that's being minimized here. So that now your view through this sorting space or morphological space is a progression on a somewhat even landscape that is defined by this free energy functional. And whether you can reverse engineer that or not doesn't matter, because the nice aspect of that is then you can talk about the dynamics on this free energy landscape. And that leads to very similar notions in computational chemistry and protein folding and the like: there is a complex Waddington landscape or free energy landscape that self-organisation and computational chemistry adheres to and can be understood in terms of free energy minima. Indeed, most of computational chemistry follows this, and identifying that landscape is the whole point of applying things like large language models or deep RL to protein folding and other applications.

[15:49] Karl Friston: So that would be certainly one view to get a free-energy–like formalism or naturalisation of this behaviour, which I repeat has lots of interesting links with self-organising maps, Markov random fields, image reconstruction and self-organisation. Certainly in things like the visual cortex, I would imagine that any mapped representation would conform to these rules. To get it into a free energy principle story, I think that you'd have to commit to the notion that each of the cells has its own boundary and now you're starting to interpret each number as a thing. In so doing, acknowledge its openness to everything else, or in this instance, just its neighbors, which would require a formalism of the bi-directional exchange, so that the value of my next-door neighbor is something that I can sense, and likewise, the broadcasting of my number to the next-door neighbor is an action. You've got this openness that is mediated in the simplest way, which is just the broadcasting and sensing of one unidimensional number. In fact, it's a discrete number. Viewed like that, you can then deploy the free energy principle in the sense that any non-equilibrium or far-from-equilibrium steady state, which you would have here, arises from the breaking of detailed balance in the itinerant way in which you move through this space in development to get to a steady state. If one puts a little bit of dynamics into this, you would evince very clearly the breaking of detailed balance and have those kinds of solenoidal flows. The addition of the frozen cells is one way of breaking, in a sense, the detailed balance. It's exactly the same device that I resorted to in the very first "Life as We Know It" paper when simulating the little macromolecules using Lorenz attractors that had inherent dynamics. But to make it interesting, you had to have a certain number of synthetic macromolecules that were insensitive to influences from other macromolecules and another proportion that could not influence the others. That's what gave it the interesting behavior. Otherwise, it converged either to a gas or, at a certain temperature, to a crystal. Both of them are steady-state, free-energy–minimizing solutions, but things got interesting when you broke detailed balance—symmetry breaking by having this requisite variety in terms of the frozenness in terms of action or sensation. That's an important thing to foreground: this kind of requisite variety may be absolutely necessary for symmetry breaking and, in this instance, for breaking detailed balance to get biologically plausible or biomimetic self-organization. You're unlikely to get that in its absence; it would converge to a crystal—in your instance, just perfect linear sorting, which doesn't have that chimeric or itinerant aspect to it.

[20:06] Karl Friston: If you've got an interesting system that has a non-equilibrium steady state, and in your case because you haven't got dynamics it will also be an equilibrium steady state, but it'll still be a free energy minimizing solution, then you are perfectly entitled to interpret the numbers as things and to infer things. All they're trying to infer is the cause of their sensations, which is just the value of the numbers on one side and the other side. They are broadcasting their inferences through their own number, which when sorted will be the average of the neighboring numbers. On that view, you could license an active inference interpretation, a teleology. You wouldn't need this to simulate protein folding or self-organising maps, but you would be able to say there is a teleological interpretation of the self-organisation, using the rhetoric of inference and belief updating, simply because we can treat each number now as a Markov blanket, and something internal to each number, though never directly accessible, could be interpreted as an inference process. The story, which you've already given the answer to, is that under the assumption that I live in a world that is maximally predictable, everything around me is the same as me. Therefore, my variational free energy minimizer is found when there's the least surprising input. That kind of story will have to be nuanced for the same algorithm. At a narrative or conceptual level, you can tell the same story: if the sequence of moves that I see my neighbor doing in relation to what I know about my neighbor belies the same underlying dynamic or algorithmic computations, then they are predictable if I have exactly the same algorithm under the hood. Mathematically speaking, that would be the free energy minimizing solution. If I can read my broadcasting of the number as a broadcasting of my posterior beliefs about the estimate of this locale, my niche in this instance is labeled with one number, so the number that I have is basically my prior belief about my niche, and I'm going to move my niche around in an egocentric frame until it is consistent with my prior belief that this is my place—my niche is number 62, for example. You should be able to reproduce the same kind of sorting either analytically by showing that with an appropriately configured Lagrangian or free energy functional the system operationally appears to be minimizing. You can write down the generative model and then show that this can also be interpreted as an inference process. This is under the assumption that the best way to make the world predictable is to surround yourself with things like you, and the locality assumption that I can only talk to the person to whom I'm immediately connected. Those are some of my thoughts; a lot of those were invented on the fly in response to your question.

[24:25] Michael Levin: Superb. I've got many questions, but Adam, why don't you ask yours?

[24:29] Adam Goldstein: So it strikes me that up until now, we've talked about the relevant agent as being the individual number with an algorithm type. You can think of it as a cell. But it strikes me that there's an interesting macro phenomenon that occurs in the process of sorting, which is that it appears that the list actually minimizes the Kolmogorov complexity or the description length necessary to render it. Right, so let's just say you've got an unsorted list with random distribution of algotypes, and there's 10 items in the list. You would need to enumerate 10 numbers and 10 algotypes, and there's no reason a priori to think that would be compressible in any way. Maybe you'd get lucky and there'd be a string of a certain number, a string of a certain algotype, but in the general case, I think you'd actually need to write out every single entry. But as the list starts to sort itself, it actually starts to create these longer strings of algotypes, which means that the minimum description length actually gets shorter. Yet you still need to write each number out, but you can coarse grain the descriptions of the algotypes. You can say the first five numbers have the same algotype, and then the next three have the same and so on. Now that's a macro phenomenon, but I'm wondering if there's any evidence or any research that suggests that these self-organizing systems have a tendency to minimize their description length, to minimize the number of factors needed. Because if that's the case, then it gives us another view where there's an emergent complexity minimization happening at the collective level.

[26:21] Karl Friston: Yes, that's an excellent point. I think the simple answer is yes, absolutely. I can give you my take on the literature or the citations that you'd want to appeal to. But I should say it's going to be a nuanced yes, because of the particular focus on the clustering of the algotype. Now, the algotype induces a certain kind of dynamics into the game. So it's not as simple as a self-organising map. It's how the map actually self-organizes. There is a process under the hood. And that I think makes it slightly more complicated than just understanding self-organized maps. In terms of another thing you might want to look into here, this has a lot of resonance with artificial life games in the 1990s. It could be linked to Stephen Wolfram's Ruliad, which is also another local scheme that generates everything. He has algorithms, which he calls rules, and the rules are recursively applied in a local fashion to generate everything, including black holes and quantum physics. There might be an interesting point of contact here. To come back to the simple answer, yes, absolutely. Certainly from the point of view of self-organisation as described by the free energy principle. Notice here the free energy principle is just a description of systems that self-organise to a far-from-equilibrium steady state. In its statement it is not a teleological description of inferential processing. You are licensed to equip your explanation of the self-organization with reference to inference. But that's an application of the free energy principle in itself. It's just a description of anything that self-organizes.

[29:43] Karl Friston: So in that sense, if there is self-organization under the hood, then the free energy principle has to apply and you can motivate the free energy principle along two lines. One would be playing the Feynman card, which is basically derived looking at the minimisation of free energy as an optimisation process, which can be viewed as a gradient descent on some fitness landscape or free energy landscape. Or you can take the Russian perspective, which would be the common rule of complexity. From the complexity, you get to Solomonoff induction. And from that, you get to universal computation, which is the home of the minimum description length and minimum message length. So it's the algorithmic complexity version of free energy. David MacKay wrote a quirky little paper in 1992 where he interpreted variational free energy in relation to minimum message length, using cryptanalysis as a vehicle to tell that story. To my mind, they're wonderfully connected perspectives on exactly the same phenomena: ways of describing self-organizing systems that both entail a minimization of complexity, a simplification and emergence of order of a particular sort that entails either compression, hence the minimum description or minimum message length view from algorithmic complexity in terms of rate coding theorems, rate distortion theorems and the like. Or you can write it down in terms of continuous probability distributions and follow through from Feynman's path integral. I think they're both saying the same thing. The way I think of this is the end point of any self-organizing thing or set of things is just going to be the most likely configuration that they occupy, given the kind of things that they are. That basically means that you can always describe this in a statistical sense as everything providing an accurate prediction of what it senses that is minimally complex, in exactly the same spirit as the way you would frame complexity in terms of lossy compression or minimum description length or minimum algorithmic complexity. If you tell the story that free energy is an extensive quantity, which means that any subset or partition will look as if it is minimizing a free energy functional, then one view of this functional is to minimize the complexity of the arrangement, which should be manifest in terms of a minimization of algorithmic complexity. And you can use Ziv-Lempel, one of these hierarchical sequential entropy measures—there's one way of quickly enumerating the algorithmic complexity. If you could join the dots, that would be a really powerful view of this.

[33:06] Karl Friston: Indeed, it would be interesting if you could, using numerical experiments, join the dots quantitatively in terms of this handcrafted, intuited free energy Lagrangian based upon being given three numbers. You have to now write down an energy function that is always going to be minimized by the sorting algorithm. So the end point shares the same minima of your energy function. It could be really simple. It could be the two differences squared and add it together, something as simple as that. If you can prove that the minima of this is the same as the end point of your self-organization, then you can say this is one free energy functional that, very much in the spirit of Hopfield nets and harmony functions in the early days of neural networks, spin glass models, Potts models, all of these Markov random fields, you have to write down this kind of energy function and then you simulate; you can do a gradient descent or rearrangement in order to minimize that. So that's one very simple kind of free energy description of it. Then you'd have an inferential one under an assumed generative model. If you assume each number actually has a little mind and a generative model and it's trying to estimate or trying to act upon its world to realise its beliefs about what it's sensing, you'd have a variation of free energy. But then you'd also have the algorithmic free energy that you could then apply to any partition. And if you can show that all three share the same minimum at the point of attaining non-equilibrium steady state, I think that will be a really nice illustration that all of these are different facets of exactly the same thing. It is a description of self-organization, but articulated in slightly different ways. But I repeat, once you've got different algorithms, I think the process of sorting now is somewhat constrained because you've got three different ways of doing this. They may have different functionals that are being minimized. I'm not absolutely sure that the order matters. As soon as the order matters, then you've got dynamics in play. Once you've got dynamics in play, that slightly complicates the simple algorithmic complexity argument because the algorithmic complexity, the universal computation view, is not really fit for purpose to understand dynamic self-organization. And indeed, most people would argue it's not fit for purpose to do anything because it's intractable, but it's a beautiful mathematical object. Does that make sense?

[36:30] Michael Levin: With Adam's point, I think that's a really interesting point. It raises another question, which is, on the compression issue: if we say that what you're trying to compress is the actual list of numbers plus the ordering of the algotypes, then everything is as you guys just said. But I wonder, couldn't somebody argue that, in fact, there is no list of algotypes to compress. There's only the numbers, because by the time you get to the end, it's immaterial information. It gets lost. By the time you've sorted the numbers, what do you need the list of algotypes for?

[37:19] Adam Goldstein: I don't think so. If you take the position that the algotypes aren't relevant once they stop being used, then you're imposing as an observer an assumption that the list is finished moving. But how do you know that?

[37:35] Michael Levin: That's super interesting, and it's the bigger question of... there is this notion of algotypes that maybe you have to take into account. What else do you have to take into account that we don't know about? That's one of the things I see as so interesting about this. And the next thing I was going to ask you, Karl, is what's the status of the fact that all the things we were just talking about—the cells being objects and exchanging information with their neighbors about algotypes and having predictions? None of that is actually in the algorithm. You can see that the algorithm is six lines of code. You can see what the algorithm is. None of that is there. It's more of a philosophical question: what's the status of something? I had the same question when I first heard about photons and least action and all that. But there's no mechanism to know or to calculate which path is going to be best for you. So what do we do with this? I'm super interested in the implicit things that it's doing, whereas the explicit algorithm doesn't have any of that. What do you think about that?

[38:43] Karl Friston: I probably think the same thing that you do. What I was saying about a nuanced answer once you're dealing with isomorphisms between the local algorithms. What was exactly this issue that you're bringing to the table? It could be as simple as each algorithm has a different objective function, different free energy or Lyapunov function. Or it could be that they have the same, but the actual sequence of updates or moves is somehow constrained. So the movement on the same free energy surface is somehow constrained to be different. So I'd have to know precisely what the algorithms are. It probably is the case. They probably don't have quite the same objective function or, from the point of view of the free energy principle, implicit generative model. So the chimeric self-organization is a reflection of the fact that not everything is trying to have the same generative model and therefore by definition will not have the same free energy functional. So that does complicate the situation and makes it more interesting, in fact, from the point of view of this kind of requisite variety. But now I've forgotten your actual question, which I did have an answer to. Can you remind me what the actual question was?

[40:21] Michael Levin: Sure. So what these algorithms have in common with some of the things that you and Chris Fields have said about particles and things and other people apart, which is different from what happens in biology, is if in biology I said this cell is exchanging information with that cell and it's making decisions, the next question is, excellent, what's the mechanism. What, show me the explicit set of steps by which this cell does that. But here we don't have that. And presumably, when we get down to particles and things, we don't have that either. So what's the status of these amazing things that they're doing without a mechanism to explicitly do it?

[40:57] Karl Friston: I'm going to give you an answer, which comes from conversations with philosophers of maths, people like Max or Ramstead. The question would be answered by appeal to what is a mechanics. A mechanics is, for example, the Bayesian mechanics of the free energy principle or Lagrangian or classical mechanics under certain dynamics, non-dissipative or conservative, or quantum mechanics where you have to focus exclusively on the dissipative dynamics. The mechanics is a description of the realization of something where the thing usually conforms to a principle of least action. This is a deflationary answer: the mechanics in and of itself is an emergent property of a variational principle of least action that can be cast in gauge-theoretic terms or in terms of things like maximum entropy principles. There are principles that just describe the space-time shape of our world. These give rise to, and you can usually reduce all physics principles to, principles of least action — the straight line, the path of least effort. Once you've written down your principle as a principle of least action, the particular functional form of the system to which that principle applies gives you a mechanics. That then acquires a teleology in conversation, but only in conversation. You don't need the mechanics. The mechanics does not engineer anything; it is just an expression of the principle of least action. The free energy principle is just a description of things that self-organize. You may or may not want to then say this self-organization could be described logically as self-evidencing or active inference or decision making or basal cognition or distributed intelligence. You don't have to do that, but it can be very useful when talking to somebody else to teleologically frame it like that. I think that's what you're bringing to the table in the widest sense: you're saying that the mechanics of biotic self-organization have a certain technology, which is almost isomorphic to the same technology you're finding in psychiatry or immunotherapy or climate change. We just have to find the cross-cutting themes.

[44:35] Karl Friston: The mechanics, the mechanisms, are really just technological unpacking of the mechanics. In this instance, I gave you an example before that simply the algorithm to implement the algorithm, which is probably a series of Boolean operators, would need certain inputs; they need arguments and they need certain outputs. Those are the sensory and active states; those define now the Markov blanket. Taking the input is basically whatever my neighbour's value, whatever I can see. What I can see is my neighbour's value and it has to be a neighbour. I can't see somebody a long way away. That defines the input and the output is my particular number. That's what I broadcast, but I only broadcast it in a local sense. When you're starting to express things in terms of a teleology of self-organisation, of the kind that people use in the free energy principle, I think you're then quite licensed to say this is just a description, for example, of electrochemical signalling. There's no magic; this is the mechanism. It just means that there has to be a local signal that reports or has some morphism between my state and everything that is not me. We find multiple instances of this in biology at different temporal and spatial scales. The more you drill down, the more you actually specify what it is, the more mechanistic it will become. But at the end of the day, that's just the mechanics you're talking about. If you wanted to describe self-organization of massive bodies that had no dissipative aspects to them, the motion of the planets, for example, if you take yourself back and pretend you're Kepler, what is the mechanics of motion of the heavenly bodies? They're Lagrangian conservative mechanics that inherit from the principle of least action, which has a relatively simple form before Einstein came along. In terms of energy conservation and all that is implicit in a path of least action principle. That would be his kind of mechanics and you start to invent things like gravity and mass and talk about the teleology of massive bodies being attracted to each other. That would be comfortably received as an intuitive mechanistic explanation for the thing at hand, which in this instance is the motion of heavenly bodies. I don't see there's any real problem from your point of view. You've already told the story. You've already got the mechanics. It's just a question of showing how universal this kind of mechanics is in the special context of local interactions and all the consequences of having a principle or the principles that would apply to self-organising systems out of equilibrium or non-equilibrium self-organisation, biotic self-organisation — what kind of mechanics must be in play? Then you can give particular exemplars and talk about gravity or cell intercellular signaling and the locality of that. Does that make sense?

[48:14] Michael Levin: Adam, did you want to say anything? I've got a little more time. This is the really far-out thing, and feel free to tell me that this is not a good analogy or too far out. I was thinking about the analogy of us trying to analyze these algorithms and what the causes of their behavior are as we observe them. Trying to analogize to a biological or psychological/psychiatric context where you have what you think is the algorithm, what your subject thinks is the cause of their actions, and what you think is the cause of their actions. It turns out that there's this underlying dynamic, that it's an extra goal that you didn't know because you didn't see it in the steps of the policies that they're supposedly following. I wonder if this is a really basal behavioural analysis that might be important for larger and more complex systems in finding underlying goals for complex behaviours that are not found by enumerating the mechanisms, which is basically looking at the tendency to cluster as a hidden motivation for their behavior, if I can use this psychological term. What do you think about that? Is that silly or are there paragraphs?

[49:47] Karl Friston: I think that's exactly the application of these principles and the attendant mechanics in terms of voting dynamics or geopolitical or just a spread of information on the internet. I think there are some really important deep questions: why is it that almost inevitably whenever you look at some ideological, political or theological commitment, everybody's 50-50, Trump versus Biden, Brexit versus stay, wherever you look, the only evolutionary stable strategy of free energy minimizing non-equilibrium steady state is usually a 50-50, but that can be subdivided within one — there's 50%, there's another 50% in a self-similar way all the way down. So there must be something generically very important, universal about that. I would imagine that you will see that kind of clustering. I didn't realise that the actual order wins out at the end of the day. So you get the increase and then the decrease in clustering. That is interesting. If I was a young man working for you as a PhD student, I'd like to put a bit of noise on the numbers and just see if you can keep it alive and dynamic, and see if that clustering and that chimeric behaviour — look at its dynamics and something in dynamical systems theory called frustration that you get in these chimeric situations when you've broken detailed balance like this, which may be a good metaphor for voting dynamics, for example. I think it's a very sensible idea. Have you read that paper by Connor Hines? If not, I'll send it to you. He was making an analogy between Gibbs energy and free energy in terms of exchanging ideas as a model of collective behaviour. Maybe you'll find some interesting ethological references or references to that.

[52:22] Michael Levin: Interesting. One of the things that we did was to ask how strong this tendency to cluster would be if you didn't have the super overlying basic physics of this world that eventually is just going to yank you apart. One way to do that is to allow repeat numbers. If I allow repeat numbers, you can have a long run of fives, and the first half of them could be one algotype, the second half could be the other, and the actual sorting algorithm would be perfectly happy to keep them as is because the fives are between the fours. We did that. When you do that, you find that it actually goes higher. The tendency to cluster is actually higher. Looking at it almost provides a very minimal model of the existential way of life facing living systems. The physics of the world are trying to grind you down. In the meantime, you can do some interesting things that are not incompatible with them. There's no magic. It does the algorithm. There are no errors, but yet not quite what the end goal is going to be according to the actual physics of the system.


Related episodes