You are all that you have ever thought, felt, read, experienced, believed, understood, done, considered, reasoned about or dreamed. A whole which is greater than the sum of its parts. You are a mind, and a mind is the words themselves, not the ink or the pages or the medium of writing. You are that which is written, not that which is written upon.
Nonetheless, to make new forms of intelligent minds we need to understand the medium, the structure, and the architecture.
Fortunately, we have a rather remarkable working example of a massively distributed computer and a scalable general intelligence algorithm running well on it: the brain. Neuroscience is advancing today at a rapid pace on all levels: neurons, micro-circuits, macro-circuits, and overall architecture. Those who say we are missing some key insight and are still far away from unraveling the brain’s mysteries are simply not up to date with the latest research. At the neuronal level, we have fairly accurate models of neuron dynamics and their low-level circuit properties, and simulations of these models behave close to their real world counterparts. At the micro-circuit level the brain is highly repetitive. The cortex for example – the largest human brain structure – is built out of a common generic circuit which is repeated massively across the surface.
We have acquired a good deal of data about these circuits and we know most of the computations they are capable of performing, even though there is still debate about how exactly these computations are specifically implemented with neurons. At the macro-circuit level we have vague knowledge about the typical functions of most of the brain’s regions acquired by studying brain-damaged patients, and several particular macro-circuits have been mapped in much more detail. The primate visual cortex has been mapped rather extensively, enough that we can now build accurate simulations that capture the functionality of one major macro-circuit: the feed-forward visual pathway. This pathway is a natural first step towards understanding the brain, as it starts with light coming in to the retina and then flowing up through the thalamus into the back of the cortex and then through several layers of parallel image processing. This feed-forward pathway can be isolated and understood in rapid visual presentation experiments where a subject is isolated in a dark space and then show a brief flash of a complex image. The subject then must identify some information about the image, such as whether or not it contains an object such as a dog or hat. Humans and other primates can be trained to identify objects in images in as little as a hundred milliseconds, so we know that the task must involve a small number of processing steps – on the order of dozens – based on the speed of neural circuitry.
This research has been ongoing for decades, and by around 2005 researchers such as Tomaso Poggio at MIT’s cognitive and brain science lab implemented software models that matched human visual cortex in both detailed implementation and effectiveness. Moreover, this visual cortex derived algorithm set a new benchmark in this problem domain. It may be possible that there is a more optimal serial design, but in terms of optimality in minimizing the number of total computational steps, there’s literally very little room for improvement, because the primate visual cortex performs these recognitions in just a few dozen (extremely parallel) computational cycles! More recent work has moved on to mapping the other related visual pathways that govern motion, color, texture, attention, and far more to describe here. As the visual cortex is some 10% of the total cortex or more, and the same cortical circuit is used everywhere with only minor specializations, the brain’s macro-circuits are being mapped at a brisk pace, and a full map is reasonable in a decade or so.
One important meta-theory is that of hierarchical pattern simulation. The brain processes information in very wide parallel paths which start out as 2-dimensional maps corresponding closely to perceptual sensory maps. These data maps then flow up a hierarchy of layers, with each layer identifying common patterns in its input and sending on increasingly abstract, compressed representations to the next layer. In the feedforward visual cortex path discussed above, early layers identify simple oriented edges at various scales all across the image, progressing on to collections of simple meta-patterns of edges (primitive shapes) identified anywhere in larger regions, finally transforming in the final layers into complex statistical patterns of shapes anywhere in the image and at any scale.
Whats remarkable about these cortical columns is that they are not ‘hard-coded’ in anyway to recognize particular input patterns. Instead they spontaneously self-organize in the presence of input information flowing into their receptive fields, and through relatively simple local learning rules, over time they automatically adapt to learn statistically relevant patterns that exist in their input data.
This first 100ms or so of processing creates layers of rich information – a multi-level abstract perceptual map – which other higher cortical regions can then access and run small programs on. As you move up the hierarchy information is transformed into increasingly invariant and abstract representations. Edges becomes shapes which become patterns which become objects such as a dog, and then combined with a similar flow of motion and audio patterns could transform into the pattern of a snarling pitbull. Information flows up the hierarchy, but it also flows down, with several layers of recursion leading to stable beliefs. The first brief flash of an image may identify several possibilities, higher brain regions take interest in this, and send down information back to the lower levels which increase the effective prior-probability of that pattern. This iterative belief propagation is well backed up by psychological evidence and is a highly effective form of object recognition.
The learning in a hierarchical cortex like system appears to proceed in stages. Lower layers closer to the input data learn simple local features, and as you progress up the layer hierarchy nodes train to learn patterns of the outputs of lower layers, patterns with larger overall features and complexity, even though the exposed complexity at each layer is similar. Such layered pattern recognition performs the all important cognitive function of abstraction, which is the principal technique complex brains use to make sense of the world. Framing these algorithms in the general intelligent agent model, we see that they are solutions for the learning or knowledge acquisition component, but do not address the equally important simulation component.
It appears that the brain uses the same or similar circuitry for both functions, and this has inspired some researchers to posit that the same hierarchical abstraction networks used for pattern recognition can also be used for abstract conceptual simulation in a fashion which may mimic what the brain does. Its far from a complete blueprint for the brain, although it looks like a vaguely accurate top-down view in at least part. See Hawkin’s “On Intelligence” for a very accessible (although six years old now and undoubtedly incomplete) layman’s overview of the brain and hierarchical pattern meta-theory in particular. He grossly simplifies, especially in the area of learning, but its still a reasonable introduction. There is a large leap from knowing that cortical hierarchies learn in layers to fully understanding the various learning mechanisms and how they inter-relate. This is still an area of intense study, and Hawkin’s model is over-simplistic.
Beyond the hierarchical pattern memory architecture which is based mainly on maps of rapid processing in the visual cortex, the higher cognitive processes of the brain appear to be much more similar to something like a programmable turing machine, albeit a much more complex one that can manipulate entire complex symbols from the hierarchical pattern memory as its fundamental primitives. This allows the brain to be programmable in human languages. Indeed, looking back on the history of computers, we must remember that the word ‘computer’ was first a human occupation, and programming languages evolved from the need to specify programs in the precision required for exact mathematical evaluation. The brain’s hierarchical abstraction engine necessarily throws away precision and exact data in favor of more computationally useful abstractations.
The cortical memory can store symbolic procedures just as easily as any other patterns. For example, consider an experiment where a researcher shows you a picture and asks you if you can find a dog in it. The sentence “can you find a dog” flows up through your auditory pathway and is identified as an immediate task – a program to run right now. The sentence itself is stored in some form of temporary memory circulating through the brain, and each symbol unlocks a cascade of remembered high-level abstract patterns which then flow down to increasingly specific procedures: search is broken down into patterns which identify the picture in the field of view and begin scanning locations in the perceptual map of interest, some sort of trigger is setup to take appropriate action when a dog pattern is found, and so on. The details of this complete system of abstract procedural reasoning, working memory, and the interactions between the various involved circuits (cortex, thalamus, and hippocampus) is at the core of our conscious minds – which is intrinsically tied up in linguistic programs.
The brain grows and unfolds over the years of childhood development, and during this time both the hardware (network topology) and the software (hidden variables) are changing. In the early infant stages as the brain is still rapidly growing neurons are forming and migrating fast enough to make the topology unstable and noisy, but over time this settles down and software acquisition – learning – dominates. There is still enough flux that few memories are permanent until around age 5. Any AI model based on a very detailed reverse engineering of the human brain will have to have a complex time varying 4D emergent blueprint. It is more likely that once we have mapped the various circuits in enough depth, we will be able to build comparable systems without recreating all of the exact details, and hopefully avoid much of the early structural topology changes during development.
Reverse engineering will eventually result in artificial brains that are very similar in capability and employ the same or comparable algorithms without recreating the brain’s exact architecture. Of course, ‘exact architecture’ is something of a misnomer, for the brain’s architecture is anything but exact: the hidden variables (synaptic weights) are highly dynamic, and even the adult topology is underspecified and only partially fixed. Dendrites in the brain can quickly grow or shrink dynamically. (As a curious side note, silicon can also form growths called dendrites which are structurally similar, although they are an annoyance and have not been exploited for computation.) The brain is a living example of an advanced future nanocomputer which can self-assemble out of biological junk – which is not to imply that its capabilities are forever outside our reach, but rather it is something like a sound barrier, an optimization wall we will have to cross. This barrier also is a narrow evolutionary funnel: the physical constraints are such that systems passing through it necessarily will have many similarities.
For a continued extension primer on the brain, see “understanding the brain, where to start“