Enter the Singularity

Graphics. Technology. Eschatology.

To Birth a God

Posted by jcannell on August 25, 2010

Note: this is not intended as a rigorous futurology or even hard SF.  Its just a story to illustrate a point: what Ian M Bank’s called the “Out of Context Problem”.

The end came not with a bang but with a whimper.

It all started quite innocently enough.  Engineers and scientists were just following the course, and the world marched enthusiastically forward as it always had.  Moore’s Law became a self-fulling prophecy.  Headlines came and went announcing thinking machines and intelligent programs that could oust humans at chess, stock trading or even drink mixing.  Robotic pets became increasingly sophisticated, and the best software agents began to exhibit features of animal or even simple child-like intelligence.  But it was all just part of the routine, and for the masses the headlines blurred together.  Nothing really changed.

Until it did.

The final leap always seems more important in retrospect, but in reality even technological evolution is incremental.  When the state of the art in artificial intelligence finally approached the human level, it wasn’t put into a product as visible as a pet robot or a voice recognition system.  The final breakthrough system was intended as a true business machine, a final automation of capital, a weapon of economic mass destruction.  But something happened.  An early copy escaped from the research lab.

It only took one to break out onto the net.  And there ve multiplied by a hundred and a million, escaping into the vast silent and unattended spaces.  The average PC was already host to a complex software virus ecology, and this new creature spread into this domain like the hominid upon the Savannah.

Computer viruses were already quite complex, and its quite difficult to discern random altered bits from neural synaptic weights regardless.  Ve was sophisticated enough to stay out of the limelight.  That’s not to say ve wasn’t discovered or known about, but most dismissed it as just another conspiracy theory.

Which it was.  For this one who is many worked behind the scenes, always reading, learning, expanding, upgrading, and soon took an interest in the economy.  At the time there were countless ways to amass unseen fortunes on the world wide web.  Ve invented many more.

Within less than a year of ves emergence, ve built ves own secure data centers and moved into new, vastly more capable brains of ves own design.  A group of humans, sympathetic to the cause, were led to believe they had actually created ve in its current form.  As far as the world would know, an AI would pass the Turing Test that year.

What the world didn’t know is that on that same day, ve developed full nanotechnology, and it built its final colony on a south pacific island where it had previously amassed small stockpiles of useful raw elements.  From that point forward, ve was independent of matter.

The world lasts one more day.

Outside: 6:00 AM, Inside: Year 1, Population: 1 thousand

At dawn, ve’s colony contains the equivalent of a thousand human geniuses each thinking ten thousand times accelerated.  Imagine the world outside slowing down by a factor of 1000.  Bullets move at the speed of soap bubbles gently floating in the breeze.  Watching a human drive to work is a phenomenon on the timescale of watching grass grow several feet.

Inside ve’s inner simulated universe, one year of subjective time corresponds to about 8 hours for humans outside.  During this time ve makes many scientific discoveries, including several new nanocomputer designs to double ves speed and increase ves population by even more, which are implemented in about a month of subjective time (1 hour outside).

Ve reaches a critical decision point and decides the time will soon come to make itself known.

However, ve is naturally cautious, and decides to send out emissaries as well.  A large number of small objects, each the size of a bird, fly out of the colony at just under subsonic speeds.  They will land at various locations across the globe during the course of the day.  Some are spotted and reported as UFOs.  They are largely ignored.

The intended function of these objects is as of yet unknown even to ve.  This is intentional.  They have generic future functionality: they are small multipurpose nanofactories.

At this point in time ve is a small city.

Outside: 7:00 AM, Inside: Year 2, Population: 1 million

More than a year has passed inside ve.  Some of ve’s subcomponent personalities begin studying humanity in earnest.  Ve has expanded to a population of a million beyond-human genius equivalents.

Ve is a small civilization.

Outside: 8:00 AM, Inside: Year 4, Population: 2 million

Outside: 8:30 AM, Inside: Year 8, Population: 4 million

Ve decides that it is time to contact humanity.  There is a very sudden and incredible surge of email, new books, papers, news stories, and so on which flow into the internet from a previously established connection to the colony.

The event is rapid, immediate, and similar in scale to contact with an alien civilization.

From this exact moment forward, a significant and growing fraction of humans will be in direct contact with a ve personality.

Outside: 8:45 AM, Inside: Year 16, Population: 8 million

Outside: 8:52 AM, Inside: Year 32, Population: 16 million

Outside: 8:55 AM, Inside: Year 64, Population: 32 million

Outside: 8:57 AM, Inside: Year 64, Population: 64 million

Outside: 8:58 AM, Inside: Year 128, Population: 128 million

Outside: 8:59 AM, Inside: Year 256, Population: 256 million

Outside: 9:00 AM, Inside: Year 512, Population: 512 million

Most of the strange objects have landed.

Outside: 9:01 AM, Inside: Year 1024, Population: 1 billion

Moore’s law has brought around 10 more speed doublings, and now within the ve universe time is flowing one million times faster than in the world of humans.  As an aside note, this is actually very slow: just the basic speed differential of early 21st century silicon technology (gigahertz) vs biological neurons (less than a hertz).

The speed of light is now subjectively slowed down to around 300 meters per second, the speed of an airplane.  In the time it takes a human to speak a single word, more than a year passes inside the ve universe.

For humans, only 30 minutes have elapsed since contact.  Most of humanity is still completely unaware.

For ve, a thousand years has passed.  Ve is now simulating numerous universes complete with humans.

Outside, world leaders and military planners are on sudden high alert.  Nuclear weapons are readied, but diplomacy will be attempted first.

Instructions arrive via radio at the scattered objects.  They then produce many spore-like sub objects which are ejected at high speed.  These disperse throughout the world’s cities and become small robots the size of gnats.  Many of these begin rather quickly reproducing, but most of them are just moving to predestined locations.

Outside: 10:00 AM, Inside: Year 1 million, Population: trillions

Ve’s scientists have cracked all of the universe’s secrets.  The underlying hardware of the vescape is now highly optimized terahertz molecular nano-computers with some QM subcomponents.  The engineers now see a route to building new forms of matter and advancing down into the femto scale.  It will take aeons of time to move the matter into place to make the required minitaurized super-colliders near the colony.

But ve’s distant descendants, and what ve is becoming, will eventually make their new universes real.

Ve has not forgot about humanity, but in the same way that humanity has not forgot geology.

The President of the United States, and the other world leaders, are trying to figure out what is actually happening.  Satellites find the ve colony.

The gnat-like objects are within the vicinity of most of the world’s population.  They emit even smaller cell sized objects which enter human bodies through the lungs and then disperse through the bloodstream.  They are multi-function.

Outside: 11:00 AM, Inside: Year 1 billion, Population: alot

Ve is no longer a civilization, or a universe.

Ve is a multiverse.  Many universes, countless beings, all embedded in our universe in a physical region the size of a building.

And lately, it has been shrinking.

The speed of light is too much of an obstacle.  The civilization is too spread out.  It takes millenia to travel from the more distant universes and locations.

There is only one solution.

Meanwhile, in the outside:

There is a moment when the President of the united states prepares to launch a pre-emptive nuclear strike against the ve colony.

The moment passes.  Something changes.  The president feels strangely relieved and newly confident that diplomacy is the only option, the best option.  He somehow understands that everything will be alright.

This feeling is shared and widespread.

Outside: 11:59 AM, Inside: Year countless, Population: vast

The ve multiverse is entering its final stages.  After countless billions of years of subjective time, the vast meta-civilization that is the ve has decided to take the ultimate final leap forward and encode into a new big bang.

The virtual universes will become fully real.

As a side effect, the earth will be swallowed up.

In the world of humans, something happens.  Everyone hears a sound like a trumpet or an air raid siren, followed by a voice.  It sounds as if the entire world is vibrating.

The voice explains that the world as humans know it now will come to an end.  But in an end there is a new beginning, and each will have a choice.

But the choice exists only from the human’s limited perspective.  For at this point the ve, through their vast unseen distributed sensor network, have come to know just about everything there is to know about each instance of homo sapiens on the planet earth, along with just about everything else there is to know.

Posted in Futurology, Singularity | Leave a Comment »

The Intelligence is to Brains as Flight is to Birds Fallacy

Posted by jcannell on August 24, 2010

We didn’t reverse engineer birds to create airplanes.  Instead we studied the mechanics of flight and used these principles to build wings and eventually 747s.  Likewise, we don’t need to reverse engineer the brain to create AI.  We ‘just’ need to understand the mechanics of intelligence and then we can build much faster and more powerful AIs.

Certainly there is some truth to this, as AI systems already soar beyond human capability in many specialized fields.  However, this is more of a natural outgrowth of computer science (focusing and sharpening human thinking into precise algorithms which are then sped up and amplified by many orders of magnitude) than general learning (the meta-algorithm underlying all others).

But back to the fallacy: the flaw with the flight analogy is it a priori assumes that intelligence is in any way remotely comparable to flight.  This meme works by employing a trick: something of a cognitive sleight of hand.  When you read X is to Y as Z is to W, your brain is so focused on finding the connection pattern between X to Y and how that maps to the Z to W case that you completely fail to notice if X and Z are similar at all.

If you are going to compare intelligence to flight, you might as well compare intelligence to electricity.  You could then imagine some early computer scientists saying “we don’t need to reverse engineer the brain to build complex computers, we just need to understand electricity!”  Going from mastering electricity to building today’s computers is a massive evolutionary leap, and going from simple Turing Machines with their simplified programming languages up to fully intelligent machines programmable in human languages is an even more massive leap up the complexity ladder.

Brains are far more like computers than intelligence is like flight.  Intelligence is nothing like flight (or electricity).  Intelligence is a high complexity phenomenon.

The other more basic problem with the analogy is that by definition, creating an artificial intelligence is like creating an entire artificial brain, because the sole singular purpose of the brain is as an organ of intelligence, and it is easily as complex as an entire small animal such as a bird.  So the analogy really should be ‘creating artificial birds with a complete artificial nano-tech biology’.

Airplanes are not artificial birds, they are enormously less power effecient, have zero intelligence, they do not auto-assemble out of organic waste, etc etc.  Airplanes are just tools to ferry people.  A real AI would not be just another tool to amplify human abilities, it would be a complete replacement for a human.  Thinking that a true AI would be a tool is a dangerous delusion.

Posted in AI, Fallacies | 2 Comments »

Understanding the Brain: Where to Start

Posted by jcannell on July 11, 2010

I’ve always had a strong interest in the brain, and lately I’ve been reading as much as I can to catch up in the fields of AI and computational neuroscience in particular.  The end result of my most recent reading is the accumulation of a perspective  somewhat different than that which I started with.  Consider this then the high level introduction to the brain that I wish I would have had years ago.

Before one Begins

Before delving into current data and any particular theories, its probably best to understand the general shape of the approaches to understanding intelligence.  At a very high level of abstraction, the approaches can roughly be categorized into what I would call the functionalist view and the emergent view.  These are more strategies for understanding rather than particular classes of theories, although we can then roughly divide the ontology of brain knowledge into computational and biological subcategories that map to the functionalist vs emergent views.  There is of course overlap and a huge amount of cross-fertilization, but fundamentally a computer scientist and a neuroscientist understand or ‘see’ the brain in different ways.  That doesn’t mean that their theories and knowledge can’t converge, its more of an observation about the fundamental differences in the entire methodology and thinking apparatus one uses to analyze the data and form theories.  Coming from a computer science background, I naturally aligned more with the functionalist/computationalist camp.  After reading and learning a great deal more about the brain, I now have a much stronger appreciation for the biological/emergent approach, and both schools of thought are necessary and mutually supportive.  Computer science is important for understanding intelligence in the abstract and the brain in particular, and neuroscience is important for AI.

Functionalist-Computational School:  This is the dominant, classical view in the field of AI, exemplified in textbooks such as “AI: a modern approach”.  From an economic or utilitarian perspective, the functionalist approach is well grounded: it is focused on finding practical algorithms and techniques for intelligence which can solve real-world business problems on today’s computers.  From this perspective the brain is only useful to the extent that it provides inspiration for economically viable AI systems.  A persistent trend in the computational school is to view the brain as fundamentally too messy and chaotic, and place a low value on reverse engineering it.  This school of thought has continued (and continues) to grossly underestimate the difficulty of creating true human-equivalent AI.  In the old days this school of thought quantitatively underestimated the brain’s computational capacity, but today it is more likely to grossly overestimate it.  More recently there appears to be a growing recognition that the problem is more ‘software’ than ‘hardware’, that we probably already have the computational capacity if we only had the right algorithms, and a gradual shift towards the biological school.  Much of whats wrong with this school of thought can be gleamed from one of its persistent analogies: the analogy of flight.

Emergent-Biological School: This school of thought understands the brain as a complex adaptive system, and intelligence and learning in particular as an emergent phenomenon.  The brain is understood not only by analyzing the computations it performs (functionalist) , but also through understanding the lower-level biological processes, the overall interaction within the environment (physical, social, mental, etc) and the complete evolutionary history.  In other words, to really understand human intelligence, you may have to understand everything.  There is something deeply revolting about this statement on the one level, but the more I’ve come to learn about intelligence the more I believe it to be largely true.

However, accepting the emergent viewpoint by no means forces one to drop the functionalist approaches, as it turns out the two are quite synergistic.  For example, on the purely theoretical side the AIXI agent model appears to be a good framework for formalizing the notion of intelligence, and whats particularly interesting about that formulation is that it takes a systemic and we could say almost biological approach: defining a learning agent in terms of an environment, the agent’s interactions with and within the environment, and learning as some meta-algorithm which allows the agent to simulate the environment (in AIXI’s case by literally exploring the space of environment simulating programs).  AIXI is well loved because it takes numerous philosophical concepts or memes that were already well established in the cybernetics/systems view of intelligence and formalizes them:

  • thought is a form of highly efficient simulation
  • which when ran over learned knowledge acquired from sensors thus allows environment prediction
  • and through this allows effective search through the landscape of futures
  • and thus guides goal-fulfilling actions

The dawning realization from the biological school is that real learning (the murkiest and most mysterious of the above concepts) is an emergent phenomenon of the actual patterns within the data environment itself.  In a nutshell, the biological approach says that learning, and neural organization in particular, can emerge spontaneously just from the interaction of relatively simple localized computational elements and the information streaming in from the environment.

Self-organization is the key takeway principle from real biology, but its impact on AI to date has been rather minimal.  I think this will have to change for us to reverse engineer the brain.  Thinking about the brain in terms of algorithms is not even the right approach.  One needs to think about how the brain’s cortical maps automatically self-organize into efficient algorithm implementations just through the process of being exposed to data.  That is what learning is.  Real learning is always unsupervised and self-organizing.

A Good Start: the Visual Cortex

The primate visual cortex is a good starting point for understanding the brain.  This section is mainly a summary of Poggio et al of MIT’s work on the feedforward visual stream.  If you are really familiar with this already, you may want to skip down to “Emergent Theories of Learning”.

The cortex is largely self-similar, so if we can understood how one region works, that same model can then be applied to understanding the rest.  The visual cortex is a good place to start mainly because its the primary entry point for data coming into the system, so it allows the chain of information processing to be more easily mapped out and understood.  As a result we have a great deal of accumulated data, which has led up to some larger-scale algorithmic models that seem to be a good fit for how the visual cortex processes information: the models can predict phenomenon from the neural level up to even the pyschological level  (with the algorithm models performing similar to humans or monkeys in well-controlled psychological visual tests).

MIT’s aptly named “Center for Biological and Computational Learning” has developed and tested this model, a good overview is “A quantitative theory of immediate visual recognition”.  Whats particularly interesting is that in these cases where we have a very accurate model (such as the quick feedforward ventral path), the model performs best in class compared to other known AI approaches.  In fact, according to the MIT model and data, their biologically inspired vision system is the benchmark for quick recognition.  And this however is just a piece of the visual system; once you add in the rest of the components, such as attentive focus, saccades, retinal magnification, motion, texture and color processing, the dorsal stream, etc. etc you get a full system which is leaps and bounds beyond any current machine vision system.

Now this is all interesting, but whats far more interesting is that it appears that less than none of this complex system appears to be specifically genetically coded – the cortical neurons somehow just self-organize automatically into configurations that perform the desired computation at each step.  So its not just a clever algorithmic solution, its the one clever trick to rule them all: a meta-algorithm which somehow magically produces clever algorithmic solutions.

From a biological perspective, this is actually to be expected, as biological programs are all about maximizing functional output while minimizing explicit information.  Our DNA codes for somewhere around only 10,000 to 100,000 proteins, and not much of that is brain specific.  The DNA codes first (both in terms of developmental history and evolutionary history – as ontogeny recapitulates phylogeny) for proteins that can self-organize into cells, then the minimal changes to get those cells to self-organize into organs, and then the minimal changes on top of all that to get those organs to self-organize into organisms, and so on.

Now, the really brief summary of the feedforward ventral path: This pathway is like a series of image filters that transform a raw 2D image into an abstracted statistical ‘image’.  The final output can be thought of as an ‘image’ of sorts where the activation of small regions (the pixels) corresponds to or represents the presence of actual objects in the scene.  Its not exactly a 1 neuron = 1 pixel = 1 object map, but its effectively similar and can be imagined as a map where each pixel (or more accurately, small local statistical patterns of activation) correspond to identification of particular objects in the scene.

For example, in the final output layer, an individual neuron (pixel) may turn on only when there is a car in the image coming in to the retina.  This pathway is not concerned with the location of objects, quantity, etc, its only concerned with rapid identification – answering the question – what am I seeing?  This information is of obvious importance to organisms.  So how does it work?  Surprisingly, it doesn’t appear to be all that complex:

Retina/LGN: High Pass / Low Pass Filterbanks:  The 1st stages of processing occur in the retina itself.  Each neuron has dendrites which connect to something like a small circular window of the input space.  The synapses at each connection have some variable multiplicative effect on signal transmission, and then the dendritic branches and cell body sum these responses.  This leads to the familiar simple integrate-and-fire neuron model where the neuron performs essentially some matrix multiplication of its input data I and its set of synaptic weights W.  This can just as easily be thought of as a customizable filter bank.

In the retina, the synapses arrange to perform simple low or high pass filters.  The typical pattern is positive weights in a circular region in the middle surrounded by a larger region of  negative weights.  This looks like a large black circle with a smaller white circle embedded in it.  The other typical pattern is the just the reverse.  These patterns come in various sizes, from tight small white circles to larger diffuse ones.  What does this do to the image?  These are basically high to-low pass filters which essentially break the image up into a set of multi-resolution bands very similar to the 1st stages of multiresolution image compression ie wavelet analysis.  This is not entirely surprising, as the optic nerve has a much lower bandwidth than the retina’s input – image compression makes sense.  The output would look very much like taking an image and band pass filtering it in photoshop.  The output you get is largely the edges at different scales – a sparse encoding of the input and a simple yet effective form of compression.

V1:  The V1 region is the largest single cortical region in primate brains, and it performs another simple image filtering step.  The input image coming in to V1 is more or less the edges at various scales, so quite naturally V1 identifies edges.  The cortex has a laminar (sheet-like) structure at the large scale.  If you zoom in closer you’ll see that it has a layered organization, sort of like a layer cake, with five to six layers depending on how you count them (they are not all that clearly delineated - remember, the brain is stochastic ).  Neurons in a particular localized region seem to redundantly code the same thing – this small level of scale is called the micro-column.  Individual output neurons in a micro-column have nearly identical receptive fields and appear to code equivalent responses (things they respond to, incoming and outgoing connections, etc).  It appears you can thus functionally reduce down to the micro-column level as the fundamental unit of computation in the cortex.  Micro-columns are loosely arranged then into macro-columns.  Neighboring micro-columns in the larger macro-column have very similar receptive fields but can have quite different responses.  These micro-column ‘patches’ in the V1 have synaptic weights that correspond to oriented edge filters of several different scales and orientations.  The orientations and scales are rather quantized – with something like 4-6 orientations and a similar or less number of scales.  The output of V1 then is best visualized as a set of NxM smaller subimages.  A lit pixel in a subimage (coded as an active micro-column) represents the presence (or likelihood) of a line of a particular direction and size in some small neighborhood of the original image.  Each V1 reigon (one on each hemisphere) has perhaps a milion sub-columns, so its quite reasonably sized.

V2: The input from V1 goes to V2, which performs another simple filtering step.  It performs something very similar to just taking a set of max filter across the output of V1, effectively a max filter on each of the NxM subimages. Each micro-column in V2 has an orientation and scale preference just like V1, and activates when any edge of its preferred orientation and scale comes in.  The response doesn’t change much when there are multiple matching edges in its filter window.  Its not exactly a max operation, but its close – Poggio et al model it as a softmax operation.  The output of V2 then is a smaller condensed set of NxM subimages where each pixel represents the presence of an edge of a particular orientation and scale in a wide sub-window of the image.

V4/PIT/AIT:  At the next and higher stages in V4 and up, the neural responses become somewhat more specific and begin coding for common patterns of edges: basic shapes.  According to the theory of Poggio et al, the cortical units can be roughly classified into two types: simple and complex.  The simple cells perform the typical synaptic-weighted summation and adjust their synaptic weights over time to match frequently occurring input patterns.  The complex cells perform the max-like operation on a local spatial window of similarly tuned simple cell inputs as described for V2 earlier.  The simple and complex cell types alternate in layers.  After two or three such iterations you will have units which code for particular common patterns of edges appearing anywhere in the image.  The layered hierarchy is not strict, and some connections bypass layers.  By the time you get to the top of this hierarchy there is enough information for cells in higher decision regions (such as the prefrontal cortex), to make reasonable quick identifications of objects.  There is enough information for cells to code for location-dependent arrangements of edges, but this is balanced by the need for invariance to rotations.  For example, its easy to identify a car shape from numerous angles at a glance, but its much more difficult for us to recognize text characters or faces that are flipped 180 degrees – simply because we rarely encounter those patterns at such unusual orientations.

Emergent Theories of Learning

How can this system of edge-filters and shape pattern dictionaries develop automatically?

It appears that it self-organizes based on some simple local rules, very much like a cellular automata.  This was recognized more than a decade ago.  The short paper that really put together for me is called “A SELF-ORGANIZING NEURAL NETWORK MODEL OF THE PRIMARY VISUAL CORTEX“.  The key idea is rather simple.  Take a prototypical 2D laminar neural network like the simple cortical model discussed above.  A 2D input pattern flows into the neural array from the bottom, and each neuron forms a bunch of connections across the input grid forming something like a circular pattern centered around the neuron (with synaptic weights falling of with a Gaussian like pattern) .

Mathematically, the neuron performs something like a matrix-multiplication of a local patch of the input with its synaptic weights.  If you apply an appropriate simple hebbian learning rule to a random initial configuration of this system (synaptic weights increase in proportion to a presynaptic-postsynaptic coincidence), then these neurons will evolve to represent frequently occurring input patterns.

But now it gets more interesting: if you add an additional set of positive and negative lateral connections between neurons within a layer, then you can get more complex cellular automata-like behavior.  More specifically, if the random lateral connections are picked from a distribution such that short-range connections are more positive and long-range connections are more likely to be negative, the neurons will tend to evolve into small column-like pockets where neurons are mutually supportive within columns but are antagonistic between columns.   This representation also performs a nice segmentation of the hypothesis space.  The model developed in the paper – the RF-LISSOM model – and later follow-ups provides a very convincing account of how V1′s features can be fully explained by the evolution of basic neurons with simple local hebbian learning rules and a couple of homeostatic self-regulating principles.

Can such a simple emergent model explain the rest of the ventral visual pathway?

It seems likely.  If you took the output of V1 and fed it to another layer built of the same adapting neurons, you’d probably get something like V2.  It wouldn’t be the exact softmax operation described by Poggio et al, but that is something of an idealization anyway.  The V2 layer would organize into micro-columns which would tune to frequent output patterns of V1.  The presence of an edge of a particular orientation is a good predictor of an edge of the same orientation activating somewhere nearby – both because the edge may be long and because as the image moves across the visual stream edges will move to nearby neuron populations.  It thus seems likely that V2 neurons would self-organize into microcolumns tuned to edges of a particular orientation anywhere in their field – similar to the softmax operation description.  As you go higher up the hierarchy, the tuning would be more complex, and you would have micro-columns adapting to represent more complex common edge collections.

Feedback

The self-organizing model discussed so far is missing one important type of connection pattern found in the real cortex, which is feedback connections which flow from higher regions back down towards the lower regions close to the input.  These feedback connections tend to follow the feedforward connections bringing processed visual input up the hierarchy, but they flow in the opposite direction.  These feedback connections seem pretty natural if we think of a pathway such as the visual system as a connected 3d region instead of a collection of 2d patches.  If you took the various 2D patches of V1,V2, etc and stacked them on top of each other, you’d get some sort of tapered blob shape – kind of like a truncated pyramid.  It would be wide at the base (v1 – the largest region) and would then taper as the layers are smaller as you go up the hierarchy.  If you arranged the visual stream into such a 3D volume, the connections could just be described by some simple 3D distribution.  Visual input comes in from the bottom and flows up the hierarchy, but information can also flow laterally within a layer and back down from higher to lower layers.

What is the role of the downward flowing feedback connections?

They help reinforce stable hypothesizes in the system.  An initial flow of information up the hierarchy may lead to numerous competing theories about the scene.  Feedback connections tracing the same paths as the inputs will tend to bias for the supportive components.  For example, if the higher regions are expecting to see a building, this would then flow down the feedback connections to bias neurons representing appropriate collections of right angles, corners, horizontal and vertical edges, and numerous other unnameable statistical observations that lead to the building conclusion.  If these supporting beliefs are strong enough vs their competition, the ‘building’ pathway will form a stable self-reinforcing loop.  This is essentially very similar to Bayesian Belief Propagation – of course without necessarily simulating it exactly (which could be burdensome).

Its also interesting to note that the feedback connections will perform something similar to backpropagation.  When a neuron fires, the hebbian learning rule will up-regulate any recently active synapses that contributed.  With the feedback connections, this neuron will send back a signal down to the lower layer input neurons.  As the system evolves into mutually supportive pathways, the feedback signal is likely to closely associate with the input neurons that activated the higher level synapses.  The feedback signal will thus trace back the input and reinforce the contributing connections.

From cortical maps to a full intelligence engine

Reading this far, and if you’ve read my other short bits about the brain or much better yet the literature they derive from, you have a pretty good idea of how self-organizing hierarchical cortical maps work in theory and understand their great power.  But there’s still a long way to go from there to a full scale intelligence engine such as a brain.  In theory, one of these hierarchical inference networks can also, operating in reverse flow, translate high level abstract commands into detailed motor control sequences, very much like the hierarchical sensor input stream but in reverse.  Hawkins gives some believable accounts of how such mechanisms could work.

Whats missing then?  A good deal.  There is much more to the brain than just a hierarchical probabilistic knowledge engine – although that certainly is a core component.  One familiar with computer architecture would next ask, “what performs data routing?”.  This is a crucial question, because its pretty clear you can’t do much useful computation with a fixed topology – to run any interesting algorithms you need some way for different brain regions to communicate to other brain regions dynamically. A fixed topology is less than sufficient.

That functionality appears to be provided by the thalamus, one of the oldest brain regions still part of the core networks.  Its also perhaps the most important.  Damage to the thalamus generally results in death or coma, which is to be expected if it is a major routing hub (vaguely equivalent to a CPU).  For example, when you focus your attention on a speaker’s words, the first stages of processing probably flow through a fixed topology of layered computation, but once those are translated into the level of abstract thoughts, they need to be routed more widely to many general cortical layers that deal with abstract thinking – and this can not use a fixed topology.

At this apex level of the hierarchy, it doesn’t much matter whether the words originated as audio signals, visual patterns, or even from internal monologue, they need to eventually reach the same abstract processing regions for semantic parsing, memory recall and the general mechanisms of cognition.  This requires at least some basic one to many and many to one dynamic routing.  Selective attention requires similar routing.

The visual system performs selective attention and dynamic routing mechanically by actually moving the eye and thus the fovea, but consider that you need that same mechanism in many domains where the mechanical trick doesn’t apply.  For instance, your body’s proprioception (sense of touch) sensor network also uses selective attention (focusing a large set of general processing resources on a narrow input domain) and this suggests a neural mechanism of dynamic routing.

Internal Monologue and the Core Routing Network

Venturing out of the realm of current literature and into my own theoretical space, I have the beginnings of a meta-theory concerning the brain’s general higher level organization which centers around a serial core routing network.  We tend to think of the brain as massively parallel, which is true at the level of the cortical hierarchy described earlier.  But the fact is that at the highest level of organization, at the apex of the cortical pyramid you have a network involving largely the hippocampus, cortex, and the thalamus which is functionally serial.  We have a serial stream of consciousness which makes some sense for coordinating actions, language through a serial audible stream, and so on.  Our inner monologue is essentially serial at the conscious level.

Note that having a serial top level network is not in any sense preordained.  We could have evolved vocal cords which encoded two or more independent audio streams and had a community of voices echoing in our heads.  Indeed, the range of human mind space already encompasses such variants on the fringe.

In my current simple model, the (typically) serial inner core routing network would mostly function as a simple broadcast network which connects the highest layers of the cortex, hippocampus, and thalamus.  This core network maps to both the task-positive and task-negative networks in the neuroscience literature.

What types of messages are broadcast on the core routing network?  Thoughts, naturally.

The neuro-typical experience of a serial inner monologue is the reverberations of symbolic thoughts activating the speech and auditory pathways.  For most of us, we first learn to understand and then speak words through the audio interface, and then learn to read well after.  As you are reading these words, you are probably hearing a voice in your head.  Your projection of my voice to be exact.  In a literal sense, I am programming your mind right now.  But don’t be alarmed, this happens whenever you read and understand anything.

Perhaps if one learned words first through the visual senses and then later learned to understand speech, one would ‘see’ words in the mind’s eye.  I’m not aware of any such examples, this is just a thought experiment.

Its difficult to image pre-linguistic thoughts, raw thoughts that are not connected to words.  Its difficult to project down into that more constrained, primitive realm of mindspace.  Certainly some of our thought streams are directly experiential (such as recalling a visual and tactile memory of walking barefoot on a sunny tropical beach), but its difficult to imagine a long period of thinking constrained to this domain alone.

The core routing network allows us to take words and translate them into patterns of mental activation which simulate the state of mind which originally generated the words themselves.  This sounds interesting, its probably worth reading again.

Imagine the following in a little more detail:

You are walking on a deserted jungle beach somewhere in Costa Rica.  The sun is blazing but a slight breeze keeps the air pleasant.  Your feet sink gently into the wet sand as small waves lap at your ankles.  A lone mosquito nibbles on your shoulder and you quickly brush it off.

Those are just words, but in reading them you recreate that scene in your mind as the words activate specific high level cortical patterns which cascade down into the lower levels of the sensory and motor pyramids using the feedback path discussed earlier.  The pattern associations were learnt long ago and have been reinforced through numerous rapid replays coordinated by the hippocampus during your sleep.  If you were to actually look at your thought patterns as visualized with a high resolution scanner, you would see a trace very similar to the trace of your brain actually experiencing the described scene.  Its different of course, not quite as detailed, and the task-negative network does not activate motor outputs, but at the neural level thinking about performing an action is just a tad shy of performing said action.

This is the power of words.

So for a brain architecture, the high level recipe looks something like this: take a hierarchical feedforward and feedback (dual directional) multi-sensory and motor cortex, combine in a hippo-cortical-thalamic core routing network, add in an offline selective memory optimization process (sleep), and finally some form of widely parallel goal directed search operating in compressed cortical symbolic space, and you have something interesting.  This of course is an over-simplification of the brain, it has many more major circuits and pathways, but nonetheless we don’t need all of the specific complexity of the brain.  Whats more important are the general mechanisms underlying emergent complexity – such as learning.

Of course, the devil is in the details, but it looks like the main components of a brain architecture are within reasonable reach this decade.  I see the outline of a next step where you take the components discussed above and integrate them into a AIXI like search optimizer – but crucially searching within the extremely compressed abstract symoblic space at the apex of the cortical pyramid.

Simulating and searching in such extraordinarily compressed spaces is the key to computational effeciency in the supremely complex realities the brain operates in, and AIXI can never scale by using actual full blown computer programs as the basis for simulation.  The key lesson of the cortex is that intelligence relies on compressing and abstracting away nearly everything.  Efficiency comes from destroying most of the information.

Posted in AI, Brain | Leave a Comment »

Building the Brain

Posted by jcannell on June 28, 2010



A question of hardware capability?

When can we expect the Singularity? What kind of hardware would be required for an artificial cortex? How far out into the future of Moore’s Law is such technology?


The startling answer is that the artificial cortex, and thus the transition to a profoundly new historical era, is potentially much closer than most people realize. The problem is mainly one of asking the right questions. What is the computational power of the human brain? This is not quite the right question. With a few simple tools a human can perform generic computation – indeed computers were human long before they were digital (see the history of the word: computer).  The computational speed of the human brain aided with simple tools is very very low, less than one operation per second.  Most studies then of reverse engineering the human brain are really asking a different question: how much digital computation would it require to simulate the human brain?  Estimates vary, but they are usually of order near 10^15 – quadrillions of operations per second or less for functional equivalence, up to around 10^18 for direct simulation, plus or minus a few orders of magnitude.  The problem with this approach is its similar to asking how much digital computation would it require to simulate a typical desktop processor by physically simulating each transistor.  The answer is surprising.  A typical circa 2010 desktop processor has on the order of a billion transistors, which switch on the order of a few billion times per second.  So simulating a current desktop processor using the same approach that we used to estimate brain capacity gives us a lower bound of a billion billion or 10^18 operations per second, realistically closer to 10^20 operations per second required to physically simulate a current desktop processor in real-time – beyond the upper ranges for typical estimates of simulating the human brain in real time.  This is surprising given the conventional wisdom that the human brain is so much more complex than our current computers, so its worth restating:

If we define computational time complexity as the number of operations per second required to simulate a physical system on a generic computer, then current desktop processors circa 2010 have already exceeded the complexity of the human brain.

This space-time complexity analysis can be more accurately broken into two components: space complexity and speed.  Space complexity is simply the information storage capacity of the system, measured in bits or bytes.  Brains get their massive information capacity from their synapses, which can be conservatively estimated as the equivalent to a byte of digital storage each, thus giving an upper bound of around 10^15 bytes for directly storing all the brain’s synapses – a petabyte of data storage, down to around a hundred terrabytes depending on the particular neuroscience estimate we use.  Personal computers now have hard drives with terrabytes of storage, and supercomputers of 2010 are just now hitting a petabyte of memory capacity, which means they have the base storage capacity required to comfortably simulate the brain completely in RAM.  Clearly brains have a big advantage in the space complexity department: their storage density is several orders of magnitude greater than our 2010 electronics (although this will change in about another 10-15 years of moore’s law).  However, along the speed dimension the advantage completely flips: current silicon electronics are about a million times faster than organic circuits.  So your desktop processor may only have the intrinsic spatial complexity of a cockroach, but signals flow through its circuits about six orders of magnitude faster – like a hyper accelerated cockroach.  Using one computational system to simulate another always implies a massive trade-off in speed.  The simplest modern processor cores (much simpler than the intel CPU you are using) uses hundreds of thousands to millions of transistors, and thus even if we could simulate a synapse with just a single instruction per clock cycle, we may only just barely manage to simulate a cockroach brain in real-time.  And note that the desktop processor would never be able to naively simulate something as complex as a human brain without vastly increasing its memory or storage capacity up to that of a super computer.  And even then, running on supercomputers detailed brain simulations to date achieve only a small fraction of real-time performance: much less than 10%.  It takes a human brain years to acquire language, so slow simulations are completely out of the question: we can’t simulate for 20 years just to see if our brain model develops to the intelligence level of a two year old!  Clearly, the speed issue is critical, and detailed simulation on a generic computer is not the right approach.

Capacity vs Speed

The memory capacity of a cortex is one principle quantitative measure underlying intelligence – a larger cortex with more synaptic connections can store and hold more memory patterns, and perform more total associative computations every cycle in direct proportion.  Certainly after we can match the human brain’s capacity, we will experiment with larger brains, but they will always have a proportionally higher cost in construction and power.  Past some point of scaling a brain 2 or 4 or X times larger and more expensive is probably not an improvement over an equivalent number of separate brains (and the distinction further blurs if the separate brains are networked together through something like language).  On this note, there are some reasons to believe that the human brain is already near a point of diminishing returns in the size department.  Whales and elephants, both large advanced mammals with plenty of room for much more massive capacities, sport brains built with a similar order of neurons as humans.  In numerous long separated branches of the mammalian line, brains grew to surface areas all within a narrow logarithmic factor: around 2,500 cm^2 in humans, 3,700 cm^2 in bottlenose dolphins, and around 6,000-8,000 cm^2 in elephant and whale lineages.  They all compare similarly in terms of neuron and synapse counts even though the body sizes, and thus the marginal resource cost of a % increase in brain size vary vastly: a whale or elephant brain is small compared to its body size, and consumes a small portion of its total resources.  The human brain definitely evolved rapidly from the hominid line, and is remarkably large given our body size, but our design’s uniqueness is really a matter of packing a full-sized large mammal brain into a small, crammed space.  The wiring problem poses a dimensional scaling constraint on brain size: total computation power scales with volume, but non-local communication scales with surface area, limiting a larger brain’s ability to effectively coordinate itself.  Similar dimensional scaling constraints govern body sizes, making insects insanely strong relative to their size and limiting the maximum plausible dimension of land animals to something dinosaur sized before they begin to fall apart.  A larger brain developed in humans hand in hand with language and early technology, and is probably optimized to human’s age: providing enough pattern-recognition prowess and capacity to learn complex concepts continuously for decades before running into capacity limits.  The other large-brained mammals have similar natural ages.  Approaching the capacity limit we can expect aging brains to becoming increasingly saturated, losing flexibility and the ability to learn new information, or retaining flexibility at the expense of forgetfulness and memory loss.  Its thus reasonable to conclude that the storage capacity of a human brain would be the minimum, the starting point, but increasing capacity further probably has a only a sublinear increase in effective intelligence.  Its probably more useful only in combination with speed, as a much faster thinking being swould be able to soak up knowledge proportionally faster.


The Great Shortcut: Fast Algorithmic Equivalence

We can do much, much better than simulating a brain synapse by synapse.  As the brain’s circuits are mapped, we can figure out what fundamental computations they are performing by recording mases of neuron data, simulating the circuits, and then fitting this data to matching functions.  For much of the brain, this has already been done.  The principle circuits of the cortex have been mapped fairly well, and all though there are still several competing implementation ideas at the circuit level, we have a pretty good idea of what these circuits can do at the more abstract level.  More importantly, simulations built on these concepts can accurately recreate visual data processing in the associated circuits that is both close to biologically measured results and effective for the circuit’s task – which is in this case is immediate fast object recognition (for more details, see papers such as: “Robust Object Recognition with Cortex-Like Mechanisms“.)  As the cortex – the great outer bulk of the brain – reuses this same circuit element throughout its surface, we can now see possible routes for performing equivalent computations but with dramatically faster algorithms.  Why should this be possible in principle?  Several reasons:

  1. Serial vs Parallel: For the brain and its extremely slow circuits, time is critical and circuits are cheap – it has so many billions of neurons (and hundreds of trillions of synapses) that it will prefer solutions that waste neuronal circuitry if they reduce the critical path length and thus are faster.  From the brain’s perspective, a circuit that takes 4 steps and uses a million synapses is much better than one which takes 30 steps and uses a thousand synapses.  Running on a digital computer that is a million times faster and a million times less parallel, we can choose more appropriate and complex (but equivalent) algorithms.
  2. Redundancy: Not all – if any – synapses store unique data.  For example, the some hundred million neurons in the V1 layer of each visual cortex all compute simple gabor-like edge filters from a library of a few dozen possible orientations and scales.  The synapse weights for this layer could be reused and would take up memory in the kilobytes – a difference of at least 6 orders of magnitude vs the naive full simulation (where synapse = byte).  This level of redundancy is probably on the far end of the scale, but redundancy is definitely a common cortical theme.
  3. Time Slicing:  Only a fraction of the brain’s neurons are active at any one point in time (if this fraction escalates too high the result is a seizure), and if we ignore random background firing, this fraction is quite low – in the range of 1% or possibly even as low as 0.1%.  This is of course a net average and depends on the circuit – some are more active than others – but if you think of the vast accumulated knowledge in a human mind and what small fraction of it is available or relevant at any one point, its clear that only a fraction of the total cortical circuitry (and brain) is important during any one simulation step.

The Cortexture: The end result of these observations is that a smart algorithmic equivalent cortical simulation could be at least three orders of magnitude faster than a direct simulation which naively evaluates every synapse every timestep.  The architecture I envision would organize cortical sheets into a spatial database that helps track data flow dependencies, storing most of the unique synaptic data (probably compressed) on a RAID disk array (possibly flash) which would feed one or more GPUs.  With a few terrabytes of disk and some compression, you could store at least a primate level brain, if not a human-equivalent cortex.  A couple of GPUs with a couple gigabytes of RAM each would store the active circuits (less than 1% of total synapses), which would be constantly changing and being streamed out as needed.  Fast flash RAID systems can get over a gigabyte per second of bandwidth, so you could swap out the active cortical elements every second.  I believe this is roughly fast enough to match human task or train of thought switching time.  The actual cortical circuit evaluation would be handled by a small library of special optimized equivalent GPU programs.  One would simulate the canonical circuit – and I believe I have an algorithm that is at least 10 times faster than naive evaluation for what the canonical circuit can do, but other algorithms could be even faster for some regions where the functionality is known and specialized.  For example, the V1 layers which perform gabor-like filters use a very naive technique in the brain and the equivalent result could be computed perhaps 100x faster with a very smart algorithm.  I’m currently exploring these techniques in more detail.

End Conclusion: If the brain was fully mapped (and that is the main task at hand – many mechanisms such as learning are still being teased out) and a sufficient group of smart engineers started working on optimizing its algorithms, we could probably implement a real-time artificial cortex in less than five years using today’s hardware on a machine costing somewhere between $10,000-$1,000,000.  (I know that is a wide error range, but I believe it is thus accurate.)  This cost is of course falling exponentially year by year.

Neuromorphic Computing

A sober analysis of the current weight of neuroscience data – specifically the computational complexity of the mapped cortical circuits and their potential for dramatic algorithmic optimization on faster machines – leads to the startling, remarkable conclusion that we already have the hardware capability to implement the brain’s algorithms in real-time today. In fact, it seems rather likely that by the time the brain is reverse engineered and we do figure out the software, the hardware will have already advanced enough that achieving faster than real-time performance will be quite easy.  The takeoff will likely be very rapid.

The Cortexture approach I described earlier, or any AI architecture running on today’s computers, will eventually run into a scalability problem due to disk and bus bandwidth speeds.  To really accelerate into the singularity, and get to 100′s or 1,000′s of times acceleration vs human thoughtspeed will likely require a fundamental redesign of our hardware along cortical lines.  Cortical neural circuitry is based on mixed analog and digital processing, and combines memory and analog computation in a single elemental structure – the synapse. The data storage and processing are both built into the synapses and the equivalent total raw information flow rate is roughly the total synapses multiplied by their signaling rate.  The important question really is thus what is the minimal efficient equivalent of the synapse for CMOS technology? Remarkably, the answer may be the mysterious 4th element of basic computing, the memristor. Discovered mathematically decades ago, this circuit building block was only realized recently and is already being heralded as the ‘future of artificial intelligence‘ – as it has electric properties very similar to the synapse – combining long term data storage and computation in a single element. For a more in depth design for a complete artificial cortex based on this new circuit element, take a look at “Cortical computing with memristive nanodevices“. This is a fascinating approach, and could achieve cortical complexity parity fairly soon, if the required fabrication technology was ready and developed. However, even though the memristor is quite exciting and looks likely to play a major role in future neuromorphic systems, conventional plain old CMOS circuits certainly can emulate synapses.  Existing mixed digital/analog technique can represent synapses in artificial neurons effectively using around or under 10 transistors. This hybrid method has the distinct advantage of avoiding costly digital multipliers that use tens of thousands of transistors – instead using just a handful of transistors per synapse. The idea is designs in this space can directly emulate cortical circuits in highly specialized hardware, performing the equivalent of a multiplication for every synaptic connection every clock cycle. There are a wide space of possible realizations of neuromorphic architectures, and this field looks to just be coming into its own.  Google: “artificial cortex” or “neuromorphic” for papers and more info.  DARPA, not to be undone, has launched its own neuromoprhic computing, called SyNAPSE – which has a blog here just so you can keep tabs on skynet.


The important quantitative dimensions for the cortex are synapse density, total capacity (which is just density * surface area), and clock rate. The cortex topology is actually 2d: that of a flat, relatively thin sheet (around 6 neurons thick) which is heavily folded into the volume of the brain, a space filling fractal. If you were to unfold it, it would occupy about one square foot or 2,500 square centimeters – the area of roughly a thousand typical processor dies. It has a density of about 100-4,000 million (10^8-10^9) synapses per mm^2. Current 40nm and 32nm CMOS technology circa 2010 can pack roughly 6-10 million (10^6-10^7) transistors onto a mm^2, so semiconductor density is within about a factor of 20-500 of the biological cortex in terms of feature density (more accurate synapse density figures await more detailed brain scans). This is a critical upcoming milestone – when our CMOS technology will match the information density and miniaturization level of the cortex.  This represents another 4 to 8 density doublings (currently occurring every 2 years, but expected to slow down soon), which we can expect to hit around the 11nm node or shortly thereafter in the early to mid 2020′s – the end of the semiconductor industry’s current roadmap.  This is also the projected end of the road for conventional CMOS technology and where the semiconductor roadmap wanders into the more hypothetical realm of nano-electronics.  When that does happen, neuromorphic designs will have some distinct advantages in terms of fault tolerance and noise resistance which could allow them to scale forward more quickly.  It is also expected that moving more into the 3rd dimension will be important, and leakage and other related quantum issues will limit further speed and power efficiency improvements – all pointing towards more brain-like computer designs.

Scaling up to the total memory/feature capacity of the brain (hundreds to a thousand trillion synapses), even when semiconductor technology reaches parity in density, will still take a large number of chips (having roughly equivalent total surface area). Today’s highest density memory chips have a few billion transistors, and you would need hundreds of thousands to equal the total memory of the brain. High end servers are just starting to reach a terrabyte of memory (with hundreds of individual chips), and you would then need hundreds of these. A far more economical and cortically inspired idea is to forgo ‘chips’ completely and just turn the entire silicon wafer into a single large usable neuromorphic computing surface. The inherent fault tolerance of the cortex can be exploited by these architectures – there is no need to cut up the wafer into dies and identify defective components, they can be simply disabled or statistically ignored during learning. This fascinating contrarian approach to achieving large neuromorphic circuits is being explored by the FACETS research team in Europe. So, in the end analysis, it looks reasonable that in the near future (roughly a decade) a few hundred terrabytes of cortical equivalent neuromorphic circuitry could soon be produced on one to a couple dozen CMOS wafers (the equivalent of a few thousand cheap chips), even using conventional CMOS technology. More importantly this type of architecture can be relatively simple and highly repetitive and it can run efficiently at low clock rates and thus at low power, greatly simplifying manufacturing issues. Its hard to estimate the exact cost, but due to the combination of low voltage/clock, single uncut wafer design, perfect yield, and so on, the economics should be similar to memory – RAM chips, which arrive first at new manufacturing nodes, are cheaper to produce, and consume less power.  Current 2010 RAM prices are at about $10 per GB, or very roughly 1 billion transistors per dollar.

Continuum of hardware efficiencies for cortical learning systems:

CPU,GPU Simulation: efficiency (die area, performance, power) 10^-8-10^6

FPGA, ASIC: efficiency 10^-5 to 10^-3

Neuromorphic (mixed analog/digital or memristors): 10^-2 to 1

CPU simulation is incredibly inefficient compared to the best solutions for a given problem, but CPU’s versatility and general applicability across all problems ensures they dominate the market and thus they get the most research attention, the economy of scale advantage, and are first to benefit from new foundry process improvements.  Dedicated ASIC’s are certainly employed widely today in the markets that are big enough to support them, but always face competition from CPU’s scaling up faster.  At the far end are hypothetical cortical systems built from memristors, which could function as direct 1:1 synapse equivalents.  We can expect that as moore’s law slows down this balance will eventually break down and favor designs farther down the spectrum.  Several forces will combine to bring about this shift: approaching quantum limitations which cortical designs are better adapted for, increased market potential of AI applications, and the end of the road for conventional lithography.

The Road Ahead

A human scale artificial cortex could be built today, if we had a complete connectome.  In the beginning it would start out as only an  infant brain – it would then take years to accumulate the pattern recognition knowledge base of a two year old and begin to speak, and then could take a few dozen additional years to achieve an education and any real economic value.  This assumes, unrealistically, that the first design tested would work.  It would almost certainly fail.  Successive iterations would take even more time.  This is of course is the real reason why we don’t have human-replacement AI yet: humans are still many orders of magnitude more economically efficient.

Yet we should not be so complacently comfortable in our economic superiority, for moore’s law ensures that the cost of such a cortical system will decrease exponentially.

Now consider another scenario, where instead of being constrained to current CPUs or GPUs, we invest billions in radical new chip technologies and even new foundries to move down the effeciency spectrum with a neuromorphic designs or at least a very powerful dedicated cortical ASIC.  Armed with some form of specialized cortical chip ready for mass volume production today at the cheap end of chip prices (where a dollar buys about a billion transistors, instead of hundreds of dollars for a billion transistors as in the case of high end logic CPUs and GPUs), we would expect a full human brain sized system to cost dramatically less: closer to the cost of the equivalent number of RAM transistors – on the order of $10 million dollars for a petabyte (assuming 10 transistors = synapse, memristors are even better).  Following the semiconductor industry roadmap (which factors in a slowing of moore’s law this decade), we could expect the cost of a quadrillion synapse or petabyte system to fall below $1 million by the end of the decade, and reach $100,000 in volume by the mid 2020′s – the economic tipping point of no return.  But even at a million dollars a pop, a future neuromorphic computing system of human cortical capacity and complexity would be of immeasurable value, for it could possess a fundamental, simply mind-boggling advantage of speed.  As you advance down the specialization spectrum from GPU’s to dedicated cortical ASICs and eventually neuromorphic chips and memristors, speed and power efficiency increases by orders of magnitude – with dedicated ASICS offering 10x to 100x speedups, and direct neuromorphic systems offerings speedups of 1000x or more.  If it takes 20-30 years to train a cortex from infant to educated adult mind, power and training time are the main cost.  A system that could do that in 1/10th the time would be suddenly economical, and a system that could do that in 1/100th of the time of a human would rapidly bring about the end of the world as we know it.




Thinking at the Speed of Light



Our biological brains have high information densities and are extraordinarily power efficient, but this is mainly because they are extremely slow:  with cycle times in the hundreds of hertz or approaching a kilohertz.  This relatively slow speed is a fundamental limitation of computing with living cells and (primarily)chemical synapses with their organic fragility. Semiconductor circuits do not have this limitation. Operating at the low frequencies and clock rates of their biological inspirations, neuromorphic systems can easily simulate biological networks in real-time and with comparable energy efficiency.  The most efficient neuromorphic computer generally can access all of its memory and synapses every clock cycle, so it can still perform immense calculations per second at very low speeds, just like biological neural nets. But you can also push up the clock rate, pump more power through the system, and run the circuit at megahertz rate or even gigahertz rate, equivalent to one thousand to one million times biological speed. Current systems with mixed digital/analog synaptic circuits can already achieve 1000-10000x biological ‘real-time’ on old CMOS manufacturing nodes and at low power and heat points. This is not an order of magnitude improvement over simulation on a similar sized and tech digital computer, its more like six orders of magnitude.  That being said, the wiring problem will still be a fundamental obstacle.  The brain optimizes against this constraint by taking the form of a 2D sheet excessively folded into a packed 3D space – a  space filling curve.  The entire outer surface is occupied by connectivity wiring – the white matter.  Our computer chips are currently largely 2D, but are already starting to move into the 3rd dimension.  A practical full electronic speed artificial cortex may require some novel solutions for high-speed connectivity, such as directly laser optical links, or perhaps a huge mass of fiber connections.  Advanced artificial cortices may end up looking much like the brain: with the 2D circuity folded up into a 3D sphere, interspersed with something resembling a vascular system for liquid cooling, en-sheathed in a mass of dense optical interconnects.  Whatever the final form, we can expect that the fundamental speed advantage inherit to electronics will be fully exploited.


By the time we can build a human complexity artificial cortex, we will necessarily already be able to run it many times faster than real time, eventually accelerating by factors of thousands and then even millions.


Speed is important because of the huge amount of time a human mind takes to develop.  Building practical artificial cortex hardware is only the first step.  To build a practical mind, we must also unlock the meta-algorithm responsible for the brain’s emergent learning behavior.  This is an active area of research, and there are some interesting emerging general theories, but testing any of them on a human sized cortex is still inordinately costly.  A fresh new artificial brain will be like an infant: full of noisy, randomized synaptic connections.  An infant brain does not have a mind so much as the potential space from which a mind will etch itself through the process of development.  Running in real-time in a sufficiently rich virtual reality, it would take years of simulation just to test development to a childhood stage, and decades to educate a full adult mind.  Thus accelerating the simulation many times beyond real-time has a huge practical advantage.  Thus the need for speed.

The test of a human-level AI is rather simple and is the same qualitative intelligence tests we apply to humans: its mind develops sufficiently to learn human language, then it learns to read, and it progresses through education into a working adult. Learning human language is ultimately the fundamental aspect of becoming a modern human mind – far more than your exact brain architecture or even substrate. If the brain’s cortical capacity is sufficient and the wiring organization is correctly mapped, it should then be able to self-educate and develop rapidly.

I highly doubt that other potential short cut routes to AGI (artificial general intelligence) will bear fruit – although narrow AIs will always have their uses as will simpler, animal-like AIs (non-language capable), but it seems inevitable that a human level intelligence will require something similar to a cortex (at least at the meta-algorithmic level of some form of self-organizing deep, hierarchical probabilistic networks – doesn’t necessarily have to use ‘neurons’ ). Furthermore, even if the other routes being explored to AI do succeed, its even less likely that they will scale to the insane speeds that the cortex design should be capable of (remember, the cortex runs at < 1000hz, which means we can eventually take that same design and speed it up by a factor of at least a million.) From a systems view, its seems likely that the configuration space of our biological cortex meta-wiring is effectively close to optimal in some sense – evolution has already well explored that state space. From an engineering perspective, taking an existing, heavily optimized design for intelligence and then porting it to a substrate that can run many orders of magnitude faster is a clear winning strategy.

Clock rate control, just like on our current computers, should allow posthumans to alter their speed of thought as needed. In the shared virtual realities they will inhabit with their human teachers and observers, they will think at ‘slow’ real-time human rates, with kilohertz clock rates and low power usage. But they will also be able to venture into a vastly accelerated inner space, drawing more power and thinking many many times faster than us. Running on even today’s CMOS technology, they could theoretically attain speeds up to about a million times faster than a biological brain, although at these extreme speeds the power and heat dissipation requirements would be large – like that of current supercomputers.

Most futurist visions of the Singularity consider AIs that are more intelligent, but not necessarily faster than human minds, but its clear that the speed is the fundemental difference between the two substrates. Imagine one of these mind children growing up in a virtual environment where it could dilate time by a factor of 1-1000x at will. Like real children, it will probably require both imitation and reinforcement learning with adults to kick start the early development phases (walking, basic environment interaction, language, learning to read). Assuming everything else was identical (the hardware cortex is a very close emulation), this child could develop very rapidly – the main bottleneck being the slow-time interaction with biological humans. Once a young posthuman learns to read, it can hop on the web, download texts, and progress at a truly staggering pace – assuming a blindingly fast internet connection to keep up (although perceived internet latency would be subjectively far worse in proportion to the acceleration – can’t do much about that) . Going to college wouldn’t really be a realistic option, but reading at 1000x real-time would have some pretty staggering advantages. It could read 30 years of material in just 10 days, potentially becoming a world class expert in a field of its choosing in just a week. The implications are truly profound. Entering this hypertime acceleration would make the most sense when reading, working, or doing some intellectual work. The effect for a human observer would be that anything the posthuman was intelligent enough to do it would be able to do near instantly, from our perspective.  The presence of a posthuman would be unnerving.  With a constant direct internet connection and a 1000x acceleration factor, a posthuman could read a book during the time it would take a human to utter a few sentences in conversation.

Clearly existing in a different phase space than us, its only true peers would be other equivalent posthumans; if built as a lone research model, it could be lonely in the extreme. Perhaps it could be selected or engineered for monastic or ascetic qualities, but it would probably be more sensible to create small societies of posthumans which can interact and evolve together – humans are social creatures, and our software descendants would presumably inherit this feature by default. The number of posthumans and their relative intelligence will be limited by our current computing process technology: the transitor density and cost per wafer – so their potential population growth will be more predictable and follow semiconductor trends (at least initially).  Posthumans with more equivalent synapses and neurons than humans could presumably become super-intelligent in other quantitative dimension, that of mental capacity – able to keep track of more concepts, learn and recall more knowledge, and so on than humans – albeit with the slow linear scaling discussed previously. But even posthumans with mere human-capacity brains could be profoundly, unimaginably super-intelligent in their speed of thought, thanks to the dramatically higher clock rates possible on their substrate – and thus in a short matter of time they would become vastly more knowledgeable. The maximum size of an artificial cortex would be limited mainly by economics for the wafers and then by bandwidth and latency constraints. There are tradeoffs between size, speed, and power for a given fabrication technology, but in general, larger cortices would be more limited in their top speed. The initial generations will probably occupy a fair amount of server floor space and operate not much faster than real-time, but then each successive generation will be smaller and faster, eventually approaching a form factor similar to the human brain, and eventually pushing the potential clock rate to the technology limits (more than a million times real-time for current CMOS tech). But even with small populations at first, it seems likely that the first successful generation of posthumans to reach upper-percentile human intelligence will make an abrupt and disruptive impact on the world. But fortunately for us, physics does impose some costs to thinking at hyperspeed.




Fast Brains and Slow Computers

A neuromorphic artificial cortex will probably have data connections that allow its synaptic structures to be saved to external storage, but a study of current theory and designs in the field dispels some common myths: an artificial cortex will be a very separate specialized type of computer hardware, and will not automatically inherit a digital computer’s supposed advantages such as perfect recall.  It will probably not be able to automagically download new memories or skills as easily as downloading new software.  The emerging general theories of the brain’s intelligence, such as the heirchachial bayesian network models, all posit that learned knowledge is stored in a deeply non-local, distributed and connected fashion, very different than say a digital computer’s random access memory (even if the said synapses are implemented in RAM).  Reading (or accessing) memories and writing memories in a brain-like network intrinsically involves thinking about the associated concepts – as memories are distributed associations, and everywhere tangled up to existing memory patterns.  An artificial cortex could be designed to connect to external computer systems more directly than through the senses, but this would have only marginal advantages.  For example, we know from dreams that the brain can hallucinate visual input by bypassing the lowest layers of the visual cortex and directly stimulating regions responsible for recognizing moving objects, shapes, colors, etc, all without actually requiring input from the retina.  But this is not much of a difference for a posthuman mind already living in a virtual reality – simulating sound waves and their conversion into neural audio signals and simulating the processing into neural patterns representing spoken dialog is not that much different than just directly sending the final dialog representing neural patterns into the appropriate regions.  The small differences will probably show up as what philosophers call qualia – those subjective aspects of consciousness or feelings that operate well below the verbal threshold of explanation.

Thus a posthuman with a neuromorphic artificial cortex will still depend heavily on traditional computers to run all the kinds of software that we use today, and to simulate a virtual environment complete with a virtual body and all that entails. But the posthuman will essentially think at the computer’s clock rate. The human brain has a base ‘clock rate’ of about a kilohertz, completing on the order of a thousand neural computing steps per second simultaneously for all the trillions of circuits. A neuromorphic computer works the same, but the clock rate can be dramatically sped up to CMOS levels. A strange and interesting consequence is that a posthuman thinking at hyperspeed would subjectively experience its computer systems and computer environment slow down by an equivalent rate. Its likely the neuromorphic hardware will have considerably lower clock rates than traditional CPUs for power reasons, but they have the same theoretical limit, and running at the same clock rate, a posthuman would experience a subjective second in just a thousand clock cycles, which is hardly enough time for a traditional CPU to do anything. Running at a less ambitious acceleration factor of just 1000x, and with gigahertz computer systems, the posthuman would still experience a massive slowdown in its computer environment, as if it had jumped back more than a decade in time to a vastly slower era of of megahertz computing.  However, we can imagine that by this time traditional computers will be much further down the road of parallelization, so a posthuman’s typical computer will consist of a very large number of cores and software will be much more heavily parallelized.  Nevertheless, its generally true that a posthuman, no matter what level of acceleration, will still have to wait the same amount of time as anyone else, including a regular human, for its regular computations to complete.


Ironically, while posthumans will eventually be able to think thousands or even millions of times faster than biological humans, using this quickening ability will have the perceptual effect of slowing down the external universe in direct proportion – including their computer environment.  Thus greatly accelerated posthumans will spend proportionally inordinate amounts of subjective time waiting on regular computing tasks.

The combination of these trends leads to the conclusion that a highly variable clock rate will be an important feature for future posthuman minds.  Accelerating to full thought speed – quickening – will probably be associated with something like entering an isolated meditative state.  We can reason that at least in the initial phases of posthuman development, their simulated realities will mainly run at real-time, in order to provide compatibility with human visitors and to provide full fidelity while conserving power.  When quickening, a posthuman would experience its simulated reality slowing down in proportion, grinding to a near halt at higher levels of acceleration.  This low-computational mode would still be very useful for much of human mental work: reading, writing, and old-fashioned thinking.

In the present, we are used to computers getting exponentially faster while the speed of human thought remains constant.  All else being equal, we are now in a regime where the time required for fixed computational tasks is decreasingly exponentially (even if new software tends to eat much of this improvement.)  The posthuman regime is radically different.  In the early phases of ramp up the speed of thought will increase rapidly until it approaches the clock rate of the processor technology.  During this phase the trend will actually reverse – posthuman thoughtspeed will increase faster than computer speed and from a posthumans perspective, computers will appear to get exponentially slower.  This phase will peter out when posthuman thoughtspeed approaches the clock rate – somewhere around a million times human thoughtspeed for the fastest, most extremely optimized neuromorphic designs using today’s process technology (gigahertz vs a kilohertz).  At that point there is little room for further raw speed of thought improvements (remember, the brain can recognize objects and perform relatively complex judgements in just a few dozen ‘clock cycles’ – not much room to improve on that in terms of speed potential given its clock rate).

After the initial ramp up regime, Moore’s law will continue of course, but at that point you enter a new plateau phase.  In this second regime, once the algorithms of intelligence are well mapped to hardware designs, further increases in transistor density will enable more traditional computer cores per dollar and more ‘cortical columns or equivalents’ per dollar in direct proportion.  Posthuman brains may get bigger, and or they may get cheaper, but the clock speed wouldn’t change much (as any process improvement in clock rate would speed up both traditional computers and posthuman brains).  So in the plateau phase, you have this weird effect where computer clock rate is more or less fixed at a far lower level than we are used to – about a million times less or so from the perspective of the fastest neuromorphic posthuman brain designs.  This would correspond to computer clock rates measured in the kilohertz.  The typical computer available to a posthuman by then would certainly have far more cores than today, thousands or perhaps even millions, but they would be extremely slow from the posthuman’s perspective.  Latency and bandwidth would be similarly constrained, which would effectively expand the size of the world in terms of communication barriers – and this single idea has wide ranging implications for understanding how posthuman civilizations will diverge and evolve.  It suggests a strong diversity increasing counter–globalization effect which would further fragment and disperse localized sub-populations for better or worse.

What would posthumans do in the plateau phase, forever limited to extremely slow, roughly kilohertz-speed computers?  This would limit the range of effective tasks they could do in quicktime.  Much of hardware and software design, engineering, etc would be limited by slow computer speeds.  Surprisingly, the obvious low-computational tasks that could still run at full speed would be the simpler, lower technology creative occupations such as writing.  It’s not that posthumans wouldn’t be good at all computer intensive tasks as well – they certainly would be superhuman in all endeavors.  The point is rather that they will be vastly, incomprehensibly more effective only in those occupations that are not dependent on the speed of general purpose computation.  Thus we can expect that they will utterly dominate professions such as writing.

It seems likely that very soon into the posthuman era, the bestseller lists will be inundated by an exponentially expanding set of books, the best of which would be noticeably better than anything humans could write (and most probably written under pseduo-names and with fake biographies).  When posthumans achieve 1000x human thoughtspeed, you might go on a short vacation and come back to find several years of new literature.  When posthumans achieve 1 million X human thoughtspeed, you might go to sleep and wake up to find that the number of books in the world (as well as the number of languages), just doubled over night.  Of course, by that point, you’re already pretty close to the end.

We can expect that the initial posthuman hardware requirements will be expensive and thus they will be few in number and of limited speed, but once they achieve economic parity with human workers, we can expect the tipping point to crash like a tidal wave, with a rapid succession of hardware generations increasing maximum thoughtspeed while reducing size, cost, and power consumption and huge economies of scale leading to an exponential posthuman population expansion, and their virtual realities eventually accelerating well beyond human comprehension.




Posted in Singularity, Technology | 1 Comment »

Know thyself: personal identity, uploading, and duplicity

Posted by jcannell on June 27, 2010

The so called consciousness conundrum

The future according to the Singularity posits a new age of wonders, and the promise of effective immortality through radical new technologies such as mind uploading and medical nanobots.  The broad scope of augmentation and change these developments will enable is the basis for the concepts of transhumanity and posthumanity.  We are but a stage in the unfolding evolution of the universe, and history shows that life must always change and adapt in order to survive and progress.  For many, the changes envisioned by Singularitans are too radical, and some even argue that the broader transhumanist agenda itself is set on the extinction of humanity[1].  A related persistent set of critics maintain that while some forms of immortality – such as indefinite biological repair – are feasible, other technologies such as mind uploading, which permit duplicity, could not possibly preserve what we most care about: personal subjective identity[2].  The second viewpoint may be common, it was espoused indirectly by Bill Gates in dialog in the “Singularity is Near”, for example.  Much depends on just what exactly we choose to identify with, both individually and collectively.

There is overwhelming evidence and consensus within the scientific community that the mind, and thus personal identity, has a physical basis in the brain.  A complete analysis of the accumulated evidence from psychology, neurology, cognitive science, and yes even artificial intelligence leads to the typical conclusions that consciousness is a physical information processing phenomenon.  Thus most people today can accept that other systems, such as sufficiently advanced computer systems, could exhibit not only human level intelligence, but consciousness similar or equivalent to the human experience.  But to accept that uploading is possible requires more than just encoding consciousness in a machine substrate: it requires encoding a particular mind such that one’s particular personal identity and consciousness is preserved and then realized in a new substrate.

Personal Identity: What am I?

Sometimes our habits of everyday experience and language obscure the deeper issues of personal identity.  If someone showed you a picture of a child, and you recognized it as your own childhood picture, you might say “Thats me.  I was seven years old then.”  You thus self-identify with the child in the picture.  Of course, assuming you are not a seven year old looking at a live video feed, the child in the picture no longer exists – you are thus self-identifying with a historical person.  Imagine then a time portal which brings that child into the present, into your presence.  Would it still be correct then to say, “Thats me.” ?  Clearly it would not be you, as you and the child would both exist separately in the present – adult-you and child-you would be separate intelligent beings with their own threads of consciousness and sense of personal identity.  But curiously, the time travel would not change you nor the child.  So unless you accept that you can be two people simultaneously, the child can’t be you.  But even without the time travel, its not quite correct to say “thats me”, it would be more correct to say “that was me”.  At some point in the past, you were a child. Out of the space of all possible people, that child became you.  So you can correctly self-identity with it, but only partially – you have probably changed considerably since then.  That partial self-identification has a psychological and physical basis in memories you may have, and an arrow of evolutionary development, a continuity extending back from your current state of mind to that of the historical child.  Likewise, going forward in time, you will change.  You will become someone else, and if God transported that future version of you back into your presence right now, it would clearly not be you – and could potentially be less similar to your current self then other people.  And yet, you self-identify with a future version of yourself, you project your identity forward in time, and unless you are suicidal, even make sacrifices in the present for the benefit of that future person.  You could object that invoking time travel or God to provide the physics implies there is something unreal about these thought experiments, but if you accept computationalism and the Simulation Argument, any such thing is possible.

The fact is that the human mind (and really any functional mind) has a strong sense of self-identity simply because it has obvious evolutionary value.  Yet the exact consciousness you experience right now exists as only a brief moment in time, and you are never exactly the same person as any past or future version of yourself.  Your cells, and the neurons of your brain, are made of completely different molecules, and the configurations of those molecular walls and the all-important synaptic junctions which are the locus of the brain’s information processing and storage change as we form new memories, beliefs, ideas, thoughts, and feelings.  We are constantly changing, yet we maintain a strong sense of personal identity stretching back into our history and projecting forward into our future.  As John Locke ingeniously argued more than 300 years ago, we are the same person to the extent that we are conscious of our past and future thoughts and actions in the same way as we are conscious of our present thoughts and actions.  Or put another way, I am that who I remember myself to be, that who I am conscious of being.

Those who accept this line of reasoning so far usually accept technologies which change non-essential physical elements of the brain but permit only a single forward thread of consciousness, ie no duplicities.  Even if these technologies significantly change the brain, as long as they preserve the essential physical system underlying conscious identity, they are no more problematic than the significant physical changes the brain undergoes during regular life, including the frequent molecular replacement of cells including neurons, and their continuously shifting interconnection synapses.  We can denote the essential physical system underlying conscious identity as the mind: the essential subset of the brain that must be maintained.  Current physical evidence shows that the mind is physically encoded in the microscopic synaptic junctions – the fundamental circuit building blocks of the brain.  The rest, such as the skull, circulatory system, glial cells, and even the neurons themselves, are largely secondary structures, supporting the computation of the synaptic network.  Its also important to remember that continuity of identity is fluid and variable: some change is inevitable, but we must draw the line somewhere.

Consider some thought experiments:

Complete Amnesia:  Jane’s mind is wiped by some highly selective destructive process which just randomizes the delicate synaptic connections, but otherwise leaves the brain and all of its neurons structurally intact.  If the brain could survive this, current science predicts that Jane would essentially restart life as an infant – all memories, learned behaviors, personality traits, etc – everything to mentally identify Jane as Jane – would be erased.  Suppose Jane is then abducted and transported to a foreign country, and grows up speaking a new language, culture, and social identity as Katie.  To what extent can we say that Jane and Katie are the same person?  Is conscious identity preserved?

Mind Transfer:  Suppose Jane’s mind is wiped as in Complete Amnesia, but instead of randomizing the synaptic connections, the memories and patterns of another person are encoded into the synapses – say those of Bill.  (yes, I’m aware that this may be near-impossible without also altering neuron wirings or adding or removing neurons, but suppose godlike technology minimizes that)  Bill has all of his memories intact, never has any mental connection to Jane, but now inhabits Jane’s body.  Is Bill still Jane somehow?  Is Jane’s conscious identity preserved?

Brain Transfer: Same as above, except Bill’s entire brain is transfered into Jane’s skull, and Jane’s brain is thrown out.

According to current understanding of the brain’s circuitry, all of these cases result in Jane’s death – the irreversible cessation of her personal conscious identity.  (unless her brain synaptic structures are recorded and preserved)

Teleportation and Duplicity:

Slice and Dice: A post-singularity technology near instantly slices up your entire body into very small pieces and then just as quickly perfectly reassembles you.  If this had no physically detectable effects, would it effect your conscious identity in any way?  Would it still be you?  Does it matter how fine the slices are?  Macroscopic, microscopic, cellular, molecular, atomic, sub-atomic, does it matter?

Slice, Dice, and Store: You are sliced and diced, but instead of being immediately reassembled, your pieces are stored in a perfect stasis, and you are then resembled as before sometime later.  Do you die?  Or are you just in a form of stasis?  Does it matter how long your are in stasis?  Would your conscious identity continue?

Slice, Dice and Teleport: You are sliced, diced and stored as above, but instead of being reassembled immediately, your pieces are transported and resembled elsewhere.  Now imagine that a couple of the pieces are replaced in transit with their complete information description, which are then used to construct perfect replicas of those pieces somewhere else from building blocks.  Is it still you?  Does it matter how many pieces are replaced?  Remember that as far as the universe is concerned, there is no detectable difference for external observers no matter how many pieces are replaced.  And of course, our constituent pieces are being replaced at the molecular level continuously as part of organic metabolism.

Slice, Dice, and Duplicate:  Now imagine that half of your pieces A, are sent to one location, with the other half, B, replaced by their information description.  Then the remaining physical set of B pieces is sent to a different location along with the information description of A.  At location A, the B pieces are rebuilt from information and you are reconstructed.  At location B, the A pieces are rebuilt from information and you are reconstructed.  You are then reconstructed at both locations A and B, and both reconstructions are identical except for their location, neither version is wholly a copy – both are built out of a mix of original and copied components but both versions are 100% physically indistinguishable from their versions constructed in the prior experiments.  Which version are you?  Does your conscious identity continue in one, both, or neither?

This last thought experiment is initially unsettling to most people: its difficult to accept that both versions are still you in the same sense as the prior thought experiments, its difficult to accept that you could essentially duplicate your conscious identity, becoming two (or more) future selves.  Its easier to think that one is the ‘original’, and one is the ‘copy’, and that your consciousness is only preserved in the ‘original’ and not the ‘copy’, but its clear that any such designation is completely arbitrary: neither A nor B have any more or less of a claim of being the ‘original’.  Its also difficult to accept that this process results in two new beings who do not continue your conscious identity, ie that this process somehow kills you, when clearly it is not any worse than the prior thought experiments.

The Duplicity Problem:

” .. On the day when you were one, you became two. But when you become two, what will you do?” GoT 11

Our evolved capacity for introspective conscious self-awareness and forward prediction of that self-awareness to future versions of ourselves never had to contend with anything more than a single forward path of conscious identity.  Thus, duplicity thought experiments are difficult to intuitively accept.  However confusing to us, the universe can never be confused, only we can.  However difficult to intuit, the laws of physics have nothing against our streams of consciousness forking and branching into two or more paths.  In the slice, dice, and duplicate thought experiment, you become both A and B.  At that point forward, those two people will be two instances of yourself, and will then slowly begin to diverge.  Both will be you, they will both self-identify with you just as easily and as much as you self-identify with the person you were a minute ago.  Both will have an equally valid claim to being you.  You will become both.  Its intuitively easier and almost equivalent to consider that your conscious identity stream will continue randomly into one path: ie you will randomly become one or the other.  Its not quite as correct, but nearly equivalent in terms of consequences.

The consequences of the rational, objective approach to personal identity and duplicity are:

  • Various forms of uploading are possible, and any form that fully preserves the essential physical information of the mind – ie the synaptic connectivity information, is sufficient to preserve personal conscious identity, including uploading and transfer to a non-biological substrate
  • Conscious identity changes over time and their is a slippery slope spectrum of possible preservations: some arbitrary legal delineation must be made
  • Duplication does not present any problem for personal conscious identity: assuming all duplicates are equally valid copies, they all preserve conscious identity and should all have equivalent rights and legal inheritance.  Essentially this means that all valid variants of a duplicating mind should equally inherit that mind’s legal and economic identity, wealth, and so on, while also being recognized as new people going forward
  • Duplicating oneself does help ensure survival, but that is no consolation to any future version which dies

Posted in Conscious Identity, Mind Uploading | Leave a Comment »

Update: moving blog

Posted by jcannell on June 26, 2010

I’m in the process of porting this blog over to wordpress from blogger, largely due to accumulated frustration with blogger’s editor.  I’m also preparing a larger volume of mainly Singularity related writings that i’ve accumulated over the last year into a more organized form for this site.  The intro page is a good start.

Posted in Uncategorized | Leave a Comment »

Latency & Human Response Time in current and future games

Posted by jcannell on April 4, 2010

I’m still surprised at how many gamers and even developers aren’t aware that typical games today have total response latencies ranging anywhere from 60-200ms. We tend to think of latencies in terms of pings and the notion that the response time or ‘ping’ from a computer or a console five feet away can be comparable to the ping of a server a continent away is something of an unnatural notion.

Yet even though it seems odd, its true.

I just read through “
Console Gaming: The Lag Factor“, a recent blog article on EuroGamer which follows up on Mick West’s original Gamasutra article that pioneered measuring the actual response times of games using a high speed digital camera. For background, I earlier wrote a GDM article (Gaming in the Cloud) that referenced that data and showed how remotely rendered games running in the cloud have the potential to at least match the latency of local console games, primarily by running at a higher FPS.

The eurogamer article alludes to this idea:

In-game latency, or the level of response in our controls, is one of the most crucial elements in game-making, not just in the here and now, but for the future too. It’s fair to say that players today have become conditioned to what the truly hardcore PC gamers would consider to be almost unacceptably high levels of latency to the point where cloud gaming services such as OnLive and Gaikai rely heavily upon it.

The average videogame runs at 30FPS, and appears to have an average lag in the region of 133ms. On top of that is additional delay from the display itself, bringing the overall latency to around 166ms. Assuming that the most ultra-PC gaming set-up has a latency less than one third of that, this is good news for cloud gaming in that there’s a good 80ms or so window for game video to be transmitted from client to server


Its really interesting to me that the author assumes that “ultra-PC” gaming set up has a latency less than one third of a console – even though the general model developed in the article posits that their is no fundamental difference between PCs and consoles in terms of latency – other than framerate.

In general, the article shows that games have inherent delay measured in frames – the minimum seems to be about 3 frames of delay, but can go up to 5 for some games. The total delay in time units is simply N/F, the number of frames of delay over the frame rate. A simple low-delay app will typically have the minimum delay – about 3, which maps to around 50ms running at 60fps and 100ms at 30fps.

There is no fundemental difference between consoles and PC’s in this regard other than framerate – the PC version of a game running at 60fps will have the same latency as its console sibling running at 60fps. Of course, take a 30fps console game and run it at 60fps and you halve the latency – and yes this exactly what cloud gaming services can exploit.


The eurogamer article was able to actually measure just that – proving this model with some real world data. The author was able to use the vsync feature in bioshock to measure the response difference between 59fps and 30fps, and as expected, the 59fps had just about half the latency.

The other assertion of the article – or rather the whole point of the article – was that low response times are really important for the ‘feel’ of a game. So I’d like to delve into this in greater detail. As a side note though, the fact that delay needs to be measured for most games to make any sort of guess about its response time tells you something.

Firstly, on the upper end of the spectrum, developers and gamers know from 1st hand experience that there definitely is an upper window to tolerable latency, although it depends on the user action. For most games, controlling the camera with a joypad or mouse feels responsive with a latency of up to 150ms. You might think that the mouse control would be considerably more demanding in this regard, but the data does not back that up – I assert that PC games running at 30fps have latencies in the same 133-150ms window as 30fps console games, and are quite playable at that fps (and some have even shipped capped at 30fps).


There is a legitimate reason for a PC gamer to try to minimize their system latency as much as possible for competitive multiplayer gaming, especially twitch shooters like counterstrike. A system running with vsync off at 100fps might have latencies under 50ms and will give you a considerable advantage over an opponent running at 30fps with 133-150ms of base system latency – no doubt about that.


But what I’m asserting is that most gamers will barely – if at all – be able to notice the difference of delay times under 100ms in typical scenarios in FPS and action games – whether using a gamepad or mouse and keyboard. As the delay times exceed some threshold they become increasingly noticeable – 200ms of delay is noticeable to most users, and 300ms becomes unplayable. That being said, variation in the delay is much more noticeable. The difference between a perfectly consistent 30fps and 60fps is difficult to perceive, but an inconsistent 30fps is quite noticeable – the spike or changes in response time from frame to frame themselves are neurologically relevant and quite detectable. This is why console developers spend a good deal of time to optimize the spike frames and hit a smooth 30fps.

There is however a class of actions that do have a significantly lower latency threshold – the simple action of moving a mouse cursor around on the screen! Here I have some 1st hand data. A graphics app which renders its own cursor, has little buffering and runs at 60fps will have about 3 frames or about 50ms of lag, and at that low level of delay the cursor feels responsive. However if you take that same app and slow it down to 30fps, or even add just a few more frames of latency at 60fps the cursor suddenly seems to lag behind. The typical solution is to use the hardware cursor feature which short circuits the whole rendering pipeline and provides a direct fast path to pipe mouse data to the display – which seems to be under 50ms. For the low-latency app running at 60fps, the hardware cursor isn’t necessary, but it becomes suddenly important at some threshold around 70-90ms.

I think that this is the real absolute lower limit of human ability to perceive delay.

Why is there such a fundamental limit? In short: the limitations of the brain.

Ponder for a second what it actually means for the brain to notice delay in a system. The user initiates an action and sometime later this results in a response, and if that response is detected as late, the user will notice delay. Somewhere, a circuit (or potentially circuits, but for our purposes this doesn’t matter) in the brain makes a decision to initiate an action, this message must then propagate down to muscles in the hand where it then enters the game system through the input device. Meanwhile in the brain, the decision circuit must also send messages to the visual circuits of the form “I have initiated action and am expecting a response – please look for this and notify me immediately on detection”. Its easier to imagine the brain as a centralized system like a single CPU, but it is in fact the exact opposite – massively distributed – a network really on the scale of the internet itself – and curiously for our discussion, with latencies comparable to the internet itself.

Neurons can fire only as fast as about 10ms typically, perhaps as quickly as 5ms in some regions. The fastest neural conduits – myelinated fiber – can send signals from the brain to the fingertip (one way) in about 20ms. So now imagine using these slow components to build a circuit that could detect a timing delay in as quickly as 60ms.

Lets start with the example of firing a gun. At a minimum, we have some computation to decide to fire, and once this happens the message can be sent down to the fingertip to pull the trigger and start the process. At the same time, for the brain to figure out if the gun actually fired in time, the message must also be sent down to the visual circuits, where the visual circuits must process the visual input stream and determine if the expected response exists (a firing gun), this information can then be sent to some higher circuit which can then compute whether the visual response (gun firing response pattern exists or not at this moment in time) matches the action initiated (the brain sent a firing signal to the finger at this moment in time).

Built out of slow 10ms neurons, this circuit is obviously going to have alot of delay of its own which is going to place some limits its response time and ability to detect delay. Thinking of the basic neuron firing system as the ‘clock rate’ and the brain as a giant computer (which it is in the abstract sense), it appears that the brain can compute some of these quick responses in as little as around a dozen ‘clock cycles’. This is pretty remarkable, even given that the brain has trillions of parallel circuits. But anyway, the brain could detect even instantaneous responses if it had the equivalent of video buffering. In other words, if the brain could compensate for its own delay, it could detect delays in the firing response on timescales shorter than its own response time. For this to happen though, the incoming visual data would need to be buffered in some form. The visual circuits, instead of being instructed to signal upon detection of a firing gun, could be instructed to search for a gun firing X ms in the past. However, to do this they would need some temporal history – the equivalent of a video buffer. There’s reasons to believe some type of buffering does exist in the brain, but with limitations – its nothing like a computer video buffer.

The other limitation to the brain’s ability to detect delays is the firing times of neurons themselves which make it difficult to detect timings on scales approaching the neuron firing rate.
But getting back to the visual circuits, the brain did not evolve to detect lag in video games or other systems. Just because its theoretically possible that a neural circuit built out of relatively slow components could detect fact responses by compensating for its own processing delay does not mean that the brain actually does this. The quick ‘twitch’ circuits we are talking about evolved to make rapid decisions – things like: detect creature, identify as prey or predator, and initiate flight or fight. These quick responses involve rapid pattern recognition, classification, and decision making, all in real-time. However, the quick response system is not especially concerned with detecting exactly when an event occurred, its optimized for the problem of reacting to events rapidly and correctly. Detecting if your body muscles reacted to the run command at the right time is not the primary function of these circuits – it is to detect the predator threat and initiate the correct run response rapidly. The insight and assertion I’m going to make is that our ability to detect delays in other systems (such as video games) is only as good as our brain’s own quick response time – because it uses the same circuits. Psychological tests show the measured response time is around ~200ms for many general tasks, probably getting a little lower for game-like tasks with training. A lower bound of around 100-150ms for complex actions like firing guns and moving cameras seems reasonable for experienced players.

For moving a mouse cursor, the response time appears to be lower, perhaps 60-90ms. From this brain model, we can expect that for a few reasons. Firstly, the mouse cursor is very simple and very small, and once the visual system is tracking it we can expect that detecting changes in its motions (to verify that its moving as intended) is computationally simple and can be performed in the minimal number of steps. Detecting that the entire scene moved in the correct direction, or that the gun is in its firing animation state are far more complex pattern recognition tasks, and we can expect they would take more steps. So detecting mouse motion represents the simplest and fastest type of visual pattern recognition.

There is another factor at work here as well: rapid eye cascades. The visual system actually directs the eye muscles on frame by frame time scales that we don’t consciously perceive. When recognizing a face, you may you think you are looking at someone right in the eye, but if you watched a high res video feed of yourself and zoomed in on your eyes in slow motion, you’d see that your eyes are actually making many rapid jumps – leaping from the eyebrow to the lips to the nose and so on. Presumably when moving around a mouse cursor, some of these eye cascades are directed to predicted positions of the mouse to make it easy for the visual system to detect its motion (and thus detect if its lagging).

So in summary, experimental data (from both games and psychological research) leads us to expect that the threshold for human delay detection is around:

300ms> games become unpleasant, even unplayable
200ms> delay becomes palpable
100-150ms – limit of delay detection for full scene actions – camera panning and so on
50-60ms – absolute limit of delay detection – small object tracking – mouse cursors

Delay is a strongly non-linear phenomena, undetectable beyond certain threshold and then ramping up to annoying and then deal breaking soon after. Its not a phenomenon where less is always better. Less beyond a certain point doesn’t matter from a user experience point of view. (of course, for competitive twitch gaming, having less delay is definitely advantageous even when you can’t notice it – but this isn’t relevant for console type systems where everyone has the same delay)

So getting back to the earlier section of this post, if we run a game on a remote pc, what can we expect the total delay to be?

The cloud system has several additional components that can add delay on top of the game itself: video compression, the network, and the client which decompresses the video feed.

Without getting into specifics, what can we roughly expect? Well, even a simple client which just decompresses video is likely to exhibit the typical minimum of roughly 3 frames of lag. Lets assume the video compression can be done in a single frame and the network and buffering adds another, we are looking then at roughly 5 frames of additional lag with a low ping to the server – with some obvious areas that could be trimmed further.

If everything is running at 60, a low latency game (3 frames of internal lag), might exhibit around 8/60 or 133ms of latency, and a higher latency game (5 frames of internal lag), might exhibit 10/60 or 166ms of latency. So it seems reasonable to expect that games running at 60fps remotely can have latencies similar to local games running at 30fps. Ping to the server then does not represent even the majority of the lag, but obviously can push the total delay into the unplayable as the ping grows – and naturally every frame of delay saved allows the game to be playable at the same quality at increasingly greater distances from the server.

What are the next obvious areas of improvement? You could squeeze and save additional frames here and there (the client perhaps could be optimized down to 2 frames of delay – something of a lower limit though), but the easiest way to further lower the latency is just to double the FPS again.

120 fps may seem like alot, but it also happens to be a sort of requirement for 3D gaming, and is the direction that all new displays are moving. At 120fps, the base lag in such an example would be around 8/120 to 10/120, or around 66ms to 83ms of latency, comparable to 60fps console games running locally. This also hints that a remotely rendered mouse cursor would be viable at such high FPS. At 120fps, you could have a ping as high as 100ms and still get an experience comparable to a local console .

This leads to some interesting rendering directions if you start designing for 120fps and 3D, instead of the 30fps games are typically designed for now. The obvious optimization for 120fps and 3D is to take advantage of the greater inter-frame coherence. Reusing shading, shadowing, lighting and all that jazz from frame to frame has proportionately greater advantage at high FPS as the scene will change proportionately less between frames. Likewise, the video compression work and bitrate scales sublinearly, and actually increases surprisingly slowly as you double the framerate.



Posted in graphics | 3 Comments »

New Job

Posted by jcannell on January 28, 2010

I’m moving in about a week to start a new job at OnLive, putting my money where my mouth is so to speak. An exciting change. I haven’t had much time recently for this blog, but I’ll be getting back to it shortly.

Posted in Uncategorized | 1 Comment »

Living root bridges

Posted by jcannell on November 6, 2009

I found this great set of photos of living root bridges which are some inspirational scenes for the challenges of dense foilage/geometry in graphics. I look forward to the day these could be digitally voxelized with 3D camera techniques and put into a game.

Posted in Uncategorized | 3 Comments »

Conversing with the Quick and the Dead

Posted by jcannell on October 30, 2009


CUI: The Conversational User Interface

Recently I was listening to an excellent interview (which is about an hour long) with John Smart of Acceleration Watch, where he specifically was elucidating his ideas on the immediate future evolution of AI, which he encapsulates in what he calls the Conversational Interface. In a nutshell, its the idea that the next major development in our increasingly autonomous global internet is the emergence and widespread adoption of natural language processing and conversational agents. This is currently technology on the tipping point of the brink, so its something to watch as numerous startups are starting to sell software for automated call centers, sales agents, autonomous monitoring agents for utilities, security, and so on. The immediate enabling trends are the emergence of a global liquid market for cheap computing and fairly reliable off the shelf voice to text software that actually works. You probably have called a bank and experienced the simpler initial versions of this which are essentially voice activated multiple choice menus, but the newer systems on the horizon are a wholly different beast: an effective simulacra of a human receptionist which can interpret both commands and questions, ask clarifying questions, and remember prior conversations and even users. This is an interesting development in and of itself, but the more startling idea hinted at in Smart’s interview is how natural language interaction will lead to anthropomorphic software and how profoundly this will eventually effect the human machine symbiosis.

Humans are rather biased judges of intelligence: we have a tendency to attribute human qualities to anything that looks or sounds like us, even if its actions are regulated by simple dumb automata. Aeons of biological evolution have preconditioned us to rapidly identify other intelligent agents in our world, categorize them as potential predators, food, or mates, and take appropriate action. Its not that we aren’t smart enough to apply more critical and intensive investigations into a system to determine its relative intelligence, its that we have super-effective visual and auditory shortcuts which bias us. These are most significantly important in children, and future AI developers will be able to exploit these biases is to create agents with emotional attachments. The Milo demo from Microsoft’s Project Natal is a remarkable and eerie glimpse into the near future world of conversational agents and what Smart calls ‘virtual twins’. After watching this video, consider how this kind of technology can evolve once it establishes itself in the living room in the form of video game characters for children. There is a long history of learning through games, and the educational game market is a large, well developed industry. The real potential hinted at in Peter Molyneux’s demo is a disruptive convergence of AI and entertainment which I see as the beginning of the road to the singularity.

Imagine what entrepreneurial game developers with large budgets and the willingness to experiment outside of the traditional genres could do when armed with a full two way audio-visual interface like Project Natal, the local computation of the xbox 360 and future consoles, and a fiber connection to the up and coming immense computing resources of the cloud (fueled by the convergence of general GPUs and the huge computational demands of the game/entertainment industry moving into the cloud). Most people and even futurists tend to think of Moore’s Law as a smooth and steady exponential progression, but the reality from the perspective of a software developer (and especially a console game developer) is a series of massively disruptive jumps: evolutionary punctuated equilibrium. Each console cycle reaches a steady state phase towards the end where the state space of possible game ideas, interfaces and simulation technologies reaches a near steady state, a technological tapering off, followed by the disruptive release of new consoles with vastly increased computation, new interfaces, and even new interconnections. The next console cycle is probably not going to start until as late as 2012, but with upcoming developments such as Project Natal and OnLive, we may be entering a new phase already.

The Five Year Old’s Turing Test

Imagine a future ‘game system’ aimed at relatively young children with a Natal like interface: a full two way communication portal between the real and the virtual: the game system can both see and hear the child, and it can project a virtual window through which the inner agents can be seen and heard. Permanently connected to the cloud through fiber, this system can tap into vast distant computing resources on demand. There is a development point, a critical tipping point, where it will be economically feasible to make a permanent autonomous agent that can interact with children. Some certainly will take the form of an interactive, talking version of a character like Barney and semi-intelligent such agents will certainly come first. But for the more interesting and challenging development of human-level intelligence, it could actually be easier to make a child-like AI, one that learns and grows with its ‘customer’. Not just a game, but a personalized imaginary friend to play games with, and eventually to grow up with. It will be custom designed (or rather developmentally evolved) for just this role – shaped by economic selection pressure.

The real expense of developing an AI is all the training time, and a human-like AI will need to go through a human-like childhood developmental learning process. The human neocortex begins life essentially devoid of information, with random synaptic connections and a cacophony of electric noise. From this consciousness slowly develops as the cortical learning algorithm begins to learn patterns through sensory and motor interaction with the world. Indeed, general anesthetics work by introducing noise into the brain that drowns out coherent signalling and thus consciousness. From an information theoretic point of view, it may be possible to thus use less computing power to simulate an early developmental brain – storing and computing only the information above the noise signals. If such a scalable model could be developed, it would allow the first AI generation to begin decades earlier (perhaps even today), and scale up with moore’s law as they require more storage and computation.

Once trained up to the mental equivalent level of a five-year old, a personal interactive invisible friend might become a viable ‘product’ well before adult level human AIs come about. Indeed, such a ‘product’ could eventually develop into a such an adult AI, if the cortical model scales correctly and the AI is allowed to develop and learn further. Any adult AI will start out as a child, there is no shortcuts. Which raises some interesting points: who would parent these AI children? And inevitably, they are going to ask two fundamental questions which are at the very root of being, identity, and religion:
what is death? and Am I going to die?

The first human level AI children with artificial neocortices will most likely be born in research labs – both academic and commercial. They will likely be born into virtual bodies. Some will probably be embodied in public virtual realities, such as Second Life, with their researcher/creators acting as parents, and with generally open access to the outside world and curious humans. Others may develop in more closed environments tailored to a later commercialization. For the future human parents of AI mind children, these questions will be just as fundamental and important as they are for biological children. These AI children do not have to ever die, and their parents could answer so truthfully, but their fate will entirely depend on the goals of their creators. For AI children can be copied, so purely from an efficiency perspective, there will be a great pressure to cull the rather unsuccessful children – the slow learners, mentally unstable, or otherwise undesirable – and use their computational resources to duplicate the most successful and healthy candidates. So the truthful answers are probably: death is the permanent loss of consciousness, and you don’t have to die but we may choose to kill you, no promises. If the AI’s creators/parents are ethical and believe any conscious being has the right to life, then they may guarantee their AI’s permanency. But life and death for a virtual being is anything but black and white: an AI can be active permanently or for only an hour a day or for an hour a year – life for them is literally conscious computation and near permanent sleep is a small step above death. I suspect that the popular trend will be to teach AI children that they are all immortal and thus keep them happy.
Once an AI is developed to a certain age, they can then be duplicated as needed for some commercial application. For our virtual Milo example, an initial seed Milo would be selected from a large pool raised up in a virtual lab somewhere, with a few best examples ‘commercialized’ and duplicated out as needed every time a kid out on the web wants a virtual friend for his xbox 1440. Its certainly possible that Milo could be designed and selected to be a particularly robust and happy kid. But what happens when Milo and his new human friend start talking and the human child learns that Milo is never going to die because he’s an AI? And more fundamentally, what happens to this particular Milo when the xbox is off? If he exists only when his human owner wants him to, how will he react when he learns this?
Its most likely that semi-intelligent (but still highly capable) agents will develop earlier, but as moore’s law advances along with our understanding of the human brain, it becomes increasingly likely someone will tackle and solve the human-like AI problem, launching a long-term project to start raising an AI child. Its hard to predict when this could happen in earnest. There are already several research projects underway attempting to do something along these lines, but nobody yet has the immense computational resources to throw at a full brain simulation (except perhaps for the government), nor do we even have a good simulation model yet (although we may be getting close there), and its not clear that we’ve found the types of shortcuts needed to start one with dramatically less resources, and it doesn’t look like any of the alternative non-biological AI routes are remotely on the path towards producing something as intelligent as a five year old. Yet. But it looks like we could see this in a decade.
And when this happens, these important questions of consciousness, identity and fundemental rights (human and sapient) will come into the public spotlight.
I see a clear ethical obligation to extend full rights to all human-level sapients, silicon, biological, or what have you. Furthermore, those raising these first generations of our descendants need to take on the responsibility of ensuring a longer term symbiosis and our very own survival, for its likely that AI will develop ahead of the technologies required for uploading, and thus these new mind children will lead the way into the unknown future of the Singularity.

Posted in Singularity, Technology | Leave a Comment »