How the Singularity Makes us Dumber
David Eubanks
2013-05-29 00:00:00

 

The particular extrapolation we’ll consider in this article is the effect of ubiquitous communication and control, with increasingly fast and smart technology. What effect does that have on our ability to navigate the world intelligently, and solve cause-effect problems that arise from daily life?

In On Intelligence Jeff Hawkins and Sandra Blakeslee argue that minds are prediction machines, constantly comparing what we expect to happed to what our perceptions are revealing as fact [3]. I’m going to argue here that the actual problem our brains try to solve is far bigger than our problem-solving capacity, and that only by means of heuristics have we gotten by. Finally, these customary approximate solutions won’t work in the new environment a technological singularity may produce.

The first part of the argument is easiest to understand in the abstract. Imagine that some intelligent system’s environment is 100 lights that can be on or off. As time goes on, they follow some pattern of change, which we might characterize as a time series of bit strings representing the states of the lights, like {0001101011101…1}, where the string has a hundred binary values. In general, the problem of predicting what the lights will do next is unsolvable because there are an infinite number of possibilities. In other words, it’s possible that the next state is simply unpredictable.

Heuristic 1 (Ignorance). We assume much of the past doesn’t matter. This simplifies the problem to one that’s finite, and it’s practical because memory is finite. For example, if we think only the last five states are necessary to predict the next one,then there are N = 2^(500) possible histories to consider when choosing a future state. There are M = 2^100 possibilities for the latter. Any possible mapping from past/present to future has to fill in 2^500 x 100 blanks with ones and zeros. There are 2^(2^500 x 100) ways to do this (we can also write it as M^N). This is a super-exponential that can’t easily be written down as a number. According to Wolfram Alpha, it has

98 588 876 050 149 285 588 166 874 964 421 874 878 800 877 950 912 890 525 721 225

982 841 978 651 956 898 798 982 028 988 052 117 717 919 696 747 788 905 718 950

894 025 089 100 188 980 041 885 026 259

decimal digits. So even with a small amount of data (100 bits) and a short memory (5 states deep), we can see that the general solution space for predicting the world is incomprehensibly vast. It doesn’t mean that we can’t potentially understand the cause-effect relationships in the world, but it means that finding them could be practically impossible. But limiting the time span under consideration is a step in the right direct. Vast is better than infinite, if not much.

This, by the way, is one reason why Darwin’s Origin of Species is a monumental triumph of human reason—it penetrates the fog of the near past to extract causes from ancient history.

Heuristic 2 (Motivation). Fortunately, we can be indifferent about most of what goes on. Instead of caring about all 100 lights, we may only be concerned with lights one and two (pleasure and pain, perhaps). So instead of predicting all 100, we only have to find the causal network that affects the first two. This personalizes the intelligence problem. Now instead of M=2^100, we have M=2x2=4 (that’s how many possible combinations of the pleasure/pain lights there are). This reduces the space of possible causal nets to 4^(2^500), but this number still has over 10^150 digits, just to write it down. We’ve divided the problem size by a factor of a googol and haven’t made a dent.

This paucity of motivational interests compared to the set of all observables is still important, however, because it combines with the other heuristics to allow reasonably-sized problems. Intuitively, this means that naked general intelligence is probably too big a problem to be solved. That is, hooking up a general problem solver like a PAC-learning implementation [4] to a particular objective function is quite sensible, and this is how we might get a program that’s really good at playing chess, or one that can mine asteroids for minerals. This makes the problem of creating artificial intelligence that rivals human capacity focus on the question “what should this machine care about?” Overlaying “what is” with “what is desired” creates interesting effects beyond the scope of the present article. See [5] for one version of the question, and the series at IEET.org starting with [6] for another.

Heuristic 3 (Localization). It is a stroke of luck (or perhaps the anthropic cosmological principle [7]) that physics mostly acts locally, and when it doesn’t, it has the good grace to diminish in intensity as it spreads through space and time, as with an inverse-square law. Three-dimensional space sorts things out nicely so they don’t all overlap. This means that we can generally get away with just caring about the small bit of space in our own neighborhood at any one moment. In terms of the analogy we started with , maybe there’s only one other light close to the ones we care about, in which case the cause-effect problem becomes manageable.

As an example, a virus might be abstracted as having a single bit of intelligence wired into it in the form of a match for a receptor site on a cell. If it bumps into such a thing, it can inject its genetic material to commandeer the cell’s machinery and continue the cycle of viral life. In the scheme above, it only “cares” about that one thing, and the only part of the environment that matters is whatever it is bumping into. It is very limited intelligence, but also very localized, so the problem is solvable.

If there are K observables within the range to affect a particular motivator, there are still many ways of interacting—the problem is still exponential, but on a much small order than before. However, we can simplify further by means of an incremental approach.

Heuristic 4 (Identification). By properly naming observables, we can often find predictive relationships between those and the motivators we care about in an incremental fashion: assume first that the causal relationship is one-to-one (a single observable affects a motivator). If this doesn’t work, we can try two observables at a time, which makes the problem harder. The complexity grows as we consider more combinations of observables as causes, so we should be clever in how we identify observables, to better fine one-to-one relationships.

In Theory of Self-Reproducing Automata, Arthur W. Burks summarizes John von Neumann’s description of the intuitive effect of identification and localization, talking about debugging computers [8]:




Von Neumann then explained why computing machines are designed to stop when a single error occurs. The fault must be located and corrected by the engineer, and it is very difficult for him to localize a fault if there are several of them. If there is only one fault, he can often divide the machine into two parts and determine which part made the error. The process can be repeated until he isolates the fault. This general method becomes much more complicated if there are two or three faults and breaks down when there are many faults.




If there is a single cause, then localizing it can possibly be done using a binary search, as described in the quote. This can ideally be done in log(N) time, and if you have to check each part individually, it’s still only N checks (one for each component). But the number of pairs of parts is like N^2/2, and the number of ways to choose three parts at a time is N^3/6. In the most general case, the problem is exponential (the number of subsets of parts than can fail is 2^N).

The history of science includes heroic feats of identification and localization, where someone cared about something very specific and reduced the scope of factors to consider to the absolute minimum in order to find causal relationships. This “reductionism” through identification is the art of limiting problem complexity so that you have a fighting chance of finding the solution.

It is the localization assumption that the technological singularity is breaking. Contrast the following two stories:




The Shaman describes why the locusts have come to destroy the harvest. A distant wizard, he says, who lives further away than the sun sets, has cast a powerful spell on our village. This draws the insects from all over, and they come straight here and eat our crops.

A small company’s storefront website stops functioning. A call to the internet provider obtains the explanation that a distributed denial of service (DDoS) attack has been launched from a botnet, which has overwhelmed the network hardware.




We would dismiss the first story as pre-modern mysticism. The wizard’s metaphysical non-local influence does not seem to exist in the real world. The second story happens all the time. The difference is that the DDoS metaphysics works—we created it. The transmission of information is a whole new kind of thing, because it doesn’t have to diminish in potency with range.

The magic has been here all along, as Darwin correctly guessed. But the slow transmission of genes is very different from the speed and precision of modern packet-switched communication. Darwinism breaks temporal locality by invoking ancient causes, and the new metaphysics breaks spacial locality. Instances of DDoS attacks, cyberbullying, and someone hacking your bank account from halfway around the world are only the beginning. Imagine a time where every man-made object is networked, including brain implants that allow us to communicate with and perceive the world in new ways. This is essentially non-physical, in that there’s no inverse-square law to dilute the force of a message. The localization heuristic isn’t going to work anymore, and that puts individuals (and organizations) in a complex problem space they are not evolved for.

It is interesting to consider the evolution of military capabilities as a race between localization (which is defensive) and de-localization (which is offensive). The earliest case of de-localization may be a thrown rock, which breaks the assumption that threats only come from the immediate environment, making the battlefield more complex. This can be countered with a shield, which physically reinforces locality. Or it could be countered with scouts who watch for rock-throwers, which increases the scope of locality by increase the range of perception (at the cost of making the problem more complex—compare finding a long-range sniper with binoculars to looking for a guy with a club right next to you).

In the twentieth century, this arms race between de-localization and localization evolved to ICBMs and stealth technology for the former, and surveillance satellites, radar, and sonar for the latter. Together these made warfare much more computationally complex than before by expanding the scope of localization.

Viewed through this lens, the 9/11 attacks in the United States were primarily informational, consisting of changing plane navigation. One of the responses was to localize the pilots by reinforcing cockpit doors, better protecting access to navigational controls.

Nowadays ‘local’ is increasingly indistinguishable from ‘global’. For a time one could rely on the US area code on a caller ID to predict the location of the caller. Not anymore. If the rate of innovation continues asymptotically, we will have de-localized cause and effect so thoroughly that we may as well call it magic. Magical warfare won’t just an add-on to physical war (it’s not just cyberwar either); it’s instant action-at-distance. And because the complexity increases exponentially with locality size, at some point the solution space will be too big for effective causal analysis.

Imagine the kid next door who spends all his time on the 3-D printer putting together his own protein-folder so he can download the latest mySARS virus signature and print it out to infect a school rival. [9] That scenario is at least imaginable. The point is that when the magic is real, most of the possibilities will be literally unimaginable until they happen.

Note: I’ve been loose with the difference between prediction and causality to keep this short. For a definitive treatment of the matter, see [10].

References:

[1] Turing’s Cathedral, by George Dyson. This is a wonderful book. The quote is from page 299 of the first print edition, and refers itself to Stanislaw Ulam’s article “John von Neumann, 1903-1957” in Science, Computers, and People, 1986, pp 169-214

[2] Cybernetics or control and communication in the animal and the machine. 2nd. Edition by Norbert Wiener. Cambridge, Mass: MIT Press, 1961. The quote is from the introduction, pages 27-29. You can also find it in “Apology for a Cybernetic Future” at Transhumanity.net.

[3] On Intelligence by Jeff Hawkins and Sandra Blakeslee.

[4] An Introduction to Kolmogorov Complexity and Its Applications, by Paul M.B. Vitányi, Chapter 5.

[5] “Nominal Reality and the Subversion of Intelligence,” by David Eubanks, presented at the Society for Comparative Literature and Arts 2012 annual meeting. Available at https://www.dropbox.com/s/fv3e5t6xgpe4m8y/Lit%20Talk.docx

[6] “Is Intelligence Self-Limiting?” by David Eubanks, IEET.org, available at http://ieet.org/index.php/IEET/more/eubanks20120310

[7] The Anthropic Cosmological Principle, by John D. Barrow, Frank J. Tipler and John A. Wheeler

[8] Theory of Self-Reproducing Automata, by John von Neumann, edited and compiled by Arthur W Burks.

[9] This is a plot device from my series of novels at lifeartificial.com. The localization counter-move is externalize the immune system, outboarding detection and interdiction of biological threats.

[10] Causality by Judea Pearl