Does Humanity Need an AI Nanny?
Ben Goertzel
2011-09-05 00:00:00
URL




The ongoing advancement of science and technology has brought us many wonderful things, and will almost surely be bringing us more and more – most likely at an exponential, accelerating pace, as Ray Kurzweil and others have argued. Beyond the “mere” abolition of scarcity, disease and death, there is the possibility of fundamental enhancement of the human mind and condition, and the creation of new forms of life and intelligence. Our minds and their creations may spread throughout the universe, and may come into contact with new forms of matter and intelligence that we can now barely imagine.

But, as we all know from SF books and movies, the potential dark side of this advancement is equally dramatic. Nick Bostrom has enumerated some of the ways that technology may pose “existential risks” – risks to the future of the human race – as the next decades and centuries unfold. And there is also rich potential for other, less extreme sorts of damage. Technologies like AI, synthetic biology and nanotechnology could run amok in dangerous and unpredictable ways, or could be utilized by unethical human actors for predictably selfish and harmful human ends.

The Singularity, or something like it, is probably near – and the outcome is radically uncertain in almost every way. How can we, as a culture and a species, deal with this situation? One possible solution is to build a powerful yet limited AGI (Artificial General Intelligence) system, with the explicit goal of keeping things on the planet under control while we figure out the hard problem of how to create a probably positive Singularity. That is: to create an “AI Nanny.”

The AI Nanny would forestall a full-on Singularity for a while, restraining it into what Max More has called a Surge, and giving us time to figure out what kind of Singularity we really want to build and how. It’s not entirely clear that creating such an AI Nanny is plausible, but I’ve come to the conclusion it probably is. Whether or not we should try to create it – that is the Zillion Dollar Question.

The Gurus’ Solutions

What does our pantheon of futurist gurus think we should do in the next decades, as the path to Singularity unfolds?

Kurzweil has proposed “fine-grained relinquishment” as a strategy for balancing the risks and rewards of technological advancement. But it’s not at all clear this will be viable, without some form of AI Nanny to guide and enforce the relinquishment. Government regulatory agencies are notoriously slow-paced and unsophisticated, and so far their decision-making speed and intelligence aren’t keeping up with the exponential acceleration of technology.

Further, it seems a clear trend that as technology advances, it is possible for people to create more and more destruction using less and less money, education and intelligence. There seems no reason to assume this trend will reverse, halt or slow. This suggests that, as technology advances, selective relinquishment will prove more and more difficult to enforce. Kurweil acknowledges this issue, stating that “The most challenging issue to resolve is the granularity of relinquishment that is both feasible and desirable” (p. 299, The Singularity Is Near), but he believes this issue is resolvable. I’m skeptical that it is resolvable without resorting to some form of AI Nanny.

Eliezer Yudkowsky has suggested that the safest path for humanity will be to first develop “Friendly AI” systems with dramatically superhuman intelligence. He has put forth some radical proposals, such as the design of self-modifying AI systems with human-friendly goal systems designed to preserve friendliness under repeated self-modification; and the creation of a specialized AI system with the goal of determining an appropriate integrated value system for humanity, summarizing in a special way the values and aspirations of all human beings. However, these proposals are extremely speculative at present, even compared to feats like creating an AI Nanny or a technological Singularity. The practical realization of his ideas seems likely to require astounding breakthroughs in mathematics and science – whereas it seems plausible that human-level AI, molecular assemblers and the synthesis of novel organisms can be achieved via a series of moderate-level breakthroughs alternating with “normal science and engineering.”

Bill McKibben, Bill Joy and other modern-day techno-pessimists argue for a much less selective relinquishment than Kurzweil (e.g. Joy’s classic Wired article The Future Doesn’t Need Us). They argue, in essence, that technology has gone far enough – and that if it goes much further, we humans are bound to be obsoleted or destroyed. They fall short, however, in the area of suggestions for practical implementation. The power structure of the current human world comprises a complex collection of interlocking powerful actors (states and multinational corporations, for example), and it seems probable that if some of these chose to severely curtail technology development, many others would NOT follow suit. For instance, if the US stopped developing AI, synthetic biology and nanotech next year, China and Russia would most likely interpret this as a fantastic economic and political opportunity, rather than as an example to be imitated.

My good friend Hugo de Garis agrees with the techno-pessimists that AI and other advanced technology is likely to obsolete humanity, but views this as essentially inevitable, and encourages us to adopt a philosophical position according to which this is desirable. In his book The Artilect War, he contrasts the “Terran” view, which views humanity’s continued existence as all-important, with the “Cosmist” view in which, if our AI successors are more intelligent, more creative, and perhaps even more conscious and more ethical and loving then we are – then why should we regret their ascension, and our disappearance? In more recent writings (e.g. the article Merge or Purge), he also considers a “Cyborgist” view in which gradual fusion of humans with their technology (e.g. via mind uploading and brain computer interfacing) renders the Terran vs. Cosmist dichotomy irrelevant. In this trichotomy Kurzweil falls most closely into the Cyborgist camp. But de Garis views Cyborgism as largely delusory, pointing out that the potential computational capability of a grain of sand (according to the known laws of physics) exceeds the current computational power of the human race by many orders of magnitude, so that as AI software and hardware advancement accelerate, the human portion of a human-machine hybrid mind would rapidly become irrelevant.

Humanity’s Dilemma

And so … the dilemma posed by the rapid advancement of technology is both clear and acute. If the exponential advancement highlighted by Kurzweil continues apace, as seems likely though not certain, then the outcome is highly unpredictable. It could be bliss for all, or unspeakable destruction – or something inbetween. We could all wind up dead — killed by software, wetware or nanoware bugs, or other unforeseen phenomena. If humanity does vanish, it could be replaced by radically more intelligent entities (thus satisfying de Garis’s Cosmist aesthetic) – but this isn’t guaranteed; there’s also the possibility that things go awry in a manner annihilating all life and intelligence on Earth and leaving no path for its resurrection or replacement.

To make the dilemma more palpable, think about what a few hundred brilliant, disaffected young nerds with scientific training could do, if they teamed up with terrorists who wanted to bring down modern civilization and commit mass murders. It’s not obvious why such an alliance would arise, but nor is it beyond the pale. Think about what such an alliance could do now – and what it could do in a couple decades from now, assuming Kurzweilian exponential advance. One expects this theme to be explored richly in science fiction novels and cinema in coming years.

But how can we decrease these risks? It’s fun to muse about designing a “Friendly AI” a la Yudkowsky, that is guaranteed (or near-guaranteed) to maintain a friendly ethical system as it self-modifies and self-improves itself to massively superhuman intelligence. Such an AI system, if it existed, could bring about a full-on Singularity in a way that would respect human values – i.e. the best of both worlds, satisfying all but the most extreme of both the Cosmists and the Terrans. But the catch is, nobody has any idea how to do such a thing, and it seems well beyond the scope of current or near-future science and engineering.

Realistically, we can’t stop technology from developing; and we can’t control its risks very well, as it develops. And daydreams aside, we don’t know how to create a massively superhuman supertechnology that will solve all our problems in a universally satisfying way.

So what do we do?

Gradually and reluctantly, I’ve been moving toward the opinion that the best solution may be to create a mildly superhuman supertechnology, whose job it is to protect us from ourselves and our technology – not forever, but just for a while, while we work on the hard problem of creating a Friendly Singularity.

In other words, some sort of AI Nanny….

The AI Nanny

Imagine an advanced Artificial General Intelligence (AGI) software program with

o A strong inhibition against modifying its preprogrammed goals

o A strong inhibition against rapidly modifying its general intelligence

o A mandate to cede control of the world to a more intelligent AI within 200 years

o A mandate to help abolish human disease, involuntary human death, and the practical scarcity of common humanly-useful resources like food, water, housing, computers, etc.

o A mandate to prevent the development of technologies that would threaten its ability to carry out its other goals

o A strong inhibition against carrying out actions with a result that a strong majority of humans would oppose, if they knew about the action in advance

o A mandate to be open-minded toward suggestions by intelligent, thoughtful humans about the possibility that it may be misinterpreting its initial, preprogrammed goals

This, roughly speaking, is what I mean by an “AI Nanny.”

Obviously, this sketch of the AI Nanny idea is highly simplified and idealized – a real-world AI Nanny would have all sort of properties not described here, and might be missing some of the above features, substituting them with other related things. My point here is not to sketch a specific design or requirements specification for an AI Nanny, but rather to indicate a fairly general class of systems that humanity might build.

The nanny metaphor is chosen carefully. A nanny watches over children while they grow up, and then goes away. Similarly, the AI Nanny would not be intended to rule humanity on a permanent basis – only to provide protection and oversight while we “grow up” collectively; to give us a little breathing room so we can figure out how best to create a desirable sort of Singularity.

A large part of my personality rebels against the whole AI Nanny approach – I’m a rebel and a nonconformist; I hate bosses and bureaucracies and anything else that restricts my freedom. But, I’m not a political anarchist – because I have a strong suspicion that if governments were removed, the world would become a lot worse off, dominated by gangs of armed thugs imposing even less pleasant forms of control than those exercised by the US Army and the CCP and so forth. I’m sure government could be done a lot better than any country currently does it – but I don’t doubt the need for some kind of government, given the realities of human nature. And I think the need for an AI Nanny falls into the same broad category. Like government, an AI Nanny is a relatively offensive thing, that is nonetheless a practical necessity due to the unsavory aspects of human nature.

We didn’t need government during the Stone Age – because there weren’t that many of us, and we didn’t have so many dangerous technologies. But we need government now. Fortunately, these same technologies that necessitated government, also provided the means for government to operate.

Somewhat similarly, we haven’t needed an AI Nanny so far, because we haven’t had sufficiently powerful and destructive technologies. And fortunately, these same technologies that apparently necessitate the creation of an AI Nanny, also appear to provide the means of creating it.

The Basic Argument

To recap and summarize, the basic argument for trying to build an AI Nanny is founded on the premises that:

1. It’s impracticable to halt the exponential advancement of technology (even if one wanted to)

2. As technology advances, it becomes possible for individuals or groups to wreak greater and greater damage using less and less intelligence and resources

3. As technology advances, humans will more and more acutely lack the capability to monitor global technology development and forestall radically dangerous technology-enabled events

4. Creating an AI Nanny is a significantly less difficult technological problem than creating an AI or other technology with a predictably high probability of launching a full-scale positive Singularity

5. Imposing a permanent or very long term constraint on the development of new technologies is undesirable

The fifth and final premise is normative; the others are empirical. None of the empirical premises are certain, but all seem likely to me. The first three premises are strongly implied by recent social and technological trends. The fourth premise seems commonsensical based on current science, mathematics and engineering.

These premises lead to the conclusion that trying to build an AI Nanny is probably a good idea. The actual plausibility of building an AI Nanny is a different matter – I believe it is plausible, but of course, opinions on the plausibility of building any kind of AGI system in the relatively near future vary all over the map.

Complaints and Responses

I have discussed the AI Nanny idea with a variety of people over the last year or so, and have heard an abundance of different complaints about it – but none have struck me as compelling.

“It’s impossible to build an AI Nanny; the AI R&D is too hard.” – But is it really? It’s almost surely impossible to build and install an AI Nanny this year; but as a professional AI researcher, I believe such a thing is well within the realm of possibility. I think we could have one in a couple decades if we really put our collective minds to it. It would involve a host of coordinated research breakthroughs, and a lot of large-scale software and hardware engineering, but nothing implausible according to current science and engineering. We did amazing things in the Manhattan Project because we wanted to win a war – how hard are we willing to try when our overall future is at stake?

It may be worth dissecting this “hard R&D” complaint into two sub-complaints:

Obviously both of these are contentious issues.

Regarding the “AGI is hard” complaint, at the AGI-09 artificial intelligence research conference, an expert-assessment survey was done, suggesting that a least a nontrivial plurality of professional AI researchers believes that human-level AGI is possible within the next few decades, and that slightly-superhuman AGI will follow shortly after that.

Regarding the “Nannifying an AGI is hard” complaint, I think its validity depends on the AGI architecture in question. If one is talking about an integrative, cognitive-science-based, explicitly goal-oriented AGI system like, say, OpenCog or MicroPsi or LIDA, then this is probably not too much of an issue, as these architectures are fairly flexible and incorporate explicitly articulated goals. If one is talking about, say, an AGI built via closely emulating human brain architecture, in which the designers have relatively weak understanding of the AGI system’s representations and dynamics, then the “nannification is hard” problem might be more serious. My own research intuition is that an integrative, cognitive-science-based, explicitly goal-oriented system is likely to be the path via which advanced AGI first arises; this is the path my own work is following.

“It’s impossible to build an AI Nanny; the surveillance technology is too hard to implement.” – But is it really? Surveillance tech is advancing bloody fast, for all sorts of reasons more prosaic than the potential development of an AI Nanny. Read David Brin’s book The Transparent Society, for a rather compelling argument that before too long, we’ll all be able to see everything everyone else is doing.

“Setting up an AI Nanny, in practice, would require a world government.” – OK, yes it would … sort of. It would require either a proactive assertion of power by some particular party, creating and installing an AI Nanny without asking everybody else’s permission; or else a degree of cooperation between the world’s most powerful governments, beyond what we see today. Either route seems conceivable. Regarding the second cooperative path, it’s worth observing that the world is clearly moving in the direction of greater international unity, albeit in fits and starts. Once the profound risks posed by advancing technology become more apparent to the world’s leaders, the required sort of international cooperation will probably be a lot easier to come by. Hugo de Garis’s most recent book Multis and Monos riffs extensively on the theme of emerging world government.

“Building an AI Nanny is harder than building a self-modifying, self-improving AGI that will retain its Friendly goals even as it self-modifies.” – Yes, someone really made this counterargument to me; but as a scientist, mathematician and engineer, I find this wholly implausible. Maintenance of goals under radical self-modification and self-improvement seems to pose some very thorny philosophical and technical problem — and once these are solved (to the extent that they’re even solvable) then one will have a host of currently-unforeseeable engineering problems to consider. Furthermore there is a huge, almost surely irreducible uncertainty in creating something massively more intelligent than oneself. Whereas creating an AI Nanny is “merely” a very difficult, very large scale science and engineering problem.

“If someone creates a new technology smarter than the AI Nanny, how will the AI Nanny recognize this and be able to nip it in the bud?” – Remember, the hypothesis is that the AI Nanny is significantly smarter than people. Imagine a friendly, highly intelligent person monitoring and supervising the creative projects of a room full of chimps or “intellectually challenged” individuals.

“Why would the AI Nanny want to retain its initially pre-programmed goals, instead of modifying them to suit itself better? – for instance, why wouldn’t it simply adopt the goal of becoming an all-powerful dictator and exploiting us for its own ends?” – But why would it change its goals? What forces would cause it to become selfish, greedy, etc? Let’s not anthropomorphize. “Power corrupts, and absolute power corrupts absolutely” is a statement about human psychology, not a general law of intelligent systems. Human beings are not architected as rational, goal-oriented systems, even though some of us aspire to be such systems and make some progress toward behaving in this manner. If an AI system is created with an architecture inclining it to pursue certain goals, there’s no reason why it would automatically be inclined to modify these goals.

Remember, the AI Nanny is specifically programmed not to radically modify itself, nor to substantially deviate from its initial goals. One cost of this sort of restriction is that it won’t be able to make itself dramatically more intelligent via judicious self-modification. But the idea is to pay this cost temporarily, for the 200 year period, while

“But how can you specify the AI Nanny’s goals precisely? You can’t right? And if you specify them imprecisely, how do you know it won’t eventually come to interpret them in some way that goes against your original intention? And then if you want to tweak its goals, because you realize you made a mistake, it won’t let you, right?” – This is a tough problem, without a perfect solution. But remember, one of its goals is to be open-minded about the possibility that it’s misinterpreting its goals. Indeed, one can’t rule out the possibility that it will misinterpret this meta-goal and then, in reality, closed-mindedly interpret its other goals in an incorrect way. The AI Nanny would not be a risk-free endeavor, and it would be important to get a feel for its realities before giving it too much power. But again, the question is not whether it’s an absolutely safe and positive project – but rather, whether it’s better than the alternatives!

“What about Steve Omohundro’s ‘Basic AI Drives’? Didn’t Omohundro prove that any AI system would seek resources and power just like human beings?” – Steve’s paper is an instant classic, but his arguments are mainly evolutionary. They apply to the case of an AI competing against other roughly equally intelligent and powerful systems for survival. The posited AI Nanny would be smarter and more powerful than any human, and would have, as part of its goal content, the maintenance of this situation for 200 years (200 obviously being a somewhat arbitrary number inserted for convenience of discussion). Unless someone managed to sneak past its defenses and create competitively powerful and smart AI systems, or it encountered alien minds, the premises of Omohundro’s arguments don’t apply.

“What happens after the 200 years is up?” – I have no effing idea, and that’s the whole point. I know what I want to happen – I want to create multiple copies of myself, some of which remain about like I am now (but without ever dying), some of which gradually ascend to “godhood” via fusing with uber-powerful AI minds, and the rest of which occupy various intermediate levels of transcension. I want the same to happen for my friends and family, and everyone else who wants it. I want some of my copies to fuse with other minds, and some to remain distinct. I want those who prefer to remain legacy humans, to be able to do so. I want all sorts of things, but that’s not the point – the point is that after 200 years of research and development under the protection of the AI Nanny, we would have a lot better idea of what’s possible and what isn’t than any of us do right now.

“What happens if the 200 years pass and none of the hard problems are solved, and we still don’t know how to launch a full-on Singularity in a sufficiently reliably positive way?” – One obvious possibility is to launch the AI Nanny again for a couple hundred more years. Or maybe to launch it again with a different, more sophisticated condition for ceding control (in the case that it, or humans, conceive some such condition during the 200 years).

“What if we figure out how to create a Friendly self-improving massively superhuman AGI only 20 years after the initiation of the AI Nanny – then we’d have to wait another 180 years for the real Singularity to begin!”– That’s true of course, but if the AI Nanny is working well, then we’re not going to die in the interim, and we’ll be having a pretty good time. So what’s the big deal? A little patience is a virtue!

“But how can you trust anyone to build the AI Nanny? Won’t they secretly put in an override telling the AI Nanny to obey them, but nobody else?” – That’s possible, but there would be some good reasons for the AI Nanny developers not to do that. For one thing, if others suspected that the AI Nanny developers had done this, some of these others would likely capture and torture the developers, in an effort to force them to hand over the secret control password. Developing the AI Nanny via an open, international, democratic community and process would diminish the odds of this sort of problem happening.

“What if, shortly after initiating the AI Nanny, some human sees some fatal flaw in the AI Nanny approach, which we don’t see now. Then we’d be unable to undo our mistake.” Oops.

“But it’s odious!!” – Yes, it’s odious. Government is odious too, but apparently necessary. And as Winston Churchill said, “democracy is the worst form of government except all those other forms that have been tried.” Human life, in many respects, is goddamned odious. Nature is beautiful and cooperative and synergetic — and also red in tooth and claw. Life is wonderful, beautiful and amazing — and tough and full of compromises. Hell, even physics is a bit odious – some parts of my brain find the Second Law of Thermodynamics and the Heisenberg Uncertainty Principle damned unsatisfying! I wouldn’t have written this article when I was 22, because back then I was more steadfastly oriented toward idealistic solutions – but now, at age 44, I’ve pretty well come to terms with the universe’s persistent refusal to behave in accordance with all my ideals. The AI Nanny scenario is odious in some respects, but can you show me an alternative that’s less odious and still at least moderately realistic? I’m all ears….

A Call to Brains

This article is not supposed to be a call to arms to create an AI Nanny. As I’ve said above, the AI Nanny is not an idea that thrills my heart. It irritates me. I love freedom, and I’m also impatient and ambitious – I want the full-on Singularity yesterday, goddamnit!!!

But still, the more I think about it, the more I wonder whether some form of AI Nanny might well be the best path forward for humanity – the best way for us to ultimately create a Singularity according to our values. At very least, it’s worth very serious analysis and consideration – and careful weighing against the alternatives.

So this is more of a “call to brains”, really. I’d like to get more people thinking about what an AI Nanny might be like, and how we might engineer one. And I’d like to get more people thinking actively and creatively about alternatives.

Perhaps you dislike the AI Nanny idea even more than I do. But even so, consider: Others may feel differently. You may well have an AI Nanny in your future anyway. And even if the notion seems unappealing now, you may enjoy it tremendously when it comes to pass.