IEET Fellows and Affiliate Scholars Receive AI Safety Grants
Nick Bostrom
Aug 27, 2015

IEET co-founder Nick Bostrom, IEET Fellow Wendell Wallach and Affiliate Scholar Seth Baum are Principal Investigators on projects n funded by Elon Musk and the Open Philanthropy Project and administered by the Future of Life Institute.

Nick Bostrom and the Future of Humanity Institute have received $1.5 million to set up a Strategic Research Center for Artificial Intelligence Policy.

Project Summary: We propose the creation of a joint Oxford-Cambridge research center, which will develop policies to be enacted by governments, industry leaders, and others in order to minimize risks and maximize benefit from artificial intelligence (AI) development in the longer term. The center will focus explicitly on the long-term impacts of AI, the strategic implications of powerful AI systems as they come to exceed human capabilities in most domains of interest, and the policy responses that could best be used to mitigate the potential risks of this technology.

There are reasons to believe that unregulated and unconstrained development could incur significant dangers, both from “bad actors” like irresponsible governments, and from the unprecedented capability of the technology itself. For past high-impact technologies (e.g. nuclear fission), policy has often followed implementation, giving rise to catastrophic risks. It is important to avoid this with superintelligence: safety strategies, which may require decades to implement, must be developed before broadly superhuman, general-purpose AI becomes feasible. This center represents a step change in technology policy: a comprehensive initiative to formulate, analyze, and test policy and regulatory approaches for a transformative technology in advance of its creation.

IEET Fellow Wendell Wallach has received $180,000 to work on the Control and Responsible Innovation in the Development of Autonomous Machines.

Project Summary: Driverless cars, service robots, surveillance drones, computer networks collecting data, and autonomous weapons are just a few examples of increasingly intelligent technologies scientists are developing. As they progress, researchers face a series of questions about whether these machines can be designed and engineered to take morally significant actions previously reserved for human actors. Can they ensure that artificially intelligent systems will always be demonstrably beneficial, safe, controllable, and sensitive to human values? Many individuals and groups have begun tackling the various subprojects entailed in this challenge. They are, however, often unaware of efforts in complementary fields. Thus they lose opportunities for creative collaboration, miss gaps in their own research, and reproduce work being performed by potential colleagues. The Hastings Center proposes to convene a series of three solution-directed workshops with national and international experts in the various pertinent fields. Together they will develop collaborative strategies and research projects, and forge an outline for a comprehensive plan to insure autonomous systems will be demonstrably beneficial, and that this innovative research progresses in a responsible manner. The results of the workshop will be conveyed through a special report, a dedicated edition of a scholarly journal, and two public symposia.

Technical Abstract: The vast array of challenges entailed in designing, engineering, and implementing demonstrably beneficial, safe and controllable AI systems are slowly being addressed by scholars working on distinct research trajectories across many disciplines. They are often unaware of efforts in complementary fields, thus losing opportunities for creative synergies, missing gaps in their own research, and reproducing the work of potential colleagues. The Hastings Center proposes to convene a series of three solution-directed workshops with national and international experts in the varied fields. Together they will address trans-disciplinary questions, develop collaborative strategies and research projects, and forge an outline for a comprehensive plan encompassing the many elements of ensuring autonomous systems will be demonstrably beneficial, and that this innovative research progresses in a responsible manner. The workshops’ research and policy agenda will be published as a Special Report of the journal Hastings Center Report and in short form in a science or engineering journal. Findings will also be presented through two public symposia, one of which will be webcast and available on demand. We anticipate significant progress given the high caliber of the people who are excited by this project and have already committed to join our workshops.


IEET Affiliate Scholar Seth Baum has received $100,000 as Co-Principal Investigator with Tony Barrett to work on an “Evaluation of Safe Development Pathways for Artificial Superintelligence”.

Project Summary: Some experts believe that computers could eventually become a lot smarter than humans are. They call it artificial superintelligence, or ASI. If people build ASI, it could be either very good or very bad for humanity. However, ASI is not well understood, which makes it difficult for people to act to enable good ASI and avoid bad ASI. Our project studies the ways that people could build ASI in order to help people act in better ways. We will model the different steps that need to occur for people to build ASI. We will estimate how likely it is that these steps will occur, and when they might occur. We will also model the actions people can take, and we will calculate how much the actions will help. For example, governments may be able to require that ASI researchers build in safety measures. Our models will include both the government action and the ASI safety measures, to learn about how well it all works. This project is an important step towards making sure that humanity avoids bad ASI and, if it wishes, creates good ASI.

Technical Abstract: Artificial superintelligence (ASI) has been proposed to be a major transformative future technology, potentially resulting in either massive improvement in the human condition or existential catastrophe. However, the opportunities and risks remain poorly characterized and quantified. This reduces the effectiveness of efforts to steer ASI development towards beneficial outcomes and away from harmful outcomes. While deep uncertainty inevitably surrounds such a breakthrough future technology, significant progress can be made now using available information and methods. We propose to model the human process of developing ASI. ASI would ultimately be a human creation; modeling this process indicates the probability of various ASI outcomes and illuminates a range of ways to improve outcomes. We will characterize the development pathways that can result in beneficial or dangerous ASI outcomes. We will apply risk analysis and decision analysis methods to quantify opportunities and risks, and to evaluate opportunities to make ASI less risky and more beneficial. Specifically, we will use fault trees and influence diagrams to map out ASI development pathways and the influence that various actions have on these pathways. Our proposed project will produce the first-ever analysis of ASI development using rigorous risk and decision analysis methodology.