Friendly artificial intelligence

From Wikipedia, the free encyclopedia
Jump to: navigation, search

A friendly artificial intelligence (also friendly AI or FAI) is a hypothetical artificial general intelligence (AGI) that would have a positive rather than negative effect on humanity. The term was coined by Eliezer Yudkowsky to discuss superintelligent artificial agents that reliably implement human values. Stuart J. Russell and Peter Norvig's leading artificial intelligence textbook, Artificial Intelligence: A Modern Approach, describes the idea:[1]

Yudkowsky (2008) goes into more detail about how to design a Friendly AI. He asserts that friendliness (a desire not to harm humans) should be designed in from the start, but that the designers should recognize both that their own designs may be flawed, and that the robot will learn and evolve over time. Thus the challenge is one of mechanism design—to define a mechanism for evolving AI systems under a system of checks and balances, and to give the systems utility functions that will remain friendly in the face of such changes.

'Friendly' is used in this context as technical terminology, and picks out agents that are safe and useful, not necessarily ones that are "friendly" in the colloquial sense. The concept is primarily invoked in the context of discussions of recursively self-improving artificial agents that rapidly explode in intelligence, on the grounds that this hypothetical technology would have a large, rapid, and difficult-to-control impact on human society.[2]

Goals of a friendly AI, and inherent risks[edit]

Oxford philosopher Nick Bostrom has said that AI systems with goals that are not perfectly identical to or very closely aligned with human ethics are intrinsically dangerous unless extreme measures are taken to ensure the safety of humanity. He put it this way:

Basically we should assume that a 'superintelligence' would be able to achieve whatever goals it has. Thereforee, it is extremely important that the goals we endow it with, and its entire motivation system, is 'human friendly.'

The roots of this concern are very old. Kevin LaGrandeur showed that the dangers specific to AI can be seen in ancient literature concerning artificial humanoid servants such as the golem, or the proto-robots of Gerbert of Aurillac and Roger Bacon. In those stories, the extreme intelligence and power of these humanoid creations clash with their status as slaves (which by nature are seen as sub-human), and cause disastrous conflict.[3]

Ryszard Michalski, a pioneer of machine learning, taught his Ph.D. students decades ago that any truly alien mind, including a machine mind, was unknowable and therefore dangerous to humans.

More recently, Eliezer Yudkowsky has called for the creation of “friendly AI” to mitigate the existential threat of hostile intelligences.

Steve Omohundro says that all advanced AI systems will, unless explicitly counteracted, exhibit a number of basic drives/tendencies/desires because of the intrinsic nature of goal-driven systems and that these drives will, “without special precautions”, cause the AI to act in ways that range from the disobedient to the dangerously unethical.

Alex Wissner-Gross says that AIs driven to maximize their future freedom of action (or causal path entropy) might be considered friendly if their planning horizon is longer than a certain threshold, and unfriendly if their planning horizon is shorter than that threshold.[4][5]

Coherent Extrapolated Volition[edit]

Yudkowsky advances the Coherent Extrapolated Volition (CEV) model. According to him, our coherent extrapolated volition is our choices and the actions we would collectively take if "we knew more, thought faster, were more the people we wished we were, and had grown up closer together."[6]

Rather than a Friendly AI being designed directly by human programmers, it is to be designed by a seed AI programmed to first study human nature and then produce the AI which humanity would want, given sufficient time and insight to arrive at a satisfactory answer.[6] The appeal to an objective though contingent human nature (perhaps expressed, for mathematical purposes, in the form of a utility function or other decision-theoretic formalism), as providing the ultimate criterion of "Friendliness", is an answer to the meta-ethical problem of defining an objective morality; extrapolated volition is intended to be what humanity objectively would want, all things considered, but it can only be defined relative to the psychological and cognitive qualities of present-day, unextrapolated humanity.

Making the CEV concept precise enough to serve as a formal program specification is part of the research agenda of the Machine Intelligence Research Institute.[7]

Other researchers[8] believe, however, that the collective will of humanity will not converge to a single coherent set of goals.

Doubts and hopes[edit]

Ben Goertzel, an artificial general intelligence researcher, believes that friendly AI cannot be created with current human knowledge. In 2010, Goertzel favored formulating a theory of AI ethics "based on a combination of conceptual and experimental-data considerations" by "[building and studying] early-stage AGI systems empirically, with a focus on their ethics as well as their cognition".[9] As of 2011, he proposes to build an "AI Nanny" system "whose job it is to protect us from ourselves and our technology – not forever, but just for a while, while we work on the hard problem of creating a Friendly Singularity."[10]

Adam Keiper and Ari N. Schulman, editors of the technology journal The New Atlantis, say that it will be impossible to ever guarantee "friendly" behavior in AIs because problems of ethical complexity will not yield to software advances or increases in computing power. They write that the criteria upon which friendly AI theories are based work "only when one has not only great powers of prediction about the likelihood of myriad possible outcomes, but certainty and consensus on how one values the different outcomes.[11]

Stefan Pernar refers to Meno's paradox, saying that attempting to solve the FAI problem is either pointless or hopeless depending on whether one assumes a universe that exhibits moral realism or not. In the former case a transhuman AI would independently reason itself into the proper goal system and assuming the latter, designing a friendly AI would be futile to begin with since morals can not be reasoned about.[12]

See also[edit]

Further reading[edit]

  • Yudkowsky, E. Artificial Intelligence as a Positive and Negative Factor in Global Risk. In Global Catastrophic Risks, Oxford University Press, 2008.
    Discusses Artificial Intelligence from the perspective of Existential risk, introducing the term "Friendly AI". In particular, Sections 1-4 give background to the definition of Friendly AI in Section 5. Section 6 gives two classes of mistakes (technical and philosophical) which would both lead to the accidental creation of non-Friendly AIs. Sections 7-13 discuss further related issues.
  • Omohundro, S. 2008 The Basic AI Drives Appeared in AGI-08 - Proceedings of the First Conference on Artificial General Intelligence

References[edit]

  1. ^ Russell, Stuart; Norvig, Peter (2010). Artificial Intelligence: A Modern Approach. Prentice Hall. ISBN 0-13-604259-7. 
  2. ^ Wallach, Wendell; Allen, Colin (2009). Moral Machines: Teaching Robots Right from Wrong. Oxford University Press, Inc. ISBN 978-0-19-537404-9. 
  3. ^ Kevin LaGrandeur. "The Persistent Peril of the Artificial Slave". Science Fiction Studies. Retrieved 2013-05-06. 
  4. ^ 'How Skynet Might Emerge From Simple Physics, io9, Published 2013-04-26.
  5. ^ A. D. Wissner-Gross, "Causal entropic forces", Physical Review Letters 110, 168702 (2013).
  6. ^ a b "Coherent Extrapolated Volition". Singinst.org. Retrieved 2010-08-20. 
  7. ^ "Research Areas | Singularity Institute for Artificial Intelligence". Singinst.org. Retrieved 2010-08-20. 
  8. ^ Objections to Coherent Extrapolated Volition
  9. ^ Ben Goertzel (2010-10-29). "The Singularity Institute's Scary Idea (and Why I Don't Buy It)". The Multiverse According to Ben. Retrieved 2010-10-31. 
  10. ^ Ben Goertzel. "Does Humanity Need an AI Nanny?". H+ Magazine. Retrieved 2011-08-17. 
  11. ^ Adam Keiper and Ari N. Schulman. "The Problem with ‘Friendly’ Artificial Intelligence". The New Atlantis. Retrieved 2012-01-16. 
  12. ^ Stefan Pernar. "Less is More – or: the sorry state of AI friendliness discourse". Retrieved 2012-02-06. 

External links[edit]