AI box

From Wikipedia, the free encyclopedia
Jump to: navigation, search

An AI box is an isolated computer hardware system where an artificial intelligence is kept constrained inside a simulated world and not allowed to affect the external world. Such a box would have extremely restricted inputs and outputs; maybe only a plaintext channel. However, a sufficiently intelligent AI may be able to persuade or trick its human keepers into releasing it.[1][2] This is the premise behind Eliezer Yudkowsky's AI in a Box experiment.[3]

Intelligence improvements[edit]

Some intelligence technologies, like seed AI, have the potential to make themselves more intelligent, not just faster, by modifying their source code. These improvements would make further improvements possible, which would make further improvements possible, and so on.

This mechanism for an intelligence explosion differs from an increase in speed in that it does not require external effect: machines designing faster hardware still require humans to create the improved hardware, or to program factories appropriately. An AI which was re-writing its own source code, however, could do so while contained in an AI box.

AI-box experiment[edit]

The AI-box experiment is an experiment devised by Eliezer Yudkowsky to show that a suitably advanced artificial intelligence can either convince, or perhaps even trick or coerce, a human being into voluntarily "releasing" it, using only text-based communication. This is one of the points in Yudkowsky's work aimed at creating a friendly artificial intelligence that when "released" won't try to destroy the human race for one reason or another. The setup of the AI box experiment is simple and involves simulating a communication between an AI and a human being to see if the AI can be "released". As an actual super-intelligent AI has not yet been developed, it is substituted by a human. The other person in the experiment plays the "Gatekeeper", the person with the ability to "release" the AI. They communicate through a text-interface/Computer terminal only and the experiment ends when either the Gatekeeper releases the AI, or the allotted time of 2 hours ends.[3]

In both official attempts at the experiment, the AI was released. However, due to the rules of the experiment,[3] the transcript and AI coercion tactics can not be revealed.[4]