Eliezer Yudkowsky

Eliezer Yudkowsky
Eliezer Yudkowsky
	Yudkowsky at Stanford University in 2006
Born	Eliezer Shlomo Yudkowsky; September 11, 1979 (age 45); Chicago, Illinois, U.S.
Organization	Machine Intelligence Research Institute
Known for	Coining the term friendly artificial intelligence; Research on AI safety; Rationality writing; Founder of LessWrong
Website	www.yudkowsky.net

Eliezer Shlomo Yudkowsky (born September 11, 1979) is an American decision theory and ethics writer, best known for popularizing the idea of friendly artificial intelligence.^[1]^[2] He is a co-founder^[3] and research fellow at the Machine Intelligence Research Institute (MIRI), a private research nonprofit based in Berkeley, California.^[4] His work on the prospect of a runaway intelligence explosion influenced philosopher Nick Bostrom's 2014 book Superintelligence: Paths, Dangers, Strategies.^[5]

Work in artificial intelligence safety

Goal learning and incentives in software systems

Yudkowsky's views on the safety challenges posed by future generations of AI systems are discussed in the undergraduate textbook in AI, Stuart Russell and Peter Norvig's Artificial Intelligence: A Modern Approach. Noting the difficulty of formally specifying general-purpose goals by hand, Russell and Norvig cite Yudkowsky's proposal that autonomous and adaptive systems be designed to learn correct behavior over time:

Yudkowsky (2008)^[6] goes into more detail about how to design a Friendly AI. He asserts that friendliness (a desire not to harm humans) should be designed in from the start, but that the designers should recognize both that their own designs may be flawed, and that the robot will learn and evolve over time. Thus the challenge is one of mechanism design—to design a mechanism for evolving AI under a system of checks and balances, and to give the systems utility functions that will remain friendly in the face of such changes.^[1]

In response to the instrumental convergence concern, where autonomous decision-making systems with poorly designed goals would have default incentives to mistreat humans, Yudkowsky and other MIRI researchers have recommended that work be done to specify software agents that converge on safe default behaviors even when their goals are misspecified.^[7]^[2]

Capabilities forecasting

In the intelligence explosion scenario hypothesized by I. J. Good, recursively self-improving AI systems quickly transition from subhuman general intelligence to superintelligent. Nick Bostrom's 2014 book Superintelligence: Paths, Dangers, Strategies sketches out Good's argument in detail, while citing writing by Yudkowsky on the risk that anthropomorphizing advanced AI systems will cause people to misunderstand the nature of an intelligence explosion. "AI might make an apparently sharp jump in intelligence purely as the result of anthropomorphism, the human tendency to think of 'village idiot' and 'Einstein' as the extreme ends of the intelligence scale, instead of nearly indistinguishable points on the scale of minds-in-general."^[1]^[3]^[6]^[8]

In Artificial Intelligence: A Modern Approach, authors Stuart Russell and Peter Norvig raise the objection that there are known limits to intelligent problem-solving from computational complexity theory; if there are strong limits on how efficiently algorithms can solve various computer science tasks, then intelligence explosion may not be possible.^[1]

Rationality writing

Between 2006 and 2009, Yudkowsky and Robin Hanson were the principal contributors to Overcoming Bias, a cognitive and social science blog sponsored by the Future of Humanity Institute of Oxford University. In February 2009, Yudkowsky founded LessWrong, a "community blog devoted to refining the art of human rationality".^[9]^[10] Overcoming Bias has since functioned as Hanson's personal blog.

Over 300 blogposts by Yudkowsky on philosophy and science (originally written on LessWrong and Overcoming Bias) were released as an ebook entitled Rationality: From AI to Zombies by the Machine Intelligence Research Institute (MIRI) in 2015.^[11] MIRI has also published Inadequate Equilibria, Yudkowsky's 2017 ebook on the subject of societal inefficiencies.^[12]

Yudkowsky has also written several works of fiction. His fanfiction novel, Harry Potter and the Methods of Rationality, uses plot elements from J. K. Rowling's Harry Potter series to illustrate topics in science.^[9]^[13] The New Yorker described Harry Potter and the Methods of Rationality as a retelling of Rowling's original "in an attempt to explain Harry's wizardry through the scientific method".^[14]

Personal life

Yudkowsky is an autodidact^[15] and did not attend high school or college.^[16] He was raised as a Modern Orthodox Jew.^[17]

Academic publications

Yudkowsky, Eliezer (2007). "Levels of Organization in General Intelligence" (PDF). Artificial General Intelligence. Berlin: Springer.
Yudkowsky, Eliezer (2008). "Cognitive Biases Potentially Affecting Judgement of Global Risks" (PDF). In Bostrom, Nick; Ćirković, Milan (eds.). Global Catastrophic Risks. Oxford University Press. ISBN 978-0199606504.
Yudkowsky, Eliezer (2008). "Artificial Intelligence as a Positive and Negative Factor in Global Risk" (PDF). In Bostrom, Nick; Ćirković, Milan (eds.). Global Catastrophic Risks. Oxford University Press. ISBN 978-0199606504.
Yudkowsky, Eliezer (2011). "Complex Value Systems in Friendly AI" (PDF). Artificial General Intelligence: 4th International Conference, AGI 2011, Mountain View, CA, USA, August 3–6, 2011. Berlin: Springer.
Yudkowsky, Eliezer (2012). "Friendly Artificial Intelligence". In Eden, Ammon; Moor, James; Søraker, John; et al. (eds.). Singularity Hypotheses: A Scientific and Philosophical Assessment. The Frontiers Collection. Berlin: Springer. pp. 181–195. doi:10.1007/978-3-642-32560-1_10. ISBN 978-3-642-32559-5.
Bostrom, Nick; Yudkowsky, Eliezer (2014). "The Ethics of Artificial Intelligence" (PDF). In Frankish, Keith; Ramsey, William (eds.). The Cambridge Handbook of Artificial Intelligence. New York: Cambridge University Press. ISBN 978-0-521-87142-6.
LaVictoire, Patrick; Fallenstein, Benja; Yudkowsky, Eliezer; Bárász, Mihály; Christiano, Paul; Herreshoff, Marcello (2014). "Program Equilibrium in the Prisoner's Dilemma via Löb's Theorem". Multiagent Interaction without Prior Coordination: Papers from the AAAI-14 Workshop. AAAI Publications.
Soares, Nate; Fallenstein, Benja; Yudkowsky, Eliezer (2015). "Corrigibility" (PDF). AAAI Workshops: Workshops at the Twenty-Ninth AAAI Conference on Artificial Intelligence, Austin, TX, January 25–26, 2015. AAAI Publications.

References

^ ^a ^b ^c ^d Russell, Stuart; Norvig, Peter (2009). Artificial Intelligence: A Modern Approach. Prentice Hall. ISBN 978-0-13-604259-4.
^ ^a ^b Leighton, Jonathan (2011). The Battle for Compassion: Ethics in an Apathetic Universe. Algora. ISBN 978-0-87586-870-7.
^ ^a ^b Dowd, Maureen (March 26, 2017). "Elon Musk's Billion-Dollar Crusade to Stop the A.I. Apocalypse". Vanity Fair. Archived from the original on July 26, 2018. Retrieved July 28, 2018.
^ Kurzweil, Ray (2005). The Singularity Is Near. New York City: Viking Penguin. ISBN 978-0-670-03384-3.
^ Ford, Paul (February 11, 2015). "Our Fear of Artificial Intelligence". MIT Technology Review. Archived from the original on March 30, 2019. Retrieved April 9, 2019.
^ ^a ^b Yudkowsky, Eliezer (2008). "Artificial Intelligence as a Positive and Negative Factor in Global Risk" (PDF). In Bostrom, Nick; Ćirković, Milan (eds.). Global Catastrophic Risks. Oxford University Press. ISBN 978-0199606504. Archived (PDF) from the original on March 2, 2013. Retrieved October 16, 2015.
^ Soares, Nate; Fallenstein, Benja; Yudkowsky, Eliezer (2015). "Corrigibility". AAAI Workshops: Workshops at the Twenty-Ninth AAAI Conference on Artificial Intelligence, Austin, TX, January 25–26, 2015. AAAI Publications. Archived from the original on January 15, 2016. Retrieved October 16, 2015.
^ Bostrom, Nick (2014). Superintelligence: Paths, Dangers, Strategies. ISBN 978-0199678112.
^ ^a ^b Miller, James (2012). Singularity Rising. BenBella Books, Inc. ISBN 978-1936661657.
^ Miller, James (July 28, 2011). "You Can Learn How To Become More Rational". Business Insider. Archived from the original on August 10, 2018. Retrieved March 25, 2014.
^ Miller, James D. "Rifts in Rationality – New Rambler Review". newramblerreview.com. Archived from the original on July 28, 2018. Retrieved July 28, 2018.
^ Machine Intelligence Research Institute. "Inadequate Equilibria: Where and How Civilizations Get Stuck". Archived from the original on September 21, 2020. Retrieved May 13, 2020.
^ Snyder, Daniel D. (July 18, 2011). "'Harry Potter' and the Key to Immortality". The Atlantic. Archived from the original on December 23, 2015. Retrieved June 13, 2022.
^ Packer, George (2011). "No Death, No Taxes: The Libertarian Futurism of a Silicon Valley Billionaire". The New Yorker. p. 54. Archived from the original on December 14, 2016. Retrieved October 12, 2015.
^ Matthews, Dylan; Pinkerton, Byrd (June 19, 2019). "He co-founded Skype. Now he's spending his fortune on stopping dangerous AI". Vox. Archived from the original on March 6, 2020. Retrieved March 22, 2020.
^ Saperstein, Gregory (August 9, 2012). "5 Minutes With a Visionary: Eliezer Yudkowsky". CNBC. Archived from the original on August 1, 2017. Retrieved September 9, 2017.
^ Yudkowsky, Eliezer (October 4, 2007). "Avoiding your belief's real weak points". LessWrong. Archived from the original on May 2, 2021. Retrieved April 30, 2021.

External links

Official website
Rationality: From AI to Zombies (entire book online)

[aima-1] Russell, Stuart; Norvig, Peter (2009). Artificial Intelligence: A Modern Approach. Prentice Hall. ISBN 978-0-13-604259-4.

[auto1-2] Leighton, Jonathan (2011). The Battle for Compassion: Ethics in an Apathetic Universe. Algora. ISBN 978-0-87586-870-7.

[auto-3] Dowd, Maureen (March 26, 2017). "Elon Musk's Billion-Dollar Crusade to Stop the A.I. Apocalypse". Vanity Fair. Archived from the original on July 26, 2018. Retrieved July 28, 2018.

[4] Kurzweil, Ray (2005). The Singularity Is Near. New York City: Viking Penguin. ISBN 978-0-670-03384-3.

[5] Ford, Paul (February 11, 2015). "Our Fear of Artificial Intelligence". MIT Technology Review. Archived from the original on March 30, 2019. Retrieved April 9, 2019.

[gcr-6] Yudkowsky, Eliezer (2008). "Artificial Intelligence as a Positive and Negative Factor in Global Risk" (PDF). In Bostrom, Nick; Ćirković, Milan (eds.). Global Catastrophic Risks. Oxford University Press. ISBN 978-0199606504. Archived (PDF) from the original on March 2, 2013. Retrieved October 16, 2015.

[corrigibility-7] Soares, Nate; Fallenstein, Benja; Yudkowsky, Eliezer (2015). "Corrigibility". AAAI Workshops: Workshops at the Twenty-Ninth AAAI Conference on Artificial Intelligence, Austin, TX, January 25–26, 2015. AAAI Publications. Archived from the original on January 15, 2016. Retrieved October 16, 2015.

[8] Bostrom, Nick (2014). Superintelligence: Paths, Dangers, Strategies. ISBN 978-0199678112.

[miller-9] Miller, James (2012). Singularity Rising. BenBella Books, Inc. ISBN 978-1936661657.

[businessinsider-10] Miller, James (July 28, 2011). "You Can Learn How To Become More Rational". Business Insider. Archived from the original on August 10, 2018. Retrieved March 25, 2014.

[11] Miller, James D. "Rifts in Rationality – New Rambler Review". newramblerreview.com. Archived from the original on July 28, 2018. Retrieved July 28, 2018.

[12] Machine Intelligence Research Institute. "Inadequate Equilibria: Where and How Civilizations Get Stuck". Archived from the original on September 21, 2020. Retrieved May 13, 2020.

[13] Snyder, Daniel D. (July 18, 2011). "'Harry Potter' and the Key to Immortality". The Atlantic. Archived from the original on December 23, 2015. Retrieved June 13, 2022.

[14] Packer, George (2011). "No Death, No Taxes: The Libertarian Futurism of a Silicon Valley Billionaire". The New Yorker. p. 54. Archived from the original on December 14, 2016. Retrieved October 12, 2015.

[vox-15] Matthews, Dylan; Pinkerton, Byrd (June 19, 2019). "He co-founded Skype. Now he's spending his fortune on stopping dangerous AI". Vox. Archived from the original on March 6, 2020. Retrieved March 22, 2020.

[16] Saperstein, Gregory (August 9, 2012). "5 Minutes With a Visionary: Eliezer Yudkowsky". CNBC. Archived from the original on August 1, 2017. Retrieved September 9, 2017.

[17] Yudkowsky, Eliezer (October 4, 2007). "Avoiding your belief's real weak points". LessWrong. Archived from the original on May 2, 2021. Retrieved April 30, 2021.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

v t e LessWrong
People	Scott Alexander Robin Hanson Eliezer Yudkowsky
Organizations	Center for Applied Rationality Future of Humanity Institute Machine Intelligence Research Institute MetaMed
Works	Harry Potter and the Methods of Rationality Rationality: From AI to Zombies

v t e Existential risk from artificial intelligence
Concepts	AGI AI alignment AI capability control AI safety AI takeover Consequentialism Effective accelerationism Ethics of artificial intelligence Existential risk from artificial general intelligence Friendly artificial intelligence Instrumental convergence Intelligence explosion Longtermism Machine ethics Suffering risks Superintelligence Technological singularity
Organizations	Alignment Research Center Center for AI Safety Center for Applied Rationality Center for Human-Compatible Artificial Intelligence Centre for the Study of Existential Risk EleutherAI Future of Humanity Institute Future of Life Institute Google DeepMind Humanity+ Institute for Ethics and Emerging Technologies Leverhulme Centre for the Future of Intelligence Machine Intelligence Research Institute OpenAI
People	Scott Alexander Sam Altman Yoshua Bengio Nick Bostrom Paul Christiano Eric Drexler Sam Harris Stephen Hawking Dan Hendrycks Geoffrey Hinton Bill Joy Shane Legg Elon Musk Steve Omohundro Huw Price Martin Rees Stuart J. Russell Jaan Tallinn Max Tegmark Frank Wilczek Roman Yampolskiy Eliezer Yudkowsky
Other	Statement on AI risk of extinction Human Compatible Open letter on artificial intelligence (2015) Our Final Invention The Precipice Superintelligence: Paths, Dangers, Strategies Do You Trust This Computer? Artificial Intelligence Act
Category

v t e Effective altruism
Concepts	Aid effectiveness Charity assessment Demandingness objection Disability-adjusted life year Disease burden Distributional cost-effectiveness analysis Earning to give Equal consideration of interests Longtermism Marginal utility Moral circle expansion Psychological barriers to effective altruism Quality-adjusted life year Utilitarianism Venture philanthropy
Key figures	Sam Bankman-Fried Liv Boeree Nick Bostrom Hilary Greaves Holden Karnofsky William MacAskill Dustin Moskovitz Yew-Kwang Ng Toby Ord Derek Parfit Peter Singer Cari Tuna Eliezer Yudkowsky
Organizations	80,000 Hours Against Malaria Foundation All-Party Parliamentary Group for Future Generations Animal Charity Evaluators Animal Ethics Centre for Effective Altruism Centre for Enabling EA Learning & Research Center for High Impact Philanthropy Centre for the Study of Existential Risk Development Media International Evidence Action Faunalytics Fistula Foundation Future of Humanity Institute Future of Life Institute Founders Pledge GiveDirectly GiveWell Giving Multiplier Giving What We Can Good Food Fund The Good Food Institute Good Ventures The Humane League Mercy for Animals Machine Intelligence Research Institute Malaria Consortium Nuclear Threat Initiative Open Philanthropy Raising for Effective Giving Sentience Institute Unlimit Health Wild Animal Initiative
Focus areas	Biotechnology risk Climate change Cultured meat Economic stability Existential risk from artificial general intelligence Global catastrophic risk Global health Global poverty Intensive animal farming Land use reform Life extension Malaria prevention Mass deworming Neglected tropical diseases Risk of astronomical suffering Wild animal suffering
Literature	Doing Good Better The End of Animal Farming Famine, Affluence, and Morality The Life You Can Save Living High and Letting Die The Most Good You Can Do Practical Ethics The Precipice Superintelligence: Paths, Dangers, Strategies What We Owe the Future
Events	Effective Altruism Global

v t e Transhumanism
Overviews	Transhuman Transhumanism in fiction
Currents	Antinaturalism Eradication of suffering Extropianism Immortalism Postgenderism Postpoliticism Singularitarianism Technogaianism Techno-progressivism
Organizations	Foresight Institute Humanity+ Institute for Ethics and Emerging Technologies Future of Humanity Institute LessWrong US Transhumanist Party
People	Nick Bostrom José Luis Cordeiro K. Eric Drexler David Gobel Ben Goertzel Aubrey de Grey Zoltan Istvan FM-2030 Nikolai Fyodorovich Fyodorov Robin Hanson James Hughes Julian Huxley Ray Kurzweil Ole Martin Moen Hans Moravec Max More Elon Musk David Pearce Martine Rothblatt Anders Sandberg Gennady Stolyarov II Vernor Vinge Natasha Vita-More Mark Alan Walker Eliezer Yudkowsky
Category

Authority control databases
International	ISNI VIAF
National	United States Czech Republic Norway Poland Israel
Artists	MusicBrainz