Jump to content

Paul Christiano (researcher)

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Hstevens86 (talk | contribs) at 16:56, 9 November 2023 (→‎Career: Removed incorrect comma). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Paul Christiano
Alma mater
Known for
Scientific career
Institutions
Websitepaulfchristiano.com

Paul Christiano is an American researcher in the field of artificial intelligence (AI), with a specific focus on AI alignment, which is the subfield of AI safety research that aims to steer AI systems toward human interests.[1] He formerly led the language model alignment team at OpenAI and is now the head of the non-profit Alignment Research Center, which works on theoretical AI alignment and evaluations of machine learning models.[2]

Education

Christiano won a silver medal in the international mathematics olympiad in 2008.[3] In 2012, Christiano graduated from the Massachusetts Institute of Technology (MIT) with a degree in mathematics.[4] At MIT, he researched data structures, quantum cryptography, and combinatorial optimization.[5] He then went on to complete a PhD at the University of California, Berkeley.[6]

Career

At OpenAI, Christiano co-authored the paper "Deep Reinforcement Learning from Human Preferences" (2017) and other works developing reinforcement learning from human feedback (RLHF).[7][8] Other works such as "AI safety via debate" (2018) focus on the problem of scalable oversight – supervising AIs in domains where humans would have difficulty judging output quality.[9][10][11]

Christiano left OpenAI in 2021 to work on more conceptual and theoretical issues in AI alignment and subsequently founded the Alignment Research Center to focus on this area.[1] One subject of study is the problem of eliciting latent knowledge from advanced machine learning models.[12][13]

Christiano is known for his views on the potential risks of advanced AI, stating in a 2023 interview that there is a “10–20% chance of AI takeover, [with] many [or] most humans dead.” He also conjectured a “50/50 chance of doom shortly after you have AI systems that are human level.”[14][1]

References

  1. ^ a b c "A.I. has a '10 or 20% chance' of conquering humanity, former OpenAI safety researcher warns". Fortune. Retrieved 2023-06-04.
  2. ^ Piper, Kelsey (2023-03-29). "How to test what an AI model can — and shouldn't — do". Vox. Retrieved 2023-08-04.
  3. ^ "IMO 2008".
  4. ^ "Paul Christiano".
  5. ^ "About the Authors: Theory of Computing: An Open Access Electronic Journal in Theoretical Computer Science".
  6. ^ FHI, Future of Humanity Institute-. "Future of Humanity Institute". The Future of Humanity Institute. Retrieved 2023-08-04.
  7. ^ Christiano, Paul F; Leike, Jan; Brown, Tom; Martic, Miljan; Legg, Shane; Amodei, Dario (2017). "Deep Reinforcement Learning from Human Preferences". Advances in Neural Information Processing Systems. 30. Curran Associates, Inc.
  8. ^ Ouyang, Long; Wu, Jeffrey; Jiang, Xu; Almeida, Diogo; Wainwright, Carroll; Mishkin, Pamela; Zhang, Chong; Agarwal, Sandhini; Slama, Katarina; Ray, Alex; Schulman, John; Hilton, Jacob; Kelton, Fraser; Miller, Luke; Simens, Maddie (2022-12-06). "Training language models to follow instructions with human feedback". Advances in Neural Information Processing Systems. 35: 27730–27744. arXiv:2203.02155.
  9. ^ Irving, G.; Christiano, P.; Amodei, Dario (2018-05-02). "AI safety via debate". arXiv:1805.00899 [stat.ML].
  10. ^ Wu, Jeff; Ouyang, Long; Ziegler, Daniel M.; Stiennon, Nissan; Lowe, Ryan; Leike, J.; Christiano, P. (2021-09-22). "Recursively Summarizing Books with Human Feedback". arXiv:2109.10862 [cs.CL].
  11. ^ Christiano, P.; Shlegeris, Buck; Amodei, Dario (2018-10-19). "Supervising strong learners by amplifying weak experts". arXiv:1810.08575 [cs.LG].
  12. ^ Burns, Collin; Ye, Haotian; Klein, Dan; Steinhardt, Jacob (2022). "Discovering Latent Knowledge in Language Models Without Supervision". arXiv:2212.03827 [cs.CL].
  13. ^ Christiano, Paul; Cotra, Ajeya; Xu, Mark (December 2021). "Eliciting Latent Knowledge: How to tell if your eyes deceive you". Google Docs. Alignment Research Center. Retrieved 2023-04-16.
  14. ^ Nolan, Beatrice. "Ex-OpenAI researcher says there's a 50% chance AI development could end in 'doom'". Business Insider. Retrieved 2023-06-04.

External links