Foundation model

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by WikiCleanerBot (talk | contribs) at 06:52, 19 November 2022 (v2.05b - Bot T20 CW#61 - Fix errors for CW project (Reference before punctuation)). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

A foundation model is a large artificial intelligence model trained on a vast quantity of unlabeled data at scale (usually by self-supervised learning) resulting in a model that can be adapted to a wide range of downstream tasks.[1][2] Foundation models are behind a major transformation in how AI systems are built since their introduction in 2018. Early examples of foundation models were large pre-trained language models including BERT[3] and GPT-3. Using the same ideas, domain specific models using sequences of other kinds of tokens, such as medical codes, have been built as well.[4] Subsequently, several multimodal foundation models have been produced including DALL-E, Flamingo,[5] and Florence.[6] The Stanford Institute for Human-Centered Artificial Intelligence's (HAI) Center for Research on Foundation Models (CRFM) popularized the term.[1]

Definitions

The Stanford Institute for Human-Centered Artificial Intelligence's (HAI) Center for Research on Foundation Models (CRFM) coined the term foundation model to refer to "any model that is trained on broad data (generally using self-supervision at scale) that can be adapted (e.g., fine-tuned) to a wide range of downstream tasks".[7] This is not a new technique in itself, as it is based on deep neural networks and self-supervised learning, but the scale at which it has been developed in the last years, and the potential for one model to be used for many different purposes, warrants a new term, the Stanford group argue.[7]

A foundation model is a "paradigm for building AI systems" in which a model trained on a large amount of unlabeled data can be adapted to many applications.[8][9] Foundation models are "designed to be adapted (e.g., finetuned) to various downstream cognitive tasks by pre-training on broad data at scale".[10]

Key characteristics of foundation models are emergence and homogenization.[7] Because training data is not labelled by humans, the model emerges rather than being explicitly encoded. Properties that were not anticipated can appear. For example, a model trained on a large language dataset might learn to generate stories of its own, or to do arithmetic, without being explicitly programmed to do so.[11] Homogenization means that the same method is used in many domains, which allows for powerful advances but also the possibility of "single points of failure".[7]

Opportunities and risks

A 2021 arXiv report listed foundation models' capabilities in regards to "language, vision, robotics, reasoning, and human interaction", technical principles, such as "model architectures, training procedures, data, systems, security, evaluation, and theory, their applications, for example in law, healthcare, and education and their potential impact on society, including "inequity, misuse, economic and environmental impact, legal and ethical considerations".[7]

An article about foundation models in The Economist notes that "some worry that the technology’s heedless spread will further concentrate economic and political power".[11]

References

  1. ^ a b "Introducing the Center for Research on Foundation Models (CRFM)". Stanford HAI. Retrieved 11 June 2022.
  2. ^ Goldman, Sharon (2022-09-13). "Foundation models: 2022's AI paradigm shift". VentureBeat. Retrieved 2022-10-24.
  3. ^ Rogers, Anna; Kovaleva, Olga; Rumshisky, Anna (2020). "A Primer in BERTology: What we know about how BERT works". arXiv:2002.12327 [cs.CL].
  4. ^ Steinberg, Ethan; Jung, Ken; Fries, Jason A.; Corbin, Conor K.; Pfohl, Stephen R.; Shah, Nigam H. (January 2021). "Language models are an effective representation learning technique for electronic health record data". Journal of Biomedical Informatics. 113: 103637. doi:10.1016/j.jbi.2020.103637. ISSN 1532-0480. PMC 7863633. PMID 33290879.
  5. ^ Tackling multiple tasks with a single visual language model, 28 April 2022, retrieved 13 June 2022
  6. ^ Yuan, Lu; Chen, Dongdong; Chen, Yi-Ling; Codella, Noel; Dai, Xiyang; Gao, Jianfeng; Hu, Houdong; Huang, Xuedong; Li, Boxin; Li, Chunyuan; Liu, Ce; Liu, Mengchen; Liu, Zicheng; Lu, Yumao; Shi, Yu; Wang, Lijuan; Wang, Jianfeng; Xiao, Bin; Xiao, Zhen; Yang, Jianwei; Zeng, Michael; Zhou, Luowei; Zhang, Pengchuan (2022). "Florence: A New Foundation Model for Computer Vision". arXiv:2111.11432 [cs.CV].
  7. ^ a b c d e Bommasani, Rishi; Hudson, Drew A.; Adeli, Ehsan; Altman, Russ; Arora, Simran; von Arx, Sydney; Bernstein, Michael S.; Bohg, Jeannette; Bosselut, Antoine; Brunskill, Emma; Brynjolfsson, Erik; Buch, Shyamal; Card, Dallas; Castellon, Rodrigo; Chatterji, Niladri; Chen, Annie; Creel, Kathleen; Davis, Jared Quincy; Demszky, Dora; Donahue, Chris; Doumbouya, Moussa; Durmus, Esin; Ermon, Stefano; Etchemendy, John; Ethayarajh, Kawin; Fei-Fei, Li; Finn, Chelsea; Gale, Trevor; Gillespie, Lauren; Goel, Karan; Goodman, Noah; Grossman, Shelby; Guha, Neel; Hashimoto, Tatsunori; Henderson, Peter; Hewitt, John; Ho, Daniel E.; Hong, Jenny; Hsu, Kyle; Huang, Jing; Icard, Thomas; Jain, Saahil; Jurafsky, Dan; Kalluri, Pratyusha; Karamcheti, Siddharth; Keeling, Geoff; Khani, Fereshte; Khattab, Omar; Koh, Pang Wei; Krass, Mark; Krishna, Ranjay; Kuditipudi, Rohith; Kumar, Ananya; Ladhak, Faisal; Lee, Mina; Lee, Tony; Leskovec, Jure; Levent, Isabelle; Li, Xiang Lisa; Li, Xuechen; Ma, Tengyu; Malik, Ali; Manning, Christopher D.; Mirchandani, Suvir; Mitchell, Eric; Munyikwa, Zanele; Nair, Suraj; Narayan, Avanika; Narayanan, Deepak; Newman, Ben; Nie, Allen; Niebles, Juan Carlos; Nilforoshan, Hamed; Nyarko, Julian; Ogut, Giray; Orr, Laurel; Papadimitriou, Isabel; Park, Joon Sung; Piech, Chris; Portelance, Eva; Potts, Christopher; Raghunathan, Aditi; Reich, Rob; Ren, Hongyu; Rong, Frieda; Roohani, Yusuf; Ruiz, Camilo; Ryan, Jack; Ré, Christopher; Sadigh, Dorsa; Sagawa, Shiori; Santhanam, Keshav; Shih, Andy; Srinivasan, Krishnan; Tamkin, Alex; Taori, Rohan; Thomas, Armin W.; Tramèr, Florian; Wang, Rose E.; Wang, William; Wu, Bohan; Wu, Jiajun; Wu, Yuhuai; Xie, Sang Michael; Yasunaga, Michihiro; You, Jiaxuan; Zaharia, Matei; Zhang, Michael; Zhang, Tianyi; Zhang, Xikun; Zhang, Yuhui; Zheng, Lucia; Zhou, Kaitlyn; Liang, Percy (18 August 2021). On the Opportunities and Risks of Foundation Models (Report). arXiv:2108.07258. Retrieved 10 June 2022.
  8. ^ "Stanford CRFM". Retrieved 10 June 2022.
  9. ^ "What are foundation models?". IBM Research Blog. 9 February 2021. Retrieved 10 June 2022.
  10. ^ Fei, Nanyi; Lu, Zhiwu; Gao, Yizhao; Yang, Guoxing; Huo, Yuqi; Wen, Jingyuan; Lu, Haoyu; Song, Ruihua; Gao, Xin; Xiang, Tao; Sun, Hao; Wen, Ji-Rong (December 2022). "Towards artificial general intelligence via a multimodal foundation model". Nature Communications. 13 (1): 3094. doi:10.1038/s41467-022-30761-2. ISSN 2041-1723. PMC 9163040. PMID 35655064.
  11. ^ a b "Huge "foundation models" are turbo-charging AI progress". The Economist. ISSN 0013-0613. Retrieved 2022-10-24.