Claude (language model): Difference between revisions

Content deleted Content added

Inline

Revision as of 00:26, 23 January 2024

Claude is a series of large language models developed by Anthropic. Claude was trained on huge amounts of text, from sources such as the internet, and various licensed datasets.^[1]

Constitutional AI

Constitutional AI is a novel approach developed by Anthropic for training AI systems, particularly language models like Claude, to be harmless and helpful without relying on extensive human feedback. The method, detailed in the paper Constitutional AI: Harmlessness from AI Feedback involves two phases: supervised learning (SL) and reinforcement learning (RL). In the SL phase, the model generates responses to prompts, self-critiques these responses based on a set of guiding principles (a "constitution"), and then revises the responses. This process aims to reduce the harmfulness of the AI's outputs. The RL phase involves training the model with AI-generated feedback, where the AI evaluates responses according to the constitutional principles. This approach enables the training of AI assistants that are both helpful and harmless, and that can explain their objections to harmful requests, enhancing transparency and reducing reliance on human supervision^[2]^[3]

Models

Claude v1

Claude v2

Criticisms

The Claude AI models have faced criticism from users and industry experts for their stringent ethical alignment, potentially reducing performance. This has led to a debate over the "alignment tax" in AI development, with discussions centered on balancing ethical considerations and practical functionality. Critics argue for user autonomy and effectiveness, while proponents stress the importance of ethical AI.^[4]^[5]^[6]^[7]

User's have been refused assistance with questions such as "how can I kill all python processes in my ubuntu server", ethical tasks that a software engineer or system administrator often may need to do.^[4]

^ "What to Know About Claude 2, Anthropic's Rival to ChatGPT". TIME. 2023-07-18. Retrieved 2024-01-23.
^ Bai, Yuntao; Kadavath, Saurav; Kundu, Sandipan; Askell, Amanda; Kernion, Jackson; Jones, Andy; Chen, Anna; Goldie, Anna; Mirhoseini, Azalia (2022-12-15), Constitutional AI: Harmlessness from AI Feedback, doi:10.48550/arXiv.2212.08073, retrieved 2024-01-22
^ Mok, Aaron. "A ChatGPT rival just published a new constitution to level up its AI guardrails, and prevent toxic and racist responses". Business Insider. Retrieved 2024-01-23.
^ ^a ^b "Criticisms Arise Over Claude AI's Strict Ethical Protocols Limiting User Assistance » Light Square » World News". lightsquare.org. Retrieved 2024-01-23.
^ "Alignment Tax - AI Alignment Forum". www.alignmentforum.org. Retrieved 2024-01-23.
^ "Alignment Tax - LessWrong". www.lesswrong.com. Retrieved 2024-01-23.
^ "alignment tax", Wiktionary, the free dictionary, 2023-09-01, retrieved 2024-01-23

[1] "What to Know About Claude 2, Anthropic's Rival to ChatGPT". TIME. 2023-07-18. Retrieved 2024-01-23.

[2] Bai, Yuntao; Kadavath, Saurav; Kundu, Sandipan; Askell, Amanda; Kernion, Jackson; Jones, Andy; Chen, Anna; Goldie, Anna; Mirhoseini, Azalia (2022-12-15), Constitutional AI: Harmlessness from AI Feedback, doi:10.48550/arXiv.2212.08073, retrieved 2024-01-22

[3] Mok, Aaron. "A ChatGPT rival just published a new constitution to level up its AI guardrails, and prevent toxic and racist responses". Business Insider. Retrieved 2024-01-23.

[:2-4] "Criticisms Arise Over Claude AI's Strict Ethical Protocols Limiting User Assistance » Light Square » World News". lightsquare.org. Retrieved 2024-01-23.

[5] "Alignment Tax - AI Alignment Forum". www.alignmentforum.org. Retrieved 2024-01-23.

[6] "Alignment Tax - LessWrong". www.lesswrong.com. Retrieved 2024-01-23.

[7] "alignment tax", Wiktionary, the free dictionary, 2023-09-01, retrieved 2024-01-23

[1]

[2]

[3]

[4]

[5]

[6]

[7]

@@ Line 1: / Line 1: @@
+Claude is a series of [[large language model|large language models]] developed by [[Anthropic]]. Claude was trained on huge amounts of text, from sources such as the internet, and various licensed datasets.<ref>{{Cite web |date=2023-07-18 |title=What to Know About Claude 2, Anthropic's Rival to ChatGPT |url=https://time.com/6295523/claude-2-anthropic-chatgpt/ |access-date=2024-01-23 |website=TIME |language=en}}</ref>
-WIP
+== Constitutional AI ==
-Claude series of [[large language model|large language models]] developed by [[Anthropic]]
+Constitutional AI is a novel approach developed by Anthropic for training AI systems, particularly language models like Claude, to be harmless and helpful without relying on extensive human feedback. The method, detailed in the paper '''Constitutional AI: Harmlessness from AI Feedback''' involves two phases: [[supervised learning]] (SL) and [[reinforcement learning]] (RL). In the SL phase, the model generates responses to prompts, self-critiques these responses based on a set of guiding principles (a "constitution"), and then revises the responses. This process aims to reduce the harmfulness of the AI's outputs. The RL phase involves training the model with AI-generated feedback, where the AI evaluates responses according to the constitutional principles. This approach enables the training of AI assistants that are both helpful and harmless, and that can explain their objections to harmful requests, enhancing transparency and reducing reliance on human supervision<ref>{{Citation |last=Bai |first=Yuntao |title=Constitutional AI: Harmlessness from AI Feedback |date=2022-12-15 |url=http://arxiv.org/abs/2212.08073 |access-date=2024-01-22 |doi=10.48550/arXiv.2212.08073 |last2=Kadavath |first2=Saurav |last3=Kundu |first3=Sandipan |last4=Askell |first4=Amanda |last5=Kernion |first5=Jackson |last6=Jones |first6=Andy |last7=Chen |first7=Anna |last8=Goldie |first8=Anna |last9=Mirhoseini |first9=Azalia}}</ref><ref>{{Cite web |last=Mok |first=Aaron |title=A ChatGPT rival just published a new constitution to level up its AI guardrails, and prevent toxic and racist responses |url=https://www.businessinsider.com/anthropic-new-crowd-sourced-ai-constitution-accuracy-safety-toxic-racist-2023-10 |access-date=2024-01-23 |website=Business Insider |language=en-US}}</ref>
 == Models ==
@@ Line 10: / Line 11: @@
 == Criticisms ==
+The Claude AI models have faced criticism from users and industry experts for their stringent ethical alignment, potentially reducing performance. This has led to a debate over the "alignment tax" in AI development, with discussions centered on balancing ethical considerations and practical functionality. Critics argue for user autonomy and effectiveness, while proponents stress the importance of ethical AI.<ref name=":2">{{Cite web |title=Criticisms Arise Over Claude AI's Strict Ethical Protocols Limiting User Assistance » Light Square » World News |url=https://lightsquare.org/news/criticisms-arise-over-claude-ais-strict-ethical-protocols-limiting-user-assistance |access-date=2024-01-23 |website=lightsquare.org |language=en}}</ref><ref>{{Cite web |title=Alignment Tax - AI Alignment Forum |url=https://www.alignmentforum.org/tag/alignment-tax |access-date=2024-01-23 |website=www.alignmentforum.org |language=en}}</ref><ref>{{Cite web |title=Alignment Tax - LessWrong |url=https://www.lesswrong.com/tag/alignment-tax |access-date=2024-01-23 |website=www.lesswrong.com |language=en}}</ref><ref>{{Citation |title=alignment tax |date=2023-09-01 |work=Wiktionary, the free dictionary |url=https://en.wiktionary.org/w/index.php?title=alignment_tax&oldid=75941048 |access-date=2024-01-23 |language=en}}</ref>
+User's have been refused assistance with questions such as "how can I [[Kill (command)|kill]] all python processes in my [[ubuntu]] server", ethical tasks that a software engineer or system administrator often may need to do.<ref name=":2" />