Talk:Waluigi effect

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

Sourcing[edit]

The Cleo Nardo post is interesting and very good... but... it's just Some Person On The Internet? I get that the arxiv paper cited it which does help, but it's also written a year ago, which is quite some time in the world of LLMs. I imagine that OpenAI would disagree with some of the more grandiose claims about how attempting to align is actually counterproductive. Can something more recent but similarly in-depth back up its claims as still relevant there?

(Also, there was an alleged quote from it before, but I found the quote nowhere in the cited source. Did the refs get swapped around and the quote was from somewhere else, perhaps?) SnowFire (talk) 05:56, 16 February 2024 (UTC)[reply]