User talk:Headbomb/unreliable
| If you're curious about why a source is highlighted, first check common cleanup and non-problematic cases and limitations, which should answer most questions. Feel free to make requests for various tweaks or more sources to be covered below and I'll address things as best I can. − Headbomb {t · c · p · b} |
A flaw in patterns
[edit]Hi again, Headbomb. As it stands, the current patterns unintentionally catch all domain names that include the target strings, e.g. there is cbn.com on your list, but it catches the unaffiliated cbn.com.cy. Perhaps adding a slash to each pattern would solve this? Daisy Blue (talk) 12:04, 22 February 2026 (UTC)
Source verification
[edit]Hi @Headbomb, I'm a long-time user of this tool and also a co-author of User:Alaexis/AI Source Verification. I'm wondering if you're planning any new features? Alaexis¿question? 13:05, 6 April 2026 (UTC)
- What new features exactly? Headbomb {t · c · p · b} 14:07, 6 April 2026 (UTC)
- I was just asking if you have some kind of a roadmap :) Alaexis¿question? 21:20, 6 April 2026 (UTC)
Blogspot.com
[edit]There are versions of Blogspot links ending with ccTLDs (like .ca or .de) as well, e. g. *.blogspot.ca and *.blogspot.de exist. My proposal is to replace "blogspot\.com" with "blogspot(\.\w+)+". Alfa-ketosav (talk) 11:31, 16 April 2026 (UTC)
AI hallucinated sources
[edit]Hi Headbomb, quick question on scope: is there any plan for UPSD to help detect AI-hallucinated sources, for example made-up ISBNs or legitimate-looking source links that just go to 404s in newly expanded or destubbed articles? Is that something UPSD might eventually cover, or does it really belong in a separate tool? HerBauhaus · talk 11:35, 29 April 2026 (UTC)
- There's no way for the script to detect those. There's no difference between an AI hallucinated source, a human-made non-existant source, and a real source that's simply hard to find. For example, if I tell you check out Kurgzwell, Bob-Étienne (1867). The Book of Women With Horrible Ugly Moles on their Left Knee. Alan & Francis. pp. 67–85. LCCN 80-570., did I make it up? Did an AI hallucinate it? How is someone supposed to know this doesn't exist or that I've made up the LCCN, making sure it has a correct checksum? Headbomb {t · c · p · b} 21:25, 29 April 2026 (UTC)
- The made-up ISBNs, specifically, can often be detected through Category:CS1 errors: ISBN and through the error message "Check |isbn= value" produced by the citation templates for these errors. You might be careful to make up a fake id with a correct checksum, but that is a level of care often not taken by the LLMs. But because the citation templates already flag these I don't see a lot of motivation for adding them to this script. —David Eppstein (talk) 22:58, 29 April 2026 (UTC)
- Thanks, that makes sense. For the narrower problem of reliable looking source links that are dead, broken, or point somewhere unrelated, is there already a tool for that, or is it still mostly a manual check? HerBauhaus · talk 05:46, 30 April 2026 (UTC)
- There are bots that tag deadlinks, but I don't know whether there is any automation that finds wronglinks. —David Eppstein (talk) 07:22, 30 April 2026 (UTC)
- Thanks, that makes sense. For the narrower problem of reliable looking source links that are dead, broken, or point somewhere unrelated, is there already a tool for that, or is it still mostly a manual check? HerBauhaus · talk 05:46, 30 April 2026 (UTC)
- Checked, LCCN 80-570 doesn't exist. Alfa-ketosav (talk) 18:50, 7 June 2026 (UTC)
- The made-up ISBNs, specifically, can often be detected through Category:CS1 errors: ISBN and through the error message "Check |isbn= value" produced by the citation templates for these errors. You might be careful to make up a fake id with a correct checksum, but that is a level of care often not taken by the LLMs. But because the citation templates already flag these I don't see a lot of motivation for adding them to this script. —David Eppstein (talk) 22:58, 29 April 2026 (UTC)