Wikipedia talk:Database reports

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Legoktm (talk | contribs) at 05:11, 23 February 2024 (→‎Wikipedia:Database reports/Unused templates (filtered) update related to Module:Pagetype: Reply). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Requests: Please list any requests for reports below in a new section. Be as specific as possible, including how often you would like the report run.

Halebot request

Not sure why Halebot is adding a colon to the front end of the Lalbijo2020 file's link on this page, but doing so is a syntax error that has effectively been culled from Wikipedia, and it would be nice for various gnomes to not need to fix it weekly. If the bot or that entry could be modified to keep this from occurring each update, that would be great. Thanks. Zinnober9 (talk) 18:26, 18 August 2023 (UTC)[reply]

done 0xDeadbeef→∞ (talk to me) 12:37, 11 September 2023 (UTC)[reply]

Format change request

At Wikipedia:Database reports/Linked miscapitalizations, the inclusion of the "number" column makes it very hard to see what the diffs are, since the diff tends to align numbers rather than names. It would be equivalent, I think, to be able to sort on article name, rather than the number. Is there any reason to not just do away with the number column? Dicklyon (talk) 16:39, 21 August 2023 (UTC)[reply]

Using {{static row numbers}}, like this, is an easy way to get better diffs. – Jonesey95 (talk) 20:00, 21 August 2023 (UTC)[reply]
That template should really be the default for all reports. Gonnym (talk) 21:25, 21 August 2023 (UTC)[reply]
Can someone explain that magic? Dicklyon (talk) 04:12, 22 August 2023 (UTC)[reply]
Never mind, I now what it's doing. Looks perfect. Dicklyon (talk) 16:04, 22 August 2023 (UTC)[reply]
@Gonnym: Yeah, MZ said the same a while back. If someone wants to send a PR enabling it for this report that would be appreciated. (Or, if you're feeling courageous, flipping the default.) Otherwise I'll get to it...later. Legoktm (talk) 19:25, 22 August 2023 (UTC)[reply]

HaleBot operator, is this a change you'd be willing to make? Dicklyon (talk) 16:04, 22 August 2023 (UTC)[reply]

@HaleBot: Maybe just do away with the number column? And add a summary at the top with number or articles and total number of links, which would be good for progress tracking? Dicklyon (talk) 05:56, 4 October 2023 (UTC)[reply]

You know what, I just deployed my change to the codebase that would make all eligible reports use static row numbers. I was holding off waiting for Legoktm to review it first, but since he's on a wikibreak.. I haven't thoroughly tested it, so this might break some reports, let me know and I'll fix. 0xDeadbeef→∞ (talk to me) 12:27, 4 October 2023 (UTC)[reply]
Well, I didn't look at his contributions when I wrote that, it looks like he's back. It should be fine though. 0xDeadbeef→∞ (talk to me) 12:29, 4 October 2023 (UTC)[reply]
 Implemented, see Special:Diff/1178592838 0xDeadbeef→∞ (talk to me) 06:31, 6 October 2023 (UTC)[reply]

Hello,

The above page stopped updating a couple of months ago, any chance we could get it going again?

Many thanks! Jdcooper (talk) 13:30, 24 August 2023 (UTC)[reply]

Jdcooper, I left a note for Cewbot's operator, Kanashimi. It may take a few days; it looks like they edit fairly regularly, but not every day. BlackcurrantTea (talk) 13:12, 25 August 2023 (UTC)[reply]
There's now an updated report. Happy editing! BlackcurrantTea (talk) 14:05, 26 August 2023 (UTC)[reply]

Linked miscapitalizations includes pages tagged with R avoided double redirect

Dicklyon recently removed the R avoided double redirect rcat from Novint falcon as it was listed in the linked miscapitalizations report. I reverted that edit, as it seems like this is the situation where {{R avoided double redirect}} is supposed to be used; if the correctly-capitalized Novint Falcon redirect were expanded into a full article, the other one would need to be changed to a link to that new article.

Special:WhatLinksHere/Novint falcon shows Novint falcon as transcluding itself, which I assume comes from {{R avoided double redirect}} (Module:R avoided double redirect verifies that the current article's redirect destination matches the specified article's redirect destination, which I guess must be listed as transcluding). (The edit also removed the parameter to {{R from miscapitalisation}} but that one was a link to the correctly capitalized form so I don't think it was the cause.)

I think that all that would be needed to fix this is to change the query on both linked miscapitalizations and linked misspellings to include a p1 != p2 check; if a redirect is linking or transcluding itself it's probably fine. However, I'm not sure how to actually make this change or if there's another aspect to this I'm aware of, or even who's responsible for maintaining the code that updates these reports. --Pokechu22 (talk) 21:50, 9 September 2023 (UTC)[reply]

Pokechu22, looking at 'what links here' for most things will list a self-transclusion (for lack of a better term). For example, Drought tolerance in barley transcludes Drought tolerance in barley. It does look odd; I don't remember the explanation for it. You might find one in the archives of the technical Village pump. BlackcurrantTea (talk) 05:36, 10 September 2023 (UTC)[reply]
BlackcurrantTea, hmm. I assume those must come from other templates too then. I notice that Wikipedia:Example of a redirect has a self-transclusion, but 48 hours and 48 hours to life don't. Another interesting set of examples is 2. Divisjon and Talk:2. Divisjon; only the talkspace one has a self-transclusion (but the mainspace one is also transcluded by the talkspace one). So I guess that means that this is a more general issue.
Still, it seems to me that the database report shouldn't count self-transclusion, which is the main issue here. --Pokechu22 (talk) 05:46, 10 September 2023 (UTC)[reply]
The operators of HaleBot would be the best people to ask. Legotkm's user page says they're intermittently available until later this month. Perhaps 0xDeadbeef can help. BlackcurrantTea (talk) 09:57, 11 September 2023 (UTC)[reply]
done and should be deployed now. 0xDeadbeef→∞ (talk to me) 14:14, 11 September 2023 (UTC)[reply]
@0xDeadbeef: Thanks! I think the same change also needs to be made on the linked misspellings report too since it has similar logic, though I'm not 100% sure of this. --Pokechu22 (talk) 18:24, 11 September 2023 (UTC)[reply]
Done and deployed. 0xDeadbeef→∞ (talk to me) 10:17, 12 September 2023 (UTC)[reply]

Another case of unneeded listing comes about via redirect tags such as {{Redirect|Cityrail|the former New Zealand rail operator|Tranz Metro}} at CityRail. Can that be fixed to take Cityrail out of the report? Dicklyon (talk) 17:38, 12 September 2023 (UTC)[reply]

That would probably need us to remove "transclusion" type links with SQL queries, which needs some investigation on what needs to be done. I'm quite busy right now, so feel free to put up a pull request if you are able to implement this. 0xDeadbeef→∞ (talk to me) 11:53, 13 September 2023 (UTC)[reply]
No, I'd have no idea how to implement in that space. Anyone else? Dicklyon (talk) 22:10, 13 September 2023 (UTC)[reply]

Working on the middle

If you look at the list Wikipedia:Database reports/Linked miscapitalizations sorted by number of links, you typically see a whole lot with just one link, and then a bunch with 10 or more. That's because I'm focusing on the ones with 2 to 9 links. The ones with just 1 link accumulate as an indication of what's happening recently. The ones with a lot of links need someone with AWB or JWB to handle efficiently. For the ones with a few links, edting the linking articles in tabs is efficient enough. Dicklyon (talk) 03:52, 7 October 2023 (UTC)[reply]

Database reports from searches

I don't know if there's any way to do this efficiently, but there are a couple searches I have devised that reliably turn up a lot of busted formatting. They are not obtained by querying the database directly, but is there any way to get them on a page such as these? Here are a couple examples:

  • [1], which is insource:/\[1\]\[2\]/ in mainspace, i.e. the string "[1][2]" appearing in the page's source. This almost always means that someone has messed something up and copypasted a sentence from their browser into the edit window, destroying references.
  • insource:/\<sup\>\{\{.itation needed/ in mainspace. This detects when someone has used {{citation needed}} in superscript tags.
  • The big daddy of them all: insource:"citation needed" -insource:"needed|date" -insource:"needed|reason" -insource:/\{\{.itation .eeded\}\}/ -insource:"needed span" -insource:"needed lead" -insource:"needed paragraph" -insource:"needed section" -insource:/on-ne/ -insource:/ded \(Wi/ in mainspace. This gives busted {{cn}} attempts, where somebody just typed "[citation needed]" or "(citation needed)" etc into an article instead of invoking the template. I have a huge regex to fix a few dozen of the most common types of this error in my JWB settings.

Et cetera, et cetera. Usually I fix these myself from JWB but I feel like others would enjoy helping with this as well. Is there a way to set up a bot to do search reports for stuff like these? jp×g🗯️ 22:08, 29 October 2023 (UTC)[reply]

Hi @JPxG: I think we can just have a page that is a collection of these search links and maybe have a bot that updates the hit count daily (to track the approximate number of pages)? The search function gives instant results, which is probably preferable over a page updated periodically by bots. 0xDeadbeef→∞ (talk to me) 10:09, 30 October 2023 (UTC)[reply]
We can query the search index replicas via Toolforge (see See wikitech:Help:CirrusSearch elasticsearch replicas). It exposes some features which are not available form web UI search. toolforge:global-search is one of the few tools that use it, but doesn't seem to expose the extra features, and doesn't provide a way to restrict results to enwiki. Would it be useful to have a {{search report}} template analogous to {{database report}}? – SD0001 (talk) 11:09, 30 October 2023 (UTC)[reply]
Oooh, given that elasticsearch replicas exist (TIL!) it would be nice if we can make use of the extra features. Though if the web UI search is sufficient in some cases I still don't think bot reports would be necessarily beneficial? 0xDeadbeef→∞ (talk to me) 11:37, 30 October 2023 (UTC)[reply]
Rethinking this, its actually probably quite beneficial to have a community maintained list of search queries where a bot would come by and update periodically. Its better at tracking stuff and makes it better for editors to navigate. 0xDeadbeef→∞ (talk to me) 08:59, 23 November 2023 (UTC)[reply]

Top new article reviewers report code needs to be updated

There was a recent change to PageTriage, where the logging of reviews is split based on whether the target is an article or a redirect. This is causing the Wikipedia:Database reports/Top new article reviewers report to give wrong results. Please change any queries in the code to replace instances of log_action = 'reviewed' with log_action in ('reviewed', 'reviewed-article', 'reviewed-redirect') This should fix the problem. -MPGuy2824 (talk) 03:30, 9 November 2023 (UTC)[reply]

cc @MusikAnimal, who is probably the maintainer of Community Tech bot, which generates that report. –Novem Linguae (talk) 06:01, 9 November 2023 (UTC)[reply]
@MPGuy2824 @Novem Linguae Thanks for the ping! Partially fixed with 80b6552, but I think the counting of redirects is still sort of broken. My understanding (please correct me if I'm wrong): For historical data, we need to still go by page.page_is_redirect, but for where data is available, we should sum where log_action = 'reviewed-redirect'. Is that correct? MusikAnimal talk 15:49, 10 November 2023 (UTC)[reply]
Yes, that is correct. Since this report calculates data over the previous 365 days, we can remove the code that takes care of historical data only after that time. I've set a reminder for myself via W-Ping. Thanks for the quick fix, btw. -MPGuy2824 (talk) 03:11, 11 November 2023 (UTC)[reply]

/Blocked users in user group

Good evening fellow Wikipedians, so the database report above is no longer updated since October 6 of last year. The not who was updating it, BernsteinBot (talk · contribs), hasn't edited since October 12, 2022. Should we archive the report or get another not to take over the updating? Toadette (Happy Thanksgiving!) 18:34, 22 November 2023 (UTC)[reply]

I suppose I can finally get around to looking into how {{database report}} works (no interest in running a bot ever again after the way mine was treated). The query in the configuration is severely out of date - besides the schema changes, it doesn't cull the extendedconfirmed group, currently at 5867 blocks, and I'm sure it was close to that when BernsteinBot was still running - but that's easy enough to fix. —Cryptic 19:07, 22 November 2023 (UTC)[reply]

If you look at the history of Wikipedia:Database reports/Unused templates (filtered)/1, you can see the size of the report jumping up and down from day to day, starting on 15 November. It should be much more steady. Pages appear on and disappear from the report for no apparent reason. Clues or fixes are welcome. – Jonesey95 (talk) 15:05, 23 November 2023 (UTC)[reply]

Anyone? Pinging Legoktm and 0xDeadbeef, the listed operators of HaleBot, the user that updates this report. – Jonesey95 (talk) 16:32, 28 November 2023 (UTC)[reply]
And now the report has stopped updating. No updates since 2 December. Help? – Jonesey95 (talk) 16:44, 5 December 2023 (UTC)[reply]
Ugh, that's wild. It might be a few days before I can look in depth. I wonder if one of the DB replicas is out of sync with the others...or maybe something changed and our query is just busted now. Legoktm (talk) 07:29, 6 December 2023 (UTC)[reply]
This still isn't working properly. Any idea what the issue is? Gonnym (talk) 09:07, 18 December 2023 (UTC)[reply]
OK so I tracked down phab:T354089, which seems to be that the replica has fallen out of sync with production, causing some weirdness, but there's more to the story, I'm still debugging. Legoktm (talk) 04:45, 29 December 2023 (UTC)[reply]
@Jonesey95, @Gonnym: I've applied a fix to the query logic, I'm not sure if this will fully address the issue but it should surface some more unused templates. Legoktm (talk) 04:53, 29 December 2023 (UTC)[reply]
There are 1,675 templates listed on the report at this writing, which is probably about the right number. We'll see if it fluctuates into the 200–300 range, as it has been doing, or if it stays relatively stable. Thanks for continuing to track down this strange problem. It's challenging to debug a problem when you are not convinced that you have found the actual cause of the problem. – Jonesey95 (talk) 14:13, 29 December 2023 (UTC)[reply]
@Jonesey95, et al: how do the updates over the past few days look - are we OK to call this resolved? Legoktm (talk) 06:32, 3 January 2024 (UTC)[reply]
So far seems good. Thanks! Gonnym (talk) 09:12, 3 January 2024 (UTC)[reply]
Yes, the updates appear to be working correctly. I check them daily. Thanks! – Jonesey95 (talk) 14:57, 3 January 2024 (UTC)[reply]

Wikipedia:Database reports/Uncategorized templates typically updates once a week on Monday. It is now 25 hours overdue. HaleBot's talk page redirects to this page. Pinging Legoktm and 0xDeadbeef, the listed operators. – Jonesey95 (talk) 17:20, 5 December 2023 (UTC)[reply]

Sorry, this is my fault. Should be fixed now and I just kicked off a run. I'll be back online in like 6 hours in case it didn't work to debug further. Legoktm (talk) 21:06, 5 December 2023 (UTC)[reply]
Legoktm, There are numerous weekly reports that haven't updated since November 26/27. --DB1729talk 01:15, 6 December 2023 (UTC)[reply]
Yes, I believe he fixed it for all reports. 0xDeadbeef→∞ (talk to me) 01:21, 6 December 2023 (UTC)[reply]
Great, and so does someone need to "kick off a run" for each of them now? DB1729talk 01:25, 6 December 2023 (UTC)[reply]
I believe when he said "kicked off a run" he meant for all reports. 0xDeadbeef→∞ (talk to me) 01:38, 6 December 2023 (UTC)[reply]
Ok thanks. I can be patient:) I only mentioned it because the one discussed above, Wikipedia:Database reports/Uncategorized templates, updated several hours ago. While the others have not yet updated. DB1729talk 01:47, 6 December 2023 (UTC)[reply]
And some daily reports have not updated since Dec. 2. Hopefully the same fix will have them back on track. Dicklyon (talk) 03:10, 6 December 2023 (UTC)[reply]
There are still some issues I'm debugging, but more reports should be updating now... Legoktm (talk) 06:27, 6 December 2023 (UTC)[reply]
OK, I think all the reports are up to date, except the article streak ones. If anything did not get an update, please let me know and I can look again when I wake up in a few hours. Legoktm (talk) 07:31, 6 December 2023 (UTC)[reply]
Thank you!:) DB1729talk 11:13, 6 December 2023 (UTC)[reply]
@Legoktm - Wikipedia:Database reports/Orphans with incoming links has stopped running, and last ran 03:00, 9 December 2023. JoeNMLC (talk) 19:53, 16 December 2023 (UTC)[reply]
JoeNMLC, it's up to date now. That report's done by DannyS712 bot, run by DannyS712. If it happens again, he's probably the best person to contact. BlackcurrantTea (talk) 08:54, 17 December 2023 (UTC)[reply]

Wikipedia:Database reports/Unused templates (filtered) update related to Module:Pagetype

A recent change to Module:Pagetype has caused some pages to register a self transclusion (but they are still unused). Can Wikipedia:Database reports/Unused templates (filtered) be modified to now check if the template's only transclusion is itself and if so keep it on the report? Gonnym (talk) 12:39, 6 February 2024 (UTC)[reply]

Looking for a template with no transclusions is much easier than just looking for one that happens to be a self-transclusion...I'm thinking of how to restructure the SQL query to accommodate this, if anyone wants to propose a better query that handles this, please do. Legoktm (talk) 04:02, 16 February 2024 (UTC)[reply]
I don't see why it would need it? Just add a clause to the templatelinks join; you already have the template page's page_id. quarry:query/80586. Also note the backslashes in the LIKEs; underscore is a metacharacter. —Cryptic 06:17, 22 February 2024 (UTC)[reply]
And quarry:query/80588 lets you get rid of the postprocessing and all those secondary queries. —Cryptic 07:01, 22 February 2024 (UTC)[reply]
@Cryptic: awesome, I'm glad you're better at SQL than me :) Would you like to submit a PR with your improved query? Otherwise I'll get to it shortly. Legoktm (talk) 05:11, 23 February 2024 (UTC)[reply]

HaleBot has not edited for a couple of days

I'm not panicking yet, but HaleBot has not edited for a couple of days. Over 48 hours, if my math is right. It averages about 45 edits per day, so a two-day break is unusual. – Jonesey95 (talk) 05:01, 22 February 2024 (UTC)[reply]

See T358175. It's trivial to restart, but I've left it in a broken state in case it makes it easier for Toolforge admins to diagnose the underlying root cause. Legoktm (talk) 05:23, 22 February 2024 (UTC)[reply]