Wikipedia:Bots/Noticeboard

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Scottywong (talk | contribs) at 22:26, 7 June 2022 (→‎MalnadachBot and watchlists). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

    Bots noticeboard

    Here we coordinate and discuss Wikipedia issues related to bots and other programs interacting with the MediaWiki software. Bot operators are the main users of this noticeboard, but even if you are not one, your comments will be welcome. Just make sure you are aware about our bot policy and know where to post your issue.

    Do not post here if you came to


    The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.


    It was raised at WP:BN (diff) that ProcseeBot has not performed any logged actions (i.e. blocks) since November 2020 (log). Given that the bot is not high-profile I'm not really surprised that its inactivity managed to pass under the radar of probably everyone except xaosflux, since they've been removing the bot's name from the inactive admins report for a while. That being said, Slakr seems to have become somewhat inactive as of late, and pppery has suggested the bot be stripped of its rights. Since its activity is primarily a bot-related task and not an admin-related task, I'm bringing it here for review. I have left Slakr a talk page note about this discussion. Primefac (talk) 07:29, 5 May 2022 (UTC)[reply]

    I feel like for security reasons we can probably apply the usual activity requirements to just the bot (rather than including if the operator is active). If an adminbot hasn't logged an admin action for a year it probably shouldn't be flagged as such and a crat can always reflag if it ever needs to be active again. Galobtter (pingó mió) 07:43, 5 May 2022 (UTC)[reply]
    @Primefac I did contact Slakr about this a few months ago (User_talk:Slakr/Archive_26#ProcseeBot); where they indicated it may be reactivated, thus why I have been skipping it during removals (as its admin access is an extension of its operators who is still an admin). So policy wise, think we are fine. Shifting off my 'crat hat and putting on my BAG hat - yes I think we should deflag inactive adminbots; their operators can always ask at BN to reinstate so long as the bot hasn't been deauthorized. — xaosflux Talk 09:36, 5 May 2022 (UTC)[reply]
    I did figure you contacted them, and from a BAG perspective "it might be reactivated soon" is always good enough to leave things be. Shifting to my own 'crat hat, though, a temporary desysop until it's back up and running is reasonable, especially since it's been 1.5 years. Courtesy ping to ST47 who runs ST47ProxyBot. Primefac (talk) 09:47, 5 May 2022 (UTC)[reply]
    • I don't think we should make a redline rule on this, and that if these rare cases arise a BOTN discussion like this is the best way to deal with things. In this case, baring a response from the operator within a week, that this is going to be activated in the month, my position is that we should desysop the bot. — xaosflux Talk 09:53, 5 May 2022 (UTC)[reply]
      For the record I never intended this as any sort of rule-creating; we're discussing a singular bot. Primefac (talk) 10:06, 5 May 2022 (UTC)[reply]
    • I think removing advanced perms from inactive bots is a good idea, and allowing them to be returned on-request if the botop wants to reactivate the bot (as long as the approval is still valid). ProcrastinatingReader (talk) 09:56, 5 May 2022 (UTC)[reply]
    • So, does anyone intend to implement the unanimous agreement here? * Pppery * it has begun... 19:13, 11 May 2022 (UTC)[reply]
      I'll ask over at BN. — xaosflux Talk 19:18, 11 May 2022 (UTC)[reply]
      WP:BN request opened. — xaosflux Talk 19:22, 11 May 2022 (UTC)[reply]
    The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

    Bot that fixes links to nonexistent category pages?

    Just wondering: Is there a bot that currently performs edits related to nonexistent category pages, such as removing the links from articles or creating the category? (Preferably the former?) Steel1943 (talk) 18:35, 6 May 2022 (UTC)[reply]

    @Steel1943: The maintenance list Special:WantedCategories is typically very, very short these days; I think this is done by a few of our category specialists. I don't know whether they do this mostly by hand, but I wouldn't be surprised: you need to triage whether this is the result of a typo or vandal edit, a category deletion where articles or templates were not adjusted properly, or shows an actual need for the redlinked category. —Kusma (talk) 20:45, 11 May 2022 (UTC)[reply]

    Controversy About Report Being Generated by Bot

    There is a deletion discussion at MFD which is really a bot issue. A bot is generating a report that appears to be a hierarchical list of deleted categories. Another editor has requested that the list be deleted, as an evasion of deletion policy.

    User:Qwerfjkl, who operates the bot and coded the task to generate the list, says that this has been coordinated with User:Enterprisey. User:Pppery says that the list should be deleted. I haven't studied the issue to have an opinion on whether the list should continue to be generated, or whether the bot task that generates the list should be turned off. However, I don't think that MFD is an appropriate forum to decide whether the bot should be generating the list. If the list is deleted, then the bot will generate a new version of the list, and Pppery says that they will tag the new version of the list as G4. Then maybe after that is done twice, the title may be salted, and the bot may crash trying to generate it. That doesn't seem like the way to solve a design issue. The question is whether there is a need for the bot to be producing the list. If so, leave it alone. If not, stop the bot. If this isn't the right forum either, please let us know where is the right forum, because it is my opinion that MFD is not the right forum. Robert McClenon (talk) 23:37, 17 May 2022 (UTC)[reply]

    Per Wikipedia:Bot policy, if you are concerned that a bot no longer has consensus for its task, you may formally appeal or ask for re-examination of a bot's approval. The policy links to this noticeboard for initiating an appeal discussion. I see BOTN as the appropriate venue for this. 🐶 EpicPupper (he/him | talk) 23:44, 17 May 2022 (UTC)[reply]
    My first inclination was to agree, but tasks that run under the policy exemption, as this one does, seem to be outside BAG's purview. As a practical matter, I think it's better for the community to directly decide (in a non-BON area) whether the task enjoys consensus. Even in bot appeals it helps to have the result of a relevant consensus process on the task (usually RfC). Userspace tasks may not require pre-approval, but as with any editing behaviour I think consensus is still able to put a halt to it if people find it to be problematic. ProcrastinatingReader (talk) 01:04, 18 May 2022 (UTC)[reply]
    The bot appears to be operating under WP:BOTUSERSPACE. There's no approval to review. Whether a BAG member was involved in the discussion that led to the creation of the bot has no weight. I'm not sure whether MFD is the right forum (versus say reopening the VPR discussion that led to the task in the first place), but it's better than here. Anomie 01:59, 18 May 2022 (UTC)[reply]
    MFD is a silly forum in which to discuss a bot task. A Delete would mean to throw away the output from the bot, rather than to stop the bot task as such. If the editors here think that Bot noticeboard is also the wrong forum, then maybe the bot should be allowed to continue to generate the list.
    I started out not having an opinion, and now have an opinion that the MFD is misguided.
    Thank you for your comments. Robert McClenon (talk) 02:31, 19 May 2022 (UTC)[reply]
    If MFD decides that the content shouldn't exist, then WP:G4 would apply and admins would be justified in taking appropriate action to prevent the bot from recreating it. The oddness comes from whether MFD is the appropriate forum for overriding the original VPR discussion. Anomie 11:48, 19 May 2022 (UTC)[reply]
    @Anomie, I'm willing to shutdown the bot if consensus is against it. ― Qwerfjkltalk 16:05, 19 May 2022 (UTC)[reply]
    • On first pass, so long as this is low volume it doesn't seem to be in direct violation of the bot policy as it is in userspace. That doesn't mean that it is appropriate, or that it isn't disruptive. Would like to hear some feedback from the operator. — xaosflux Talk 14:09, 19 May 2022 (UTC)[reply]
      Operator notified. — xaosflux Talk 14:10, 19 May 2022 (UTC)[reply]
      I'm not sure what the best venue to deal with this is, but my initial feeling is that this is a bad idea, mostly because the bot keeps making pages that it seems noone is reading, then requesting that the same page be deleted - making needless work for admins who have to constantly clean up after it. — xaosflux Talk 14:14, 19 May 2022 (UTC)[reply]
      Can someone point to the VPR discussion that is being mentioned above? — xaosflux Talk 14:16, 19 May 2022 (UTC)[reply]
      OK, seems this is Wikipedia:Village_pump_(proposals)/Archive_187#Automatically_generate_a_record_of_the_contents_of_deleted_categories_at_the_time_of_deletion - which I don't really see as representative of any strong community consensus - seems like it just sort of died out. — xaosflux Talk 14:20, 19 May 2022 (UTC)[reply]
    • Comment - This seems to be getting more complicated. However, having a bot generate a report that needs to be deleted without being used sounds like a bad idea. Robert McClenon (talk) 19:47, 19 May 2022 (UTC)[reply]
    • Thanks for the link to the Village Pump discussion, that makes it much clearer what this is about. "Listify and delete" outcomes in CfD discussions are rare to begin with. But, if that is the outcome, the category is kept until listification has really taken place. However, it may happen that the list is initially created but deleted later e.g. because sources were not provided. That very rare problem could be solved in a different way if (in case of a "listify and delete" outcome) closers of CfD discussions would list the category content on the talk page belonging to the discussion page. So we can stop the bot without any harm. Marcocapelle (talk) 21:12, 19 May 2022 (UTC)[reply]
    • @Fayenatic london, Bibliomaniac15, and Explicit: pinging some administrators involved in CfD closing. Marcocapelle (talk) 21:15, 19 May 2022 (UTC)[reply]

    MalnadachBot and watchlists

    Hey folks, This has been brought up to the bot operator a number of times (e.g.[1]), but responses have been largely unhelpful. As such, I'd like to ask that the MalnadachBot be halted or its impacts on watchlists be removed. I understand it is trying to clean up pages for all sorts of reasons (e.g. accessibility) but it is a huge mess. I've had days were my watchlist in the last 24 hours was 50% from this bot. And there are pages where this bot is the only editor and has somehow found a dozen or more times where it needed to edit the page. See [2] for such a page.

    We don't allow cosmetic edits for a reason. And I understand these aren't purely cosmetic, but the same issues apply. Do we really need signatures to be cleaned up in old RfAs? Its impact on watchlists is a molehill, but a really annoying one. I don't want to remove bot edits from my watchlist. And I do want to watchlist the pages I have listed. And yes, there are ways I could manage a script to remove this bot from my watchlist, but really that's not a good way forward--it only solves the problem for me, not everyone else. Perhaps we could have this particular bot not impact watchlists (basically a default script for everyone?). Or we could just halt it. Something, please. Hobit (talk) 22:44, 5 June 2022 (UTC)[reply]

    @Hobit this bot appears to be asserting the bot flag, I suggest you enable "bots" under the "hide" options on watchlist, that way you won't be bothered with such things (this is what fixes it for "everyone else" - mostly already). — xaosflux Talk 23:17, 5 June 2022 (UTC)[reply]
    • Thanks Xaosflux. Above I said: "I don't want to remove bot edits from my watchlist." Yes, I could do this. But I've seen bot edits that have been problematic in the past. And I don't think ignoring bot edits is the default, so I'm not at all sure that fixes things for many, let alone most. I think you all are significantly underestimating the pain associated with this for folks. User retention is one of our primary issues and I suspect this hurts in its own small way. Hobit (talk) 23:54, 5 June 2022 (UTC)[reply]
      A "default script for everyone" is the MediaWiki feature that defines the user group "bot". Something in common.js has the same probability of being enabled as a dog has to fly. Yes, I could do this. But I've seen bot edits that have been problematic in the past Great! I would reccomend you keep on reading MalnadachBot's edits, then; if you find one of the fixes problematic, you can raise that specific task at BOTN (here). Cheers! 🐶 EpicPupper (he/him | talk) 00:10, 6 June 2022 (UTC)[reply]
      Not sure what else to say here. That you want to see bot edits, but don't like these ones is your own preference. As far as halting, I'm not seeing anything technically wrong here - but if there is a consensus that this class of edits is now inappropriate, we could certainly stop this bot (and all others that do similar updates). WP:VPM may be a good place to have or advertise for such a discussion. — xaosflux Talk 00:14, 6 June 2022 (UTC)[reply]
      After 3.7 million edits and fixing over 10 million Lint errros, I have yet to have anyone bring up actual erroneous edits or malfunctions by the bot. That's because I am running it in the safest mode possible. it only solves the problem for me, not everyone else perhaps thats because most people dont have a problem with it and use the options available. What actually hurts user retention, or prevents people from becoming users in the first place, is knowingly letting accessibility problems lie in our pages just because some people dont want to customise their watchlists. ಮಲ್ನಾಡಾಚ್ ಕೊಂಕ್ಣೊ (talk) 06:29, 6 June 2022 (UTC)[reply]
    Tbh my issue with the bot is that it makes 5-10 edits to the same page to fix a single issue like deprecated <font> tags when if it actually parsed the page instead of fixing one signature at a time using regexes it should be able to fix all of them at the same time, properly. I'd probably support stopping the bot for very low-priority issues like <font> tags in signatures (which really aren't much of an issue); we can fix them when someone writes a bot to do it properly in one edit. (I basically agree with Legoktm's comments here).
    Most of MalnadachBot's edits seem to be about <font> tags, so I think a narrowly tailored RfC on just stopping edits that fix <font> tags if they don't fix all the errors on the page, would stop a lot of the watch-list spam but allow real lint fixes to be done. Galobtter (pingó mió) 01:00, 6 June 2022 (UTC)[reply]
    One misunderstanding here is that the bot is not just fixing "a single issue like deprecated <font> tags", which makes the job at hand seem trivial. There is a useful thread about this bot task in the archive, in which I listed a selection from the wide variety of errors that this bot is attempting to fix. I put out an open call for assistance to editors who think that this bot can perform better: Wikipedia:Village pump (technical)/Archive 70 is a sample page with approximately 180 Linter errors, nearly all of them font-tag-related. I encourage you to try to create a set of false-positive-free regular expressions that can fix every Linter font tag error on that page, and other VPT archive pages, with a single edit of each page. If you can do so, or even if you can reduce the count by 90% with a single edit, you will be a hero, and you may be able to help Wikipedia get out from under the burden of millions of Linter errors much more quickly. Nobody has taken up the challenge, so it is still open. Meanwhile, the bot has edited that VPT archive page five times since that discussion, reducing the error count by about 25%. – Jonesey95 (talk) 03:48, 6 June 2022 (UTC)[reply]
    To be clear I understand the bot is fixing a lot of issues - this is precisely why I'm only talking about limiting the bot on the specific issue of the obsolete font tag, which from looking at the bot edits is what most of the edits are about.
    Re "false-positive-free regular expressions" - that's precisely what I'm talking about is the issue. Regular expressions (in a very provable, mathematical sense - see [3]) cannot handle HTML, which is why there's so much issue with false positives. But a bot using an actual wikitext parser should be able to do a much better job. Galobtter (pingó mió) 04:00, 6 June 2022 (UTC)[reply]
    The bot operator has, multiple times, offered to accept help with programming a bot that does a better job. As far as I know, nobody has been willing to put in the work to make those improvements. This bot, with apologies to Winston Churchill, is currently the worst way to fix these millions of errors except for all of the other ways that have been tried thus far. – Jonesey95 (talk) 04:28, 6 June 2022 (UTC)[reply]
    Are we certain the errors need to be fixed? And if so, is there any rush? I'm struggling to understand why we have a dozen edits made to old RfAs. Perhaps we could either not fix them or agree to not have it edit a page more than once a year and fix all those errors at once? Hobit (talk) 11:31, 6 June 2022 (UTC)[reply]
    <font>...</font> and some other tags have been deprecated, so yes, they need to be fixed, according to the Wikimedia developers, who set up this error tracking. No, there is no rush; a small group of editors has been working on these millions of errors for almost four years, and there are still more than 11 million errors left to fix. There is also no good reason to slow down; I wish the errors could be fixed more quickly, honestly, hence my plea for help in developing better code for the bot. Re fixing all the errors at once, please read what I posted and linked to above. – Jonesey95 (talk) 16:06, 6 June 2022 (UTC)[reply]
    @Jonesey95: This argument is a red herring. Yes, it would be challenging (although not impossible) to design a single regular expression that fixes all of the various linter errors in one go. But that is not necessary to address the editing behavior issues with this bot. You could have 100 separate individual regular expressions, each of which fixes a single issue, and that would be far less challenging to develop. However, that doesn't mean that you'd need to make 100 individual edits in order to fix those issues. Any minimally sophisticated bot could grab the content of a page, run in through the first regex, then take the result and pass it through the second regex, then take the result and pass it through the third regex ... then take the result and pass it through the 100th regex, and then make a single edit to the page. Instead, this bot operator insists on making upwards of 20 edits to a single page. There is no insurmountable technical hurdle that's in the way of fixing these issues with one edit per page. In other words, if you've written the code that can reliably fix all of the 180 linter errors on your example page by making 20 edits, then it would be trivial to use that code to fix all 180 linter errors in a single edit. Either this bot operator isn't sophisticated enough to do that, or doesn't want to put in the additional effort to reduce their edit count. —⁠ScottyWong⁠— 17:24, 6 June 2022 (UTC)[reply]
    Scottywong says that it is possible but still does not offer code to improve the bot. Please see the list of signatures I provided in the previous discussion, and develop a set of regexes to fix a decent fraction of them in one edit. If someone here can provide a false-positive-free regex or set of regexes that works as described above, I expect that the bot operator will be willing to include them in their code. – Jonesey95 (talk) 18:05, 6 June 2022 (UTC)[reply]
    I'm not going to spend hours developing code just to prove you wrong. Are you saying that it's technically impossible to combine multiple fixes into a single edit? Again, if you have the code to fix 20 different issues in 20 different edits, then you have the code to fix those 20 issues in a single edit. This shouldn't be difficult to understand. I could write an entire article in 10,000 edits by adding one word per edit, or I could write the same exact article in a single edit. The problem isn't with the bot's ability to fix the linter errors; the problem is that it makes far too many edits per page to fix those errors, and there is no legitimate reason that multiple edits are required. —⁠ScottyWong⁠— 01:40, 7 June 2022 (UTC)[reply]
    If this specific bot annoys you, but you don't want to ignore bots in general, WP:HIDEBOTS has instructions on how to ignore a specific bot. Headbomb {t · c · p · b} 10:30, 6 June 2022 (UTC)[reply]
    Thanks. I did detail why I don't think that's a good way forward. Shutting down the bot, or taking a similar action to what you describe but making it the default for everyone, would address my issue. At the moment I think we are doing more harm than good. Hobit (talk) 10:51, 6 June 2022 (UTC)[reply]
    • Support halting the bot until such time that it can be shown that the bot operator is capable of fixing all linter errors on a page with a single edit. The benefits of fixing trivial linter errors is not outweighed by the disruption caused by taking multiple edits to accomplish it. Ensuring that the signature of a long-retired editor displays correctly on a 15 year-old AfD is not an important enough problem to trash everyone's watchlist for the remainder of eternity. None of the proposed methods of filtering the bot's edits from your watchlist are suitable for all circumstances. —⁠ScottyWong⁠— 17:28, 6 June 2022 (UTC)[reply]
      "disruption caused by taking multiple edits to accomplish it"
      AKA little-to-no disruption, which can easily be bypassed. We had a discussion on this just last month, and there's nothing that warrants stoppage of the bot. People are welcomed to suggest concrete improvements, but a loud minority that refuse to mute the bot on their own watchlist is no reason to halt the bot. Headbomb {t · c · p · b} 18:31, 6 June 2022 (UTC)[reply]
      • Lots of people seem to think it's a problem. If you've every worked with programmers (I am one), you've had the experiance where you say "this is a problem" and they say "no, it's not". And you're like "but I'm the user and yes, I understand why you think it's not a problem, but in practice, it is for me". We are having that discussion right now. Please don't assume your users are just being whiney or dumb. Hobit (talk) 19:15, 6 June 2022 (UTC)[reply]
        And lots of people have been given a solution which they refuse to use. MalnadachBot, on the whole, does more good (fixing hundreds of thousands / millions of lint errors) than harm (annoying a handful of editors). While it may not function not optimally (everyone agrees with would be great if the bot could do one edit per page and fix all the errors in one go), this is not reasonably doable / very technically challenging (while perhaps not insolvable, the issue has not yet been solved), and the bot still functions correctly and productively. Those that don't like it can solve their annoyance with single one edit, as detailed on WP:HIDEBOTS. Headbomb {t · c · p · b} 21:10, 6 June 2022 (UTC)[reply]
        I think it's a lot more people who are annoyed than you are acknowledging. But that's why we have discussions. You may well be correct. I'm really unclear on why fixing these lint errors in old discussions is worthwhile. Is there a pointer to a discussion on this? Hobit (talk) 22:30, 6 June 2022 (UTC)[reply]
        You can start at Wikipedia:Linter. There are multiple links from there if you want to take a deeper dive. – Jonesey95 (talk) 00:03, 7 June 2022 (UTC)[reply]
    Let the bot continue with its excellent work. Anyone who wishes to do a better job, is more then welcomed to either write their own bot or contribute better code to this one. Putting the onus on ಮಲ್ನಾಡಾಚ್ ಕೊಂಕ್ಣೊ to make an extremely complex code, better, while at the same time, shutting down its bot is ridiculous. If anyone has a problem with "watchlist spam", then hide bot edits. Don't want to? Then stop complaining. Gonnym (talk) 18:31, 6 June 2022 (UTC)[reply]
    • No? This is making it harder for me to use Wikipedia. So I will complain. If it turns out it really is only a handful of us who find the cost/benefit analysis to be poor then I'll just put up with it (as I have for more than a year now). Hobit (talk) 22:33, 6 June 2022 (UTC)[reply]
      Or you know, just hide it and have Wikipedia be 'easy' to use again. Headbomb {t · c · p · b} 00:35, 7 June 2022 (UTC)[reply]
    • @Headbomb: What about the set of editors that want to continue monitoring bot edits? Or the set of editors that want to specifically monitor Malnadachbot's edits to the pages on their watchlists (without being bombarded by the bot editing the same page dozens of times)? If we're all ignoring the bot's edits, then major problems could slip through the cracks without anyone noticing. I don't understand why some of you are so adamant that there is no problem here, despite the continuous stream of complaints from various editors over the last few months. What's the rush here? Why can't we just pause for a brief moment, take a step back, and see if there is a better way to proceed? —⁠ScottyWong⁠— 01:51, 7 June 2022 (UTC)[reply]
    • Then hide MalnadachBot and leave the others unhid. Those than want to monitor MalnadachBot's edits but don't want to see MalnadachBot's edits is a minority so nonexistent we might as well discuss the current lives of dead people. And in the off chance those exists, Special:Contributions/MalnadachBot is there for that. Headbomb {t · c · p · b} 01:55, 7 June 2022 (UTC)[reply]
    • Why are you always defending this bot so aggressively? Do you really believe that it is technically difficult to combine multiple fixes into a single edit? If you make 5 edits in a row to the same article, there is no technical reason that you couldn't have combined those 5 edits into 1 edit. I truly don't understand why you (and a few others) don't see this as being even a small issue. No one should have to hide a specific bot from their watchlist because its behavior is so disruptive. Keep in mind that this bot is also filling up the revision histories of millions of pages, making them more tedious to look through for editors. After a couple years of this bot running like this, a huge portion of the page histories on WP will be infected with the Malnadachbot Virus of 2022, as evidenced by a giant block of 50+ edits in a row to every discussion page in the project that has user signatures on it. How can you not see this as a problem? —⁠ScottyWong⁠— 02:04, 7 June 2022 (UTC)[reply]
    Do you really believe that it is technically difficult to combine multiple fixes into a single edit? Answer to that is an unequivocal Yes. See the stackoverflow link Galbotter has shared above. As someone who has spent hundreds of hours fixing html errors, I fully agree with the pinned answer. Eveybody Gangsta Until they try to build a bot that can fix all Lint errors in a single edit. You can sooner build a bot to write featured articles than a bot to fix all Lint errors of all type in a single edit. If this doesn't make sense to you, you have no idea what we are actually dealing with. With this bot run in future, people will not notice anything unusal when reading pages, which is exactly this point of these edits. Users have a greater necessity to read old discussions than to check its page history. Page histories are irrelevant to readers, whose needs come first. ಮಲ್ನಾಡಾಚ್ ಕೊಂಕ್ಣೊ (talk) 05:58, 7 June 2022 (UTC)[reply]
    @ಮಲ್ನಾಡಾಚ್ ಕೊಂಕ್ಣೊ: You're misunderstanding the point. I understand the difficulties in detecting and fixing a wide variety of linter errors. It's not a simple problem to solve. However, the issue that is being brought up here is not that your fixes are inaccurate or invalid. The issue is that you're making far too many edits per page to solve these problems. Regardless of the complexity of the problem, if you have written code that can accurately correct 100 different linter errors by making 100 separate edits to the same page, then it should be trivial to amend that code so that it fixes all 100 errors in a single edit. There is no reason that your code has to actually commit an edit after fixing each individual error, before moving on to the next error. Your code analyzes the wikitext of a page, modifies it, and then submits the modified wikitext as an edit. Instead, your code should submit the modified wikitext to the next block of code, which can analyze it and modify it further, and then the next block of code, etc., until all possible modifications have been made, and then submit the final wikitext as a single edit. Instead, you are fixing one error, making an edit, fixing another error on the same page, making another edit. This is the problem. No one is asking you to develop code that magically fixes every linter error known to humanity in a single edit. All we're saying is that your code should fix all of the issues that it is capable of fixing within a single edit. I don't understand why that is difficult for you. —⁠ScottyWong⁠— 15:30, 7 June 2022 (UTC)[reply]
    I don't have code to fix 100 patterns in a page at the same time. Rather what I usually have is code to fix 100 patterns spread across 50,000 pages. Individual pages among the 50,000 will have at least 1 and usually not more than 5 of the patterns that are being checked in that batch. All patterns are checked sequentially and then saved in a single edit. In the (highly unlikely) case if a page has all 100 patterns in a batch, it matches the first pattern, the output of which will be matched with the second pattern and so on upto the 100th. Only the final output is saved in a single edit, it does not save 100 times in hundred edits. Once the 100 patterns in this batch across 50,000 pages is cleared, 100s more issues that were previously burried deep come out to the top of Linter reports and the process is repeated. In this page that Cryptic has brought up below, the bot has made 21 edits over 11 months from 2 tasks. At the time of edits from task 2, I did not have approval for edits that were made from task 12. Basically MalnadachBot is very good at fixing Lint errors by "breadth" (35-40k per day) but is bad at fixing them by "depth". Hope this is clear. ಮಲ್ನಾಡಾಚ್ ಕೊಂಕ್ಣೊ (talk) 19:20, 7 June 2022 (UTC)[reply]
    I understand that your code has developed over time, and perhaps 6 months ago you didn't have the code developed to fix an issue that you can fix today. And that might account for multiple edits on certain pages. Sure, that is reasonable. However, that doesn't explain why, for example on this page, you made 26 different edits that were linked to task 2 of your bot, and another 24 separate edits that were linked to task 12 of your bot. In total, you made 50 edits to the same page to carry out fixes for two approved tasks. If it was true that you simply hadn't gotten approval for task 12 at the time that you were running task 2, then I would have expected a maximum of 2 edits to this page (one for task 2, one for task 12), not 50. Much of your explanation above doesn't make any sense to me. You say that "all patterns are checked sequentially and then saved in a single edit", but then how can you explain why you made 50 edits to a single page? It doesn't add up. I'm honestly losing confidence that you have the technical expertise to carry out these tasks in a responsible manner. I'm glad you're getting some help from other editors in the conversations below, and I look forward to a day when you can make one edit per page to carry out all fixes at once. Until that day arrives, I believe that you should stop your bot from making edits. There is no rush to fix these issues, and it shouldn't be done until it can be done correctly. —⁠ScottyWong⁠— 22:25, 7 June 2022 (UTC)[reply]
    @ಮಲ್ನಾಡಾಚ್ ಕೊಂಕ್ಣೊ: you misunderstand what I mean by the stackoverflow link. The issue is that you are using regex to fix a problem that regex is not meant to solve. As the stackoverflow link states, using an actual parser makes doing HTMl stuff easy. Galobtter (pingó mió) 21:39, 7 June 2022 (UTC)[reply]
    • Halt. If you're not competent to handle even trivial cases like combining these edits [4][5][6][7] then you're not competent to be running a bot, let alone touching its code. —Cryptic 23:39, 6 June 2022 (UTC)[reply]
      Yeah the bot seems to fix every page with a certain signature in one go rather than any form of batching which means many many edits to the same page. At the very least stop for a few months, accumulate a big list of signatures to fix and apply the fixes at the same time, rather than rushing. Galobtter (pingó mió) 01:53, 7 June 2022 (UTC)[reply]
    • Exactly. I could maybe understand if the bot was making separate edits to fix distinctly different issues. But your diffs show that the bot is making multiple edits to fix different instances of the same exact issue on a single page. Combine that with the fact that these are purely cosmetic issues, and that's over the line for me. —⁠ScottyWong⁠— 01:56, 7 June 2022 (UTC)[reply]
    @Galobtter: I combine multiple signatures and run the bot in batches. If multiple signatures in a batch are present in a page, the bot replaces them in a single edit. The problem is that there are only so much signatures you can collect from the Linter lists before you start running into the same signatures again and again. To get new signatures, I would have run a batch and remove a few hundred thousand entries from the lists. ಮಲ್ನಾಡಾಚ್ ಕೊಂಕ್ಣೊ (talk) 05:58, 7 June 2022 (UTC)[reply]

    Code

    As requested:

    import mwparserfromhell as mwph
    
    def fix_font(wikitext: str) -> str:
        code = mwph.parse(wikitext)
        for tag in code.filter_tags():
            if tag.tag == "font":
                # Turn it into a <span>
                tag.tag = "span"
                # Turn color into style="color: ...;"
                if tag.has('color'):
                    attr = tag.get('color')
                    attr.name = "style"
                    attr.value = f"color: {attr.value};"
                # TODO: face, size
        return str(code)
    

    Using this replacement as an example:

    >>> print(fix_font("""[[User:Ks0stm|<font color="009900">'''Ks0stm'''</font>]] <sup>([[User talk:Ks0stm|T]]•[[Special:Contributions/Ks0stm|C]]•[[User:Ks0stm/Guestbook|G]]•[[User:Ks0stm/Email|E]])</sup> 15:48, 13 December 2015 (UTC)"""))
    [[User:Ks0stm|<span style="color: 009900;">'''Ks0stm'''</span>]] <sup>([[User talk:Ks0stm|T]]•[[Special:Contributions/Ks0stm|C]]•[[User:Ks0stm/Guestbook|G]]•[[User:Ks0stm/Email|E]])</sup> 15:48, 13 December 2015 (UTC)

    It's entirely possible I've missed something, but seems like the general approach should work. Legoktm (talk) 20:49, 6 June 2022 (UTC)[reply]

    • I think it is ridiculous that upstream dev's wont just keep support for these tags (convert them in the parser or whatever) - but that's not the bot's fault. In general, I don't see any specific edit this bot is making is bad, that is if an editor made it it would be OK. Now, could the bot be "better", sure - but as long as we are getting threatened by the software folks that our pages will be disrupted if we don't change the wikitext I'm not too opposed to people slowly going through them. I don't like extra edits for sure, but I don't really have any sympathy for the When I don't hide bots in my watchlist, my watchlist is busy complaint. We keep putting up with unflagged editors spamming recent changes/watchlists and can't get support to get them to stop. — xaosflux Talk 23:52, 6 June 2022 (UTC)[reply]
    Legoktm, thank you! for being the first person to actually suggest code improvements. You have illustrated the complexity of constructing code that will work on the wide variety of font tags that are out there in the wild, because the code as written above does not work (a # symbol is needed before color specs in span tags, even though it wasn't needed in font tags). You also got lucky that the font tag was inside the wikilink; font tags outside of wikilinks sometimes need different processing. If you are willing to create code that works on a significant fraction of the list of signatures in the archived discussion that I linked above, I expect that the bot operator would be willing to engage with you, or perhaps another brave editor would be willing to create a new bot with your new approach. I think we can all agree that fewer edits per page, as long as the edits are error-free, is the optimal outcome, and the above code snippet, with some development, seems to promise some of that. – Jonesey95 (talk) 00:12, 7 June 2022 (UTC)[reply]
    I think that a potential solution would be using regex replaces sequentially. They could be stored in a dictionary, so the code would loop through it, do a replace, then do the next for the already-replaced text. 🐶 EpicPupper (he/him | talk) 00:29, 7 June 2022 (UTC)[reply]
    I'm pretty sure the bot does this already, applying multiple regexes to each edit, when applicable. The challenge is that in order to best avoid false positives, it is my understanding that editor signatures and other specific patterns are added one by one to the list of regexes. You can't just add a general pattern and hope to avoid false positives, because you could end up changing examples that are deprecated or invalid on purpose. You'd have to ask the bot operator to be sure. – Jonesey95 (talk) 01:36, 7 June 2022 (UTC)[reply]
    I think you're misunderstanding; the regexes can be applied sequentially, rather than all together. Right now, I believe what you mean by "applying multiple regexes" is doing, say, 5 together, all at once, but rather this can be applied one after the other (first replaces the first regex, then takes the result of that replace and triggers the second, etc). 🐶 EpicPupper (he/him | talk) 01:40, 7 June 2022 (UTC)[reply]
    I do apply changes sequentially. i.e match for one pattern in the whole page and use the output to match the next pattern one after the other for all patterns in a batch. After it has matched all of them, it will save the final output in a single edit. ಮಲ್ನಾಡಾಚ್ ಕೊಂಕ್ಣೊ (talk) 05:58, 7 June 2022 (UTC)[reply]
    @ಮಲ್ನಾಡಾಚ್ ಕೊಂಕ್ಣೊ, so why does your bot need 100+ edits to fix Lint errors? 🐶 EpicPupper (he/him | talk) 16:07, 7 June 2022 (UTC)[reply]
    Which page has 100+ edits by MalnadachBot? If you see the example given by Hobit, it shows about 16 edits spread across 11 months and 2 seperate tasks. All of those edits are from different batch runs, it does not make any edits in rapid succession. A typical edit by MalnadachBot would be something like this. Most pages have only one or two Lint errors that is fixed using direct signature replacement and it will not have to visit it again. RFA pages tend to acccumalate a lot of different signatures which why they have 10+ edits. Now that I think of it, maybe I should just skip highly watched pages like RFAs since that is what at least 3 poeple have mentioned. ಮಲ್ನಾಡಾಚ್ ಕೊಂಕ್ಣೊ (talk) 16:29, 7 June 2022 (UTC)[reply]
    100+ is probably an exaggeration. Pages with 50+ are easy to find. Example. Example. Example example example. —Cryptic 17:46, 7 June 2022 (UTC)[reply]
    Wow! That is just egregious. I didn't know pages like this existed. There's no legitimate reason for a bot to behave like this. —⁠ScottyWong⁠— 22:13, 7 June 2022 (UTC)[reply]
    @Jonesey95: heh, I should've previewed before posting :p yeah, it needs to check if the value is hex, and if so, prefix it with a hash if it isn't already. I would prefer if someone else took over my proof-of-concept and ran with it, but given that I do find the current behavior slightly annoying I might if no one else picks it up. The main point I wanted to make is that using a proper wikitext/HTML parser instead of regex makes this much more straightforward, I think a comprehensive font -> span function should be about 100 lines of Python. Legoktm (talk) 04:33, 7 June 2022 (UTC)[reply]
    In the past few weeks, I have been building a bunch of general regexes using the same proof-of-concept as in Legoktm's code above. These regexes work with near 100% accuracy for very rigid sets of replacements applied sequentially. I am sure that if I use them, I can fix about 7 million Lint errors, fixing most of the common cases in a single edit. It will greatly decrease the number of revists. The reason I have not used them is that it ignores Lint errors other than font tags. The bot will still have to revist pages for other types of Lint errors, which is what some people have a problem with. ಮಲ್ನಾಡಾಚ್ ಕೊಂಕ್ಣೊ (talk) 05:58, 7 June 2022 (UTC)[reply]
    I have listed some safe regexes in User:MalnadachBot/Task 12/1201-1250. I am currently running the bot with it. If the wikitext parsing method mentioned above can handle nested font tags, misnested tags, mutiple tags around the same text, tags both inside and outside a link etc, I am willing to give it a try. ಮಲ್ನಾಡಾಚ್ ಕೊಂಕ್ಣೊ (talk) 13:26, 7 June 2022 (UTC)[reply]
    What's the reason for using regexes instead of a wikitext/HTML parser? Another strategy I've been thinking about since yesterday is to use some form of visual diffing to verify the output is unchanged before saving. Then you could have code that doesn't handle literally every single edge case run and clean up most pages in bulk, and then go back and write in code to handle pages with edge cases in smaller passes. Legoktm (talk) 17:15, 7 June 2022 (UTC)[reply]
    As for why I use regexes, that's because this is an AWB bot and is easy to use with it. I just tried mwparserfromhell, got it to work fine for font tags with a single attribute. However am stuck at font tags with two attributes. I tried
    if tag.tag == "font":
        tag.tag = "span"
        if tag.has('color') and tag.has('face'):
            attr1 = tag.get('color')
            attr2 = tag.get('face')
            attr1.name = "style"
            attr2.name = "style"
            attr1.value = f"color:{attr1.value};"
            attr2.value = f"font-family:{attr2.value};"
    
    For this if I pass <font color="red" face="garamond">abc</font>, it returns <span style="color:red;" style="font-family:garamond;">abc</span>. How can I get them in a single style? ಮಲ್ನಾಡಾಚ್ ಕೊಂಕ್ಣೊ (talk) 18:22, 7 June 2022 (UTC)[reply]
    @ಮಲ್ನಾಡಾಚ್ ಕೊಂಕ್ಣೊ try phab:P29487. Maybe we should move to a Git repo for easier collaboration? Also we want fixers for all the other deprecated tags. Legoktm (talk) 19:00, 7 June 2022 (UTC)[reply]
    @Xaosflux afaik there's no indication that browser devs are actually going to drop support for the tags, isn't it just in linter because font tag are obsolete in HTML5? Or did I miss a discussion. Galobtter (pingó mió) 01:57, 7 June 2022 (UTC)[reply]
    @Xaosflux: that's a big [citation needed] from me. Please provide links to where developers (MediaWiki or browser) have made these "threats" about <font> and other deprecated HTML tags being removed - I'm not aware of any. The tag is marked as deprecated because it shouldn't be used in new code, and it would be nice to clean up old uses of it, but there's no deadline to do so. Certainly if browsers did even propose to remove it there would be huge blowback given how much legacy content on random geocities type websites that would be broken, I doubt it'll ever happen. That said, how people choose to spend their time on Wikipedia is up to them, and if people are interested in cleaning this stuff up, more power to them. Legoktm (talk) 04:10, 7 June 2022 (UTC)[reply]
    @Legoktm good point, I'd have to go find more on that - so are you saing that "font" deprecation is basically useless and we need not worry about it really - because I'd certainly rather nobody bother with any of the Special:LintErrors/obsolete-tag's if doing so is going to be useless. — xaosflux Talk 10:01, 7 June 2022 (UTC)[reply]
    I would put it one or two steps above useless. I think it is nice that people are cleaning this stuff up, but there's absolutely no urgency to do so. I'm struggling to come up with an analogy to other wiki maintenance work at the moment that clearly articulates the value of this work without overstating its importance...I'll try to get back to you on that. Legoktm (talk) 17:12, 7 June 2022 (UTC)[reply]
    • I don't see any specific edit this bot is making is bad - then you're not thinking it through. I. See. (No exaggeration.) HUNDREDS OF THOUSANDS. Of. Bad. Edits. If a bot's reason for editing a page is to eliminate font tags, then it is an unambiguous error to save that page with a font tag remaining in it. If it can't yet handle all the cases on a given page, then the blindingly obvious solution is to log it for later inspection and then either skip that page or halt entirely. And yes, if a human made fifty edits in a row to a given page fixing a single font tag per edit and continued to do so over millions of pages for years while dozens of editors begged them to stop, we'd block them in a heartbeat. —Cryptic 17:46, 7 June 2022 (UTC)[reply]
      If I edit a page and fix something on it - the page is better than it was before. Lack of not also fixing something else doesn't make my first edit "bad". If an article had two misspelled words and you fixed one should you be cautioned for not fixing the other? My comment was in the broadest, along the lines of If a human editor made this edit should it be reverted? - and I'd say no. And yes, if someone without a bot flag was flooding recent changes we'd have issue - because of that. All that being said, following from my last comment: if these edits are useless then they shouldn't really be getting made by any bots at all - and that is something that we can measure community support for to make a decision. — xaosflux Talk 18:16, 7 June 2022 (UTC)[reply]
      Add me to any list of editors who would like a solution to this. I’m tired of seeing them. Doug Weller talk 18:47, 7 June 2022 (UTC)[reply]
      @Doug Weller: et al: I'm kind of agnostic on the need for this bot, and the level of headache it's causing, and the extent of support and opposition to the bot, and the ease/necessity of reducing the number of edits per page, and whether it makes sense to pause this until it is more efficient. But just a reminder (the bot's userpage says this, and Headbomb says it somewhere up above, but this is a long thread now) that WP:HIDEBOTS works well if you want to hide just one bot's contribs (or, I'm surprised to find out, one user. I didn't know that was possible). I'm an idiot "very non-technical", and I just now set it up by myself, and now I see no Malnawhateverbot edits in my watchlist, but all the other bots are still there. You're all of course free to keep arguing - and I'll be watching out of curiosity - but that does solve the immediate problem for those who find this to be ... an immediate problem. --Floquenbeam (talk) 19:33, 7 June 2022 (UTC)[reply]
      Correction: it seems to work fine on my computer, but apparently it doesn't work on my phone (even when in desktop mode). So not quite the perfect solution I marketed it as above. --Floquenbeam (talk) 19:39, 7 June 2022 (UTC)[reply]
      Using the mw.loader.load version of this aka {{lusc}} is likely to sort that on mobile. Izno (talk) 19:46, 7 June 2022 (UTC)[reply]
      Thanks, looks useful. I wonder if it will work on my iPad. Doug Weller talk 19:57, 7 June 2022 (UTC)[reply]

    FireflyBot is not running

    User:FireflyBot has stopped. It last ran at about 1100 GMT on 2 June, and notified editors that their drafts had been untouched for 5 months. It also isn't updating the DRN case status. I left a note on the talk page of User:Firefly, who seems to be busy blocking spammers. Robert McClenon (talk) 15:12, 6 June 2022 (UTC)[reply]

    Hi @Robert McClenon the only person who can make that bot start would be its operator. You could drop a request at WP:BOTREQ to see if someone else would like to spin up a clone and take over some of those tasks. — xaosflux Talk 15:34, 6 June 2022 (UTC)[reply]