Jump to content

Wikipedia:Bot requests: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
Line 82: Line 82:
::::I don't know. Making another Web Citation service is a huge amount of work. Not only would the operator have to worry about developing the service, but also about copyright issues, and how/where to host it. I think our best bet is to use an external, 3rd party service like WebCitation. Perhaps as they further develop, they will be more reliable, and bots like [[User:WebCiteBOT|WebCiteBOT]] will make more sense. [[User:Tim1357|<font color="Blue" face="Arial" >Tim]]</font><font color="Red" face="Optima" >[[Special:Contributions/Tim1357|1357]]</font> <sup><font face="Times new roman" size = 2 >[[User talk:Tim1357|talk]]</font></sup> 01:10, 26 June 2010 (UTC)
::::I don't know. Making another Web Citation service is a huge amount of work. Not only would the operator have to worry about developing the service, but also about copyright issues, and how/where to host it. I think our best bet is to use an external, 3rd party service like WebCitation. Perhaps as they further develop, they will be more reliable, and bots like [[User:WebCiteBOT|WebCiteBOT]] will make more sense. [[User:Tim1357|<font color="Blue" face="Arial" >Tim]]</font><font color="Red" face="Optima" >[[Special:Contributions/Tim1357|1357]]</font> <sup><font face="Times new roman" size = 2 >[[User talk:Tim1357|talk]]</font></sup> 01:10, 26 June 2010 (UTC)
:::::You're probably right. Even when webcitation is working, I don't think the bot is run. Can someone take it over, or something? I'd be happy to email webcitation, if I though we had a bot that would use them. - [[User:Peregrine Fisher|Peregrine Fisher]] ([[User talk:Peregrine Fisher|talk]]) 03:18, 26 June 2010 (UTC)
:::::You're probably right. Even when webcitation is working, I don't think the bot is run. Can someone take it over, or something? I'd be happy to email webcitation, if I though we had a bot that would use them. - [[User:Peregrine Fisher|Peregrine Fisher]] ([[User talk:Peregrine Fisher|talk]]) 03:18, 26 June 2010 (UTC)

::::::I emailed them awhile back and have a bot ~50% complete. However I was going to do higher priority links first (e.g. all FAs, GAs, FLs, restricted TLDs like .edu and .gov, most cited links), not the newest links and then work my way down from there. The only limitation webciation.org asked that we keep was the 1 request every 5 seconds one, and to notify them when the project comes operational. Though they did say "In the future we will likely remove this limitation."
::::::As far as creating another service, that would be ideal solution since you don't have to trust for another service not to go down. The main issue there is who is going to host it, using the toolserver is out of the question. Even if a project like that didn't break the rules (it breaks rule 4 and possibly 3) I can almost guarantee they wouldn't be able to provide the disk space you would need. It needs to be running on server(s) in a datacenter not off of someone's laptop in their bedroom, since the latter is virtually guaranteed to be even less reliable than webcitation.org. The only real solution for something like that is either colocation or renting dedicated servers. Dedicated servers run from anywhere from ~$70-300/mo per server depending on the specs of the server and whether you get other addons like IDS systems, automated backup, or support. Colocation isn't that much cheaper but you have the benefit of not having to pay the monthly fee for upgrades to the server, like for additional hard drive space or RAM and has the disadvantage of you actually having to buy a server and install it or have it installed usually for a fee. Colocation is cheaper in the long run but most datacenters want to deal with people renting 1/4 rack or more, not some 1U server. According to [http://www.websiteoptimization.com/speed/tweak/average-web-page/ this article] you’re looking at about 305 KB per page. At that rate you should be able to store about 3.2 million pages on a 1 TB hard drive without a backup (931 GB on a 1 TB drive due to the difference of 1000 and 1024 for sizes), though you might be able to increase that significantly with compression. For backup reasons you'd need to make sure you stored that on two different drives so really you’re looking at 1.6 million per 1 TB drive uncompressed. Keep in mind those numbers are based on that article which is 3 years old, it's very likely that the trend has continued since then and the average web page is significantly larger. Also note that is the average for web pages with images. If you were to archive other types of content like pdfs, or flash objects those would be much larger. As far as copyright that shouldn't be too big of a deal as long as you obey any takedown requests and other standard spidering standards like robots.txt as you should be good under the DMCA's safe harbor provision if you’re operating in the US (see the recent ruling on Viacom Inc. v. Youtube). --[[User:nn123645|nn123645]] ([[User talk:nn123645|talk]]) 00:25, 28 June 2010 (UTC)


== Track deprods ==
== Track deprods ==

Revision as of 00:25, 28 June 2010

This is a page for requesting tasks to be done by bots per the bot policy. This is an appropriate place to put ideas for uncontroversial bot tasks, to get early feedback on ideas for bot tasks (controversial or not), and to seek bot operators for bot tasks. Consensus-building discussions requiring large community input (such as request for comments) should normally be held at WP:VPPROP or other relevant pages (such as a WikiProject's talk page).

You can check the "Commonly Requested Bots" box above to see if a suitable bot already exists for the task you have in mind. If you have a question about a particular bot, contact the bot operator directly via their talk page or the bot's talk page. If a bot is acting improperly, follow the guidance outlined in WP:BOTISSUE. For broader issues and general discussion about bots, see the bot noticeboard.

Before making a request, please see the list of frequently denied bots, either because they are too complicated to program, or do not have consensus from the Wikipedia community. If you are requesting that a template (such as a WikiProject banner) is added to all pages in a particular category, please be careful to check the category tree for any unwanted subcategories. It is best to give a complete list of categories that should be worked through individually, rather than one category to be analyzed recursively (see example difference).

Alternatives to bot requests

Note to bot operators: The {{BOTREQ}} template can be used to give common responses, and make it easier to keep track of the task's current status. If you complete a request, note that you did with {{BOTREQ|done}}, and archive the request after a few days (WP:1CA is useful here).


Please add your bot requests to the bottom of this page.
Make a new request
# Bot request Status 💬 👥 🙋 Last editor 🕒 (UTC) 🤖 Last botop editor 🕒 (UTC)
1 Automatic NOGALLERY keyword for categories containing non-free files (again) 27 11 Anomie 2024-08-04 14:09 Anomie 2024-08-04 14:09
2 Can we have an AIV feed a bot posts on IRC? 8 3 Legoktm 2024-06-21 18:24 Legoktm 2024-06-21 18:24
3 Bot to update match reports to cite template BRFA filed 14 5 Yoblyblob 2024-06-20 21:21 Mdann52 2024-06-20 21:11
4 Clear Category:Unlinked Wikidata redirects 9 6 Wikiwerner 2024-07-13 14:04 DreamRimmer 2024-04-21 03:28
5 Fixing stub tag placement on new articles Declined Not a good task for a bot. 5 4 Tom.Reding 2024-07-16 08:10 Tom.Reding 2024-07-16 08:10
6 Adding Facility IDs to AM/FM/LPFM station data Y Done 13 3 HouseBlaster 2024-07-25 12:42 Mdann52 2024-07-25 05:23
7 Tagging women's basketball article talk pages with project tags BRFA filed 15 4 Hmlarson 2024-07-18 17:13 Usernamekiran 2024-07-18 17:10
8 Adding links to previous TFDs 7 4 Qwerfjkl 2024-06-20 18:02 Qwerfjkl 2024-06-20 18:02
9 Bot that condenses identical references Coding... 12 6 ActivelyDisinterested 2024-08-03 20:48 Headbomb 2024-06-18 00:34
10 Convert external links within {{Music ratings}} to refs 2 2 Mdann52 2024-06-23 10:11 Mdann52 2024-06-23 10:11
11 Stat.kg ---> Stat.gov.kg 2 2 DreamRimmer 2024-06-23 09:21 DreamRimmer 2024-06-23 09:21
12 Add constituency numbers to Indian assembly constituency boxes 3 2 C1MM 2024-06-25 03:59 Primefac 2024-06-25 00:27
13 Bot to remove template from articles it doesn't belong on? 3 3 Thryduulf 2024-08-03 10:22 Primefac 2024-07-24 20:15
14 One-off: Adding all module doc pages to Category:Module documentation pages 6 2 Nickps 2024-07-25 16:02 Primefac 2024-07-25 12:22
15 Draft Categories 13 6 Bearcat 2024-08-09 04:24 DannyS712 2024-07-27 07:30
16 Remove new article comments 3 2 142.113.140.146 2024-07-28 22:33 Usernamekiran 2024-07-27 07:50
17 Removing Template:midsize from infobox parameters (violation of MOS:SMALLFONT)
Resolved
14 2 Qwerfjkl 2024-07-29 08:15 Qwerfjkl 2024-07-29 08:15
18 Change stadium to somerhing else in the template:Infobox Olympic games Needs wider discussion. 8 5 Jonesey95 2024-07-29 14:57 Primefac 2024-07-29 13:48
19 Change hyphens to en-dashes 16 7 1ctinus 2024-08-03 15:05 Qwerfjkl 2024-07-31 09:09
20 Consensus: Aldo, Giovanni e Giacomo 16 4 JackkBrown 2024-08-07 06:30 Qwerfjkl 2024-08-02 20:23
21 Cyclones 3 2 OhHaiMark 2024-08-05 22:21 Mdann52 2024-08-05 16:07
22 Substing int message headings on filepages 8 4 Jonteemil 2024-08-07 23:13 Primefac 2024-08-07 14:02
23 Removing redundant FURs on file pages 4 2 Jonteemil 2024-08-12 20:26 Anomie 2024-08-09 14:15
Legend
  • In the last hour
  • In the last day
  • In the last week
  • In the last month
  • More than one month
Manual settings
When exceptions occur,
please check the setting first.



Make redirects from titles with correct Romanian diacritics to the currently used diacritics

We at ro.wp are currently in the process of changing the widely used, but incorrect Romanian diacritics (Ş,Ţ) with the correct, but somewhat less used versions (Ṣ,Ṭ). You can find more informations here and here. This creates serious problems for around 8% of the users, so we implemented some JS solutions for them. Unfortunately, it's not feasible to implement those at en.wp, so the incorrect diacritics should remain for a while.

However, it would be useful to have redirections made from S/T-comma titles to S/T-cedilla titles, so people with Win Vista/7 can write the titles directly and to keep the links from ro.wp in good order.

I have a pywikipedia robot that does the replacement for all the pages. Its code is available here. I was wondering if there was somebody with bot rights that was willing to run it? Thanks.--Strainu (talk) 10:33, 12 June 2010 (UTC)[reply]

Well there's no such thing as "bot rights" (at least, not here on en.wp), each individual bot and task has to get approval through a bot request. You can file a bot request for approval if you already have the code ready, since anyone else will also have to file one (since even for somebody who already runs bots this would be a new task, so require new approval). - EdoDodo talk 11:42, 12 June 2010 (UTC)[reply]
I'm not terribly familiar with pywikipedia, but it looks like your code goes through the entire list of pages looking for the few with the problem letters. This would be extremely inefficient here on enwiki, given that only 12692 of the 7503387 page titles in the enwiki "all titles in ns0" dump from 2010-05-14 contain S/T-cedilla.
I also note that your code appears to check for a template {{titlu corect}} to detect articles with non-Romanian titles containing S/T-cedilla that don't need redirects. Obviously, this template doesn't exist here on enwiki, and probably there is nothing equivalent. How many of the 12692 don't need S/T-comma redirects? Is there any way for a bot to reliably check? Anomie 14:54, 12 June 2010 (UTC)[reply]
Anomie, the code can be adapted to go through a list of pages from a file (like the one you made). I can do that for you. While there is no 100% sure way for a bot to determine if an article has a Romanian or Turkish title (S-cedilla is used in Turkish), a pretty good way is to check if a certain article is included directly or indirectly in Category:Turkey. I did that for a few articles and I could always reach the top category. However, I have no idea if code has already been written for this (I know there is a script that allows for all the articles in a category to be listed, but then one should match the 2 lists). Furthermore, you should remove the redirects from your list, that could reduce the running time.
EdoDodo, I preferred not to run the code myself, but rather to ask for help exactly because I'm not familiar with the rules here. Also, when saying "bot rights", I was referring to the bot flag.--Strainu (talk) 17:39, 12 June 2010 (UTC)[reply]
But if a Romanian topic "Şomething" redirects to "Something", wouldn't you want to create the redirect "Ṣomething" also pointing to "Something"? Anomie 19:12, 12 June 2010 (UTC)[reply]
Hmmm... I guess you are right :) So, if we sort out all the technical details, would you be willing to run such a bot, either with my code or some other code?--Strainu (talk) 08:12, 13 June 2010 (UTC)[reply]
With my own code, yes. Anomie 18:38, 13 June 2010 (UTC)[reply]
Thanks. Please let me know how I can help you.--Strainu (talk) 18:47, 13 June 2010 (UTC)[reply]
Figure out how to tell if the articles' titles are Romanian rather than Turkish, Kurdish, Zazaki, Azerbaijani, Crimean Tatar, Gagauz, Tatar, or Turkmen (based on the list at Cedilla#Use of the cedilla with the letter S). I tried your category idea on the database dump from 2010-03-12 (the one from 2010-05-14 is missing some of the needed data for this analysis), but it doesn't seem to work too well since Category:Romania is in Category:Black Sea countries which is in Category:Black Sea which is in Category:Landforms of Turkey which is in Category:Geography of Turkey which is in Category:Turkey (and the same can be done to find Category:Romania as a parent of Category:Turkey, for that matter). Anomie 22:26, 13 June 2010 (UTC)[reply]

How about we do this semi-manually? Based on your list, we could create the following lists:

  1. Pages directly included in a category containing "Turkey"/"Turkish" (e.g. "Cities of Turkey") - not interesting for our goal
  2. Pages that contain as part of their title a title from the list above, but could not be included in it (if they exist, I could find none) - add this to list #1
  3. Pages directly included in a category containing "Romania" (e.g. "Rivers of Romania", but also "Romanian people stubs") - we need to create redirects for those
  4. Pages that contain as part of their title a title from the list above, but could not be included in it (e.g. Alexandru Vlahuţă is in the list above, but Alexandru Vlahuţă, Vaslui is not) - add this to list #3
  5. The rest of the articles - depending on the number in this list, we will see if I can go through it by hand or we need further criteria

I'll try to write some code and generate the lists today EET.--Strainu (talk) 08:45, 14 June 2010 (UTC)[reply]

I just had an idea: what if we look at articles tagged by WikiProject Romania? I'm also going to ask WikiProject Romania for any input they may have. Anomie 12:56, 14 June 2010 (UTC)[reply]

Here's a critical question: what about the body of articles? Unless we can determine the bot to perform changes within and throughout the articles, I for one advise against the change, at least for the time being. As for how to determine the articles, I find that WP:RO tags are a good guide (as re the WikiProject Moldova ones, btw!), and I could also suggest making the bot operate the changes in articles where both ş and ţ are present together (except perhaps the articles on these letters, where, say, Turkish and Romanian words are likely to be mixed together). I'm also positive that a bot operating changes for ş and ţ in articles which use ş, ţ, â, î and ă together will result in almost no errors.

Whatever the outcome and decision, I urge careful planning of the move, because the risk of ending up with a huge inconsistency is serious, and the task of fixing would be immense and annoying. As we speak, Romanian wikipedia itself is quite inconsistent. Dahn (talk) 15:38, 14 June 2010 (UTC)[reply]

Update: apologies, I hadn't realized this was not yet a move proposal, and I find the redirect creation very constructive as a preliminary step. But please consider my other points for the next steps. Thank you. Dahn (talk) 15:52, 14 June 2010 (UTC)[reply]
The creation of redirects was also the first step at ro.wp. As I said above, since it is not feasible to implement the same JS tricks from ro.wp at en.wp, we cannot consider going any further than this for a long time from now on.--Strainu (talk) 18:04, 14 June 2010 (UTC)[reply]
Absolutely, and I apologize again for not getting you right the first time. But please review the suggestions I made above, which work for both finding redirects and the eventual transition: would it be safe to say that an article where ş and ţ are present alongside each other, like an article where both are present alongside ă, î and â, could use a ṣ or ṭ redirect (presuming this occurs only in Romania- and Moldova-related articles)? And is it feasible to make the bot search these specific parameters? If so, at least part of the problem of finding the relevant articles is solved.
Incidentally, one could in theory create ṣ redirects for Turkish, Azerbaijani etc. ş articles, as long as they remain just redirects. There's no saying that the redirects have to be 100% correct. Provided we also keep a listing of all these redirects, we could then specify which ones should become article titles (Romanian) and which ones should not (Turkish, etc.). It's one way to approach this. Dahn (talk) 18:18, 14 June 2010 (UTC)[reply]

I was able to identify 4968 articles that need redirects and 886 that don't need redirects. Unfortunately, logging of the leftovers failed, so I need to filter Anomie's list and then try to identify pages with {{WikiProject Romania}}, {{WPRO}}, {{WikiProject Moldova}} or {{WPMoldova}}.--Strainu (talk) 22:31, 14 June 2010 (UTC)[reply]

The 6814 remaining articles are at User:Strainu/leftover. I will continue filtering tonight.--Strainu (talk) 09:21, 15 June 2010 (UTC)[reply]

I have applied all the filters described above, then took the pages by hand. The final result is here - around 9000 articles. If you see any obvious mistake, please go ahead and correct it.--Strainu (talk) 20:47, 15 June 2010 (UTC)[reply]

BRFA filed Chances are quite good that any necessary review of the list will be completed before the BRFA. Anomie 03:14, 16 June 2010 (UTC)[reply]
Y Done 2524 redirects copied from with-cedilla titles, and 6946 created to articles at with-cedilla titles. Anomie 12:40, 22 June 2010 (UTC)[reply]
Thanks Anomie! Strainu (talk) 07:43, 23 June 2010 (UTC)[reply]

We need another User:WebCiteBOT

That one never quite got the job done, and its runner was talented, so we need someone who really knows there stuff. Probably also someone who is willing to license their bot code under a free license in case they don't want to run it. Thanks. - Peregrine Fisher (talk) 06:21, 13 June 2010 (UTC)[reply]

It not running AFAIK because WebCite couldn't handle the load. Checklink does some light preemptive archiving, but they still seem to have some trouble with that load. — Dispenser 20:05, 13 June 2010 (UTC)[reply]
I could write it easily enough, but I agree with Dispenser that the service is somewhat weak. Tim1357 talk 21:18, 13 June 2010 (UTC)[reply]
Any ideas on how we could just do it ourselves? Why couldn't they do it? I never saw any on-wiki conversation about it. What should we do? Who should I ask about it? Thanks. - Peregrine Fisher (talk) 07:35, 20 June 2010 (UTC)[reply]
I don't know. Making another Web Citation service is a huge amount of work. Not only would the operator have to worry about developing the service, but also about copyright issues, and how/where to host it. I think our best bet is to use an external, 3rd party service like WebCitation. Perhaps as they further develop, they will be more reliable, and bots like WebCiteBOT will make more sense. Tim1357 talk 01:10, 26 June 2010 (UTC)[reply]
You're probably right. Even when webcitation is working, I don't think the bot is run. Can someone take it over, or something? I'd be happy to email webcitation, if I though we had a bot that would use them. - Peregrine Fisher (talk) 03:18, 26 June 2010 (UTC)[reply]
I emailed them awhile back and have a bot ~50% complete. However I was going to do higher priority links first (e.g. all FAs, GAs, FLs, restricted TLDs like .edu and .gov, most cited links), not the newest links and then work my way down from there. The only limitation webciation.org asked that we keep was the 1 request every 5 seconds one, and to notify them when the project comes operational. Though they did say "In the future we will likely remove this limitation."
As far as creating another service, that would be ideal solution since you don't have to trust for another service not to go down. The main issue there is who is going to host it, using the toolserver is out of the question. Even if a project like that didn't break the rules (it breaks rule 4 and possibly 3) I can almost guarantee they wouldn't be able to provide the disk space you would need. It needs to be running on server(s) in a datacenter not off of someone's laptop in their bedroom, since the latter is virtually guaranteed to be even less reliable than webcitation.org. The only real solution for something like that is either colocation or renting dedicated servers. Dedicated servers run from anywhere from ~$70-300/mo per server depending on the specs of the server and whether you get other addons like IDS systems, automated backup, or support. Colocation isn't that much cheaper but you have the benefit of not having to pay the monthly fee for upgrades to the server, like for additional hard drive space or RAM and has the disadvantage of you actually having to buy a server and install it or have it installed usually for a fee. Colocation is cheaper in the long run but most datacenters want to deal with people renting 1/4 rack or more, not some 1U server. According to this article you’re looking at about 305 KB per page. At that rate you should be able to store about 3.2 million pages on a 1 TB hard drive without a backup (931 GB on a 1 TB drive due to the difference of 1000 and 1024 for sizes), though you might be able to increase that significantly with compression. For backup reasons you'd need to make sure you stored that on two different drives so really you’re looking at 1.6 million per 1 TB drive uncompressed. Keep in mind those numbers are based on that article which is 3 years old, it's very likely that the trend has continued since then and the average web page is significantly larger. Also note that is the average for web pages with images. If you were to archive other types of content like pdfs, or flash objects those would be much larger. As far as copyright that shouldn't be too big of a deal as long as you obey any takedown requests and other standard spidering standards like robots.txt as you should be good under the DMCA's safe harbor provision if you’re operating in the US (see the recent ruling on Viacom Inc. v. Youtube). --nn123645 (talk) 00:25, 28 June 2010 (UTC)[reply]

Track deprods

Hi. Please write a bot (or add functionality to an existing bot) to track deprods per Wikipedia_talk:Edit_filter/Archive_4#Filter_200.2C_or_should_the_EF_be_engaged_to_track_non-abusive.2C_non-.22wrong.22_edits.3F. A typical behavior of simply reverting addition of a PROD should be trackable, and the articles that happens to should get more attention (say via a category or listing on a page), possibly resulting in an AfD nomination as a contested PROD. Thanks!   — Jeff G. ツ 20:10, 15 June 2010 (UTC)[reply]

Here is a prototype that is running in user space: User:PSBot/Deprods. Admittedly, it is very incomplete, but after some hours you should see some pages listed. Source code is on the talk page. PleaseStand (talk) 06:10, 20 June 2010 (UTC)[reply]
Excellent work so far. Might I suggest using {{Ln}} for each lineitem? Thanks!   — Jeff G. ツ 15:22, 20 June 2010 (UTC)[reply]
I have changed the script to use such a template. I hope the new version of my code works. PleaseStand (talk) 19:11, 20 June 2010 (UTC)[reply]
Just a note that I've run a similar thing with SDPatrolBot, where it lists all the deprods at User:SDPatrolBot/prodResults. However, I've recently been very bad at running this, so it's not exactly up to date. If PleaseStand wants to take over that's fine, although SDPatrolBot did also notify users if their prod had been removed, which you could consider doing as well PleaseStand...? If you want I can help with the coding (although I only really know .NET), or feel free to use User:SDPatrolBot/source#ProdNotify.cs. Best, - Kingpin13 (talk) 17:45, 25 June 2010 (UTC)[reply]

A more user-friendly archive bot

True, we have ClueBot III and MiszaBot, but they're not user-friendly. We should have a more user-friendly archive bot that does not overwhelm users with syntax. RussianReversal (talk) 02:34, 18 June 2010 (UTC)[reply]

What specifically is not user-friendly? What would you want changed? In your opinion, how should the syntax work? -- Cobi(t|c|b) 02:37, 18 June 2010 (UTC)[reply]
Archives should be daily and have a searchable box by default. It should have relatively few options: freq, age, format, and box. Format should be md,y; mdy; ymd; or dmy. Now that's user-friendly! RussianReversal (talk) 02:46, 18 June 2010 (UTC)[reply]
No that's just not flexible. FinalRapture - 03:24, 18 June 2010 (UTC)[reply]
Most talk pages don't get enough traffic to be archived by day. Furthermore:
  • As for formatCB3 has this. Just the values are slightly different. "F j, Y"; "F j Y"; "Y F j"; "j F Y" respectively.
  • As for age — CB3 has this and allows for fine-grained control down to the hour.
  • As for freq — I don't know what you want this to do, and how it's different from age.
  • As for box — CB3 also has this, but called archivebox.
  • As for a search box — CB3 has this, too.
I don't understand what else you want. -- Cobi(t|c|b) 04:14, 18 June 2010 (UTC)[reply]

Perhaps make a PHP script that generates the code for them. The user the user simply select the options they want, and then copies the code to their talk page. --Chris 11:04, 23 June 2010 (UTC)[reply]

French footballers

I have created a new WikiProject Football France task force and now a load of existing articles need tagging on the talk page. I am currently finding articles already tagged with the {{Football}} or {{WikiProject Football}} templates, and adding the parameter France=yes. Using AWB, I have tagged around 500 of these but it's getting tedious so I wonder whether a bot could do the same job? I am currently performing a normal regex find & replace for ({{\s*)([Ff]ootball)(\s*(?:\s*|⌊⌊⌊⌊M?\d+⌋⌋⌋⌋\s*)?(\|((?>[^\{\}]+|\{(?<DEPTH>)|\}(?<-DEPTH>))*(?(DEPTH)(?!))))?)\}\} or ({{\s*)([Ww]ikiProject Football)(\s*(?:\s*|⌊⌊⌊⌊M?\d+⌋⌋⌋⌋\s*)?(\|((?>[^\{\}]+|\{(?<DEPTH>)|\}(?<-DEPTH>))*(?(DEPTH)(?!))))?)\}\} and replacing with $1$2$3|France=yes}}. Any talk pages already tagged with the France=yes parameter need skipping. I would like the bot to do this on all articles in the categories Category:French footballers, Category:Footballers in France by club and Category:French football managers. Cheers, BigDom 07:28, 20 June 2010 (UTC)[reply]

This is a pretty standard request, and can bypass BRFA. I suggest asking Xenobot Mk V to help you out (however, any of the WikiProject tagging bots will do). Follow the instructions on the bot's page, and hopefully Xeno will be able to run his bot to do this. Tim1357 talk 22:50, 25 June 2010 (UTC)[reply]
Thanks, I'll have a look there. BigDom 05:58, 26 June 2010 (UTC)[reply]

 Doing... on DodoBot. There's a fairly large number of pages, so probably won't be done till this evening or even tomorrow. - EdoDodo talk 09:23, 27 June 2010 (UTC)[reply]

 Done Should all be tagged. Feel free to leave me a message on my talk page if you think of any other categories that could be tagged. - EdoDodo talk 19:45, 27 June 2010 (UTC)[reply]

Tagging duplicate files from Commons with NowCommons

Can a bot tag all duplicates from Commons that do not have {{NowCommons}} with NowCommons? Also, any images that are OLDER on Commons should be tagged with {{Db-f8}}, explaining in the text of the tag that the Commons version is older. There is no realistic circumstance where an image is older on Commons and should still be here. The bot should skip images with {{NoCommons}}, {{C-uploaded}}, and {{M-cropped}}. CommonsClash - Tool listing duplicates on the English language Wikipedia and Wikimedia Commons. Backlog on 9 May: about 34.000 media files. Less than half of these images are marked in any way as duplicates. ▫ JohnnyMrNinja 13:58, 22 June 2010 (UTC)[reply]

So I'm, pretty sure that when nobody responds to a bot request (not even to tell you how much your request can't be done), it means that the request is a seething pile of gibberish. Let's start over. There are at least 34000 files that are duplicated here and on Commons (maybe a lot more now, actually). While many of them are marked with the tag {{NowCommons}}, these are still much less than half of the total. The others are only distinguishable by a feature of MW that shows up a little link at the bottom of the page. These images aren't even added to a category automatically. The only templates we have now for these images are NowCommons (or the derivative NCD) and {{db-f8}}. Of these 34000 files, probably about 33900 should be deleted. I would simply like someone to put these images into categories so that people can start easily processing them. I do not want a bot to delete anything, and a human will eventually be able to tell if any one of these images shouldn't be deleted, and will mark the image as such. This is not creating more work, this is making work that is already being done easier. If this seems too controversial, or doesn't make sense, or doesn't seem worth it, please let me know. ▫ JohnnyMrNinja 17:37, 25 June 2010 (UTC)[reply]
There is a bot that did this, or something similar. Fact is, there are not many admins cleaning the nowcommons category so it will probably just add on to the backlog. –xenotalk 17:43, 25 June 2010 (UTC)[reply]
I think there are more admins than normal working on this right now, we just had one user who was moving tons of images per day. That was part of the reason I wanted to have this bot start working now, because there are so many eyes on these images right now. Images are now being sorted into subcategories, and there is a reviewer tag for images to be checked by non-admins so admins can delete faster (that's only a few days old). And I don't think I'd agree that it's adding to the backlog; it's just making visible a portion that was invisible, as these images already should have been tagged when they were duplicated in the first place. ▫ JohnnyMrNinja 18:10, 25 June 2010 (UTC)[reply]
So maybe contact that user and see if they have some code for this... –xenotalk 18:12, 25 June 2010 (UTC)[reply]
I'm pretty sure that his bot was going through images marked as "candidates to be moved to Commons", as that is what I usually see his bot doing, and that's why so many images at once. That is what the other editor was also doing recently, moving thousands per day. That is increasing the backlog. I will still ask them to drop by here. ▫ JohnnyMrNinja 18:29, 25 June 2010 (UTC)[reply]

Note, for whoever does this, don't rely on the hash values of the images to check, there is no guarantee that they will be there for older images or that they will be the same (for example, if someone cropped off whitespace or the likes/colour corrects etc), You should also check the height/width attributes of the images before tagging, and the template that is placed should have a bot switch so the admin that does delete can clearly see that it is a bot check and can pay more attention to it. Peachey88 (Talk Page · Contribs) 08:34, 26 June 2010 (UTC)[reply]

I like the idea. But perhaps we should try to do something about the images allready tagged at "NowCommons" before we add more to this backlog. We need admins to delete the images but every user can help check and fix the images. Perhaps we can all try to advertise this project to make more users help checking the images. If we get 100 users to check 100 images each we have fixed 10,000 images :-)
Ofcourse, that does not mean that this idea has to be killed. Lets create the bot so it is ready to roll when the backlog has been reduced. I clearing the backlog takes to long we could create a new template to add to the images that does not match 100 % so they end up in a new category. --MGA73 (talk) 17:27, 26 June 2010 (UTC)[reply]

ArticleAlertBot revive or replace

ArticleAlertBot has been non-functioning for several months. Can it either be revived or replaced? The most useful function is to alert Projects of articles within their scope which have been PRODded or sent to AfD. The big drive to clear out the backlog of UBLPs would be helped enormously if the Projects could have this function restored asap. Any chance someone could help?--Plad2 (talk) 18:05, 22 June 2010 (UTC)[reply]

I already run daily reports for wikiprojects that want unrefed BLPs within their project, creating new reports for what you want shouldnt be that hard. (see here for the unref BLP report lists) βcommand 18:27, 22 June 2010 (UTC)[reply]
See discussion at WP:BON#ArticleAlertBot. Anomie 19:16, 22 June 2010 (UTC)[reply]

Railway station talk pages not showing WikiProject Stations

Hi there Bot Operators; I think I may have a task for one of you. The talk page banner {{TrainsWikiProject}} has a parameter |stations=yes, which if present, marks the article as falling within WP:STA. Unfortunately, in a significant number of cases, this parameter has been mis-entered as |Stations=yes, which isn't recognised in the intended fashion.

The task: for all talk pages (associated with any namespace) bearing {{TrainsWikiProject}}, or any of its redirects, check within that banner template, and apply changes according to these rules:

  • If the parameter |Stations= is not present, do nothing and move on to the next page.
  • If |Stations= is present, but |stations= is not present, alter |Stations= to |stations=, retaining its existing value.
  • If both |Stations= and |stations= are present, check their values according to the rules of {{yesno|no=no}}:
    • if |Stations= is "no", remove it whatever the value of |stations=
    • if |stations= is "yes", remove |Stations= whatever its value
    • if |Stations= is "yes", but |stations= is "no", alter value of |stations= to "yes" and remove |Stations=

Is that OK as a spec? (I have previously asked Adambro if AdambroBot could do it, but he's busy and pointed me here). --Redrose64 (talk) 19:38, 22 June 2010 (UTC)[reply]

Much easier to just do this. –xenotalk 19:42, 22 June 2010 (UTC)[reply]
Thanks; but that doesn't cover the (hypothetical) case of both existing in the template. I say hypothetical, because I don't know of any existing cases; but I'm sure I've encountered it before. If the banner does have both, ie |Stations=yes|stations=yes, this will now be passed on as |tf 1=yesyes, which will be interpreted by {{WPBannerMeta/hooks/taskforces}} as "no". Further, the |stations= parameter is also passed through to {{WPBannerMeta/hooks/tfnested}}, which you seem to have overlooked. --Redrose64 (talk) 19:52, 22 June 2010 (UTC)[reply]
[1] For the latter. Running a query on the former. –xenotalk 19:56, 22 June 2010 (UTC)[reply]
You could instead use |tf 1={{{stations|{{{Stations|}}}}}} Then the lowercase form will have priority over the uppercase form. — Martin (MSGJ · talk) 19:58, 22 June 2010 (UTC)[reply]
I thought of that; but I want the "yes" to have priority over the "no", hence my last instruction on the original spec. --Redrose64 (talk) 20:12, 22 June 2010 (UTC)[reply]
Scan complete, there are only 3 articles with duplicated "stations" parameters: Talk:Berlin Hauptbahnhof; Talk:Chester-le-Street railway station; Talk:22nd Street (HBLR station). 4511 articles use the capital "Stations=", fwiw. Making edits just to change the case would be extremely wasteful now that the template recognizes it. –xenotalk 20:37, 22 June 2010 (UTC)[reply]
Thank you OK; but have still edited the three you mention: it's not a good idea to have almost-duplicated parameters. --Redrose64 (talk) 21:17, 22 June 2010 (UTC)[reply]
Definitely. I left them for you to do =) Marking this  Not donexenotalk 22:31, 22 June 2010 (UTC)[reply]

Removing unnecessary hyphens

I often see adverbs ending in -ly as part of compound modifiers which are incorrectly hyphenated, e.g. "a highly-motivated Wikipedian" should be just "a highly motivated Wikipedian". It seems like a simple regex could search for this pattern and correct it in a semi-automated fashion. Are there bots available to do this already or should I make a new one? (I am proficient in C# .NET). Thanks. -Cwenger (talk) 00:39, 23 June 2010 (UTC)[reply]

Bots are not permitted to do spell-checking, but maybe this could be added to WP:AWB/T? –xenotalk 17:44, 25 June 2010 (UTC)[reply]

Italicising article names for species and genera

The scientific names of species and genera are italicised by convention. There are vrious ways of doing this. See a discussion about it that started at WikiProject Arthropods. In short there are various templates and work arounds to acheive it. I started a process of adding "{{italictitle}}" to relevant pages using AWB, but it's takes a while. As suggested by other project participants, here's a bot request. (I couldn't find anything suitable at the Bot status page.

The bot would need to check whether any of the "genus", "species", or "binomial" parameters specified in the Taxobox template exactly match the title of the page. If it does, the bot should prepend the page with "{{italic title}}". There are two forms of the template: "{{italictitle}}" and "{{italic title}}". The bot should prepend with the latter. (I've just discovered this morning that "{{italic title}}" is the active form.

Of course, the bot should ignore any article containing "{{italictitle}}" or "{{italic title}}".

Lastly, some pages will achieve the italic title by using the "DISPLAYTITLE" magic word. If possible, it would be better to delete lines of text containing "DISPLAYTITLE" and replacing with "{{italic title}}". Otherwise, ignore pages containing "DISPLAYTITLE".

Although this started as a conversation in WikiProject Arthropods, the issue exists through all WikiProjects under the WikiProject Tree of Life. Thus, there is no need to limit the bot to particular categories or other article classifications.

I don't know how often the bot should run, but suggest that something like weekly would be enough (if it even works that way).

Happy to assist with any clarification, testing and refinement required. Heds (talk) 00:35, 26 June 2010 (UTC)[reply]

I oppose this as currently framed. With the right parameters, the {{Taxobox}} already automatically italicizes the title; what Arthropods has been doing is adding some useless code to make it more explicit that the title is being italicized. That causes edits like this, which do not change the display of the page in any way. If the Arthropod projects wants to do this, but there is no consensus to do this across Tree of Life. Ucucha 06:18, 26 June 2010 (UTC)[reply]
According to Template talk:Italic title, an RFC some time last year did show consensus for italicizing titles in Tree of Life. I also found a small poll from around the same time in WT:WikiProject Tree of life/Archive26 that showed no consensus for preferring either {{taxobox}} or {{italic title}} for doing so. Maybe later discussions reached a different consensus, please link them if so.
Note that a bot can detect the display title for a page using a query like this. Also note that the edit you complain about doesn't actually do what is requested here, as it (uselessly?) adds the "name=" parameter to {{taxobox}} which then makes the article require {{italictitle}} for proper italicization.
That said, before having AnomieBOT work on this I would want to see a clear specification of which articles specifically should be touched and a new poll at WT:WikiProject Tree of Life to verify consensus. Anomie 15:24, 26 June 2010 (UTC)[reply]
That the title should be italicized is not in question, but what Heds, and the Arthropods project, appear to want is adding {{italictitle}} and the redundant |name= parameter, instead of using the built-in italictitle functionality of the taxobox; they believe this makes the syntax more intuitive. Ucucha 15:37, 26 June 2010 (UTC)[reply]

Dusty Articles (WP:DUSTY) needs to be updated

Administrator User:Fuhghettaboutit referred me here after I posted at Wikipedia:Village pump (miscellaneous) about the fact that Wikipedia:Dusty articles page hasn't been updated since 15 Feb 2010. He informed me that User:DustyBot went offline at that date, and its operator, User:Wronkiew, has not edited any pages since 4 July 2009.

I would like to know if it is possible to get another bot to update Wikipedia:Dusty articles, as it was a useful tool to help keep articles clean and updated. --Eastlaw talk ⁄ contribs 07:53, 27 June 2010 (UTC)[reply]

I am going to contact Wronkiew by email to see if he would be willing to get the bot working again or give the code to somebody else so they can run it. - EdoDodo talk 07:58, 27 June 2010 (UTC)[reply]

It seems that Svick has opened a BRFA for a bot to do this task. - EdoDodo talk 16:12, 27 June 2010 (UTC)[reply]