Wikipedia:Bot requests/Archive 57

From Wikipedia, the free encyclopedia
Jump to: navigation, search

Contents

MySQL expert needed

The Wikipedia 1.0 project needs someone with experience working with large (en.wikipedia-size) MySQL databases. If interested, contact me or the WP 1.0 Editorial Team.— Wolfgang42 (talk) 02:19, 20 October 2013 (UTC)

Um, what specifically do you need help with? You'll probably get more people who can help if they know specifically what you need help with. Legoktm (talk) 02:24, 20 October 2013 (UTC)
Writing various queries to pull information out of the database. The problem is that the en.wikipedia database is very large, and I don't know enough about MySQL to be able to work with a dataset where queries can take weeks to complete. — Wolfgang42 (talk) 18:02, 20 October 2013 (UTC)
If queries take that long something is wrong with the database configuration. More often than not you need to add and/or use indexes. If you provide the table structure and index list queries shouldn't take more than a few hours at most. Werieth (talk) 18:08, 20 October 2013 (UTC)
Can you paste the queries? If you're running on labs, there are different databases like revision_userindex which would be faster if you need an index upon user. Legoktm (talk) 21:10, 20 October 2013 (UTC)
The query (being run on the Labs database) is:
SELECT page_title,
    IF ( rd_from = page_id,
        rd_title,
    /*ELSE*/IF (pl_from = page_id,
        pl_title,
    /*ELSE*/
        NULL -- Can't happen, due to WHERE clause below
    ))
FROM page, redirect, pagelinks
WHERE (rd_from = page_id OR pl_from = page_id)
    AND page_is_redirect = 1
    AND page_namespace = 0 /* main */
ORDER BY page_id ASC;

Wolfgang42 (talk) 22:53, 23 October 2013 (UTC)

The results from the second column seem odd, mixing varbinary & int data and the OR in the where clause doesn't help with the preformance. What exactly are you wanting to get from the database? -- WOSlinker (talk) 23:39, 23 October 2013 (UTC)
You're right—I pasted an older version of the code; I've fixed it to be the title both times. (My mistake for not checking that I had the latest copy in version control.) This query is a direct translation of an agglomeration of perl, bash, and C code which was parsing the SQL dumps directly. What it's trying to do is find redirect targets by looking in the redirect table, and falling back to the pagelinks table if that fails.
I would suspect that the 3-way join isn't helping performance any either, but unfortunately it seems to be needed. If there's a better way to do this, I'd love to see it. — Wolfgang42 (talk) 02:30, 24 October 2013 (UTC)

Try this and see if it works any better. -- WOSlinker (talk) 06:00, 24 October 2013 (UTC)

SELECT page_title, COALESCE(rd_title, pl_title)
FROM page
LEFT JOIN redirect ON page_id = rd_from
LEFT JOIN pagelinks ON page_id = pl_from
WHERE page_is_redirect = 1
    AND page_namespace = 0 /* main */
ORDER BY page_id ASC;
Putting the EXPLAIN keyword in front of the query will return the execution plan, indexes used, etc. --Bamyers99 (talk) 19:40, 24 October 2013 (UTC)

Request for a bot for WikiProject Military history article reviews per quarter

G'day, WPMILHIST has a quarterly awards system for editors that complete reviews (positive or negative) of articles that fall within the WikiProject. So far we (the project coordinators) have done this tallying manually, which is pretty labour-intensive. We have recently included GA reviews, and are having difficulty identifying negative GA reviews using the standard tools. We were wondering if someone could build a bot that could tally all FA, FL, A-Class, Peer and GA reviews of articles that fall within WikiProject Military history? In terms of frequency, we usually tally the points and hand out awards in the first week after the end of each quarter (first weeks in January, April, July and October), but it would be useful functionality to be able to run the bot as needed if that is possible. Regards, Peacemaker67 (send... over) 23:36, 13 October 2013 (UTC)

If this comes up, lemme know, 'kay? Maybe some of the other projects might be able to use it as well. :) John Carter (talk) 23:43, 13 October 2013 (UTC)
Hi, could someone clarify if I am in the wrong place (ie is this not a bot thing)? Thanks, Peacemaker67 (send... over) 03:09, 18 October 2013 (UTC)
G'day all, is this something a bot could do? Peacemaker67 (send... over) 19:59, 25 October 2013 (UTC)

New REFBot - feedback on user talkpages

20:11, 17 October 2013 > message A reference problem

05:19, 18 October 2013 > message A reference problem

11:19, 22 October 2013 > message A reference problem Two replys

  • 11:28, 22 October 2013 Not me Squire!
  • 11:28, 22 October 2013 OOPS, was me -fixed! (This editor is a Senior Editor II and is entitled to display this Rhodium Editor Star.)

Bot to download and reupload images to resolve AU legal concerns

The discussion is at Wikipedia talk:WikiProject Australian Roads/Shields, but in summary, there are sets of images transferred from Commons to here as {{PD-ineligible-USonly}}. The user that moved the files (downloaded them from commons then uploaded them here) wants to remove his involvement due to potential legal issues in Australia. Under existing policy, revdel, oversight, and office actions are not appropriate. It was suggested that a bot could upload the same files under a different name and nominates the old ones for deletion per WP:CSD#F1. - Evad37 (talk) 06:42, 26 October 2013 (UTC)

Marking talk pages of Vital articles

Can someone make a bot to mark all talk pages of Vital articles (all levels) with {{VA}}, and fill out its parameters (level, class, topic) if possible. It should also remove such templates from non-VAs.

Ideally this should run on a regular basis, but even a one-off run would be very helpful. -- Ypnypn (talk) 18:48, 28 October 2013 (UTC)

Bot to tag "PROD Survivors" and "Recreated Articles"

In this first paragraph, I will summarize my request: It would be good if someone could please create a bot which tags articles which were PRODded but survived (I shall call these "Survivors"). And/or which tags articles which were PROD-deleted then recreated (I shall call these "Recreated Articles"). You may tag them with {{old prod full}}. You may leave all the template's parameters blank, or you may fill some in.

Rationale: Such tags warn us not to re-add another PROD tag. They also make it more obvious to us that perhaps we should consider nominating the page for WP:AfD.

Here are some things you could do, but which I don't recommend: You could download a database dump with history, parse it, and look for Survivors. But such a dump is over ten terabytes of XML once uncompressed.[1] You could download the dump of all logs, download the dump of all page titles, parse the two, and look for Recreated Articles. User:Tim1357 tried parsing a dump,[2], but he didn't succeed: the matter is still on the latest revision of his to-do list. I suspect it may not be worth attempting either of these difficult tasks.

Here is what I do recommend: It would be worthwhile to create a bot to watch Category:Proposed deletion and tag future Survivors. And to watch for new pages and tag Recreated Articles. User:Abductive suggests some optional refinements.[3]

It would be good if someone could please start writing a bot to do either or both of these tasks. It would be even better if they could provide us with a link to their code-in-progress. User:Kingpin13 and User:ThaddeusB have expressed interest,[4] but nobody seems to have actually written any code to do these tasks on the live Wikipedia.

User:Rockfang started tagging Survivors in 2008 using AWB (the wrong tool for the job) but later stopped. S/he wrote that s/he "got distracted".

AnomieBOT already does one related task. If an article is AfDed, then recreated, AnomieBOT's NewArticleAFDTagger task puts {{old AfD multi}} on that article's talk page. The task's open-source[5] code is here. Maybe you could build on it, and maybe you could even ask Anomie to run it for you. Dear User:Anomie: Do you know if you or any bot ever tagged the pages which were recreated in the years before you wrote your bot?

Cheers, —Unforgettableid (talk) 04:32, 16 October 2013 (UTC)

For the record, I'm a "he". :) Rockfang (talk) 05:17, 16 October 2013 (UTC)
I do not know of anyone who went back and tagged all articles that had ever been deleted through AfD.
I considered the recreated-after-prod tagging at one point. But the task would require keeping a list of every article that was ever prodded and then deleted without the prod tag being removed, which I didn't think was worthwhile. The AfD tagging is easier, since the bot can just look for Wikipedia:Articles for deletion/{{PAGENAME}}. Anomie 11:45, 16 October 2013 (UTC)
I have investigated and found that probably somewhere between 95% and 100% of PROD-deleted articles have the all-caps string "PROD" somewhere in their deletion logs. So, detecting Recreated Articles would be easier than you think. :) Cheers, —Unforgettableid (talk) 19:22, 16 October 2013 (UTC)
"somewhere between 95% and 100%"? Which is it? Anomie 21:04, 16 October 2013 (UTC)
Out of the couple dozen PROD-deleted articles I checked, each and every one had the string somewhere in their deletion logs. But my sample size was so small that I cannot claim with certainty that 100% of PROD-deleted articles have it in their logs. —Unforgettableid (talk) 00:40, 18 October 2013 (UTC)
Whether the number is 95%, 100%, or somewhere in between, searching for the string is quite easy and quite effective. ISTM it's the best way to identify Recreated Articles. Dear all: what do you think? —Unforgettableid (talk) 06:43, 4 November 2013 (UTC)
I think the best way to handle it is get the All articles proposed for deletion category periodically. If an article was in one iteration and not the next, it was either deleted or the tag was removed. --ThaddeusB (talk) 17:55, 4 November 2013 (UTC)
I have long intended to make a bot to tag PROD survivors... is anyone else planning on programming this? If not, I can try to get started on it next week. --ThaddeusB (talk) 19:25, 18 October 2013 (UTC)
Dear ThaddeusB: If you do end up writing such a bot, please do let us know. :) Cheers, —Unforgettableid (talk) 06:33, 4 November 2013 (UTC)
If so, please name it CattlePROD Bot! Headbomb {talk / contribs / physics / books} 17:34, 4 November 2013 (UTC)
Since no one else seems interested, I will try to get to this by the end of the week. --ThaddeusB (talk) 17:55, 4 November 2013 (UTC)

Redirects in templates after page moves

Per WP:BRINT, redirects are undesirable in templates. Currently after a page move, bots (bless their hearts) sweep up all of the broken or double redirects etc., but the links in templates are left untouched. For instance, a page was moved from here to here in January but the accompanying template was not updated until today. Is is possible for a bot to fix redirects on templates that are on a page that is moved? Rgrds. --64.85.216.235 (talk) 05:51, 4 November 2013 (UTC)

Possible order of priority here. It may not be needed to move the redirect if the template is not actually on the page being moved. For example, if the Professional Fraternity Association changed its name to something else, it would cause a redirect on Template:Kappa Kappa Psi since that has a link to the PFA, but wouldn't need to be fixed as badly since the PFA page doesn't include the Kappa Kappa Psi template. — Preceding unsigned comment added by Naraht (talkcontribs) 17:47, 4 November 2013 (UTC)‎
Yes, to reiterate, a bot that can fix redirects on the templates that are currently on the page that is moved is the priority. Templates that link to the article but are not on the moved page are not priority. Rgrds. (Dynamic IP, will change whem I log off.) --64.85.216.79 (talk) 20:55, 4 November 2013 (UTC)

Star Wars Bot needed?

Bot for Star Wars articles needed, maybe? Might help monitor changes.20-13-rila (talk) 11:19, 5 November 2013 (UTC)

This is not a proper bot task request that can be implemented, especially without detail. What changes is it supposed to monitor? —  HELLKNOWZ  ▎TALK 11:17, 5 November 2013 (UTC)
I was thinking that it might help with the Star Wars WikiProject, which I am a member of. I am not sure if it is needed, which is why I would like to discuss it. 20-13-rila (talk) 11:35, 5 November 2013 (UTC)
May I suggest you discuss this with the project first and come up with a concrete proposal of what task(s) can be done and how. Without further detail, I doubt you will find many interested parties on this page (which is for requesting specific tasks). —  HELLKNOWZ  ▎TALK 11:37, 5 November 2013 (UTC)
Thank you, Rila 20-13-rila (talk) 09:32, 6 November 2013 (UTC)

Mash together two FA-related lists?

We have WP:FANMP (a list of FAs yet to appear on the main page) and WP:WBFAN (a list of FAs and former FAs by nominator). Can someone think of a way to produce a hybrid for me, i.e. a list of FAs yet to appear on the main page by nominator? BencherliteTalk 20:30, 5 November 2013 (UTC)

Shutdown of blogs.amd.com

It seems that the articles have been moved to http://community.amd.com and http://developer.amd.com. I think all links to http://blogs.amd.com should be marked with {{dead link}} at least. Please fix them semi-automatically if you can. --4th-otaku (talk) 12:35, 4 November 2013 (UTC)

That's fine, I'll do them all quickly enough tomorrow. Rcsprinter (orate) @ 00:09, 6 November 2013 (UTC)
I can't seem to find any articles which link to blogs.amd.com. Rcsprinter (message) @
[6] There's not a whole lot. —  HELLKNOWZ  ▎TALK 17:12, 8 November 2013 (UTC)
Good find; it'll have to be tomorrow I run the thing. Rcsprinter (post) @ 22:00, 8 November 2013 (UTC)

Help needed tracking recent changes to medical content

User:Femto Bot used to populate Wikipedia:WikiProject Medicine/Recent changes which in turn updated Special:RecentChangesLinked/Wikipedia:WikiProject Medicine/Recent changes. I think that's how it worked. It reported all changes to pages with {{WPMED}} on their talk page. Anyway, it was an awesome tool for patrolling some of Wikipedia's most sensitive content. But since Rich Farmborough was banned from bot work it's stopped working - it only reports recent changes to pages beginning with "A".

This tool aims to do the same thing but it's slow and often times out, and when it works it's running a couple of days behind.

There was also Tim1357's tool, but his account has expired from the Toolserver.

I was wondering if somebody here would be able to provide WP:MED with something to replace these? With something like this a handful of experienced medical editors can effectively patrol all of Wikipedia's medical content. Without it, there's no telling what's happening. --Anthonyhcole (talk · contribs · email) 17:58, 4 November 2013 (UTC)

It appears that the source code for the bot is not available. I see you have attempted to contact him; I will take it over if you are successful getting the code, I can run it if needed. However, he will not be able to run it by himself I fear, due to ArbCom. --Mdann52talk to me! 13:23, 5 November 2013 (UTC)
Should be fairly trivial to write something like this up. Werieth (talk) 13:40, 5 November 2013 (UTC)
1. See VPT for current development.
2. I have asked for a module solution in Lua (negative for the full automation)
3. I am putting a fresh page up manually right now.
4. I moved the RELC page to Wikipedia:WikiProject Medicine/List of pages/Articles for future automation and expansion (and because the old page name was incorrect).
Will be back later on. -DePiep (talk) 14:29, 5 November 2013 (UTC)
Yes. (just curious: my page is 795k (28391 articles), yours is without *bullets is 871k - uses another source category? I started AWB for this, checking four cats deep.)
Now, the MED people are served and I have little time today & tomorrow. So I'll pick it up later. In short, this is my concept around the bot action:
  • A project editor can put a notice (template) on the Project page. The template is called like "{{RELC list: please bot make some RELC lists for this project}}". Parameters are set for: |RELC list1 namespace1=Article [space] + Talk [space], |RELC list2 namespace2=Template + template talk, |other parameters like 1x/month=. The template is invisible. Just like what User:MiszaBot/config does on talkpages to archive.
  • The bot sees the request and writes the list on a dedicated "RELC list" page (in its own section: say ==Pages==. The bot is not the only one that writes on that page).
  • Systematic page names are build like:
Wikipedia:WikiProject Medicine/List of pages our top page
Wikipedia:WikiProject Medicine/List of pages/Articles
Wikipedia:WikiProject Medicine/List of pages/Articles + Talks 0-9 A-M
Wikipedia:WikiProject Medicine/List of pages/Articles + Talks N-Z
Wikipedia:WikiProject Medicine/List of pages/Articles + Talks
Wikipedia:WikiProject Medicine/List of pages/Templates
Wikipedia:WikiProject Medicine/List of pages/Templates + Template talks
Wikipedia:WikiProject Medicine/List of pages/Non-articles
Wikipedia:WikiProject Medicine/List of pages/Non-articles + non-articles talks
The naming suggestion is: first use namespace names into plural; readers see this on top of the RELC special page, so a natural page name is valuable. We also need codes for those "all non-articles" and "A-M" requests.
  • A template for project page, now {{RELC list}}, will use these page definitions too (so we must agree on the names and other protocols), and produces the special links on a project page (as {{RELC list}} does for WP:MED now).
  • There are also other templates like {{RELC list/Listpage header}}
  • FYI, I build such a set, list pages maintained manually, for WP:ELEMENTS at{{WikiProject Elements page lists}}.
  • Trick: the page should contain its own name, so the RELC reader sees: "page was updated on ...".
  • Trick: necessary off-topic pages, like the header template, would appear in the RELC view after edits (disturbing the view because itself not on topic). I created and used a Redirect, which does not change and so does not appear in the special view.
  • Will go writing on the WT:MED page now.
See you Thursday. -DePiep (talk) 16:50, 5 November 2013 (UTC)
Im pretty sure that I can generate a list based on any criteria you need. User:Werieth/sandbox didnt use a category, but rather it used a list of all pages that had {{WikiProject Medicine}}. Defining how you want the lists generated should be doable, we would just need to define a template setup similar to User:MiszaBot/config. The important factors for getting this going is to clearly and simply define things. Break it down to the very basics of what you are looking for, dont factor in how something is done, just what you want done, leave the how for me. Werieth (talk) 17:52, 5 November 2013 (UTC)
If you're asking what functionality WP:MED needs, I was very happy with Special:RecentChangesLinked/Wikipedia:WikiProject Medicine/Recent changes in terms of speed and features. --Anthonyhcole (talk · contribs · email) 18:44, 5 November 2013 (UTC)
re Anthonyhcole@. The page you mention had its last update in 2012. That could be solved of course by updating it today. But there is also this: it was 35k in size, which means it listed only a small part of all the WP:MED pages. See this old version of that page. How small? Today the 'updated' page (named Wikipedia:WikiProject Medicine/List of pages/Articles; big page) has 28.391 MED articles listed, and is 700k. That means your page only listed 35/700=5% or 1400 pages. It did not serve its purpose, one never saw that B-cell chronic lymphocytic leukemia (first B page) was edited. Different check: today I checked the RELC workings with only the MED articles starting with "A" (2500 pages, 70k page). So the old page not even had the "A" complete. It was missing 95% of its target. How was that a good feature?
About speed: opening the Special page to show the edits (WP:MED Articles - Related changes), the special page we want, has acceptable speed, is not slow (for me). Anyway we should not "improve" it by leaving MED articles out at random, do we. It is opening the big list page itself that is slow (700k). That is why I advise readers to leave that page alone, and only the Special page RELC reads it (fast) to produce the desired overview.
If I am missing something, or mistaking your point, please tell me. User experiences (good and bad) are best reported at WT:MED. -DePiep (talk) 19:23, 5 November 2013 (UTC)
  • re Werieth@. OK, what you say is what I meant to say (in a hurry). And yes I'll define stuff more crisp and clear. We have a start. I suggest we develop this bot+template project over at Template talk:RELC list from now on. See you there? This thread can be closed I guess, since you picked it up. -DePiep (talk) 19:33, 5 November 2013 (UTC)
  • It looks like AnomieBOT's WatchlistUpdater does what is needed here. Anomie, would there be any problems with setting up a watchlist page for WP:MED? — Mr. Stradivarius ♪ talk ♪ 02:12, 6 November 2013 (UTC)
    • I'd have to seek approval for that task to edit outside the bot's userspace, and I don't recall offhand if the code is set up to handle 28000+ articles. It sounds like Werieth is working on a bot for this, so I'll leave it to them. Anomie 02:21, 6 November 2013 (UTC)
      • Fair enough - thank you for your quick response, and thank you Werieth for taking this on! — Mr. Stradivarius ♪ talk ♪ 03:09, 6 November 2013 (UTC)
  • FWIW, I have a clone of Tim's tool at [7]. (Yes, I'll fix the mixed content issues eventually). Legoktm (talk) 02:25, 6 November 2013 (UTC)
FYI, the tool is been changed from "RELC list" into Page reports. -DePiep (talk) 21:51, 8 November 2013 (UTC)

Bot for adding links to OLAC resources about languages

The "OLAC" (Open Language Archives Community) website has consistently helpful pages about resources for the languages of the world, especially the endangered and lesser-taught languages. The OLAC pages use a URL which ends with a three-letter code from the ISO 639-3 language code list, which is found in our language articles infobox. Each OLAC page has a nice descriptive title at the top, such as OLAC resources in and about the Aguaruna language.

Rather than adding several thousand OLAC page links to the External links sections of language articles by hand, couldn't we just write a bot to do this?

I know some languages have multiple language codes in their Wikipedia infobox, due to multiple dialects or language variants. Even if the bot didn't add links for languages with multiple codes, it would still be a big time-saver!

What do you think? Djembayz (talk)

If you look at ǂKx'ao-ǁ'ae you'll see that it already has a language infobox, and that includes a link off to an external site. Why not modify the infobox to add the language-archives.org link? Josh Parris 00:10, 9 November 2013 (UTC)

Switch Internet Archive links to HTTPS

Without getting too deep into tin foil territory, encrypting is one of many essential steps to ensure readers' privacy. Since October 24, 2013, the Internet Archive now uses HTTP Secure (https://) by default [8]. Just this week they updated their server software so it can handle TLS 1.2, the latest version. It is safe to say they encourage their visitors to access their site using an encrypted connection.

In my opinion, Wikipedia should support this effort and switch all outgoing links to the Internet Archive to HTTPS. According to Alexa, Wikipedia currently ranks fourth among upstream sites to archive.org [9]. {{Wayback}} was already updated in that regard, but most of the links to the Wayback Machine are implemented in one of the many citation templates as encouraged at WP:WBM. I started to fix a lot of those links manually, before realizing it would be a perfect job for a bot.

The Wayback Machine links have a common scheme, e.g. https://web.archive.org/web/20020930123525/http://www.wikipedia.org/. So the task is this: find http://web.archive.org/web/ throughout the article namespace and replace with https://web.archive.org/web/. That's it. --bender235 (talk) 20:51, 8 November 2013 (UTC)

See WP:NOTBROKEN and WP:COSMETICBOT Werieth (talk) 21:08, 8 November 2013 (UTC)
This is not a cosmetic change. It's not like switching http://archive.org/ to http://web.archive.org/, which would indeed change nothing. But switching to https changes the transport mechanism, from unencrypted to encrypted. Even tho it looks simple, it has significant consequences. --bender235 (talk) 21:17, 8 November 2013 (UTC)
In this case changing the transport protocol doesnt make much of a difference. No data other than the page contents (which can easily be retrieved via both secure and non-secure methods) is being transmitted. Thus the possible intercepted data risk is null. All it would do is generate a false sense of security. If you really think it should be done, you might look into a lua replacement module that can be plugged into the citation templates. Werieth (talk) 21:27, 8 November 2013 (UTC)
Lua is a most cluefull suggestion. Josh Parris 23:51, 8 November 2013 (UTC)
I agree with Werieth, this is a huge number of edits (over 160,000 in mainspace) for something that's not broken. If we really want to do this, it would be better as something like a low-priority (only done in combination with more significant changes) change in another tool like AWB. Mr.Z-man 21:22, 8 November 2013 (UTC)
Okay, I'll do that. --bender235 (talk) 23:30, 8 November 2013 (UTC)
I think you've misunderstood. I said it should be done only in combination with more substantial changes. I certainly wasn't saying to go and make 160,000 edits with AWB. Mr.Z-man 00:34, 9 November 2013 (UTC)
I won't do that. I just added it to the regular typofixing scan I do regularly anyway. --bender235 (talk) 00:36, 9 November 2013 (UTC)
Note to everyone: I started a discussion on this over at Village Pump. --bender235 (talk) 10:40, 9 November 2013 (UTC)

Adding ISOC (international), SOC (US) and NOC (Canada) job codes to professions infoboxes

Would it be possible to import those 3 standardized codes into the professions infoboxes ? --Teolemon

Importing CNP Codes (Quebec and Canada)
Importing International Standard Classification of Occupations Codes (International)
  • International Standard Classification of Occupations (en)
  • Standard International Classification code for jobs. "ISCO is a tool for organizing jobs into a clearly defined set of groups according to the tasks and duties undertaken in the job."
  • XLS Structure of those codes: http://www.ilo.org/public/english/bureau/stat/isco/index.htm
  • Value for Librarian is: 2622 (Librarians and related information professionals)
Importing SOC Codes (US)

The individual occupation items don't have yet any SOC codes associated with them, but they are in broad occupation categories in enwiki that should make it easier to match:

Here's the list of SOC codes for matching with the existing items.

— Preceding unsigned comment added by 2A01:E35:2EA8:950:5BF:1AF3:3374:F3D0 (talkcontribs)

New REFBot

copied from WP:VPT --Frze > talk 07:17, 18 October 2013 (UTC)

Thanks - Vielen Dank!

DPL bot and BracketBot are the best inventions of Wikipedia. It's time for a new Bot. We need the

If a user contributes a broken reference name, an incorrect ref formating (or a missing reflist), please inform the user who caused this error. It is so outrageously hard work to correct all these errors afterwards, from someone who is not holding the factual knowledge. For example: it took me a week to work up the backlog of Category:Pages with broken reference names - more than 1500 items, some disregarded more than two years. Search with WikiBlame for first entry of ref, making the changes, inform the users... annoying. Thank you very much. --Frze > talk 12:25, 17 October 2013 (UTC)

I ask for little consideration please. It takes several minutes of work - only because of a lack of character. Ten times and more per day. For example see Cite error: The named reference Media2 was invoked but never defined What' wrong? > Compare selected versions > Fix broken reference name. Why doesn't a Bot send a message to the polluter of the error? Why must other users rid of the mess? With BrackBot and DPLBot it is so easy --Frze > talk 06:44, 18 October 2013 (UTC)
Example:

== A reference problem ==

Hi [[User:SpidErxD|SpidErxD]]! Some users have been working hard on [[:Category:Pages with broken reference names]].

[https://en.wikipedia.org/w/index.php?title=Nuclear_program_of_Iran&diff=577623223&oldid=577620891 Here] you added new references '''ref name=OPBW''' and '''ref name="status"''' but didn't define it. This has been showing as an error at the bottom of the article. <small>'''''Cite error: The named reference was invoked but never defined.'''''</small> Can you take a look and work out what you were trying to do? Thanks --User:REFBot

Let's try... See User talk:SpidErxD#A reference problem Thanks --Frze > talk 08:00, 18 October 2013 (UTC)

────────────────────────────────────────────────────────────────────────────────────────────────────

A bot-issued message would have to be much vaguer than that. If revision X has no error, and revision X+1 has an error "The named reference Foo was invoked but never defined", then many different things could have gone wrong. The edit could have added the named reference, or deleted the last call to it, or accidentally disabled the last call to it by damaging the syntax of some earlier reference or some template. -- John of Reading (talk) 08:21, 18 October 2013 (UTC)
I've been working on broken reference name problems for quite a while. A bot along the lines of BracketBot would be very helpful. Many of the "ref name" problems happen when experienced users are working on pages, copying material between articles, or doing cleanup rapidly. (AWB users like to change hyphens or spaces in reference names for example.) Experienced users often don't slow down to preview pages and look for errors at the bottom. These are the editors who would immediately fix problems if they were informed. AnomieBOT 's orphan reference fixer does a great job catching many things, but those put in accidentally by experienced users are often more subtle and harder to track down.
Unlike BracketBot it would be hard to give a message that pinpointed the problem, Also unlike bracket problems, reference errors are easy to see in the article. It would be enough just to put a copy of the generated error message on the user's talk page. StarryGrandma (talk) 18:19, 18 October 2013 (UTC)
Regardless of the details of how it would work out, I see this proposal as a wonderful idea. If we want a vague thing, we could use something like "$EDIT1 you made to $PAGE2 resulted in a reference coding error. Please return to the page to fix the problem. If you would like assistance with fixing this problem, please visit the Help Desk, where other editors will be happy to assist you." It would be sufficient for the experienced editors, and inexperienced editors would either understand what to do or they'd know where to go to get help. Nyttend (talk) 01:52, 19 October 2013 (UTC)
I like the idea a lot!
As for the wording, pooling from Nyttend, 64.40.54.174 and User:BracketBot/inform; how about something like: "Information.svg Hello, I'm REFBot. I have automatically detected that [your edit] to [page] may have caused a reference coding error. Please take a look at the page and edit it to fix the problem if you can. If you would like assistance with fixing this problem, please visit the Help Desk, where other editors will be happy to assist you. If I misunderstood what happened, or if you have any questions, you can leave a message on [my operator's talk page]. Thanks ~" benzband (talk) 10:37, 22 October 2013 (UTC)
  • Agree a bot like this would be a good idea. It might be hard to eliminate false positves though. I would suggest the bot leave a message saying something like "an error was found, please take a look" rather than "you caused an error". Best. 64.40.54.174 (talk) 05:06, 19 October 2013 (UTC)
    • That`s my opinion too. Just send a remembrance / reminder to the editor, who caused the problem a few minutes ago. So he could fix the error in one minute, instead of us, who needs sometimes 10 minutes, or more (in at least these steps: see categories pages with citation errors > open page > read > edit page in another window > view history in another window > compare selected versions > < pondering what's wrong > fixing > preview > saving > closing all pages...). Just now there are 250 pages in all three categories > that means: ten hours work when you need just 2 1/2 minutes per page. Message could be just:

"Take a look at the page XYZ. There is a citation error. It could be in the text:

  1. A <ref> tag is missing the closing </ref>.

- or take a look at the bottom of the page:

  1. There are <ref> tags on this page, but the references will not show without a {{reflist}} template.
  2. A named reference was invoked but never defined.
  3. A reference was defined but isn't used in the text.

Thanks, RefBot talk 10:05, 21 October 2013 (UTC)"

There is an article traffic on pages with citation errors of about 25 views per day. (See [10][11][12]). That could mean that about 10 or so users have been working hard on this pages with citation errors. If there would be a REFBot, this work could be minimized to 25%...35%. --Frze > talk 10:05, 21 October 2013 (UTC)
  • This is a great idea. Categories such as Pages with missing references list and other error cats tend to fill up rather quickly, even with our attempts to maintain it. Other bots try to fix referencing errors, but there are so many different types of errors, all of which can be caused by multiple syntactical errors (here's a small list). It would be hard to program a bot to fix all of them. With this bot, all that is needed is for it to check if the error is present, and which edit it first appeared (to notify the correct editor). It does not need to check the specific syntax that caused the error. The editor will be able to see what he did and fix it easily. It's a much simpler solution than programming a bot to fix them itself.
    • "It might be hard to eliminate false positves though." There might be a small problem with valid checking if templates are present in the article. I've seen a fair share of error messages in articles that resulted from an edit to a template and not an edit to the article itself. The error message still shows up in the article. i.e. if a user adds a citation to the template and there isn't a {{Reflist}} template in the article. The bot would have to check for that I assume. — JJJ (say hello) 15:15, 22 October 2013 (UTC)
  • I support this request for creation of a RefBot. Pointing editors towards problems in their edits not only reduces the need for other editors to fix these problems later but provides an opportunity for the original editors to become aware of problems they are creating and how to avoid/fix them. - - MrBill3 (talk) 15:40, 22 October 2013 (UTC)
Hey all, sorry it took me so long to comment on this idea. It seems a great idea, and there looks to be enough support on the issue. I will try and have a look into making this, though the main issue is that I don't have much time with all my university work. (Though stay tuned, because my year project is for Wikipedia.Face-smile.svg) If anyone could help me, by finding out how to detect the errors in real time, I'll look into modifying BracketBot's code to make ReferenceBot. (Or whatever name you want to vote for Face-tongue.svg) 930913(Congratulate) 18:31, 27 October 2013 (UTC)
@Frze: Why do you distract me? D: Anyway, I have made a script to collect the previous day's mistakes.
I need these (a random sample?) checked to ensure that a notice is appropriate, and for each of the categories I'll need a template, or a single template with an insertion for relative phrases. 930913(Congratulate) 01:13, 28 October 2013 (UTC)
  • 930913, I've got code that checks ISBN problems. It finds these problems. Yell if you want it.
  • The main problem with Bracketbot that needs to be fixed ASAP is people goto 930913's talk page for questions. Huon answers most of the questions with GoingBatty helping out. Questions for Bracketbot and Refbot need to be directed to where more people can help out. Bgwhite (talk) 04:46, 28 October 2013 (UTC)
930913: Thanks for your efforts. Here are the categories we are interested in: Category:Pages with citation errors
Category:Pages with broken reference names
Category:Pages with incorrect ref formatting
Category:Pages with missing references list --Frze > talk 04:00, 29 October 2013 (UTC)
@Frze: plus Added
Again, please check for reasons why any of the above shouldn't be included, and come up with wording for each category. Thanks, 930913(Congratulate) 17:52, 29 October 2013 (UTC)
I am the editor heading the "Yesterdays mistakes" list with
Arjayay edited Transcendental Meditation technique causing Category:Pages with ISBN errors, Category:Pages with citations using unsupported parameters, Category:Pages using citations with old-style implicit et al., Category:Pages using citations with accessdate and no URL
You clearly need to look at these edits more closely, before rushing in with a bot. My "edit" was to reinstate a page which had been blanked, and replaced by an inappropriate redirect. All of the mistakes I am accused of "creating" were, therefore, already there. If editors are going to be chased by a bot every time they reinstate a blanked page, or a blanked section, you are going to be inundated with complaints, and editors will either ignore your notifications, or turn the bot off.
You need to be able to identify when an editor has actually introduced the problem, and when they have merely reverted some vandalism, which includes problem(s) potentially made by numerous editors over a long period of time. Wikipedia has an unfortunate history of launching half tested software, e.g. bracket bot which doesn't understand basic things such as the use of greater or less than symbols, and points editors to the wrong place, so I fear the worst - Arjayay (talk) 18:25, 29 October 2013 (UTC)
@Arjayay: More closely? At all. I don't have much time, so I'm relying on people like you raise these issues, so I can properly code the bot. Obviously flagging reverts is undesirable, and will need to be removed for the approved implementation. Thank you for your participation, Face-smile.svg 930913(Congratulate) 22:32, 29 October 2013 (UTC)
930913: Thanks for your efforts again. We are only interested in: Category:Pages with citation errors, nothing else!
Category:Pages with broken reference names
Category:Pages with incorrect ref formatting
Category:Pages with missing references list
because if there are more than 50 items in each categorie - then it shows as a backlog.
The categories you mentioned are impossible to work on, please try it later: There is a backlog today of about 60,000 pages:
Category:Pages with ISBN errors - 7,598 pages
Category:Pages with citations using unsupported parameters - 10,216 pages
Category:Pages using citations with old-style implicit et al. - 4,408 pages
Category:Pages using citations with accessdate and no URL - 42,060 pages
We have to clean up at first the Category:Pages with citation errors so not to leave any big red error on the pages. Thank you for your attention --Frze > talk 21:26, 29 October 2013 (UTC)
@Frze: Irrelevant, the script would work by notifying anyone who puts a page in those categories (i.e. not those already there.) This bot will not clear the backlog, it will slow the backlog's growth, such to aid your attempts to clear. 930913(Congratulate) 22:32, 29 October 2013 (UTC)
A930913: I know that the Bot do not clear the backlog. I am not meschugge. But first of all we will not grow the backlog in the Category:Pages with citation errors again to more than 1,500 pages! It took us more than a week hard work to clean up. Do what you want with the other categories, the main thing is to start the bot for Category:Pages with citation errors. In most exquisite gratitude --Frze > talk 23:09, 29 October 2013 (UTC)
@Frze: The point is, for very little work, we cover a whole load more categories. 930913(Congratulate) 23:21, 29 October 2013 (UTC)
A930913: The other point is the message to the user, as simple as posssible. See benzbands contrib 10:37, 22 October 2013. Please program the two different bots. --Frze > talk 23:44, 29 October 2013 (UTC)
"Please program the two different bots." The whole point of this bot is to notify editors. There will only be one bot. — JJJ (say hello) 00:38, 30 October 2013 (UTC)
@Frze: You're missing the point, each category can have its own message. 930913(Congratulate) 00:37, 30 October 2013 (UTC)
930 and JJJ, methinks what Frze was talking about is that there is a prioritized-need for the various use-cases of the bot, with citation-errors being the most critical (since they are very difficult to correct without help from the editor who originally created the trouble). The least critical is the 42k pages that have "cite-w/-accessDate-but-no-URL" which basically means, somebody cited a page-number from a printed book, and then went ahead and specified that their info was 'retrieved on' November 1st of 2013. Which is pointless, since as long as they specify edition/format/isbn of the book, the date they looked up the fact in that book does not matter, the retrieved-on param is only for URL-based cites, since the contents of the URL often suffer from bitrot, whereas deadtree books do not so suffer.
   Anyways, while I understand that there need only be a single bot (or filter, see my suggestion below), I strongly suggest that RefBot should not be turned loose on all the possible error categories, simultaneously. We do not want to put template-spam on the talkpage of a user, which says "you made a reference-coding error" when all they did was harmlessly put the retrieved-on param into a cite from a deadtree book. Why waste their time? But we especially don't want to *rollback* such changes, of course. Point being, although there will only be one bot, I agree with Arjayay about the importance of doing serious careful testing, so that we are dead-sure RefBot handles all the odd corner-cases properly, before we unleash it on millions of unsuspecting editors. We do not want to *discourage* people from adding references! That's hard enough to get them to do in the first place.
   Therefore, at the end of the day I agree with Frze, but for a different reason: strongly suggest that RefBot be implemented so that it can be up-front configured to ignore all categories which are not explicitly specified. That way, we can do a staged rollout, testing RefBot against a small subset of all possible cite-errors, before we expand the category-count. Eventually, of course, RefBot may be so bullet-proof-user-friendly that we can permit it to ignore nothing, and at that point the category-inclusion-exclusion-code can be taken out. But in the first months of RefBot testing, methinks it will prove very valuable to just focus on one sort of cite-error at a time, beginning with Category:Pages with citation errors where we know there are some beta-testers full of wikithusiasm for the wondrous powers of RefBot.  :-)   Appreciate your time; thanks for improving wikipedia. 74.192.84.101 (talk) 11:33, 4 November 2013 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── A930913 TheJJJunk I'm looking forward with happy anticipation to the implementation of my idea. Thanks to you all. --Frze > talk 04:28, 30 October 2013 (UTC)

 Question: Is the bot going to have an opt-out option like BracketBot does? Because with the brackets, the edit could have been very minor, not causing large-scale damage to the article. But with this, big red error messages occur because of the mistakes. This is something to think about: Do users get to decide if they get notified about their error, or do they have to be. — JJJ (say hello) 17:13, 30 October 2013 (UTC)
Comment. There are other possibilities... with xLinkBot, if the editor submits an external link to facebook (or some other greylisted website), xLinkBot will perform a rollback, then notify the editor on their talkpage. This is often problematic, because even if the edit was ten kilobytes, xLinkBot will remove everything -- parsing out just the greylisted hyperlink is too server-intensive and error-prone. Another problem, is that rollback can wipe out a long series of edits, only the last of which added the greylisted link. Given these existing possibilities, and the BracketBot behavior JJJ mentions:
  1. Forcible-Prevention-Filter. RefBot should be implemented as an edit-filter, and immediately warn the user when they preview or save a busted ref, refusing to let them save it in a broken state (they must fix it first)
  2. Loud-Warning-Filter. RefBot should be implemented as an edit-filter, and immediately warn the user when they preview or save a busted ref, but permit the user to override and save anyways (in the broken state)... then nothing
  3. Silent-Warning-Filter. Same as #2. Additionally, RefBot allows the editor to opt-out of receiving RefBot filter-warnings.
  4. Loud-Fix-It-Later-Filter. RefBot should be implemented as an edit-filter, and immediately warn the user when they preview or save a busted ref, but permit the user to override and save anyways (in the broken state)... however, after their edit is saved, RefBot should rollback that one edit (not rollback the last N edits by the editor in question), and then RefBot should automagically post to the article-talkpage with a diff-link to the attempted-ref-edit that it just reverted
  5. Silent-Fix-It-Later-Filter. Same as #4. Additionally, RefBot allows the editor to opt-out of receiving RefBot filter-warnings.
  6. Loud-Warning-Bot. RefBot should be implemented as a bot, and eventually warn the editor on their talkpage, but should leave the article alone (no opt-out capability)
  7. Silent-Warning-Bot. Same at #6. Additionally, RefBot allows the editor to opt-out of receiving RefBot talkpage-messages.
  8. Loud-Fix-It-Later-Bot. RefBot should be implemented as a bot, and eventually warn the editor on their talkpage, plus RefBot should rollback that one edit (not rollback the last N edits by the editor in question), and then RefBot should automagically post to the article-talkpage with a diff-link to the attempted-ref-edit that it just reverted. Plus, ideally, RefBot's user-talkpage-message should have a one-click-to-put-my-broken-edit-back hyperlink, which also redirects the editor to the article (this prevents them from needing to manually visit the article, enter the edit-history, manually undo RefBot, and then go back to editing the article). Since the editor might not utilize the one-click-magic 'soon' by standards of how quickly the article in question is changing, prolly the one-click-magic should only work if the sub-section of the article in question has *not* been changed by any editors, since RefBot reverted this editor's work; otherwise, the one-click-magic might do more harm than good.
  9. Silent-Fix-It-Later-Bot. Same at #8. Additionally, RefBot allows the editor to opt-out of receiving RefBot talkpage-messages.
Obviously, there are additional variations that are possible, such as #7-less-the-article-talkpage-feature, or whatever. But I think these options cover the main *types* of behaviors that we might want. 74.192.84.101 (talk) 11:04, 4 November 2013 (UTC)

(Arbitrary break for ease of editing)

I support the idea of this bot existing with functionality similar to that of BracketBot. I have specific ideas for a different bot that would actually fix CS1 citation errors, but I will describe that functionality in a separate request.

As stated above by others, I do not think it would be productive to apply this new bot's activity to all of the subcategories of Category:Articles with incorrect citation syntax. That would generate a LOT of error messages on people's Talk pages, and some error messages are not even displayed on the article pages by default, so it will be hard for people to figure out where they made an error or if they have fixed it. I recommend starting with the following categories, each of which has been emptied through diligent work by wikignomes:

  • Pages with citations having wikilinks embedded in URL titles
  • Pages with citations using conflicting page specifications
  • Pages with citations using unnamed parameters
  • Pages with DOI errors‎
  • Pages with empty citations
  • Pages with OL errors
  • Pages with URL errors

Also as requested above, the bot should operate on articles in:

  • Pages with broken reference names
  • Pages with incorrect ref formatting
  • Pages with missing references list

I estimate that a total of 20 to 50 articles are added to all of the categories above (combined) each day; someone here might be able to scrub the logs and get a better count.

The bot should post a message similar to Bracketbot's message on the Talk page of the editor who makes the change. Since these categories are already empty, the situation described above in which a revert reintroduces an error should be a rare case.

Also, the bot should have a built-in waiting period (Bracketbot waits five minutes) to allow editors to fix errors themselves if they notice them. Please contact me if you need help writing the error notification text for each category. – Jonesey95 (talk) 16:23, 6 November 2013 (UTC)

@Jonesey95, Frze, and TheJJJunk: Started on the three categories and have come up with this so far.
"Category:Pages with broken reference names" --> "a [[:Category:Pages_with_broken_reference_names|broken reference name]] <small>([[Help:Cite_errors/Cite_error_references_no_text|help)</small>"
"Category:Pages with incorrect ref formatting" --> "a [[:Category:Pages_with_incorrect_ref_formatting|cite error]] <small>([[Help:Cite errors|help]])</small>"
"Category:Pages with missing references list" --> "a [[:Category:Pages with missing references list|missing references list]] <small>([[Help:Cite_errors/Cite_error_refs_without_references|help]] {{!}} [[Help:Cite_errors/Cite_error_group_refs_without_references|help with group references]])</small>"
Examples of what the bot would generate from that

On User:Tesfazgi Teklezgi:
Hello, I'm ReferenceBot. I have automatically detected that some edits performed by you may have introduced errors in referencing. They are as follows:

Please check these pages and fix the errors highlighted. If you think this is a false positive, you can report it to my operator. Thanks, 930913(Congratulate) 16:26, 7 November 2013 (UTC)

On User:14.139.160.4:
Hello, I'm ReferenceBot. I have automatically detected that an edit performed by you may have introduced errors in referencing. It is as follows:

Please check this page and fix the errors highlighted. If you think this is a false positive, you can report it to my operator. Thanks, 930913(Congratulate) 16:26, 7 November 2013 (UTC)

On User:98.230.108.226:
Hello, I'm ReferenceBot. I have automatically detected that an edit performed by you may have introduced errors in referencing. It is as follows:

Please check this page and fix the errors highlighted. If you think this is a false positive, you can report it to my operator. Thanks, 930913(Congratulate) 16:26, 7 November 2013 (UTC)

On User:Soetermans:
Hello, I'm ReferenceBot. I have automatically detected that an edit performed by you may have introduced errors in referencing. It is as follows:

Please check this page and fix the errors highlighted. If you think this is a false positive, you can report it to my operator. Thanks, 930913(Congratulate) 16:26, 7 November 2013 (UTC)

On User:Chrisd915:
Hello, I'm ReferenceBot. I have automatically detected that an edit performed by you may have introduced errors in referencing. It is as follows:

Please check this page and fix the errors highlighted. If you think this is a false positive, you can report it to my operator. Thanks, 930913(Congratulate) 16:26, 7 November 2013 (UTC)

On User:71.173.129.226:
Hello, I'm ReferenceBot. I have automatically detected that an edit performed by you may have introduced errors in referencing. It is as follows:

Please check this page and fix the errors highlighted. If you think this is a false positive, you can report it to my operator. Thanks, 930913(Congratulate) 16:26, 7 November 2013 (UTC)

On User:128.8.228.120:
Hello, I'm ReferenceBot. I have automatically detected that an edit performed by you may have introduced errors in referencing. It is as follows:

Please check this page and fix the errors highlighted. If you think this is a false positive, you can report it to my operator. Thanks, 930913(Congratulate) 16:26, 7 November 2013 (UTC)

These would be daily reports (BracketBot now does ten minutes delay, though the page hasn't been updated) more like DPL bot (signed with ReferenceBot, not my sig and put on the talk page, not the userpage Face-tongue.svg).
The current templates used are {{User:ReferenceBot/inform/top}}, {{User:ReferenceBot/inform/middle}} and {{User:ReferenceBot/inform/bottom}}. See also User:ReferenceBot.
I'll apply for approval soon. 930913(Congratulate) 16:26, 7 November 2013 (UTC)
Nice work. I clicked through the links to the help pages and improved some of the help text. We want to be sure that people are able to fix problems once they are alerted to them. – Jonesey95 (talk) 19:21, 7 November 2013 (UTC)
@A930913: Perhaps once/if the bot gets approved, it should have its own talk page instead of being redirected to your own? Just an idea. This could help isolate problems of this bot, knowing that BracketBot's talk also redirects to the same place. — TheJJJunk (say hello) 20:45, 7 November 2013 (UTC)
Good idea. I like Citation bot's Talk page; it gives an easy way to report problems or make suggestions.
A daily report sounds fine to me, if you can figure out how to identify which edit introduced the problem and if the problem still exists. You'll still want some sort of delay to allow people to fix their own edits if they see them. (e.g. If the report runs daily at 23:59, you should ignore edits made between 23:49 and 23:59 but include edits made from 23:49 to 23:59 on the previous day.) – Jonesey95 (talk) 22:51, 7 November 2013 (UTC)

Move 30 Seconds to Mars links to Thirty Seconds to Mars

After a requested move and a move review, the page 30 Seconds to Mars was moved to Thirty Seconds to Mars, which is the official name of the band. After long discussions, we came to an end and all links to 30 Seconds to Mars should be replaced with Thirty Seconds to Mars. Please fix them if you can.--95.245.58.53 (talk) 21:16, 11 November 2013 (UTC)

I'd like to point out that at the requested move 30 Seconds to Mars was moved to Thirty Seconds to Mars and its move review was closed with an endorse. The last discussion is here, where it was definitely decided that the name Thirty Seconds to Mars is right.--95.245.58.53 (talk) 14:43, 12 November 2013 (UTC)
All links to 30 Seconds to Mars will redirect to Thirty Seconds to Mars. See WP:NOTBROKEN. — TheJJJunk (say hello) 15:22, 12 November 2013 (UTC)
That is not the point. Thirty Seconds to Mars is the official name (see discussions), that's why 30 Seconds to Mars should be replaced.--95.245.58.53 (talk) 20:42, 12 November 2013 (UTC)
 Done. Pages that link to "30 Seconds to Mars"Pages that link to "Thirty Seconds to Mars". — TheJJJunk (say hello) 00:16, 13 November 2013 (UTC)

Thanks for your work. The same thing should be done for MTV Unplugged: 30 Seconds to Mars, Attack (30 Seconds to Mars song), Kings and Queens (30 Seconds to Mars song), Hurricane (30 Seconds to Mars song), Night of the Hunter (30 Seconds to Mars song), Search and Destroy (30 Seconds to Mars song), City of Angels (30 Seconds to Mars song), Do or Die (30 Seconds to Mars song).--Earthh (talk) 20:13, 13 November 2013 (UTC)

yellow tickY Half done All of the links have been replaced, and I hit some of the major templates too. The only ones that should remain are the ones that are linked through the templates, and once a user makes an edit to the page, it will be removed from the list. If any are still linked, you'll need to check if any templates still have the old links. — TheJJJunk (say hello) 23:12, 13 November 2013 (UTC)

Dead archiveurl detection, repair

The Wayback Machine respects robots.txt across time. If a website has a robots.txt that permits archiving at one point, an editor could archive that page; a subsequent change to robots.txt on that site could lead to an inaccessible archive. For example:

South Park has link to

Ukraine

Chuck Norris

WebCite doesn't cause us problems in this way.

I believe a bot is required to repair these archive links. They can be detected by running a report against the database for all external links to archive.org, and for each link checking the link still works (will a HEAD command be sufficient?). Dead archiveurl links would need to be archived at webcite, or if the original link is unavailable then that need flagging with {{dead}}. Josh Parris 02:22, 16 November 2013 (UTC)

Mass diacritics correction

I have a task for bots: It's needed to replace all diacritics Ş ş Ţ ţ with Ș ș Ț ț in articles from categories about Moldova and Romania. In romanian language correct are second variant, but in Windows XP is a bug and in place of them are those wrong ^ diacritics. I have corrected a part of articles about football, but they are more. Its needed an bot that also can rename pages, because sometimes diacritics (wrong) are in title. Examples:

I repeat that in romanian language Ş ş Ţ ţ does not exist. Those are turkish diacritics, so you can freely to run bot in category ″Moldova″ and ″Romania″ + all subcategories. Thanks. XXN (talk) 14:49, 17 November 2013 (UTC)

See Wikipedia:Bots/Requests for approval/VoxelBot 2. Also see Wikipedia:Bot requests/Archive 39#Cedilla to Comma below bot for articles under Romanian place names and people and Wikipedia:Bot requests/Archive 52#Romanian orthography for past discussion of this sort of thing. In short, someone doing this must be very careful that the words they are altering are in fact Romanian rather than Turkish, Kurdish, Azerbaijani, Crimean Tatar, Gagauz, Tatar, or Turkmen (based on the list at Cedilla#S). Anomie 17:22, 17 November 2013 (UTC)

Looking for technical mentors and tasks for Google Code-in

Hi, I'm one of the Wikimedia org admins at mw:Google Code-in. We are looking for technical tasks that can be completed by students e.g. create/update a bot, improve its documentation... We also need mentors for these tasks. You can start simple with one mentor proposing one task, or you can use this program to organize a taskforce of mentors with the objective of getting dozens of small technical tasks completed. You can check the current Wikimedia tasks here. The program started on Monday, but there is still time to jump in. Give Google Code-in students a chance!--Qgil (talk) 16:11, 21 November 2013 (UTC)

bots

hello, how do i go about getting a bot for my chatroom? — Preceding unsigned comment added by Hannsg8000 (talkcontribs) 19:06, 21 November 2013 (UTC)

Generally, we can only help with bots related to Wikipedia. However, if it is an IRC related chatroom, meta:wm-bot may be useful to you. --Mdann52talk to me! 13:29, 22 November 2013 (UTC)

Requesting script modification

Hi all, I was recently granted a trial with my bot (see Mdann52 bot BRFA). However, it turned out that the script I was trying to use (mw:Manual:Pywikibot/weblinkchecker.py) did not check links in-between ref tags, so was not very useful for the task as I first thought. As my python skills are not very good at the minute, can someone rewrite the script (or produce a version of it) that only checks links in-between ref tags (and possibly ignores any tagged with {{dead link}}?) Thanks --Mdann52talk to me! 13:39, 22 November 2013 (UTC)

Updating a table with some stuff

There's currently a table of women physicists at User:Headbomb/sandbox2. If someone could code a bot to fetch the articles, and fill in the other columns, that would be nice and much appreciated.

  • DOB (Date of Birth): Use YYYY-MM-DD format, or YYYY-MM, or YYYY if information is incomplete. — if information is missing.
  • DOD (Date of Death): Use YYYY-MM-DD format, or YYYY-MM, or YYYY if information is incomplete. — if information is missing.
  • Tagged?: List projects that tag the article, alphabetized. That is, if the talk page is tagged by {{WikiProject Physics}} and {{WP Biography}}, list {{WikiProject Biography}}, then {{WikiProject Physics}}, with linebreaks between them.
  • Class: Max rating found in banners, i.e. if you find Start and Stub, list Start.
  • Link count: How many times the article and its redirects are linked to (mainspace count only, exclude redirects).

For clarity, I've filled the first line of the table. The request is for a one-time run for now, but a weekly/monthly run could be done when at some point in the future when the table gets hosted as its permanent location. Feel free to do tests directly on my sandbox2. Headbomb {talk / contribs / physics / books} 18:01, 26 November 2013 (UTC)

Project banners for WikiProject Women artists

The newly formed WikiProject Women artists could use a bot to add project banners to the talk pages of articles within certain categories. Gobōnobō + c 03:55, 28 November 2013 (UTC)

You might want to ask Anome or User:Magioladitis. Their about the only ones left with a bot that might be willing to do that. 108.45.104.69 (talk) 04:00, 28 November 2013 (UTC)
Great. Thank you 108. Gobōnobō + c
I can do the tagging only if I get specific instructions. -- Magioladitis (talk) 09:39, 28 November 2013 (UTC)

Ban violation bot

A bot that finds ban violations (e.g. editing someone's userpage when there is an interaction ban between the new editors, editing during a site ban, etc) and reports and possibly reverts them. 2AwwsomeTell me where I screwed up.See where I screwed up. 20:02, 26 November 2013 (UTC)

I've honestly been brainstorming a bot like this for months now. But there are a lot potential issues that would need resolving.—cyberpower ChatOnline 13:03, 27 November 2013 (UTC)
Topic bans might run off the category system, but then there's the "broadly construed" aspect of many bans. Interaction bans aren't normally just userpages are they? Josh Parris 10:49, 28 November 2013 (UTC)
No, but some would be difficult. And the userpage one was just an example. WikiProjects might be better than categories, unless the topic doesn't have a WikiProject. And there might be something to check whether edits were probably reverts of vandalism. 2AwwsomeTell me where I screwed up.See where I screwed up. 16:40, 28 November 2013 (UTC)
I've been experimenting with a few setups of potentially running a bot like this with encouraging results. I might just take this task up.—cyberpower OnlineHappy Thanksgiving 16:44, 28 November 2013 (UTC)

Bot for repetitive tasks

I want a bot for that automated or semi-automated for making repetitive edits that would be extremely tedious to do manually. repetitive tasks. for example adding the same category or template for a 1000 article. --DIYAR DESIGN (talk) 18:08, 27 November 2013 (UTC)

In general that's what bots are supposed to do. Any chance for more details? Hasteur (talk) 19:36, 27 November 2013 (UTC)
I also want a bot, to do that kind of stuff for me. I want to be able to tell it what to do, and it does it, no questions asked. That's why I chose to become a programmer and a bot-op. Now I have 2 active bots that I can boss around. :p—cyberpower ChatOnline 20:58, 27 November 2013 (UTC)

So you want a bot. What sort of bot? What do you want it to do? Symbol wtf vote.svg Idea is not well explained 2AwwsomeTell me where I screwed up.See where I screwed up. 17:06, 28 November 2013 (UTC)

Auto click/add to cart bot.

My mistake and apologies. — Preceding unsigned comment added by 71.222.78.246 (talk) 22:38, 1 December 2013 (UTC)

This page is for requesting bots to perform edits on the English Wikipedia. It is not the place to ask people to write bots for other sites such as Twitter. Anomie 02:13, 2 December 2013 (UTC)

URL updates for The Canadian Encyclopedia

A couple days ago, The Canadian Encyclopedia completely overhauled its website and, unfortunately, completely changed its URL format. This has broken over 5,500 links, but I think many of them could be repaired by a bot. The old url format was like http://www.thecanadianencyclopedia.com/index.cfm?PgNm=TCE&Params=A1ARTA0005015, while the new uses the article's title: http://www.thecanadianencyclopedia.ca/en/article/sir-andrew-macphail/. Would it be possible to have a bot check the title associated with a citation or external links entry and update with the proper URL where it can? I would imagine there would still be plenty of bare references and the like that we would still have to manually fix, but if a bot can take care of most of these, it would make the job manageable. Thanks! Resolute 22:05, 29 November 2013 (UTC)

It seems like this is a good opportunity to upgrade these references to use {{Cite encyclopedia}}. Josh Parris 08:40, 4 December 2013 (UTC)
St. Michael's College, Toronto is a perfect example. Josh Parris 22:55, 4 December 2013 (UTC)
I've found quite a few instances of just the encyclopaedia being referenced. I'm of the opinion these cases ought to be fixed by a human. For examples see: Bonavista Bay, B. C. Binning, Bishop's University, Mount Allison University
I've been guessing as to what the new url is, but it appears the easiest way to fix these urls is via the Wayback Machine; in 2012 these urls were redirecting. Having found it on the Wayback Machine one can then infer the new url. Josh Parris 22:55, 4 December 2013 (UTC)
Symbol wait.svg BRFA filed: Wikipedia:Bots/Requests for approval/WildBot 8. Josh Parris 23:48, 6 December 2013 (UTC)

Bot to upload main page images

Some of you will know that main page images that are hosted at Commons are - or should be - protected at Commons by an adminbot there by adding them to a casade-protected page at Commons. This prevents alterations or fresh versions being uploaded there, while our local cascading protection of files in today's and tomorrow's main pages prevents local upload of images. However, there was this thread at Talk:Main Page recently:

user:KrinkleBot hasn't edited since 9 November, meaning there is no autocascade protection on Commons. Promoting admins, please do check image protection status and upload a local protected copy if you can't protect on Commons - recent TFA and TFP images have not been protected. Materialscientist (talk) 02:59, 14 November 2013 (UTC)
This is exactly why I've argued against relying upon KrinkleBot as a first-line file protection measure. It's a useful fallback (its intended purpose), but this isn't the first outage that's occurred (and it probably won't be the last). —David Levy 04:00, 14 November 2013 (UTC)

So it occurs to me that a useful adminbot task would be to check WP:Main Page/Tomorrow and Wikipedia:Main Page queue (perhaps even Template:Did you know/Queue) and usefully uploading local copies of images found there (including the source information and licence tag}, adding {{Uploaded from Commons}}. Adminbot powers would be useful but not essential (a non-adminbot wouldn't be able to upload local copies of tomorrow's images since cascading protetion would have kicked in, but it would catch TFL/TFA/OTD images scheduled more than a day in advance. Thoughts / volunteers? BencherliteTalk 23:55, 18 November 2013 (UTC)

I support the idea. This would essentially duplicate the functionality of KrinkleBot at the English Wikipedia, which could only be helpful. (I doubt that a second safety net would generate any over-reliance beyond that stemming from the first safety net's existence.)
Note that adminbot powers would be essential; non-admin accounts can't be used to upload new local files with names already in use at Commons.
In addition to the pages mentioned, the bot should monitor Wikipedia:Picture of the day/Tomorrow, thereby protecting tomorrow's featured picture before the corresponding main page template (transcluded at Wikipedia:Main Page/Tomorrow) is generated. (Per this discussion, I created that page for KrinkleBot to monitor, but Krinkle dislikes the related infrastructure and apparently decided not to assist in its maintenance.) —David Levy 07:57, 22 November 2013 (UTC)
That wouldn't really work for Featured pictures, as they have local description pages. They would need a completely different approach, than other files. Armbrust The Homunculus 11:45, 27 November 2013 (UTC)
Why would a completely different approach be needed? Obviously, it would be appropriate for the bot to combine the local templates and Commons description (and revert to the former afterward). Otherwise, what deviations would the task require? —David Levy 11:11, 30 November 2013 (UTC)
Sorry for the late answer. If a local page exists, than uploading the image will not change the description of the file, because the software treats it as if a new version of the file would be uploaded. Therefore adding the description needs to be made in a separate step. Armbrust The Homunculus 17:32, 6 December 2013 (UTC)
Yes, that's correct. I don't regard this as "a completely different approach", but no matter. —David Levy 00:45, 9 December 2013 (UTC)
Symbol wait.svg BRFA filed: Wikipedia:Bots/Requests for approval/TFA Protector Bot 2. Legoktm (talk) 19:17, 4 December 2013 (UTC)

Images lacking a US status indication

Tag all entries in - http://tools.wmflabs.org/betacommand-dev/reports/Media_lacking_US_status_indication.txt

for inclusion in Category:All media files lacking a US status indication to be created.

This can either be done by a bot, or by tweaking templates. I prefer a mass tag run by a bot. Sfan00 IMG (talk) 15:41, 7 December 2013 (UTC)

Why wouldn't this category be populated by templates? What tag are you wanting? What makes these 6330 pages candidates for this category? Josh Parris 23:53, 7 December 2013 (UTC)
The reason it's not populated by templates yet is because those templates need either to be tweaked accordingly or a new template created. A file should be in the new category if it's not in
I've also been advised it would be better to do this tracking with a bot, because of the overlaps created by various license templates.Sfan00 IMG (talk) 09:42, 9 December 2013 (UTC)
Those categories are all directly below Category:Wikipedia copyright.
Now that templates are scriptable, it might be doable just with template code - the upside of which is that there's no bot to fail or debug and it's constantly up-to-date. I suspect that there would be a timing/ordering problem. Can a Lua expert chime in here please? Josh Parris 05:47, 12 December 2013 (UTC)

Small scale domain updates

Copying from myself at Wikipedia:VPT#Bulk_change_of_domain_in_external_URLs:

I probably have added 50-100+ external URLs to policy and discussion spaces that link back to a personal domain, where I host my academic writings and datasets relevant to Wikipedia and collaborative security. This domain has now changed, and while there is an HTTP redirect in place, administrative policies dictate that will not survive forever. The file paths are constant. This is a touch painstaking manually. Is there a way to automate this? If so, is that solution limited to en.wp or is this something that can be done for all WMF properties (I know I have links on Wikimania wikis and Metawiki, at minimum)?

I am looking to change everything of the form, http://www.cis.upenn.edu/~westand to http://www.andrew-g-west.com. Based on the request history here, it seems like some functionality is in place to take care of this? IOs it worth your troubles? Thanks, West.andrew.g (talk) 21:39, 11 December 2013 (UTC)

I'd suggest Wikipedia:AutoWikiBrowser/Tasks as somewhere where you'll find willing helpers. Josh Parris 04:40, 12 December 2013 (UTC)

Prod tagging compilation for statistical analysis

There is a sentiment among some users that PROD (not Sticky prod) is useless because anyone, including the creator, can simply remove the PROD tag. I've seen this expressed a few times recently in various fora. It can be pointed out that every week we successfully delete a few hundred pages through prod, so we know it works and they're not always removed, but it would be nice to see what the real statistics are – what percentage of taggings are successful and toher data about the process. To this end, I thought it would be a simple task to have a bot compile a list of prod taggings over some length of time, say one month. No human being could do this because they would miss all or many of the prods taggings that were placed and then removed within a short time, whereas a bot can simply, inhumanly, keep refreshing today's prod category, compare against a list its been compiling and add any new entry. That's the germ of the idea. A human at the end of the data gathering period can easily calculate a gross percentage of success by the number of red-linked and blue-linked and delve further to make sure deletions weren't by other methods but actual as a result of the prodding (if the bot couldn't do this as well). And there's lots of other data that could be gathered, which could be done through the bot if someone would be willing to set it up or by a human willing to spend the time, such as list how long after creation the prodding occurred, who removed, whether they were the creator, how long between tagging and removal, whether the creator was warned or not and I'm sure there are other interesting areas of inquiry I haven't thought of. Is this feasible? Feasible and easy? Feasible but too difficult to bother with? Anyone willing?--Fuhghettaboutit (talk) 00:21, 12 December 2013 (UTC)

On a technical level, I see three approaches as being viable.
  1. Keep querying the system for current population of the category, as you've suggested. Note additions, and removals. Do an analysis of the revision history of noted pages to determine what happened. It'd be helpful if this was done by an adminbot, to look at the text of deleted pages.
  2. Pick a period of time, and pull up all the revisions, searching for the addition or removal of the category. That could be slow. On the other hand, it doesn't have to be done against live data, so it doesn't have to process data in realtime. Again, for deleted revisions you'd need an adminbot.
  3. Camp out on recent changes for a while and drink from that firehose. You could just watch for the addition of the category, watchlist pages that have the category added and watch for subsequent removal. If you looked for deletions and removals on recentchanges, you wouldn't need an adminbot.
Each of these has their upsides and downsides. Josh Parris 05:03, 12 December 2013 (UTC)

Global search and replace to fix links to archived discussions.

Here's a bot that would be super-useful:

Search for links like the one at https://en.wikipedia.org/wiki/Wikipedia:WikiProject_Inline_Templates#Created Replace broken links links to since-archived discussions to the archived discussion: Replace

https://en.wikipedia.org/wiki/Wikipedia_talk:WikiProject_Inline_Templates#Fact_template_discusison_needs_comments with

https://en.wikipedia.org/wiki/Wikipedia_talk:WikiProject_Inline_Templates/Archive_2#Fact_template_discusison_needs_comments !

Presumably, we'll need pilot runs, big runs, and ongoing maintenance runs. Anyone up for it?--Elvey (talk) 01:19, 12 December 2013 (UTC)

ClueBot already does this. Legoktm (talk) 01:34, 12 December 2013 (UTC)

en.wikipedia.org

This is actually two requests. In hundreds, perhaps thousands, of articles, en.wikipedia.org is used instead of Wikilinking. In others it is used as a reference.

Could a bot be programmed to replace en.wikipedia.org in the text body with the link that was intended, per WP:WIKILINK?

Separately, could a bot be programmed so that when there any en.wikipedia.org within <ref></ref>, the whole lot is replaced with {{cn}}, per WP:CIRCULAR? Simply south...... eating lexicological sandwiches for just 7 years 19:45, 15 December 2013 (UTC)

Your second request is a bad idea as a general rule. It is certainly possible that articles that mention Wikipedia would have citations to en.wikipedia.org. And in these cases, the link should be an external link to mirrors will be linking to the right place. Anomie 20:09, 15 December 2013 (UTC)

Disease box update bot

Hi All, I am Dr. Noa Rappaport, scientific leader of the MalaCards database of human diseases. Following a suggestion by Andrew Su (https://en.wikipedia.org/wiki/Wikipedia:WikiProject_Molecular_and_Cellular_Biology/Proposals#MalaCards_-_www.malacards.org) we were asked to write a bot that updates the disease box external references within disease entries in Wikipedia: https://en.wikipedia.org/wiki/User:ProteinBoxBot/Phase_3#Disease. We found it to be a non trivial task. Does anyone know of any such bot that exists or can help us write it ? Thanks. — Preceding unsigned comment added by Noa.rappaport (talkcontribs) 10:22, 28 November 2013 (UTC)

How does the MalaCards disease name differ from the {{Infobox_disease/sandbox2}} disease name? Josh Parris 10:45, 28 November 2013 (UTC)
Hi Josh, we can supply a mapping given a list of diseases, we also have cross references to most of the existing lists. Noa.rappaport (talk) 08:44, 1 December 2013 (UTC)
Noa.rappaport, given this is a pretty major upgrade of the infobox, I think it would be a good idea to fill in the other fields at the same time. Are there lists of mappings available for the other fields? Josh Parris 08:45, 4 December 2013 (UTC)
A brief investigation suggests this may be a ripe task for WP:wikidata Josh Parris 23:42, 4 December 2013 (UTC)
If fact, the phase 2 data for Cystic fibrosis is at https://www.wikidata.org/wiki/Q178194 - adding a record type for malacards and importing the appropriate data shouldn't be a problem. Where is your mapping data? For purposes of populating wikidata, the en.wikitionary name and the malacards number should be sufficient. Josh Parris 10:15, 5 December 2013 (UTC)
Josh Parris - Great! Do you have an email address or ftp site I can send it to ?
Is there a reason not to post it to User:Noa.rappaport/Malacard mappings? That way any interested party can do the translation. Josh Parris 21:12, 9 December 2013 (UTC)
Josh Parris, I sent you the file but I can't seem to upload it because my account is not approved yet. Will appreciate your help with that. Noa.rappaport (talk) 15:19, 15 December 2013 (UTC)
File is in place at User:Noa.rappaport/Malacard mappings. Noa.rappaport (talk) 08:11, 17 December 2013 (UTC)

Bad links on wiki due to SEFAdvance converting our site URL's from underscores to dashes.

I am the webmaster for SeacoastNH.com

The site is built in joomla and we use the extension SEFAdvance that used to use underscores (__) in links, but the new version doesn't allow underscores and instead uses dashes (-) in links.

Consequently many of the links and references on wiki that use the old underscored links now present 404 errors.

Could the bot go and find all links on wiki for seacoastnh.com that use underscores and convert them to dashes? — Preceding unsigned comment added by Adcetera692 (talkcontribs) 15:29, 12 December 2013 (UTC)

Someone could do this manually easily enough, there are only 29 pages in article space that link to that site and include underscores in the URL. Anomie 16:58, 12 December 2013 (UTC)
 Done by the requester on December 12. GoingBatty (talk) 03:55, 19 December 2013 (UTC)

Age of Wushu

Hello Wiki, I'd like to request for a tutorial guide on "creating and configure a bot" for Age of Wushu. Eg; Harvest, mining or Kidnapping bot, and etc. — Preceding unsigned comment added by 175.139.223.168 (talk) 05:57, 20 December 2013 (UTC)

This page is mainly seen by people who operate bots at Wikipedia itself. For advice on writing a bot for a different site, you could try posting at the Computing reference desk. -- John of Reading (talk) 07:58, 20 December 2013 (UTC)

Bot to move orphan tags to the talk namespace

Following this RfC, orphan tags should be in the talk namespace now. Where in the talk namespace wasn't addressed, but I believe that below all the existing templates, but before the first section, should be OK. I rewrote the documentation that way. A bot should do the articles currently tagged, and possibly articles tagged in the future by editors unaware of the change. Ramaksoud2000 (Talk to me) 02:08, 21 December 2013 (UTC)

Could the bot also simply remove the template if the article has incoming links? Thanks! GoingBatty (talk) 06:11, 21 December 2013 (UTC)
Yobot already does the latter. I'll fill a request. -- Magioladitis (talk) 07:33, 21 December 2013 (UTC)
Wikipedia:Bots/Requests for approval/Yobot 23. -- Magioladitis (talk) 07:46, 21 December 2013 (UTC)
Thanks! Ramaksoud2000 (Talk to me) 09:54, 21 December 2013 (UTC)

Redirect a section of one article to another article

I've noticed that moving a section from one article to another can break all incoming links to that section. So far, I haven't found any way to automatically redirect a section of one article to a section of another article. (A comprehensive list of all broken section links can be found here - they are quite numerous, and there is not yet any automated solution for fixing them, as far as I know.)

@GoingBatty: For example, a template {{anchor|Code readability|redirect=Computer programming#Code readability}} could be used to specify a section of an article that a section anchor would redirect to, and all incoming links to that anchor would be re-targeted by a bot. If this feature were implemented, it would make it much easier to re-target sections from one article to another. Jarble (talk) 17:07, 21 December 2013 (UTC)

@Jarble: Fixing the links would be easy - finding the links to Readability#Computer programming is the harder part. Suggestions anyone? GoingBatty (talk) 17:50, 21 December 2013 (UTC)
The search here shows that fewer that 100 other articles link to the page Readability. Some of these search hits are looking for the 'Computer programming' section of the Readability page. Possibly 20-30 judging from the topics. The list could be gone through manually. 20-30 is a small enough number you might not require a bot. EdJohnston (talk) 20:12, 21 December 2013 (UTC)
@Jarble: - I used AWB to look through all of the articles that link to Readability, and didn't find any that link to Readability#Computer programming. What am I missing? (In the future, requests with a small number of articles would probably be better at WP:AWB/Tasks.) Thanks! GoingBatty (talk) 21:47, 21 December 2013 (UTC)
@GoingBatty: - A section-redirector type of bot would be useful for all broken section links across every part of Wikipedia: I'm sure that there are dozens (if not hundreds) of other broken section links besides Readability#Computer programming. We need some way to automatically fix all broken section links of this kind, since it's very tedious to fix them manually. Jarble (talk) 23:34, 21 December 2013 (UTC)
@Jarble: - The first link on Wikipedia:Database reports/Broken section anchors is Satisfaction_(Residents_cover)#The_Residents_version, which redirects to (I Can't Get No) Satisfaction#The Residents and there's no section called "The Residents" on (I Can't Get No) Satisfaction. To resolve this issue, I changed the link on Template:The Residents from Satisfaction (Residents cover) to (I Can't Get No) Satisfaction#Other covers. If there are any other links to Satisfaction (Residents cover), they could be changed as well, and then Satisfaction (Residents cover) could be sent to RfD.
Maybe I chose a bad example, but a great philosopher once told me that "you can't always get what you want". (Oh wait, wrong song.) Could you please help me understand the logic you think would work for such an automated solution? Thanks! GoingBatty (talk) 23:53, 21 December 2013 (UTC)

Tagging pages for a new Taskforce

Following this WP:VPT talk, and tipped by Anomie (I guess Anomie picks up here).
The new taskforce is in WP:MEDICINE: Society_and_medicine. From my talkpage [13]:

If I could magically use bots, I'd use a bot to tag every article with the taskforce:

That should net the majority of the articles we wish to catch. --LT910001 (talk) 15:38, 12 December 2013 (UTC)

The bot edit, I suggest:

{{WikiProject Medicine|...|society=yes|society-imp=&lt;TBD&gt;}}
or {{WPMED|...|society=yes|society-imp=&lt;TBD&gt;}}
with <TBD> = to be decided: "???" or "mid", ask taskforce members.

Please do not contact me in this, I am just a middle man for the taskforce @LT910001, Bluerasberry, and Jinkinson:. User:DePiep 14:18, 13 December 2013 (UTC)

Requestors (LT910001BluerasberryJinkinson) So I understand the request,
If a page that is tagged by WikiProject Medicine and
Is tagged with WP BIO or WP Companies
Alter the WikiProject Medicine tagging so that it has the society parameter
Add the society importance parameter and set it to '???' to indicate that the society taskforce needs to make a determination about the importance
If this is the case, I think I can have something written up fairly quickly, to do this. Hasteur (talk) 14:57, 13 December 2013 (UTC)
Thanks and thanks so much to DePiep for navigating the hallowed recesses of Wiki to direct me here! I'll respond to any queries. For the first run, please set the importance to ???, as we may wish to triage the articles once they are added to the taskforce, and ??? importance will provide a way of doing this. Thanks! --LT910001 (talk) 15:15, 13 December 2013 (UTC)
LT910001 That's a good point. If the page is already tagged for the society task force, the bot will move on to another page. I think that the initial tagging in to society should always be ??? so that the task force members make the determination. How often are you thinking that the bot should run? I could see a case of a once a month run (to pick up any old ones) and probably drop a notice on the WP Medicine talk page to announce that a new round of tagging has been completed and that the society task force should go through evaluating. This sound good? Hasteur (talk) 15:22, 13 December 2013 (UTC)
That would be wonderful, and additionally would reduce the burden for article assessors. --LT910001 (talk) 15:36, 13 December 2013 (UTC)
I just happened to come by, was nearby anyway. The once/month thing is a useful idea. Now could it be that the humans have thrown out of the taskforce a page X for good reason, and then the bot comes by and re-adds page X it with ???? (yes, four) Can you throw a page out by setting |society=no and the bot understands? -DePiep (talk) 17:11, 13 December 2013 (UTC)
That's an excellent point. That would probably be handled in the early portion of the check (if the page qualifies) If the society parameter is already set, don't muck with it. This is why I like to hash out the task and get 90% of the use cases handled before the BRFA gets filed. Hasteur (talk) 17:28, 13 December 2013 (UTC)

There seems to be some interest from the other task forces in this bot, however I feel that it may be better to first get a functioning bot, and then add additional usage cases for the additional 10+ task-forces after it is functioning. If at a later date this could be expanded to multiple taskforces it would be extremely valuable for WPMED and I am sure many users would be very grateful. If I may add two additional cases, to a total of four:--LT910001 (talk) 01:54, 14 December 2013 (UTC)

Question: is it possible to tag articles that have certain categories? I worry the difficulty with that may be that categories have a cascading structure and may be difficult to implement --LT910001 (talk) 01:54, 14 December 2013 (UTC)

LT, this is a different approach. The first question could be handled this way: the bot looks all pages that have {{WPMED}} transcluded (the list Special:WhatLinksHere/Template:WikiProject Medicine). And so forth.
What you ask now is that the bot goes to all the pages in a category (and its subcategories). That is different for the bot (maybe not to you), so I guess the bot operator would like clarity.
As for the subcategories: if you ask it explicitly (in your second question here then), a bot might well be able to drill deeper. That way Category:Medicine_articles_by_quality "+ recursion" will list the pages you might expect. It may be called "depth", "recursion/recursive". So this is a phrasig of the question to check.
What the bot actually can & will do, is up to the operator, not me. -DePiep (talk) 20:52, 15 December 2013 (UTC)

Hi Hasteur, how is the bot coding going? I understand in many countries the festive season has arrived, so I'll be happy to wait if you're busy, however this bot would be bery useful, so I'm enthusiastic about seeing it acutated. --LT910001 (talk) 03:39, 22 December 2013 (UTC)

I'll start working on it soon. I need to clean up my last BRFA project (one that's been sitting since October Wikipedia:Bots/Requests for approval/HasteurBot 6). In a holding pattern on the BasketballBallStats Bot (pending ESPN granting access to their delicious API). So this task is next. Hasteur (talk) 00:31, 23 December 2013 (UTC)
Brilliant! --LT910001 (talk) 01:03, 23 December 2013 (UTC)

Usage cases

Tagged under the 'society and medicine' task force:

  • any article simultaneously under WP:BIO and WP:MED
  • any article simultaneously under WP:COMPANIES and WP:MED
  • any article simultaneously under WP:Organizations and WP:MED
  • any article with the word 'charity' in the title. (unlike WP:BIO, articles about charities are not reliably tagged with other WPs)

Ampelography Ampelographers

these entries need to be linked to en.wikipedia.org/wiki/Ampelography in an automated way
Xb2u7Zjzc32 (talk) 04:18, 23 December 2013 (UTC)

@Xb2u7Zjzc32: There aren't a lot of pages that are missing the link - have you tried the Find link tool: http://edwardbetts.com/find_link/ GoingBatty (talk) 05:15, 23 December 2013 (UTC)

Help with reviewing 19000 blocks

We may need some sort of bot or script at WT:OP#Proposal_to_unblock_indeffed_IPs_en_masse. In particular we would like to know which IPs are rangeblocked or globally blocked. Your input would be appreciated. Thanks. -- zzuuzz (talk) 10:57, 23 December 2013 (UTC)

Global search and replace to fix links to archived discussions.

Here's a bot that would be super-useful:

Search for links like the one at https://en.wikipedia.org/wiki/Wikipedia:WikiProject_Inline_Templates#Created Replace broken links links to since-archived discussions to the archived discussion: Replace

https://en.wikipedia.org/wiki/Wikipedia_talk:WikiProject_Inline_Templates#Fact_template_discusison_needs_comments with

https://en.wikipedia.org/wiki/Wikipedia_talk:WikiProject_Inline_Templates/Archive_2#Fact_template_discusison_needs_comments !

Presumably, we'll need pilot runs, big runs, and ongoing maintenance runs. Anyone up for it?--Elvey (talk) 01:19, 12 December 2013 (UTC)

ClueBot already does this. Legoktm (talk) 01:34, 12 December 2013 (UTC)
Legoktm (talk · contribs), what are you talking about, exactly? ClueBot is inactive (on wikibreak). And no bot has fixed the link I cited above. Can I get ClueBot NG to fix such links? How? --Elvey (talk) 03:54, 24 December 2013 (UTC)
ClueBot only fixes the links when it archives pages. See the source (ctrl+f "Fixing links"). Even then, its implementation could be improved. Σσς(Sigma) 04:27, 24 December 2013 (UTC)
Aha. Thanks. Misza13 (talk · contribs)'s MiszaBot II is responsible for broken link and archive I cited above. I guess that's a reason to switch from MiszaBot II archiving to ClueBot III archiving?--Elvey (talk) 07:50, 24 December 2013 (UTC)