Wikipedia:Bots/Requests for approval/KSFT bot
- The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was Withdrawn by operator.
Operator: KSFT (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)
Time filed: 19:22, Thursday, June 23, 2016 (UTC)
Automatic, Supervised, or Manual: Automatic
Programming language(s): Python
Source code available: https://github.com/KSFTmh/football-bot
Function overview: This bot should add football player articles to team categories.
Links to relevant discussions (where appropriate): WP:BOTREQ#Missing category identification
Edit period(s): One-time run
Estimated number of pages affected: I plan to try to run it on as many football player biography articles as I can. Based on Category:Association football players by nationality, which I could use to find all of them, there appear to be on the order of tens of thousands.
Exclusion compliant (Yes/No): Yes
Already has a bot flag (Yes/No): No
Function details: This bot should go through articles about football players, check the infobox parameters "clubs1" through "clubs40" and "currentclub", then add the article to categories of the format "Category:[team] players" for each team the person has played on. This is the first time I've written a Wikipedia bot, so I expect that it will take a while for me to get it working correctly.
Discussion
[edit]Is this something that there has been a discussion on elsewhere? (Are you the sole "requester" of this task?) — xaosflux Talk 19:34, 23 June 2016 (UTC)[reply]
- It was requested on WP:BOTREQ. Is this something that needs consensus? It seems like uncontroversial maintenance to me, but, again, I have no other experience with bots on Wikipedia. I wanted to learn more about writing bots, and this request looked easyish, so I took it. KSFTC 19:41, 23 June 2016 (UTC)[reply]
- I just noticed you left the links to discussion part blank. — xaosflux Talk 19:43, 23 June 2016 (UTC)[reply]
- I didn't think that was a "discussion", but I've added it now. KSFTC 19:57, 23 June 2016 (UTC)[reply]
- I just noticed you left the links to discussion part blank. — xaosflux Talk 19:43, 23 June 2016 (UTC)[reply]
- For adding to [[:Category:[team] players]] - will you only be adding if the category exists? — xaosflux Talk 19:46, 23 June 2016 (UTC)[reply]
- I'm...not sure. Should I? KSFTC 19:57, 23 June 2016 (UTC)[reply]
As the person who raised this at the Bot requests page, I didn't want the bot to add the categories, I wanted it to identify missing categories and produce a list. There are going to be lots of complications with club identification which will need further sorting (see list below), so I thought it would be good for the bot to produce an output list which we can work on. The specific issues are:
- Missing categories (which has already been mentioned). I would like to create many of these, so the bot missing these off won't help.
- Renamed clubs – many clubs have changed names over the years (e.g. Newton Heath became Manchester United), so whilst the categories are at the current names, players who played for the clubs at the time of their former names would have that in the infobox, and so a matching category to add players to will not exist.
- Misnamed clubs – many articles do not have exactly the correct name of the club in the infobox (for instance linking to Manchester United rather than Manchester United F.C. or Wrexham F.C. rather than Wrexham A.F.C.), so the bot adding categories would in theory miss these out as there is no category for the incorrect link.
Happy to answer any more questions. Cheers, Number 57 12:05, 24 June 2016 (UTC)[reply]
- Oh, I misunderstood the request. Why do you want a list when a bot can fix it automatically instead? KSFTC 12:50, 24 June 2016 (UTC)[reply]
- Because I don't think it's possible to programme it to spot all the potential issues mentioned in points 2 and 3 above (there will be thousands of exceptions). Plus you've said it won't be creating new categories, so there needs to be a list of the missing ones to be created anyway. Number 57 15:39, 24 June 2016 (UTC)[reply]
- It currently doesn't create categories only because I thought that wasn't what you wanted; it would be easy to change that. If you just want a list of articles with discrepancies between the infobox and the categories, we wouldn't need a bot that edits pages, so we don't need this BRFA. I can generate a list like that, but I'm not sure how a bot would be able to tell whether a team had changed its name. KSFTC 19:19, 24 June 2016 (UTC)[reply]
- If you can generate a list, that would be great – I can use Excel or something to sort it. How do you generate such a list out of interest? Number 57 16:23, 26 June 2016 (UTC)[reply]
- @Number 57: I can use the API to get all the articles in a category, like Category:English footballers, and then check the "currentclub" and "clubs1" through "clubs40" parameters of the infobox, then check category links and compare them. I can check (imperfectly) for misnamed clubs by comparing the first words of the names; if they're the same, but the whole text is different, there's probably a mistake. There appear to be thousands to tens of thousands of articles just in Category:English footballers with discrepancies. If you're sure you don't just want them fixed automatically, I can withdraw this BRFA and give you a list. KSFTC 16:29, 26 June 2016 (UTC)[reply]
- I'd rather have the list if possible – I am worried there will be too many exceptions to make a bot workable. Cheers, Number 57 16:33, 26 June 2016 (UTC)[reply]
- @Number 57: I'm now running it on that category. I'll put the list somewhere in my userspace and update here when it's done, which should be in a few minutes. Once we've confirmed that it's mostly working, I'll run it on more articles. Let's continue this discussion on my talk page so I can withdraw this request and it can be closed. KSFTC 16:44, 26 June 2016 (UTC)[reply]
- I'd rather have the list if possible – I am worried there will be too many exceptions to make a bot workable. Cheers, Number 57 16:33, 26 June 2016 (UTC)[reply]
- It currently doesn't create categories only because I thought that wasn't what you wanted; it would be easy to change that. If you just want a list of articles with discrepancies between the infobox and the categories, we wouldn't need a bot that edits pages, so we don't need this BRFA. I can generate a list like that, but I'm not sure how a bot would be able to tell whether a team had changed its name. KSFTC 19:19, 24 June 2016 (UTC)[reply]
- Because I don't think it's possible to programme it to spot all the potential issues mentioned in points 2 and 3 above (there will be thousands of exceptions). Plus you've said it won't be creating new categories, so there needs to be a list of the missing ones to be created anyway. Number 57 15:39, 24 June 2016 (UTC)[reply]
Per the above discussion, this bot is Withdrawn by operator. KSFTC 16:46, 26 June 2016 (UTC)[reply]
- The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.