Wikipedia:Bots/Requests for approval/SvickBOT 3
- The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was Approved.
Operator: Svick (talk · contribs)
Automatic or Manually assisted: Automatic
Programming language(s): C#
Source code available: [1]
Function overview: Update Wikipedia:Dusty articles
Links to relevant discussions (where appropriate):
Edit period(s): Probably once a week.
Estimated number of pages affected: 2: Wikipedia:Dusty articles/List and Wikipedia:Dusty articles/Updated
Exclusion compliant (Y/N): No
Already has a bot flag (Y/N): No
Function details:
The list at Wikipedia:Dusty articles was updated by User:DustyBot until February when it stopped working. This bot will perform the same function – it creates a list of articles that weren't edited the longest (excluding disambiguation pages) based on database dump and current data from the API and publishes 100 of them. The original bot was updating the page every day. I think that's unnecessary and once a week is enough.
I have already tested the bot: [2]. Svick (talk) 15:06, 27 June 2010 (UTC)[reply]
Discussion
[edit]Bots may not be tested outside of the bot's or operator's own userspace until after a trial has been approved. Anomie⚔ 15:26, 27 June 2010 (UTC)[reply]
- I thought it's okay because it was only those two edits and on pages that are in the Wikipedia namespace and where it can't cause any harm. It won't happen again. Svick (talk) 18:55, 27 June 2010 (UTC)[reply]
- Ok. I would like to review the code, and why did you use dashes to separate the time components? Anomie⚔ 19:08, 27 June 2010 (UTC)[reply]
- I made the output to be exactly the same as DustyBot's. The hyphens between time components are a (now fixed) bug. The code is available at [3]. It uses my library WpApiLib (source included in the archive), but most of it is not used in this app. Svick (talk) 21:18, 27 June 2010 (UTC)[reply]
- Here are the comments I have on your code:
- One thing I find quite useful for testing is the ability to set a flag that makes my library's edit function just write the proposed edit to a file. Or you could have it show you a diff of each proposed edit, the point being to be able to see what the bot would do without risking disrupting Wikipedia. Doing this is entirely optional, of course, but makes testing very easy.
- Something like this seems like a very good idea when editing articles, but I think it's not very useful for this task: the bot regenerates the page every time from scratch, using straightforward code and edits only pages where disruption would be minimal.
- The editing function in your library should specify the starttimestamp option (which is copied from the return property of the same name in your prop=info&intoken=edit call). Otherwise you may run into T17647, although that's not likely to happen in this task. Once you add that in, basetimestamp should be defaulted to that rather than left out if the latest revision timestamp is unavailable.
- Done.
- I see you use assert=user. Good!
- updateNeeded() should check the time since the bot last edited, rather than the time since anyone last edited. As written, someone could prevent the bot from ever editing by making sure to edit the page themself every 6 days.
- Yeah, fixed.
- It looks like the bot doesn't actually check if the database load fails. I guess that wouldn't matter except on the initial run, as it seems it keeps the database between runs.
- I'm not sure what you mean by this. If you mean the variable
loadFailed
inUpdate()
then that's meant only for the case when the bot reaches the newest edit from the dump, to keep it from entering infinite loop.
- I'm not sure what you mean by this. If you mean the variable
- It also looks like loadFromDump() will never indicate success, as you set Settings.Default.StartDate to max.LastEdit and then return success only if max.LastEdit > Settings.Default.StartDate.
- Fixed.
- Will the dump loader correctly handle the situation when you feed it a new file, where some of the articles in the database from the old dump have newer revision dates and some of the articles have been deleted? Either case is somewhat likely, as the point of this task is to give editors a list of articles that haven't been touched in a long time. Or do you intend to empty the database before loading a new dump?
- After the bot reads the dump for the first time, it will read it again only when the SQL database table is almost empty. If it tries to add an article that's already there, it will keep the old version. I also don't do anything about deleted files when reading dump. So, in both cases, if new info is available from the dump, the bot ignores it, but querying the API will solve both issues, so the result should be correct.
- Consider reading the list of namespace prefixes from the <namespaces> node in the dump, instead of hard-coding them into the bot.
- Good idea, done.
- For each page that ends up in the list and each non-dab page you reject, you make 2 API queries. For each dab page you reject, you make one API query. And even if the first 101 pages give you your 100, you still do this for the next 99. You could cut that down tremendously by making a query for prop=categories|revisions&rvprop=timestamp&clcategories=Category:All+disambiguation+pages&cllimit=max and packing up to 500 pages into the titles parameter (and then it wouldn't matter much if you query a few hundred extra pages).
- I haven't thought about that, done. Also, the limit for SvickBOT is 50 pages. I suppose that's because it doesn't have the bot flag.
- One thing I find quite useful for testing is the ability to set a flag that makes my library's edit function just write the proposed edit to a file. Or you could have it show you a diff of each proposed edit, the point being to be able to see what the bot would do without risking disrupting Wikipedia. Doing this is entirely optional, of course, but makes testing very easy.
- HTH Anomie⚔ 02:09, 28 June 2010 (UTC)[reply]
- Thanks for the review. Updated code is at the same address. Svick (talk) 02:36, 29 June 2010 (UTC)[reply]
- Sorry, I must have seen this on my watchlist when I didn't have time to reply and then forgot to come back to it. In the future, feel free to post {{BAGAssistanceNeeded}} after about a week of no replies. Changes look good, and I think I understand just what the bot is doing with the database now. Let's give it a try. Approved for trial (
510 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete. Anomie⚔ 16:45, 15 July 2010 (UTC)[reply]- Thanks, I have started the trial. Svick (talk) 10:49, 16 July 2010 (UTC)[reply]
- The trial has been finished. Can you review the edits? Thanks. Svick (talk) 17:46, 19 August 2010 (UTC)[reply]
- Thanks, I have started the trial. Svick (talk) 10:49, 16 July 2010 (UTC)[reply]
- Sorry, I must have seen this on my watchlist when I didn't have time to reply and then forgot to come back to it. In the future, feel free to post {{BAGAssistanceNeeded}} after about a week of no replies. Changes look good, and I think I understand just what the bot is doing with the database now. Let's give it a try. Approved for trial (
- Thanks for the review. Updated code is at the same address. Svick (talk) 02:36, 29 June 2010 (UTC)[reply]
- Here are the comments I have on your code:
- I made the output to be exactly the same as DustyBot's. The hyphens between time components are a (now fixed) bug. The code is available at [3]. It uses my library WpApiLib (source included in the archive), but most of it is not used in this app. Svick (talk) 21:18, 27 June 2010 (UTC)[reply]
- Ok. I would like to review the code, and why did you use dashes to separate the time components? Anomie⚔ 19:08, 27 June 2010 (UTC)[reply]
Approved. No issues with the trial. Anomie⚔ 19:55, 19 August 2010 (UTC)[reply]
- The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.