User talk:NKohli (WMF)

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search
The Signpost
1 October 2018

Contents

Users eligible for autopatrolled[edit]

Hello,

During a conversation with @WereSpielChequers: and later @IJBall: the subject of identifying users who might be viable candidates for being granted the autopatrolled privilege came up. There is an existing database report which is intended to provide this information, but the report as it stands today isn't very useful. The main issues (as far as I am aware) are that it is not filtered to show editors who have created articles recently (say, in the last 30 days) and is does not exclude bots and a few other minor things.

I wrote a quick Perl script which uses the MediaWiki API to create a report which does take into account these factors. I have run it a few times, and have been manually pasting the output into my userspace page here.

This isn't a long-term solution, because apart from anything else I am both busy and forgetful :)

WereSpielChequers would ideally like the original database report page to be updated with my version of the report, instead of Community Tech Bot's version, but obviously at present the bot and I would be in conflict with each other, and it would be terribly untidy. So there are a few options I can think of for how we might move forward, and I wonder if you might have any suggestions or input.

1. Perhaps the tidiest thing would be if the existing report could be modified to use the same criteria that I use in mine (which are described at the top of the page), and then I would just stop running my report entirely. I don't know how likely that is to happen on a short timescale, but perhaps you would have some insight on this?

2. Alternatively, perhaps Community Tech Bot could be told to stop updating that page, and I could update it manually for now (and probably seek bot approval to have it run automatically each week).

3. I could update some new page elsewhere in the Wikipedia: namespace, and the existing database report page could maybe redirect to it.

4. Some other option? Community Tech Bot running my code maybe?

I'm not very familiar with the database reports service, or Community Tech Bot, so there may be some background I'm missing here which would make everything clearer; but for now we have two things. We have a page where everyone is looking for this information, and we have a report which is regularly being run to provide the information they are looking for. Unfortunately the two things are not connected in any way :)

Would appreciate your thoughts,

Thparkth (talk) 16:40, 29 December 2015 (UTC)

Hi! Firstly my apologies for this late reply. I've been on vacation. Secondly, that's a fantastic idea, thank you. I'd be very happy to modify the bot to run your script. The only thing that I need is the SQL query you used to generate it. I'll update the bot when I get it. Thanks a lot for helping maintain the reports. :) NKohli (WMF) (talk) 17:28, 5 January 2016 (UTC)
Thanks for getting back to me, hope you enjoyed your vacation :)
The current code is not SQL-based but rather uses Perl and MediaWiki::API. Hopefully this week or next I will have time to look at a pure SQL version. I'll get back to you on this :)
Thparkth (talk) 02:47, 6 January 2016 (UTC)
  • OK, so I have a pure-SQL version of this now which I have tested and appears to work successfully. Note that it is an expensive and long-running query - it will probably take five to six hours to run. This is similar to the performance of the Perl version. The basic issue is that calculating how many articles a user created (with all the various conditions in place) is quite an expensive operation. It might be possible to optimize it further than I already have, but I have literally spend at least twelve hours refactoring this in various radical ways, and this is the fastest option I could find. Anyway, this code works, and I believe should be ready to go!
/*
        autopatrolled_eligible.sql

        Identify users who meet the criteria for being granted "autopatrolled"
        on the English Wikipedia but who don't already have it.

        Author: Andrew Crawford (thparkth) <acrawford@laetabilis.com>

        License: Public Domain

*/

SELECT
        /* "editor" consisting of user_name, wrapped in HTML tags linking to the sigma "created" tool */
        CONCAT(
                '<a href="https://tools.wmflabs.org/sigma/created.py?name=',
                user_name,
                '&server=enwiki&max=100&startdate=&ns=,,&redirects=none&deleted=undeleted">',
                user_name,
                '</a>'
              ) AS editor,
        /* derived column "created count" returned by this subquery */
        (
                SELECT
                        count(*)
                FROM
                        revision_userindex
                JOIN
                        page
                ON
                        page_id=rev_page
                WHERE
                                page_namespace=0
                        AND
                                rev_parent_id=0
                        AND
                                rev_user_text=user_name
                        AND
                                rev_deleted=0
                        AND
                                page_is_redirect=0
        ) AS created_count
FROM
        (
        /* This query returns users who have created pages in the last 30 days
           and who are not already members of autoreviewed */

        SELECT DISTINCT
                user_name
        FROM
                recentchanges
        JOIN
                user
        ON
                rc_user=user_id
        JOIN
                page
        ON
                rc_cur_id=page_id
        WHERE
                        /* User created a page within the last thirty days */
                        rc_timestamp>date_format(date_sub(NOW(),INTERVAL 30 DAY),'%Y%m%d%H%i%S')
                AND
                        /* It was an article */
                        rc_namespace=0
                AND
                        /* The user was human */
                        rc_bot=0
                AND
                        /* It was a new page */
                        rc_new=1
                AND
                        /* It's not a redirect */
                        page_is_redirect=0
                AND
                        /* User doesn't already have autoreviewer */
                        NOT EXISTS
                        (
                                SELECT 1 FROM user_groups WHERE ug_user=user_id AND (ug_group='autoreviewer' OR ug_group='sysop')
                        )
        ) as InnerQuery
HAVING
        created_count > 24
ORDER BY
        created_count DESC
LIMIT
        500
;

This can also be found as /data/project/thparkthsql/autopatrolled_eligible.sql on Tool Labs.

By the way, I enjoy this kind of work, and I am gaining some familiarity with the schema and the Tool Labs environment so please do let me know if there is a backlog I could be helping with.

Cheers,

Thparkth (talk) 14:11, 7 January 2016 (UTC)

Hi @Thparkth: I'm extremely sorry it took me this long to get back to you. I just updated the report. Let's hope this works. I'm a little worried since it's an expensive query and often queries running more than a couple of hours are killed abruptly. Aside, would you like to be a co-maintainer for the database-reports labs tool? There are several other reports lying dead which could use some love. :) It's a simple python-based bot, whose code can be seen at https://github.com/Niharika29/database-reports Thank you so much! -- NKohli (WMF) (talk) 09:47, 15 April 2016 (UTC)
Thanks NKohli, that's great. Thanks for tolerating my eccentric SQL indentation style ;). Virtually all of the processing time is spent in working out the created count for each user. An index on rev.rev_parent_id would probably help a great deal with that. I will probably go back to this now and try to find away to avoid using that column entirely; finding the lowest rev.rev_id for the page should be logically the same thing and it's a primary key!
Yes, I would love to be a co-maintainer for database-reports, and thank-you for the opportunity.
Thparkth (talk) 10:44, 15 April 2016 (UTC)
I added you as a maintainer for the tool. Are you on Github? Then I can add you as a collaborator on the repo and you can push code easily. :) Also, are you on IRC? -- NKohli (WMF) (talk) 15:34, 15 April 2016 (UTC)
I just created a github user (thparkth). I am very familiar with git but less so with github. I am "thparkth" on IRC also. Thparkth (talk) 16:37, 15 April 2016 (UTC)

Welcome to The Wikipedia Adventure![edit]

TWA guide left bottom.png
Hi NKohli (WMF)! We're so happy you wanted to play to learn, as a friendly and fun way to get into our community and mission. I think these links might be helpful to you as you get started.

-- 15:58, Wednesday, January 13, 2016 (UTC)

Get Help
About The Wikipedia Adventure | Hang out in the Interstellar Lounge

API change will break your bot[edit]

Hi Niharika,

Did you see the announcement that http:// access to the API is going away in a few weeks? It looks like the Community Tech bot is going to break as a result of this change.

BTW, I'm encouraging people who need help updating their code to ask at w:en:Wikipedia:Bot owners' noticeboard or on the mailing list. I don't know whether there will be any requests, but if you can help, it'd probably make someone's day much happier. :-) Whatamidoing (WMF) (talk) 23:50, 19 May 2016 (UTC)

Thank you for pointing this out to me, Whatamidoing (WMF)! I did indeed miss the announcement. I'll fix up the bot over the weekend and try to keep an eye on the mailing list and the noticeboard as well. Really happy to help. :) Thank you so much. NKohli (WMF) (talk) 03:29, 20 May 2016 (UTC)
Hey Whatamidoing (WMF) Can you point me to where I should be making the https fix? I couldn't see where I'm using http in my code instead of https (in a very cursory glance through). Thanks! -- NKohli (WMF) (talk) 06:43, 21 May 2016 (UTC)
I don't really know anything except that your bot was on the list. :-(
Brandon Black in Ops should be able to provide you with more information. Whatamidoing (WMF) (talk) 05:11, 22 May 2016 (UTC)
Okay! Thank you. :) -- NKohli (WMF) (talk) 06:07, 22 May 2016 (UTC)

Database Report[edit]

Hi Niharika. Wikipedia:Database reports/Unused templates has been broken for over two years, and I haven't had much luck finding someone from the community interested in getting it back up and running. Is this something you or others on your team might be able to help with? If it's a technically trivial extension, low transclusion templates (1 or 2 transclusions) would be of interest as well. ~ Rob13Talk 05:27, 27 July 2016 (UTC)

Hi Rob, thanks for bringing this to my attention. I'd try and look into it soon. -- NKohli (WMF) (talk) 16:24, 28 July 2016 (UTC)

Community Tech bot[edit]

I noticed that Community Tech bot has stopped updating as of the end of August. Can you give the bot a push and get it working again? Or if this is a known problem, do you have an ETA on when it might be back up? Thanks. Pinging Thparkth on this, too. -- Gogo Dodo (talk) 01:58, 7 September 2016 (UTC)

Hey @Gogo Dodo:, thanks for the ping. I'll look into the problem today. -- NKohli (WMF) (talk) 08:30, 7 September 2016 (UTC)
Okay, this should be fixed now. The bot will resume updating pages as usual. Feel free to report back if you find any more problems with this. Thanks! -- NKohli (WMF) (talk) 11:51, 7 September 2016 (UTC)
Thank you for fixing it. It is updating now. While I'm here, I noticed that a few talk pages listed are false positives and have been appearing for awhile now. Can you check it out? The ones that keep reappearing are:
It is a rather bizarre mix and I don't see any obvious connection between them. Thanks again. -- Gogo Dodo (talk) 05:19, 11 September 2016 (UTC)
Hi @Gogo Dodo: which report is this for? -- NKohli (WMF) (talk) 11:15, 12 September 2016 (UTC)
It's Wikipedia:Database reports/Orphaned talk pages. All the above pages except Talk:DDWG were created between 22 April and 3 May 2016. DDWG was moved to that name 6 May 2016. Maybe a database got some bad entries in that period. I recall an incident where some pages were designated as mainspace pages somewhere even though their name specified another namespace, but I think that was longer ago. PrimeHunter (talk) 11:42, 12 September 2016 (UTC)
The latest database reports by the bot have omitted <onlyinclude>...</onlyinclude> around the timestamp. Example: [1]. It's needed to only transclude the timestamp and not the whole report at Wikipedia:Database reports. I have fixed two reports so right now there is no problem at Wikipedia:Database reports. PrimeHunter (talk) 01:00, 10 September 2016 (UTC)
Hi @PrimeHunter: I don't see any difference between [2] and [3]. What's different? -- NKohli (WMF) (talk) 15:24, 10 September 2016 (UTC)
The page itself looks the same. onlyinclude only affects how it looks when it's transcluded on another page like Wikipedia:Database reports where only the part inside <onlyinclude>...</onlyinclude> will be shown. When the bot version was current the entire page was shown in a single cell in the "Data as of" column at the "PRODed articles with deletion logs" row. My edit [4] means only "06:30, 10 September 2016 (UTC)" is currently shown in that cell. For testing, try previewing this section without nowiki around the below code. PrimeHunter (talk) 15:49, 10 September 2016 (UTC)
Understood. I'll fix it by Sunday. Thanks! -- NKohli (WMF) (talk) 18:01, 10 September 2016 (UTC)
I have added a parameter to omit the transclusion for now so there is no rush.[5] PrimeHunter (talk) 11:02, 12 September 2016 (UTC)
Thanks for that. This should be fixed now (in future report updates, that is). I pushed those reports today so they should update soon. I'll try to keep an eye on it and make sure it behaves. -- NKohli (WMF) (talk) 11:12, 12 September 2016 (UTC)
Thanks for adding onlyinclude taqs. The timestamp should be inside the tags like the old revision in [6] in order to display on transclusion at Wikipedia:Database reports. The current revision only displays "..." on transclusion. PrimeHunter (talk) 11:10, 13 September 2016 (UTC)
Oops, sorry. I changed it to <onlyinclude>~~~</onlyinclude>. Is that okay? Thanks. -NKohli (WMF) (talk) 15:43, 15 September 2016 (UTC)
You need five tildes for a timestamp alone. Three tildes make a username without timestamp so this should work: ~~~ <onlyinclude>~~~~~</onlyinclude>. PrimeHunter (talk) 18:47, 15 September 2016 (UTC)
It's working perfectly now. I manually moved the timestamp in some weekly and monthly reports which I guess will automatically do it in future updates. Regarding the above post by Gogo Dodo, Wikipedia talk:Database reports#Zombies in the database shows a report by another bot is also unable to detect that Category:216 BC births and Category:235 BC births exist, so I guess it's a database issue and not a problem in the bot code. PrimeHunter (talk) 11:36, 17 September 2016 (UTC)

[[Wikipedia:Database reports/PRODed articles with deletion logs]]: {{Wikipedia:Database reports/PRODed articles with deletion logs}} [[Wikipedia:Database reports/Articles by size]] (doesn't use onlyinclude but isn't currently transcluded at [[Wikipedia:Database reports]]: {{Wikipedia:Database reports/Articles by size}}

User:NKohli (WMF)/megawatch.js[edit]

To get around the 50 page restriction, you can split the result array into multiple arrays each with 50 pages...

//Split into lists of 50 (max number for api)
result_split = new Array( Math.ceil(result.length/50) );
for (var j=0; j<result.length; j+=50) {
	result_split[j/50] = result.slice(j, j+49);
}

and then make multiple api calls with a loop...

var count=0;
for (var ii = 0; ii < result_split.length; ii++) {
	new mw.Api().get( {
	// ... (your code) ...
	titles: result_split[ii].join( '|' )
	// ... (your code) ...
	.done( function () { 
		count++;
		if (count === result_split.length) {
			$( '#megawatch' ).addClass( 'watched' );
		}
	} );
}

Though you might still want to limit the number of watched pages to a reasonable number, as there are some pretty large categories out there. - Evad37 [talk] 23:48, 27 January 2017 (UTC)

This is useful. Thanks! I'll update the script soon. :) -- NKohli (WMF) (talk) 11:06, 2 February 2017 (UTC)

Your BRFA[edit]

Your BRFA, Wikipedia:Bots/Requests for approval/Community Tech bot 3 has been approved. — xaosflux Talk 22:16, 22 February 2017 (UTC)

Thanks! -- NKohli (WMF) (talk) 03:46, 23 February 2017 (UTC)

Your feedback matters: Final reminder to take the global Wikimedia survey[edit]

Popular pages report[edit]

What is the popular pages report and should WP:CHICAGO have one?--TonyTheTiger (T / C / WP:FOUR / WP:CHICAGO / WP:WAWARD) 20:02, 24 March 2017 (UTC)

Hi! The popular pages report shows the top-viewed 500/1000 projects in a Wikiproject in the past month. See Wikipedia:WikiProject Alaska/Popular pages for example. There was a bot, Mr. Z bot which used to generate these reports previously but it died a few months ago and the reports were broken. So we made a new bot to do the same job. If you do not want a report for WikiProject Chicago I can remove it from the bot config. The current report is at Wikipedia:WikiProject Chicago/Popular pages. Thanks. -- NKohli (WMF) (talk) 20:16, 24 March 2017 (UTC)

User:Community Tech bot/Popular pages config.json[edit]

How do I set up User:Community Tech bot/Popular pages config.json for a project? Looking to do it for Wikipedia:WikiProject Disney and Wikipedia:WikiProject Amusement Parks. Elisfkc (talk) 03:59, 27 April 2017 (UTC)

Hi Elisfkc, I see that both WikiProject Disney and Amusement Parks are listed in that config. So that means that the bot is already working on generating their reports. The bot is running a little behind schedule since this is its second run and the code is getting changed frequently but both of those projects should be updated soon. You can find the reports here: Wikipedia:WikiProject Amusement Parks/Popular pages and Wikipedia:WikiProject Disney/Popular pages. -- NKohli (WMF) (talk) 04:51, 27 April 2017 (UTC)
Oh, I just didn't see them. Thanks. Elisfkc (talk) 13:47, 27 April 2017 (UTC)

Wikipedia:WikiProject New York City Public Transportation/Popular pages[edit]

Could you possibly help set up this page? Thanks.--Kew Gardens 613 (talk) 12:17, 6 August 2017 (UTC)

Done! Sorry for the delay, I was on vacation. Let me know if anything looks out of place. Thank you! -- NKohli (WMF) (talk) 22:41, 9 August 2017 (UTC)

Tech News: 2017-40[edit]

23:25, 2 October 2017 (UTC)

Tech News: 2017-41[edit]

14:21, 9 October 2017 (UTC)

Tech News: 2017-42[edit]

15:31, 16 October 2017 (UTC)

Tech News: 2017-43[edit]

18:18, 23 October 2017 (UTC)

Tech News: 2017-44[edit]

00:20, 31 October 2017 (UTC)

Tech News: 2017-45[edit]

18:45, 6 November 2017 (UTC)

Tech News: 2017-46[edit]

19:19, 13 November 2017 (UTC)

Tech News: 2017-47[edit]

19:19, 20 November 2017 (UTC)

Tech News: 2017-48[edit]

20:30, 27 November 2017 (UTC)

Tech News: 2017-49[edit]

17:51, 4 December 2017 (UTC)

Tech News: 2017-50[edit]

17:58, 11 December 2017 (UTC)

Tech News: 2017-51[edit]

15:27, 18 December 2017 (UTC)

Tech News: 2018-02[edit]

16:19, 8 January 2018 (UTC)

Tech News: 2018-3[edit]

18:46, 15 January 2018 (UTC)

Tech News: 2018-04[edit]

23:56, 22 January 2018 (UTC)

Tech News: 2018-05[edit]

17:07, 29 January 2018 (UTC)

Tech News: 2018-06[edit]

20:51, 5 February 2018 (UTC)

Tech News: 2018-07[edit]

21:59, 12 February 2018 (UTC)

Tech News: 2018-08[edit]

22:55, 19 February 2018 (UTC)

Tech News: 2018-09[edit]

19:52, 26 February 2018 (UTC)

Tech News: 2018-10[edit]

17:12, 5 March 2018 (UTC)

Tech News: 2018-11[edit]

19:44, 12 March 2018 (UTC)

Tech News: 2018-12[edit]

15:03, 19 March 2018 (UTC)

Tech News: 2018-13[edit]

20:04, 26 March 2018 (UTC)

Tech News: 2018-14[edit]

19:28, 2 April 2018 (UTC)

Tech News: 2018-15[edit]

18:08, 9 April 2018 (UTC)

Tech News: 2018-16[edit]

15:21, 16 April 2018 (UTC)

Tech News: 2018-17[edit]

18:16, 23 April 2018 (UTC)

Tech News: 2018-18[edit]

16:18, 30 April 2018 (UTC)

Tech News: 2018-19[edit]

16:28, 7 May 2018 (UTC)

Tech News: 2018-20[edit]

22:23, 14 May 2018 (UTC)

Tech News: 2018-21[edit]

17:34, 21 May 2018 (UTC)

Tech News: 2018-22[edit]

12:40, 29 May 2018 (UTC)

Tech News: 2018-23[edit]

21:54, 4 June 2018 (UTC)

Tech News: 2018-24[edit]

21:55, 11 June 2018 (UTC)

Tech News: 2018-25[edit]

21:47, 18 June 2018 (UTC)

Tech News: 2018-26[edit]

23:10, 25 June 2018 (UTC)

Tech News: 2018-27[edit]

00:46, 3 July 2018 (UTC)

Tech News: 2018-28[edit]

23:09, 9 July 2018 (UTC)

Tech News: 2018-29[edit]

16:01, 16 July 2018 (UTC)

Tech News: 2018-30[edit]

09:44, 24 July 2018 (UTC)

Tech News: 2018-31[edit]

14:05, 30 July 2018 (UTC)

Tech News: 2018-32[edit]

19:39, 6 August 2018 (UTC)

Tech News: 2018-33[edit]

17:53, 13 August 2018 (UTC)

Tech News: 2018-34[edit]

16:46, 20 August 2018 (UTC)

Tech News: 2018-35[edit]

16:16, 27 August 2018 (UTC)

Tech News: 2018-36[edit]

16:47, 3 September 2018 (UTC)

Tech News: 2018-37[edit]

22:35, 10 September 2018 (UTC)

Tech News: 2018-38[edit]

21:58, 17 September 2018 (UTC)

Tech News: 2018-39[edit]

15:22, 24 September 2018 (UTC)

Tech News: 2018-40[edit]

17:34, 1 October 2018 (UTC)

Tech News: 2018-41[edit]

23:38, 8 October 2018 (UTC)

Tech News: 2018-42[edit]

22:40, 15 October 2018 (UTC)

Tech News: 2018-43[edit]

23:11, 22 October 2018 (UTC)