Wikipedia talk:List of Wikipedians by number of edits

From Wikipedia, the free encyclopedia
Jump to: navigation, search

When two people have an equal number of edits ...[edit]

When two people have an equal number of edits, why can't they be listed as equal in rank? I am quite sure the computer program (or whatever it is) can handle this simple math calculation. No? Joseph A. Spadaro (talk) 04:10, 11 May 2016 (UTC)

Anyone? Joseph A. Spadaro (talk) 04:32, 7 June 2016 (UTC)
Pinging MZMcBride, the operator of the bot that updates the lists. SiBr4 (talk) 11:32, 7 June 2016 (UTC)
Thanks. Joseph A. Spadaro (talk) 16:15, 7 June 2016 (UTC)
Hi Joseph A. Spadaro. The numbering is simple enumeration, not a ranking. ;-) --MZMcBride (talk) 20:57, 8 June 2016 (UTC)
@MZMcBride: Hi. Thanks. Of course, it's a ranking. If it is indeed simply an enumeration, what order is the list in? Is it in alphabetical order? If it were simply an enumeration, the list would have all 5,000 people in alphabetical order with an indication of how many edits they have. Then, the number would be an enumeration of the people in the list. This list is ranked from Person #1 down to Person #5,000. And the highest person is ranked (not enumerated) as #1 and the 5000th person is ranked (not enumerated) as #5000. Or are you saying that it's a huge coincidence that the listing of how many edits a user has is listed in decreasing order? That's just a simple coincidence, that they are enumerated that way? And if the list is called "Top 5,000 Editors", we know that the enumeration is 5,000. Why would we need it enumerated at all, in that case? All in all, your reply does not make any sense at all. And you are using semantics. We all know it's a rank, not an enumeration. If indeed it were an enumeration, there would be no need for the number (since we know the list is a list of 5,000 people). What part am I missing? Joseph A. Spadaro (talk) 00:25, 9 June 2016 (UTC)
Hi Joseph A. Spadaro. I suppose the part you're missing is the emoticon (;-)). My reply was only half-serious.
This list contains 10,000 users, not 5,000. It's possible (and perhaps even common!) to have both ordering/sorting and simple enumeration. Have you used spreadsheet software? Imagine you took a list of numbers in a spreadsheet and sorted them by value. You'd still have simple enumeration of the cells, even with an ordered/sorted list. I can provide a screenshot if you'd like. --MZMcBride (talk) 22:33, 22 November 2016 (UTC)
@MZMcBride: Thanks. But I have no idea what you are talking about. First, when I look at the article, my computer screen shows 5,000 editor names. Not 10,000. I don't know why you are saying 10,000? Second, yes, I missed the emoticon. Third, yes, I am very familiar with Excel spreadsheets and the sort function. Fourth, I don't want to engage in semantics of "enumerating" versus "ranking". It is clear that this list ranks the editors from highest count to lowest count. The list is not simply "enumerating" or counting them. We already know that there are 5,000. So, there is no point in enumerating (counting) them. Either way, we are getting off topic. My question is about listing "equal" ranks when the edit counts are the same. Thanks. Joseph A. Spadaro (talk) 00:56, 23 November 2016 (UTC)
Hi Joseph A. Spadaro. When you visit Wikipedia:List of Wikipedians by number of edits, there are not 5,000 editor names. Some of them (about 50) have been replaced by the text "[Placeholder]", so it's a bit under 5,000. The list itself is made up of ten subpages, each with 1,000 entries. "Page 2" of the list is available at Wikipedia:List of Wikipedians by number of edits/5001–10000. I feel like we used to make this clearer. It's mentioned a couple of times at Wikipedia:List of Wikipedians by number of edits, but this continuation of the report should probably have its own subsection so that it's less likely to be overlooked.
While it may be clear to you that this list is a ranking, it's less clear to me. In the same way that Excel would refer to duplicate or triplicate values in a sorted list as having different cell enumerations, this list similarly does not mark entries where the alleged edit count value is the same for more than one user. It wouldn't be particularly difficult to change the simple enumeration to be more like a ranking, but I'm not sure doing so would be a good idea. --MZMcBride (talk) 03:54, 23 November 2016 (UTC)
Ok. Thanks. Joseph A. Spadaro (talk) 04:39, 23 November 2016 (UTC)

(unindent) The "5001–10000" section heading was removed in this edit from February 2016.

As mentioned at the top of this talk page, the script used to generate this report is available at Wikipedia:List of Wikipedians by number of edits/Configuration. In thinking about this discussion a bit more, perhaps removing the numbering altogether would be best. --MZMcBride (talk) 06:16, 23 November 2016 (UTC)

IIRC the page did list 7,000 or perhaps 8,000 at one time (it never showed the full 10,000) - it maxed out somewhere around the 8,000 level, and since it was taking far too long to render, some years ago we agreed to cut it back to 5,000. --Redrose64 (talk) 10:19, 23 November 2016 (UTC)

Question[edit]

I don't know how easy or how complicated it is to gather the data used for the list on this page. I assume it's all done by computer and, as such, it is not particularly difficult for any one specific editor. So I have a question and/or suggestion. Is it possible to add another column (or two) that indicates an editor's rank/standing now; and the previous rank/standing; and perhaps the difference? Is that feasible or no? For example, the list would say, in various columns: User Name = John Smith; Number of Edits = 43,987; Rank = 732; Previous Rank = 740; Change = +8 (or similar). Thanks. Joseph A. Spadaro (talk) 16:26, 29 June 2016 (UTC)

I think that the standard wikipedia answer, as it always is, is, no problem, especially when it is " not particularly difficult," DO IT. Carptrash (talk) 18:14, 29 June 2016 (UTC)
Huh? I didn't understand a word you said. Joseph A. Spadaro (talk) 19:48, 29 June 2016 (UTC)

Generating this report for other Wikimedia wikis[edit]

I got asked about generating this report for/on another Wikimedia wiki, specifically the Bengali Wikipedia. I ended up writing a reply with pretty specific instructions that others may find useful or interesting. A slightly cleaned up version of this reply is pasted below. --MZMcBride (talk) 04:49, 20 July 2016 (UTC)

The code lives at <https://github.com/mzmcbride/database-reports>, sort of.

For that particular report, the code actually lives here: <https://en.wikipedia.org/wiki/Wikipedia:List_of_Wikipedians_by_number_of_edits/Configuration>.

Do you have a shell user account on Wikimedia Tool Labs? You can create a text file on Wikimedia Tool Labs and then run "python your_file_name.py" to generate the report. If you encounter errors, just paste the error text in this e-mail thread or in a pastebin and I can look and we can figure out what went wrong.

It looks like the version of wikitools globally installed on Wikimedia Tool Labs doesn't work well with bn.wikipedia.org/w/api.php, so you'll need to locally install wikitools. And because you'll then be using a virtual environment version of Python, you'll also need to install MySQLdb locally.

Try this:

username = 'bot name goes here'
password = 'password goes here'
editsumm = 'Bot: Updated page.'
dbname = 'bnwiki_p'
host = 'bnwiki.labsdb'
rootpage = 'Wikipedia:Database reports/'
apiurl = 'https://bn.wikipedia.org/w/api.php'
  • Then run this command: chmod 600 settings.py
  • Then run this command: git clone "https://github.com/alexz-enwp/wikitools.git" wikitools
  • Then run this command: virtualenv venv
  • Then run this command: source venv/bin/activate
  • Then run this command: cd wikitools
  • Then run this command: python setup.py install
  • Then run this command: pip install "MySQL-python"
  • Then run this command: cd ..

Now you should be in a directory with settings.py, editcount.py, a wikitools directory, and a venv directory; if you run the "ls" command you should now see:

(venv)tools.mzmcbride@tools-bastion-03:~/scripts/bnwiki$ ls
editcount.py  settings.py  venv  wikitools

You can verify that you have installed the correct versions of wikitools and MySQLdb by running these commands:

(venv)tools.mzmcbride@tools-bastion-03:~/scripts/bnwiki$ python -c 'import MySQLdb; print(MySQLdb.__version__)'
1.2.5

and...

(venv)tools.mzmcbride@tools-bastion-03:~/scripts/bnwiki$ python -c 'import wikitools; print(wikitools.VERSION)'
1.4
  • Finally, run this command: python editcount.py

That should do it. I followed these steps and began creating a report on bn.wikipedia.org: <https://bn.wikipedia.org/wiki/%E0%A6%AC%E0%A6%BF%E0%A6%B6%E0%A7%87%E0%A6%B7:Contributions/BernsteinBot>.

I hit two issues. I don't have a bot account on bn.wikipedia.org, so I hit edit rate limits. And there's no "Template:aug" on bn.wikipedia.org, so the user group abbreviations don't work properly.

List of Wikipedians by number of non-automated edits[edit]

It recently came to my attention that there is a possibility to count non-automated edits separately:

Is it relatively easy to generate the NoAuto edit list? IMO it will be a better indicator of content creation activity as compared to mopping up. Not that I disrespect the mop; I am doing my share of mopping, but I kinda feel uneasy this list dominated by "maintenance editors". Can someone to do an experimental NoAuto list, to see how it looks like? I am asking this because I cannot check by myself online for top 100 (see 'Koavf' above). Staszek Lem (talk) 19:32, 13 October 2016 (UTC)

Hi Staszek Lem. I guess it depends on your definition of relatively easy. :-) You'd need to go through the contributions of every user, count the automated edits, count the non-automated edits, and then generate a report based on that. It's a somewhat straightforward task, but you're talking about sorting and counting about 860 million edits, according to Special:Statistics currently. That's a lot of data! Plus, and perhaps most importantly, you'd need to define what an automated edit is. The tool you're using has done this in some way, but it's almost certainly based on inspecting edit summaries and guessing based on the presence of certain strings. This is suboptimal, but at least with clear criteria, your request becomes even more straightforward to implement/generate. It's just a lot of scanning and processing. --MZMcBride (talk) 03:28, 23 November 2016 (UTC)
But you are already counting every user and every edit. Or not? Staszek Lem (talk) 03:30, 23 November 2016 (UTC)
Well I'm certainly not! The bot uses a stored/precomputed edit count value that's available in the user table. --MZMcBride (talk) 03:56, 23 November 2016 (UTC)

Bot to update did not run for 22 November 2016 yet[edit]

Right now, the page indicates "This is a list of Wikipedians sorted by edit count as of 04:03, 21 November 2016 (UTC)." It is now after 16:00, 22 November 2016 (UTC). The bot that creates this list seems to be running intermittently rather than daily. Who needs to address this? Peaceray (talk) 16:11, 22 November 2016 (UTC)

@Peaceray: First check how the page is built - it's transcluded from Wikipedia:List of Wikipedians by number of edits/1–1000, Wikipedia:List of Wikipedians by number of edits/1001–2000 etc. Then look at the revision history for any of those (e.g. this one), you'll see that it's updated by BernsteinBot (talk · contribs). Click one of the talk page links, and you'll see that there is already a thread on this matter. If you check the user page, you'll see that it's a bot operated by MZMcBride (talk · contribs). --Redrose64 (talk) 20:53, 22 November 2016 (UTC)
Hi Peaceray. Is there some reason you're particularly interested in daily updates of this report? If the bot misses a day or two, does it matter to you? --MZMcBride (talk) 22:23, 22 November 2016 (UTC)
This was consistently running as a daily report, so I am wondering why this changed. I work in IT, & am always curious about any change that affects the running of a batch job that I use to monitor things. Oh, & you may call it vanity, but I do follow it daily. Peaceray (talk) 22:28, 22 November 2016 (UTC)
If you work in IT, I would think that you would know how finicky computers can be! Tool Labs isn't the most stable platform. I looked at the logs and don't see a reason that the report failed to update on November 22, 2016. Perhaps the report will update again in a few hours. We'll see. --MZMcBride (talk) 22:37, 22 November 2016 (UTC)
Somewhat bizarrely, it appears the report is updating, but the Age page sometimes isn't. A look at the bot's recent edits confirms this. Weird. --MZMcBride (talk) 03:38, 23 November 2016 (UTC)