Wikipedia talk:India Education Program/Archive 2

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search

Community work load[edit]

NPP backlog 4 Nov 2011

I think this graph clearly demonstrates the massive action by volunteers to clean up someone else's mess.

Now that the word has got round that the IEP has been (partly) closed, the backlog is again on the increase.

Today, there is only 1 (one) patroller on duty...

Heartfelt thanks to everyone who rolled their sleeves up.--Kudpung กุดผึ้ง (talk) 04:20, 5 November 2011 (UTC)

I think I checked more pages than a copyright attorney this past month, though some of it was also existing articles. I got one hell of a crash course in picking out close paraphrasing and copies from multiple websites. The Blade of the Northern Lights (話して下さい) 05:26, 5 November 2011 (UTC)
Ha! me too! Kudpung กุดผึ้ง (talk) 07:00, 5 November 2011 (UTC)

Pilot program?[edit]

I am having difficulty understanding how a supposed pilot program was able to grow so large and out of control. The whole idea of a pilot is that it is small, contained, and doesn't risk resources. Continuing to refer to this giant fiasco as a pilot strikes me as minimizing rhetoric rather than the thorough review of a deeply flawed process that we need. Jojalozzo 02:39, 6 November 2011 (UTC)

Roughly 20,000 registered accounts made more than 10 article space edits in September. source This program was intended to add 1,000 new editors at that roughly that level of activity, an increase of 5%. Make of that what you will. Danger High voltage! 03:06, 6 November 2011 (UTC)
- a frightening proposition indeed, and serious enough to firmly insist on a delay before the next phase starts. What's going to happen ulmtimately, is that with all the other issues with the WMF this year, the regular community is going to be extremely skeptical of wanting to offer any more help. I had a silly comment from another admin yesterday that wrecked my day so I sat next to my pool in the tropical sun for the rest of the afternoon and mused over whether we are shouting into the void, or if what we do here is worth it at all - it's sowieso only a drop in the (Indian) ocean, and nobody is listening to the drips (pun). --Kudpung กุดผึ้ง (talk) 04:57, 6 November 2011 (UTC)
That "5%" looks soothingly manageable, until you consider that, unlike most of the 20,000, the thousand new editors (a) are quite new to Wikipedia, (b) in many cases, it seems, are not fluent in English, (c) do not seem to have been given even a rudimentary briefing on things like wiki markup, sandboxes, and copyright, (d) are "supervised" by people with equally little WP experience, and (e) are given deadlines to contribute. JohnCD (talk) 13:15, 6 November 2011 (UTC)
...and, another important difference: the 20,000 are volunteers, here because they want to edit. The 1,000 are pressed men, they have to edit or they fail their course. JohnCD (talk) 22:17, 6 November 2011 (UTC)
Plus, they have been assigned (well, have to choose one, thats not much better) a random topic from a field they're still trying to pass. They are not experts at all. Regular Wikipedia contributors usually write because they know a lot about a subject, not because they randomly picked this article as a "least effort" topic to write about. If you know a lot it's usually much easier to write yourself; if they don't really know enough, people tend to copy&paste. --Chire (talk) 08:50, 7 November 2011 (UTC)

Common format[edit]

Ok, let's get this underway. We need to figure out a common format for all the tables at WP:IEPS so that we can keep the machine-readable lists updated.
I suggest these columns, in this order: Rollno, name, username, articles, sandboxes,mentor,approval/sign,instructor,OA, OA comments, other columns. Most of the tables just need to be rearranged, but some of the tables have double columns (see this for an example). I'm all for removing the extraneous columns like approval/sign and instruction (they all have the same content), but I don't want it to inconvenience the profs. Course pages should have their tables replaced with links to the sections of WP:IEPS.
We should also decide places to keep master machine-readable lists (For the moment, these are User:Manishearth/Ambassador/IEPstudents/rcl and User:Manishearth/Ambassador/IEParticles/rcl) Once this is done, I can easily keep them in sync, as well as run redirect checks, etc. ManishEarthTalkStalk 16:45, 6 November 2011 (UTC)

In regard to the table rows with multiple students, it would be possible to create separate rows for each student and still combine their shared data into multi-row table fields for stuff like articles they worked on, etc. This way, we could bring the tables into a uniform format without loosing the info that some users formed groups to work on articles. The down-side is the more complicated table syntax. (If we want to avoid the more complicated syntax, we could, in a first pass, add separate rows for each student and just fill up the shared columns with something like "see above". This way, the table format could be kept simple and straightforward to edit (and parse on source code level).) Comments?
An alternative might be to create a special {{Template:User IEP|accountname=|realname=|rollno=}}. However, the real name and roll number is mostly don't care for our cleanup purposes. Ideally, this kind of information should be part of the user account info or stored in a template on the corresponding user pages and extracted from there by another template, so that it can be centrally maintained as part of the data object "user", not "course". So, while it might be a good idea to have something like this in future programs, it would only increase our work now for no immediate benefit, it seems. Comments?
Should we add a special comment column for our cleanup efforts, or just use a generic comment column as we already do now?
Do we need to have special columns for cleanup status info, such as "user page has been tagged with IEP template", "all discussion pages of articles edited by user have been tagged with IEP assignment template", "all articles have been evaluated, reverted and/or cleaned", always with date stamp, or is this kind of info no longer necessary and will be maintained elsewhere as part of a formal CCI investigation process?
Additional notes:
In order to avoid ambiguity I would like to suggest that we all stick to use the well-established ISO 8601 standard international date format in the comments column, that is "yyyy-mm-dd", example: 2011-11-06 for November 6th, 2011. No abbreviated 2-year forms, no national date orders, no non-standard separators, only hyphens (as per the standard). Using the ISO format, we no longer have to wonder if something like "11/10/11" now means November 10th, 2011 or October 11st, 2011 (and if we didn't already knew we are talking about 2011, there would be even more ways to interpret this).
If we don't create a special user template (s.a.), I still think we should frame all accountnames with {{User-c|accountname}}. The added "(t c)" can be easily filtered during table export, but it makes it much easier to check for contributions, if we add it. Existing usage of the {{User|accountname}} template could be changed to {{User-c|accountname}} in a minute.
If we add multiple entries (account names, articles, sandboxes) to a table field, I suggest to use a semicolon (;) to separate the entries, not a comma (which might be used as part of an article or user name as well) or no separator at all (no separator makes it difficult to export the data using the HTML rendering, and we would have to parse the data on source code level instead). I don't know, if it is necessary, but if we find multiple multi-word entries where separating them by semicolon would prove to be difficult, we could put them in "quotes". Easy to parse and strip off in the resulting exported data.
"Course pages should have their tables replaced with links to the sections of WP:IEPS". Yes, but only after once more proofing and sync'ing the data into the master table.
--Matthiaspaul (talk) 00:27, 7 November 2011 (UTC)
Something like this:
{| class="wikitable sortable IEPtable"
! ID
! Roll number
! Real name
! Account(s)
! Article(s)
! Sandbox(es)
! Mentor
! Approval/Sign
! Instructor
! Last change link
! Wikiproject review
! Online ambassador
! OA comments
! Cleanup comments
! Cleanup status
| zzz <!-- Group ID, where applicable -->
| zzz <!-- Student roll number -->
| zzz <!-- Student real name -->
| {{User-c|zzz}} <!-- Student account name. No real name. Repeat with ; for more than one account name. -->
| [[zzz]] <!-- Article name without pipes. Repeat with ; for more than one article. -->
| [[User:zzz/sandbox]] <!-- Student sandbox. Repeat with ; for more than one sandbox. -->
| {{User-c|zzz}} <!-- Mentor account name or real name -->
| zzz <!-- Approval/sign -->
| {{User-c|zzz}} <!-- Instructor account name or real name -->
| [[zzz]] <!-- Link of last change made to article, where applicable -->
| zzz <!-- WikiProject Computing/Computer science review, where applicable -->
| {{User-c|zzz}} <!-- Online ambassador account name or real name -->
*yyyy-mm-dd: zzz <!-- OA comment. Repeat in new line for more than one comment. -->
*yyyy-mm-dd: zzz <!-- Cleanup comment. Repeat in new line for more than one comment. (add new comments on top) -->
*yyyy-mm-dd: zzz <!-- Reserved for cleanup status. Repeat in new line for more than one comment (add new statuses on top). -->


ID Roll number Real name Account(s) Article(s) Sandbox(es) Mentor Approval/Sign Instructor Last change link Wikiproject review Online ambassador OA comments Cleanup comments Cleanup status
zzz zzz zzz zzz (t c) zzz User:zzz/sandbox zzz (t c) zzz zzz (t c) zzz zzz zzz (t c)
  • yyyy-mm-dd: zzz
  • yyyy-mm-dd: zzz
  • yyyy-mm-dd: zzz
--Matthiaspaul (talk) 01:53, 7 November 2011 (UTC)
I like it!!! No, we should not use the multi-row fields etc, that makes it confusing for scripts. For extracting data, the usernames/whatever need to be in a specific column number. Separation with semicolons (for multiple usernames/rollnos/articles/etc) is the best ("See above" is OK, too, except that then the comments and all become confusing). Let's not use the Template:User IEP thing. YOu're absolutely right, it will be useful next time, but right now, it will just be an unnecessary headache.
I think we should have the separate comments column, so that the originaly OA commetns dont get overwritten. The status column is an excellent idea. We can fill it with {{yes}}/{{no}}/{{partial}} templates, with the content being somewhat like "Checked:Copyvio/Blanked","Checked:Copyvio/Not Blanked","Checked:OK","Checking","Not sure", and "Unchecked" (or empty), along with the username and datestamp. Making new comments on another line is perfect for our needs.
There's no need to do the "User page has been tagged with an IEP template", as it will be done via a bot sooner or later (I'm waiting for the BRFA to get approved). The "All articles have been reverted/cleaned" etc will come into the "cleanup status" column.
Yep, we should use ISO, though a timestamp might also be necessary. Why not just use the ~~~~~ timestamper? There's no ambiguity in that (Example: 13:03, 7 November 2011 (UTC))... We probably won't need to machine read it, and even if we do, JS/Java/etc have libraries that can interpret all types of dates.
User-c is the way to go. It's pretty easy to filter out talk/contrib links, though its not so easy to do so for sandbox links (because some OAs encourage "playground" pages instead of sandboxes, etc.). Which is why they'll be fine sitting in a separate column.
By the way, there's this tool that makes editing wikitables easy. It's here. Unfotrunately, it doesn't work with the new toolbar. I'll try to modify it so that it does. ManishEarthTalkStalk 13:03, 7 November 2011 (UTC)
We could use {{subst:ISO8601}} to implant the current date and time in ISO format, example: "2011-11-07T14:25Z ". (EDIT: Removed distracting extra linefeed and UTC link from template output. Hope nobody else needs it...) --Matthiaspaul (talk) 14:25, 7 November 2011 (UTC)
Well, it makes it less human-readable that the ~~~~~, though it adds a bit of machine readability. I'd say we keep the ~~~~~, as the comments aren't going to be required in the machine-parsing. Actually, I don't see the point of timestamping the comments when the commenter's going to sign them anyways...
On a related note, I've started writing an editnotice to put on the page after we do the reformat. The working copy is here. Feel free to edit it.ManishEarthTalkStalk 15:39, 7 November 2011 (UTC)
The table proposed above has too many columns - it's much wider than the screen of my small laptop, and it's going to be much wider when populated. I'd suggest the following changes:
  • ID - remove
  • Roll number / Real name - remove one if not both
  • Mentor - is it necessary ? abbreviate to initials (with a key below table)
  • Approval/sign - remove
  • Instructor - is it necessary ? abbreviate to initials (with a key below table)
  • Last change link - needs clarification, what's the purpose of this ?
  • Wikiproject review - need to clarify which wikiproject referring to
  • Online ambassador - possibly abbreviate to initials (with a key below table)
  • OA comments / Cleanup comments - combine ?
  • Cleanup status - OK
If student edits are continuing then the cleanup status column should be sortable by date. DexDor (talk) 07:53, 8 November 2011 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── I'm completely fine with it if we remove the extraneous columns, just that I didn't want to tamper with the info already there. It's of no use to us, but it is of use to the profs. Anyways, seeing that all edits are stopped till the cleanup is finished, I guess we can remove the columns (If anyone wants the data we can redirect them to an older revision). You make a good point about sorting the cleanup statuses. So we should use the ISO timestamp along with the signature in that cell (the same goes for the cleanup comments). The WikiProject review column will contain whatever is already there in the current table. We can combine the OA comments and cleanup comments, but it will become a headache to reformat the OA comments.
I have also added class="wikitable sortable IEPtable" to the table above (it won't interfere with anything, but it will ensure that other tables won't interfere when I extract data) ManishEarthTalkStalk 10:09, 8 November 2011 (UTC)

IMHO, we cannot simply delete existing columns, if, at the same time, we remove the lists on the course sub pages (as I have already started in some cases after once more checking bit by bit, that this information is reflected in the master list as well). We do this to concentrate the data into once place so that we no longer need to sync with other places. Pointing the locals to an older version of the data in the page history is just the same as forking the data again.
"ID", "Roll number": Some of the courses actively use these group IDs. Removing this information may interfere with their work. "Real name": In some cases, this may help us to identify multiple accounts and it may also be used by the local people in Pune. "Mentor - is it necessary? abbreviate to initials (with a key below table)", "Approval/sign - remove", "Instructor - is it necessary? abbreviate to initials (with a key below table)", "Last change link - needs clarification, what's the purpose of this?", "Wikiproject review - need to clarify which wikiproject referring", "Online ambassador - possibly abbreviate to initials (with a key below table)": Surely, we don't need any of this, but this is information present in some of the course lists, and since we want them to actively use the online lists as their one-and-only data base (instead of using shadow lists in paper form or whatever, which would once again cause synchronisation problems), we cannot just delete their info, because we don't need it. I'm personally not a fan of footnotes at all, however, if this way we can bring the table width down enough, this would be a solution. "OA comments / Cleanup comments - combine?" Possible, I thought about this as well, but decided against it in my proposal because the OA comments sometimes have nothing to do with any cleanup efforts and even if they do, I thought it would be better to have an extra column for the more systematic and organized cleanup efforts which still have to take place. However, I'm not against it, if it really helps.
I'm open to removing columns we don't need, of course, but I think we would need approval from the course instructors and corresponding CAs for this first.
I don't use a wide screen myself and therefore cannot see the whole table as well, but I don't see much problem in scrolling or temporary reducing the browser font size ([CTRL]+[-]) if I need to have a broad view... --Matthiaspaul (talk) 11:27, 8 November 2011 (UTC)
Are there horizontally collapsible tables, so that those with narrow screens could untick the columns they don't want to see? --Matthiaspaul (talk) 11:35, 8 November 2011 (UTC)
No. Reducing the font size to 85% could help, though. —Ruud 11:47, 8 November 2011 (UTC)
I'm not saying we delete their info. Remember, it will all be in the history.
How 'bout this? We put all the columns which are useless for our current purposes at the end. That way, whatever we need is up front, and the rest is off to the side. There isn't any way to make columns collapsible, unfortunately. (Though I can write a script that hides the last few columns if anyone's interested). It won't look too logical, but it serves our purposes. ManishEarthTalkStalk 13:02, 8 November 2011 (UTC)
Rearranging the order of columns and reducing the width of the table would give something like this:
Table. Test
ID Roll number Real name Account(s) Article(s) Sandbox(es) Cleanup status Cleanup comments OA comments Online ambassador Mentor Instructor Approval/ Sign Last change link Wikiproject review
zzz zzz zzz zzz (t c) zzz User:zzz/sandbox
  • 2011-11-08T16:55Z: zzz
  • 2011-11-08T16:55Z: zzz
  • 2011-11-08T16:55Z: zzz
zzz (t c) zzz (t c) zzz (t c) zzz zzz zzz
The downside would be that we'd have to swap the order of entries for many tables (instead of just inserting stuff in between). This will make it more complicated (at least for me, as I don't have a good editor at hands right now), but if it helps the others, I would be okay with it.
Question: Adding "sortable" to the wikitable class, the table will again be blown up to full width. Any workaround? --Matthiaspaul (talk) 16:55, 8 November 2011 (UTC)
I think we can remove the Instructor column altogether as it never changes within a course and the instructur is defined in the course tables as well, so no information gets lost here.
I'm not sure about the role of the Mentors, but perhaps we can combine this column with the OAs. If this would be important, we could add prefixes such as MT:{{User-c|accountname}} or OA:{{User-c|accountname}} and still save a column. Since the "link of last change" and "Approval/Sign" columns are never used at the same time, perhaps we can combine them as well. And finally, if we combine the "Wikiproject" and the "OA comments" columns (which have been mixed up anyway in many tables) we get another column. Gives:
Table. Test
ID Roll number Real name Account(s) Article(s) Sandbox(es) Cleanup status Cleanup comments OA comments / Wikiproject review Online ambassador / Mentor Approval / Sign / Last change link
zzz zzz zzz zzz (t c) zzz User:zzz/sandbox
  • 2011-11-08T16:55Z: zzz
  • 2011-11-08T16:55Z: zzz
  • 2011-11-08T16:55Z: zzz
OA:zzz (t c); MT:zzz (t c) zzz; zzz
Alternative proposal with a column order more in line with the existing table layout (and therefore easier to convert semi-manually):
Table. Test
ID Roll number Real name Account(s) Article(s) Sandbox(es) Approval / Sign / Last change link Online ambassador / Mentor OA comments / Wikiproject review Cleanup comments Cleanup status
zzz zzz zzz zzz (t c) zzz User:zzz/sandbox zzz; zzz OA:zzz (t c); MT:zzz (t c)
  • 2011-11-08T16:55Z: zzz
  • 2011-11-08T16:55Z: zzz
  • 2011-11-08T16:55Z: zzz
Would the second proposal be narrow enough to fit on the screens? It doesn't for me, but neither do the existing course tables, but I don't care. If we could get width:50% to work with sortable tables, we won't have any problems at all... --Matthiaspaul (talk) 20:28, 8 November 2011 (UTC)
For test purposes, I have changed the first of the course tables in the master list to this proposed format. It's still wide, but at least not wider than some of the existing tables. (I found it very time-consuming to change the order of columns, that's why I used the second of the proposals. Swapping the column order is an extra step anyway, so we don't waste time if we'd do it in two passes, would it still be necessary to change the order at a later stage. Regarding width:70% not working with sortable tables, is this a bug in the implementation or is this "by design"? Should we remove the "sortable" to make width work so that all can see the whole table without scrolling, and re-add it at a later stage, when it becomes more useful? --Matthiaspaul (talk) 04:21, 9 November 2011 (UTC)
In regard to those changes - I've been trying to finish off that course, and the table format has been changing, so I've moved to another course instead. However, now I'm lost as to what I'm supposed to do with the new format. I was never an OA, as far as I am aware - just someone assigned to do CCI. It may be that I am an OA, but now my comments are listed as OA comments, and there are two new fields regarding data I've already added, but which are blank. Do I fill in those fields with the same data I was adding to the OA field? Or is a third person, other than me, to check those cleanup fields? And does this mean that they will be repeating work I've completed? Or perhaps I shouldn't be adding data to the comments field at all, only to the cleanup fields? And, of course, if that's the case, what data is to go in the "cleanup status" column? At this stage I'm going on the assumption that I can't work with that course, as I might be compounding problems with the new formats. - Bilby (talk) 05:04, 9 November 2011 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── @Matthiaspaul: There's no need to make the table fit in the window (So keep it sortable and remove the 70% width). Only the important stuff should fit right now, and the remaining columns (name/rollno/group no) should be shunted to the end. There's no need to test out the format on WP:IEPS right now (copy the first table into your sandbox and try it out if you want). I'm working out the bugs from the wiki table editor, and this will let you shift columns in a jiffy.
@Bilby: Continue your cleanup efforts on that page. You may use the OA comments section for now, right now we are only testing formats... ManishEarthTalkStalk 05:27, 9 November 2011 (UTC)

Okay. I was not so much "testing", but more trying to seek consensus while moving forward at the same time. ;-) If WTE allows to swap columns at a later time, I think, I will continue in the current order and you swap the columns afterwards. I am temporary without my favourite editors (TSE Pro, Ultraedit), that's why I am a bit "handicaped" when it comes to REGEX and scriptable tasks such as table reordering, and doing this manually is a time-sink. That's why I prefer to keep the columns in basically the same order as they are already, and someone else can change it with proper tools at hands. BTW. I think it would be a good idea to leave at least the ID column as the first one (using the existing group IDs where present, and just counting up elsewhere). It doesn't take much room, but serves as a memorizable index into a row and makes it easy to restore the original list order.
Putting the column order aside for a moment, do we have consensus on the type and combination of columns in general? Should we combine some more? (I like your idea with the yes/no/partial template for the cleanup status, but this is something that can be added later, or should we set it to No now?) --Matthiaspaul (talk) 11:36, 9 November 2011 (UTC)

────────────────────────────────────────────────────────────────────────────────────────────────────(edit conflict)I finally got the table editor to work. It's still quite buggy, but it does the trick.

To install, type this code in your skin js page.


Once you install, two new table icons should appear in the editing toolbar. Select part of a table, and click the first icon. It should let you do all sorts of stuff with tables, most importantly adding and moving columns. The save button updates the table in the editbox (it does NOT save the page) Note that the script is still quite buggy (Because it was probably meant for simple wikitables, without templates etc.). It has the following bugs:

  • Sometimes the full wikitext of a row gets crammed into a single cell (Just scroll down the table while the Table Editor is open and check for anomalies) These anomalies have to be fixed by hand, unfortunately.
  • Templates with a pipe character mess up the editor. Basically, "OA:{{User-c|Manishearth}}" will show up as "Manishearth}}". It does not affect the actual wikitext of the cell, it ownly looks like it does. You may move around such half-eaten columns normally, the eaten text will move around with them (Even though it won't appear to do so).
  • Also, the editor can't work with the header row (actually it does, but this setting breaks the rest of the script). Header rows can't be seen in the editor (They aren't removed, though, they just don't show up). Moving columns around does not affect the header row. For this reason, I suggest you convert the header row to a normal row (convert the !s to |s, etc), use the table editor to shift things around, press save, and then convert the top row back to a header row. I'm going to try and fix this bug.

ManishEarthTalkStalk 11:52, 9 November 2011 (UTC)

In reply to the stuff you wrote above, I'd say that the yes/no/partial templates should only be used once an article has been/is being checked. If an article hasn't been glanced at, then we just leave the cell blank. Instructions on formatting will be in the editnotice (or in comments in the cells), which will let them know that they have to use the yesno template. As for agreement on the columns, I'm fine with them , but of course you need input from others. ManishEarthTalkStalk 11:57, 9 November 2011 (UTC)
Alrighty. I've not fixed the header row bug per se, but I found a good enough workaround. All you have to do is make sure that the header row has a "|-" before it. Eg:
Instead of:
{|style=blah etc etc
|+ Caption
(data goes here)
{|style=blah etc etc
|+ Caption
(data goes here)

Hope that helps! ManishEarthTalkStalk 12:09, 9 November 2011 (UTC)

By the way, the merging of columns required by your later proposals won't be possible via the editor. Let's just go with the original format and shunt all columns irrelevant to cleanup off to the side. ManishEarthTalkStalk 12:12, 9 November 2011 (UTC)

Multiple inheritance[edit]

Since I added a {{copyvio}} to Multiple inheritance some days ago, an editor has very politely asked me to "unblock" the article because they need it to complete an assignment. The problems with that article are possibly not only due to the IEP - I suspect it was in bad shape beforehand - but I am quite reluctant to remove the copyvio tag. I still have serious doubts about it. I have no objections if someone else would do so, as long as the sourcing is improved. -84user (talk) 12:05, 7 November 2011 (UTC)

Currently, all IEP student edits are supposed to be stopped. See WT:IEP#Relief. ManishEarthTalkStalk 12:44, 7 November 2011 (UTC)
Apparently this student's deadline is the 10th. There's a temporary workspace available at Talk:Multiple inheritance/Temp which can be used as the foundation of a copyvio free article if the student competently rewrites the content in his own words. This is mentioned in the copyvio template. MER-C 13:00, 7 November 2011 (UTC)
I think that they aren't allowed to even edit sandboxes (and the temporary workspace should fall underneath that category). See the Relief section above. ManishEarthTalkStalk 15:15, 7 November 2011 (UTC)
Yes, IEP students from CoEP have been asked not to edit on wikipedia (main space or sandbox) anymore. If any student from CoEP edits, he/she will show up on this list: and the Campus Ambassadors will get in touch with them and ask them personally not to edit for the time being. Im keeping a close watch on this list and as you can see no student from CoEP has edited in the past 24 hours. Nitika.t (talk) 09:10, 8 November 2011 (UTC)
Nitika, you seem to trust the database way more than we do... I mean, as discussed here on this page, we are still in the process to put together a complete, accurate and up-to-date master database (Wikipedia:India_Education_Program/Students) from the snippets of incomplete, outdated and faulty info distributed all over the place. In the past days, several students and articles missing have been added to this list, and if you check the students' list of contributions, the list of articles actually edited is still much larger than currently reflected in the master list. We are doing this because we cannot start cleaning up systematically without the data. If you already have more complete data then please, by all means, provide it (merge it into the master table, not any other place), because any serious investigation and cleanup effort must fail if the underlaying data is incomplete.
I just had a quick look at the student-o-meter (displaying events back to 1:40, 7 November 2011 right now). It missed for example RAJATPASARI (t c), one of the editors of "Multiple inheritance", who has edited WP on 2011-11-07T08:11:42 (although harmless, please don't penalize him for that).
Also, there is still significant IP activity (which can be traced back to Pune) in various IEP articles. Obviously some students continue to edit articles as IPs now that they have been disallowed to edit under their normal accounts. Does the student-o-meter also trigger on IP addresses?
(On a different note, I'm not sure it was really a good idea to disallow any edits under their student accounts. They should at least have been allowed to continue to edit on talk pages without being penalized, because now they cannot even answer questions raised by the community without risking being penalized. One more thing that IMHO should better have been discussed with the community first.) --Matthiaspaul (talk) 10:15, 8 November 2011 (UTC)
Curiously some of those copyvios date back to 2007. I've reverted back to pre IEP material and repaced to 2007 copyvio material with an earlier description. See page talk page for details.--Salix (talk): 09:29, 8 November 2011 (UTC)

Signpost report[edit]

Hello all. I am currently writing a report on the Signpost for publication in eight hours' time, and would welcome any comments and statements from interested parties. The working draft of the story is here; please email if you would like to contribute. Thanks in advance, Skomorokh 14:15, 7 November 2011 (UTC)

I'd like to point out that though Hisham was right about the cultural differences having no impact, (in my opinion) he was wrong about the attitudes not having any bearing on this. I have studied in both countries, so I have observed firsthand the stark differences in the attitudes towards plagiarism. In the US, plagiarism is denounced at an early stage (3rd grade or so), and teachers do penalize offending students. I've even seen teachers paste random chunks of a report in Google do check for plagiarism. On the other hand, in India, I haven't heard a single word about plagiarism yet. Teachers don't bat an eyelid even when the majority of the class submits almost the same content (copied from enwp or the first Google result). Bibliographies usually have "" in them (Which we all know is meaningless), and usually no other entries (Sometimes you see also). When I first came here and saw this happening, I asked a few students why they plagiarised, and didn't they know that it was illegal? They were genuinely mystified by this.
So I feel that the difference of attitudes has a very big part to play in this whole mess, though we can't blame anyone as such differences aren't obvious unless you have experienced both sides of the coin.
Note: All of the above are my observations and in no way do I mean to generalize for the country as a whole. ManishEarthTalkStalk 15:03, 7 November 2011 (UTC)
I've observed the same thing here in the US; in my area, we have a fairly large Indian, Thai, and Chinese population. None of the aforementioned groups, when they first come over here, have any concept of copyright (although in the case of the Chinese, it's more of a willful, malicious disregard for it; my uncle married a Chinese woman who taught English in China, and she can attest to the problems there). Again, it's anecdotal, but my experience has more or less been the same. Not that it necessarily stops plagiarism from American kids (especially at community colleges, where I tutor now), but it's definitely not so endemic. The Blade of the Northern Lights (話して下さい) 01:37, 8 November 2011 (UTC)
As a teacher and teacher trainer, I have been observing exactly the same phenomena here in Asia for the past 13 years. In universities, plagiarism is often tolerated from the dean down - graduate schools are no exception. Kudpung กุดผึ้ง (talk) 06:24, 8 November 2011 (UTC)

Common format: We need consensus[edit]

To be able to proceed in updating WP:IEPS to a common format (for machine parsing etc), we need consensus on the format. The current proposal is these columns (Not in this order):ID,Roll number,Real name,Account(s),Article(s),Sandbox(es),Mentor,Approval/Sign,Last change link,Wikiproject review,Online ambassador,OA comments,Cleanup comments,Cleanup status.
The columns irrelevant to the cleanup should be shunted to the end of the table (It's better not to remove them as the columns have some significance to the profs/CAs/etc). For a discussion on what the columns do, please see WT:IEP#Common_format.
A proposed editnotice is under construction at User:Manishearth/Ambassador/IEPnotice. ManishEarthTalkStalk 12:28, 9 November 2011 (UTC)

Hi everyone, I see that there has been a discussion about going through CCI procedures to clean up the articles from the Pune pilot. At the same time, as most of you are probably aware, staff members have already asked various Online Ambassadors - both some India-based ones and some U.S.-based ones - to take on cleaning up these articles (you can see which articles have been assigned to be cleaned by which Ambassadors on this list of students). While not all the Ambassadors have started cleaning up the articles they've been assigned to, many have, and I believe that the cleanup work they've already done have been very valuable in removing poor content from the articles. Yesterday I contacted almost all Ambassadors involved in the cleanup effort to inquire whether they're still available to help if they haven't already started doing cleanup, and whether some of them can take on cleaning up more articles if they've already done some cleanup. For the purposes of cleanup, I would also like to replace the Ambassadors who have little Wikipedia-editing experience with Ambassadors with more editing experience.

Seeing that both a CCI discussion and an Ambassador deployment effort are going on at the same time for the purposes of cleaning up articles, I want to work with you all on coordination to make sure no one is doing repeat work. As I mentioned, some Ambassadors have already put in a lot of time and effort in removing poor content from articles, and I wouldn't want any other community member to have to duplicate this work. The whole reason we assigned Ambassadors to the cleanup effort is to take workload off the larger community's shoulder - the community has had to shoulder too much workload related to the Pune pilot already (I fully acknowledge that and thank you all sincerely for the work you've done), so I would like the cleanup work to be largely shouldered by our Ambassadors rather than by the larger Wikipedia community, to save the community at large this burden. Of course, I think that a cleanup force made up of both Ambassadors and the community at large might be best, since a lot of cleanup work still remain to be done and our Ambassador resources are limited. What I really want to avoid, though, is getting any of us into a situation where we're just duplicating the work that someone else has already done.

So here's my proposal, and I'd like to get everyone's feedback so that we can move forward quickly depending on what we decide together. I say that we use this list of students as the master list that we'll all work off of, and we'll assign all the student articles on that page either to Ambassadors or to community members at large. Whoever is assigned to a student article would be listed in a column next to the article. Many of the student articles on that page have already been assigned to people (in fact many have already been cleaned up by Ambassadors), so we don't have to worry about those. But some articles haven't yet been assigned, and some are currently assigned to people with little Wikipedia-editing experience, so I'd recommend that we divide up these articles among the (experienced) Ambassadors and at-large community members willing to be part of this cleanup effort. This way, someone is "responsible" for cleaning up each article, and it'll be clear who is responsible for which article. Of course, I realize that the list may not be totally complete yet, and may lack some information that is important to the cleanup effort. So what I would recommend there is for us all to work together to update the list to make it more complete, and also change the format of the list (in alignment with some of the earlier discussions on this talk page) to make it more usable. I'd advise against creating completely separate lists/pages because that could lead to confusion and duplicate work.

What does everyone think about this? My basic point here is that I think everyone interested in the cleanup effort - staff, Ambassadors, super nice Wikipedia community members at large - needs to work together and stay in constant communication with each other during the cleanup process; to not do so would be to risk wasting people's time doing repeat work (staff is certainly guilty of this as well, but I want to change that and make the Pune-pilot cleanup effort going forward truly collaborative). Thanks all. Annie Lin (Wikimedia Foundation) (talk) 18:31, 9 November 2011 (UTC)
My experience is that most Ambassadors - even if experienced Wikipedians - are not always knowledgeable about the subject matter of the article they are supposed to review. So I'd really like to see this work being done in duplo. Once by an Ambassador, who can take care of trivial matters like copyright violations, once by a editor who knowledgeable about the subject and can also take care of any quality issues (i.e. assessing whether the article only needs a small amount of copy editing, or is better off being reverted to its pre-IEP state.) Clearly the latter class of editors are in much shorter supply and less willing to be named as "responsible" for a particular set of articles. —Ruud 19:08, 9 November 2011 (UTC)
  • The IEP CCI hasn't gone live yet. The processes, as far as I can see are complementary to a certain extent. The full CCI which will generate very easy links to all contributions which will have to be checked. They'll look like this for each of the 800 IEP students. In my view, a final check will still have to be done via the CCI after the dust completely settles regardless of what the Online Ambassadors do. There are several reasons for this:
  1. Students are still editing, even today (see below), even in courses where they have been explicitly told not to.
  2. Students are still adding their user names
  3. Students have gone back and edited after the original copyvio was cleared out and comments made in the status columns. Some of those comments are quite old and/or undated.
  4. Some of the OA/volunteer editor checks are only being made to the listed articles in the tables, which in many cases are inaccurately listed and don't represent all the places the student have actually edited.
When the CCI finally gets underway (and no one has started there to my knowledge), the work the OA's have done and are doing will have still have been a huge help. The CCI cleaner-uppers will be able to see from the article's history and cross checking with Wikipedia:India Education Program/Students whether it has already been checked, by whom, and when. They'll see if any copyvio has been removed, whether or not the students have edited it since then, and if the the original checker was sufficiently experienced to trust their check. If all the conditions are met, the article can be quickly signed off. If not, then it will need a re-check. It's the only way to do it systematically and thoroughly. At this point there's no use in rushing. It took 3 months to add the copyvio and it may well take another 3 months to get rid of it. But that's OK. Better to do it properly. Those are my thoughts, anyhow. Hopefully MER-C and Moonriddengirl can give us some expert input on this. Voceditenore (talk) 19:56, 9 November 2011 (UTC)
I fully agree with what Ruud and Voceditenore have said already. Any copyvio the OAs and the community can remove now will not be wasted time, but make the more formal and fully recursive CCI approach much easier. And the lesser the time copyvios remain in the articles the better, not only for legal reasons, but also because any other edit on top of copyvio'ed content is at risk to have to be deleted as well - this would be a waste of the precious contributors' energy. Still, I see, that a systematic CCI is necessary later on to ensure that we really catch all edits. The master list is still in flux, as are the edits, and even with the help of the student-o-meter and huge watchlists (based on days-old data, though) we will miss at least some activity, and we haven't even started to add all the other articles/sandboxes, which have been edited by the students (and I guess we will see even more once we come around to add IP addresses to the lists as well). --Matthiaspaul (talk) 20:32, 9 November 2011 (UTC)
I may write more later, but just quickly - as far as I am aware, a CCI has been ongoing for close to two weeks. The US Online Ambassadors who were willing to help were asked to conduct a CCI, and speaking for myself that's the way I've approached this. Given that there are a large number of students, but few contributions for each student, checking all of their edits where I have been assigned a student has not be difficult, and I assume that this is the same process that the other "EOAs" have been doing. I'm a bit concerned that there is a lot of duplication of work planned, which is less than ideal. My impression has been that there was a process in place to handle the copyright concerns, but there has been a parallel process being developed here as well, when I'm not yet sure that the parallel process was necessary. - Bilby (talk) 21:07, 9 November 2011 (UTC)
At the risk of reiterating myself - but I'm not sure if I was entirely clear in my first reply - but we seem to have at least two problems with the IEP. A copyvio problem and a quality problem. Anything the Ambassdors can do to remedy those quickly is fine, but I think both require the articles to be reviewed more carefully than the Ambassadors can do on their own. —Ruud 21:13, 9 November 2011 (UTC)
@Bilby. This is the CCI. It's an offical process. Is that what you've been asked to participate in? It appears that what you've been asked to do is centered on the Student Lists. I'm not sure how clear the WMF is about what an official CCI actually is and may have confused it in their emails with simply "cleaning up copyvio". The CCI isn't due to go live for a while. So any help you give on the Student List page won't be duplicated (per my comments above), unless the students edit their articles after you've checked them. Voceditenore (talk) 22:01, 9 November 2011 (UTC)
I've been involved in quite a few CCIs in the past, and the requst from the Annie Lin was specifically "look through those students' edits and remove all copyright violations or bad edits in general". There was a lot more detail, but we were being asked to remove copyvio, copyedit if possible, and remove badly written material to the talk page in regard to all of the student's edits on any articles - pretty much the same expections of a normal CCI, with the addition of the quality requirements. My concern is simply duplication: I get the impression that there is currently a process in place which is doing the job that the CCI is intended to do. But while the people asked over from the OA program have been following that process, a parallel and effectivly identical process has also been built. The result is going to be a lot of people doing the same jobs: where copyvio or problems were detected by the OA, it won't be a major issue for someone else to look, see it is fixed, and move on; but where none was detected the second person will have to go through the same process of confirming that it isn't a problem. I suspect, if nothing else, some of that duplication can be reduced if the OA people are asked to replicate what they've done on the CCI process, but doing so will slow things down.
The bit which confuses me is that, in reading the above, it seems that it has been stated that there is a formal process that seems to be being used to fix the problem, but that nevertheless a second formal process has also been developed. - Bilby (talk) 22:56, 9 November 2011 (UTC)
I didn't set up the CCI. but my impression is that it will serve as a final mopping up and check, after the whole thing has well and truly finished, not a duplication of what the OAs are currently doing. I would have thought that CCI investigators will cross check with the comments on the student lists. If the OA says they've checked for copyvio and haven't found any and the student hasn't edited the article since their last check, then the article(s) would be signed off without further checks. Voceditenore (talk) 23:06, 9 November 2011 (UTC)
That sounds like a good idea to me. I have absolute confidence, for instance, in Bilby's work in this area. :) Many of the OAs working on this are familiar to me. I don't know that much about bots and what they can do, but I know for instance that MER-C does. :D Would it be possible for us to cross-check the CCI against the OA cleanup chart? If not, I can do that by hand, since there's not a whiff of content management there. It's clerical stuff, and I stand ready to serve. --Moonriddengirl (talk) 11:22, 10 November 2011 (UTC)
I echo this comment made by Voceditenore. The CCI is absolutely necessary because the OA cleanup shares the same gross organizational incompetence (not the OAs' fault) as the whole project. The WMF did not even attempt to coordinate before throwing the OAs at the student list despite the CCI being brought up in the office hours. The CCI may also serve other purposes when the need arises. MER-C 09:01, 10 November 2011 (UTC)
Bilby, what we are preparing here right now is a solid database on which cleanup work of any kind can be based. Unfortunately, little thought has been put into the data aggregation and representation by those who ran the programme, perhaps because nobody could predict the lack of discipline of the students even to add their valid account and other data to the online course lists, or the lack of interest of the local people to verify, enforce and correct the online data earlier on in the timeline, or it simply was not seen that an accurate database would become important for any maintenance or controlling tasks alongside the IEP, I don't know and will leave that to a post-mortem-analysis. However, the organization of the data was (and still is) fundamentally flawed and that's how we ended up with dozens of half-maintained lists with similar but not identical contents, which need to be carefully merged and synchronized again. Our current efforts don't focus on correcting the structural mistakes already done, because it would require more work. Our focus now is on the cleanup, not the development of a proper technical infrastructure to base future programmes on - nevertheless, some of the stuff reshaped now may also be a good starting point in the future.
Anyway, even in the past few days we still stumbled upon several participating students and lots of articles not listed previously. We found them by reverse lookup from article space to user space and recursive backtracing of "suspicious" edits in the page history of articles, talk, and user pages. So, any cleanup efforts so far would have missed them. That's why we'll need the CCI, which will systematically and fully recursively scan and (re-)check any of the articles touched by any of the students. If they are found to be carefully cleaned up already, no further work is required. If not, the CCI will have to deal with the left-overs of any prior cleanup efforts. So, (almost) no work will be doubled, and if the WMF now wants the OAs to clean up before the formal CCI will come around, this is perfectly fine as well, for as long as it is documented what has been done. It does not interfere if we work on the same online data base (the master list). --Matthiaspaul (talk) 12:14, 10 November 2011 (UTC)
Sorry, I may be giving the wrong impression. I'm not against a CCI - I want this cleaned up too - but I'd like to avoid duplicating effort, mostly because I know of the backlog that is at CCI (through lack of volunteers, more than anything - CCI work is labour intensive, difficult and unrewarding, but essential, hence my immense respect for people like MER-C and Moonriddengirl, along with all involved). So if there's way of combining efforts it would work out better. The benefit I see from the current Emergency OA model is that it is proactively seeking people to work on the problem, which is a good move in my eyes from those involved in the IEP. Capitalising on that would be both a way of combining efforts and, when this occurs in the future, having the beginnings of a possible response model. (I say that simply because, as an educator who works with hundreds of tertiary students a year, I know the likelihood of copyright problems in any program in any country, even though I happily accept that the rewards outweigh risks). At any rate, that sounds like the plan, so all is good. - Bilby (talk) 12:47, 10 November 2011 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── @Bilby: The CCI won't be a duplication of effort. Our cleanup will involve checking the IEP edits to a particular page, determining if it is a copyvio (and fixing bad grammar), notifying the student, and recording our actions in the master table. The CCI will look at the resultant article (after our cleanup), do a check for remaining copyvios, and they will also check our actions (Which are recorded with diffs, so it will be quite quick). This level of redundancy (We'd probably go for more redundancy if we weren't short of volunteers) is required to make sure that every single copyvio is removed. The reason that we aren't just doing a CCI (which is rigorous enough to not need any more redundancy), is that (like you mentioned), CCI has a lack of volunteers (And will thus take a very long time to go through ~1000 articles). We need to get rid of these copyvios ASAP (Remember, they're live.. We can't have that at all..). The best thing to do is attempt to clean it up as much as possible in a short time so that the copyvios are no longer live, and the CCI has a much easier job to do. The CCI will basically 'clean up the cleanup', by picking up any stray stuff we've missed (or wrongly blanked).
@Voceditnore: Not all of the OAs are experienced enough for the community to be sure that "if an OA says it's OK, it's OK" (At least, that's what it looks like from older discussions on this page). The CCI investigators are experienced in the field of identifying and addressing copyvios, so it would be better if they checked the OA or other cleanup volunteer actions also. We should provide diffs of all cleanup edits in the master list (comments section) to make life easy for the CCI.
@Moonriddengirl: It shouldn't be too hard to cross-check. In fact, MER-C has written a bot that imports a list of students/articles and generates a CCI. Unfortunately, we don't have a complete list of students/articles yet. Aside from that, it's not machine readable (Because of the various formats). Once that is done, MER-C can bot-create the CCI, and since the CCI will be bot-created, it, too will have a regular format which can be broken down for comparison with the OA cleanup (I doubt we will have to do this, though, if the CCI is started after the OA cleanup is finished). ManishEarthTalkStalk 15:25, 10 November 2011 (UTC)

Oh, by the way, Voceditnore, this isn't exactly the CCI. It's just a page for testing MER-Cs CCI creator. (See WT:IEP#Contributor_surveyor_finished) ManishEarthTalkStalk 15:28, 10 November 2011 (UTC)
Yes I know not all of the OAs are experienced enough for the community to be sure that "if an OA says it's OK, it's OK". I think I mentioned even further up that we needed to cross-check how experienced the OA cleaners were, and re-check their work if in doubt. This was especially true of almost all of the "out of process" OAs appointed by the IEP which I had pointed out at Meta a week ago. (I believe the bulk of them were finally removed yesterday.) But even some of the ones from the "normal" US OA program are inexperienced with doing this work, and frankly, the work itself is so mind-numbing that it's almost impossible not to "glaze over" after a while. Voceditenore (talk) 16:13, 10 November 2011 (UTC)
@ManishEarth: I'm getting two stories here, which is confusing. However, redundancy is good, but I don't see it as a plus here: a CCI is going to have a hard time going through all of the relevant articles, and the work involved to recheck something already checked is going to be considerable. Redundancy is great when you can afford it, but if you can't afford to loose time doing things twice, it is normally better to work out a means of focusing on the relevant problems so that duplication is removed. Especially with some big CCIs floating around that will also need to be addressed elsewhere. - Bilby (talk) 21:15, 10 November 2011 (UTC)
Let me make an attempt at summarizing the points made by various people so far on this topic: the CCI process is going to be a separate cleanup process than the Online Ambassador (OA) efforts. The OA efforts have already been underway and are continuing to operate, and are focused on not only removing copyright violations, but also removing unsourced content, very bad English grammar, and other hard-to-fix poor content in general (these are the instructions that the OAs participating in the cleanup have received). The CCI process has not yet started and it is uncertain when exactly it will start; when it does start, it will involve community members with CCI experience in general, and it will focus mostly on getting rid of copyright violations. User:Bilby is expressing concern that the CCI process will duplicate a lot of the work that the OA efforts already did; a few other people on this page are saying that the duplication will be minimal, and others are saying that the duplication might be necessary for adequately removing poor content coming out of the Pune pilot. There appears to be consensus on the point that even though the CCI process will take place sometime down the line, the current OA efforts are still needed and valuable because the copyvios and poor content need to be removed as much as possible immediately.
Is this an accurate summary? Please correct me if I am mistaken.
Building off of this, I have a few points and questions:
  • I want to echo User:Matthiaspaul's point above that it is crucial for everyone involved in cleanup - whether via the CCI process or the OA efforts - to be working off of the same database, namely this master list of students. This will minimize the amount of duplicate work, because OAs are using that student list and leaving comments there after they've cleaned up articles. I understand that the master student list might still be incomplete - some professors in the Pune pilot have still not yet provided their students' usernames and article information, so more information continues to be added to that list. Furthermore, the list could perhaps use some format improvements (like additional columns or beautification in general). It is therefore important that all of us continue to update and improve the student list, but we should all work off this same list instead of creating any separate databases/lists for the duration of this OA and CCI cleanup.
  • I'd like to call on the community at large to join the OA efforts. The OA efforts - despite its (informal) name - are not restricted to OAs (and our OA resources are limited anyway!). I basically see the "OA efforts" as the currently ongoing efforts to remove poor content from the Pune pilot student articles before the CCI process does a second cleanup later on. So if any community member has some time to help out with cleanup this week or next week, I would really, really appreciate it if you could go to the master student list (linked above), put your username in the "Online Ambassador" column next to some student articles that currently have "IN NEED" next to it, and help clean up those articles (remove copyvios, unsourced content, bad grammar, poor content in general). We can use all the help we can get right now!
  • I understand that some folks are concerned about how much the OAs can be trusted with doing cleanup. Following some discussions on this topic on the relevant Meta talk page, I've removed almost all OAs who have little prior Wikipedia-editing experience from the cleanup effort. But there still seems to be concerns about how much to trust the OAs. I would really like for all of us to come together behind the OAs and support their cleanup work rather than cast doubt on it. Like what many at-large community members have (very generously) been doing, the OAs have already been doing highly valuable work in cleaning up those student articles, and I believe all these people deserve applause rather than suspicion for that work. I think it is unhealthy to cast doubt on people who are completely good-faithed members of the cleanup efforts - it is unhealthy not only because I think we should build our relationships on trust (and "assume good faith") rather than mutual skepticism, but also because to say that "we can't trust the OAs and therefore in the CCI we'll need to re-check everything the OAs did" would just lead to a lot of unnecessary duplicate work. So my suggestion is for us all to get behind the OAs (and to get behind the CCI folks when they start the CCI process down the line), and for everyone in the cleanup effort - OAs, CCI people, etc. - to operate on mutual trust. Now, if there are particular OAs who you think should not be part of the cleanup effort because you think those OAs for some reason don't have the right qualifications, I encourage you to please indicate (soon) who you think those OAs are who should be removed from the cleanup effort (please also provide good reasons for why you'd like to see those particular OAs taken off the cleanup effort), and then we can talk about it and take off those OAs who actually do not meet the "right qualifications" for doing cleanup. I certainly think that possibly some OAs aren't experienced enough with copyright issues on Wikipedia to help with cleanup, and in that case we probably should take them off this effort. But for all OAs who remain in the cleanup effort, I'd highly recommend that we assume good faith and assume competency, just as we'd do the same for other Wikipedia community members at-large. Annie Lin (Wikimedia Foundation) (talk) 23:31, 10 November 2011 (UTC)
  • Annie, this has nothing to do with not assuming good faith, and I would appreciate it if you didn't put it those terms. Furthermore, it has nothing to do with the editors' "trustworthiness" as people and no one has implied that. It has to do with the level of experience in a very specific task. Any conscientious editor would welcome a back-up check when they're operating in a wholly new area. I know I would.
Two days ago I left some tips for a very experienced editor and Online Ambassador for the US program (13,000 edits, 43 articles created) after seeing his message on another editor's talk page:
"I have no previous experience with systematically looking for copyright violations. I am wondering if you could give me a few tips about how I could be helpful in cleaning up the messes."
I'm sure he'll do fine now, but I'm wondering if the email messages sent to the OAs gave them tips and advice on how to do an effective copyvio search? If not, it might help to send one around rather than waiting for them to ask. There may be others in a similar position which is why they've been slow to get started.
On another note, if you're looking for more help in the current process you've set up, I suggest you reach out to the subject-specialised WikiProjects. If you leave a note on the project talk pages, you may be able to recruit not only experienced editors, but ones with specialist knowledge of the subject area and access to offline sources. Specialist knowledge is a big help in "fixing" articles. I've cleaned quite a few from the CoEP, but on several occasions the English was so garbled and my subject area knowledge so poor, that I had no idea what the students were trying to say. Thus, I couldn't adequately repair the article apart from removing any traced copyvio. I suggest you reach out to the following if you haven't already done so:
WikiProject BusinessWikiProject EconomicsWikiProject EngineeringWikiProject TechnologyWikiProject Computer scienceWikiProject Computing
Having said that, make sure these editors have a place to make their comments. The following either have no table at all, or tables without places for the OA/reviewer's name and comments: History of Economic Thought Year 2 Group A (section), Agribusiness and Marketing Year 2 Group A (section), History of Economic Thought Year 2 Group B (section), Agribusiness and Marketing Year 2 Group B (section), Research Methodology Year 3 Group B (section).
Best, Voceditenore (talk) 10:05, 11 November 2011 (UTC)
I feel the same way as Voceditnore.. It's not a question of 'trustworthiness', it's a question of 'experience'. It's perfectly fine if the OAs do the cleanup (along with the rest of the community), but a CCI would be necessary as a final, reassuring check.
We should also start to update the tables to the format given above. (I'll do some now if I get the time) ManishEarthTalkStalk 12:34, 11 November 2011 (UTC)
I've updated the first table on the list to the new format (More cleanup-centric). Please use the same column order when updating others. ManishEarthTalkStalk 14:02, 11 November 2011 (UTC)
I've done a few more. Please revert if you feel that we need to rethink the new format. ManishEarthTalkStalk 14:29, 11 November 2011 (UTC)
Great! The course sub-pages of the Symbiosis School of Economics and the SNDT Womens University have now all been merged into the master list. I have gone through the change log of the corresponding sub-pages back to 2011-11-02 (and in one case back to September) to check for user name and articles changes. The tables on the sub-pages have been deleted afterwards in order to force students to use the master list. So, we'll now have to continue with the more difficult task of converting the COEP tables into the new format as well... I would like to discuss one possible change to the current table order, though. The current order is:
Cleanup comments
Cleanup status
OA comments / Wikiproject review
Online ambassador / Mentor / Campus ambassador
CA comments
Roll number
Real name
Approval / Sign / Last change
And my proposed change would be:
Cleanup status
Cleanup comments
OA comments / Wikiproject review
CA Comments
Online ambassador / Mentor / Campus ambassador
Roll number
Real name
Approval / Sign / Last change
That is, the ID would be in the first column to help identify a particular row. Optionally, the three comments columns would be grouped together, with the cleanup columns coming first so that still fit on the screen. The SNDT Womens University table is currently in this order. What do you think? --Matthiaspaul (talk) 20:56, 14 November 2011 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── Good work with the course page cleanup. We should modify the links on the master lists to go from "Courses/Fall 2011/Course#Students" to just "Courses/Fall 2011/Course" (Otherwise we get recursive links). But we can do this after the reformat is over.
@ID column: Hmm, the ID column takes up space, and we don't need it to identify a student. To identify a student, a username us sufficent (with a course name for the students in two courses). The ID column will not be actively used and glanced at. The username/articlename should be visible for quick clicking, and the comments also take precedence.
@CA Comments I'm rather indifferent about that column (doesn't really matter where it goes), because its empty in almost all the tables. So I'd rather keep it the way it is and spend time in reformatting the newer tables (The ones from COEP are giving a bit of trouble as they have missing cells, etc, which trips up the script and half the time I'm adding workarounds to it). ManishEarthTalkStalk 09:51, 15 November 2011 (UTC)

Perhaps it's a language thing as I don't speak any Indian language, but I find it quite difficult to remember many of the students' account names, so between browsing the rendered page and editing a particular entry, I often use the ID (if available) as a handle to find the corresponding location more easily. --Matthiaspaul (talk) 10:34, 15 November 2011 (UTC)
Hmm... It's then much easier to just look for the article name (or a snippet of it). Or just scroll to the side and see the IDs there. But OK, I'll shift the ID column once I'm done with the rest. ManishEarthTalkStalk 11:19, 15 November 2011 (UTC)

Students are editing[edit]

A large number of editors from the Data Structures course are editing again, in article space no less [1]. Why? According to the course page there is some kind of deadline November 10th. —Ruud 18:27, 9 November 2011 (UTC)

But the course page also gives an article edit deadline of the 2011-10-25. Really strange. (We also discussed this further up under #Multiple inheritance.) --Matthiaspaul (talk) 19:59, 9 November 2011 (UTC)
The professor in charge of this class is a prior (and existing) wikipedia editor. He's determined that he wants to carry on the project. He's had previously given an extended timeline to his students of November 10 (even prior to our asking all the faculty members to stop the assignment.) He's extended it to November 12 now. He's also told them that anyone doing any kind of copyvio (text or image) will be lose all 20 marks for the assignment. He has also told his students that anyone who's work ha been redirected to wikibooks should continue on wikibooks. Lastly, he's told his students that anyone who have page redirects or other such issues can submit assignments to him outside of wiki. We had spoken to the Director and all the professors (including the one for this class.) We will speak to him again tomorrow morning India time and reiterate the reasons for the suspension of the project and the seriousness of the issues around copyvio and other quality concerns. Hisham (talk) 20:21, 9 November 2011 (UTC)
Half a dozen students from Macroeconomics are still editing today, 11 Nov, with at least one still adding copyvio material in the mainspace. Their course page says their deadline is 14 Nov. JohnCD (talk) 22:32, 11 November 2011 (UTC)
Sarangvk (talk · contribs) is still editing heavily. I haven't had a change to check for copyvios but given the proficiency of the English being used compared to the user's the previous text, it needs to be checked. I'll be very busy today and tomorrow and probably won't have time to check but since I thought students weren't supposed to be editing at this point and this student has added mass amounts of content, I thought I should bring it up somewhere. OlYeller21Talktome 15:38, 14 November 2011 (UTC)
At least one article, AK model, is mostly lifted from here as is the caption of an uploaded image, File:Ak model.png. I tagged the article and will check other contributions. Jojalozzo 21:59, 14 November 2011 (UTC)
The edits are continuing: [2] Wikipedia talk:Articles for creation/Interest-free economy, [3]... MER-C 03:41, 16 November 2011 (UTC)

USEP discussion[edit]

There's a discussion going on here about the US education programme. It has not had the same problems as the IEP, but it does impact the general community of editors. Those involved here, both as ambassadors and as part of the copyright cleanup effort, may wish to participate in that discussion too. Mike Christie (talk - contribs - library) 01:31, 11 November 2011 (UTC)

Two questions about student contributions[edit]

I am working through some student contributions and ran into a couple of things I'd like advice on.

  • Student Mallika.sharma created Wikipedia_talk:Articles_for_creation/Non-banking_Financial_Company; the creation was declined. It was weakly/incompletely sourced. The English is poor enough that I doubt it is a direct copyvio. Is it OK just to leave this alone?
  • Student Abhinav619 created Challenges of inflationary policy in India, and is the author of almost all the text in it. (It should probably be moved to Inflation in India, but that's another issue.) Sources are given but there are no citations, so per Nitika's instructions, if this were not a student-created article I would simply move all the student's text to the talk page with a note that it would have to be cited appropriately. I can't do that in this case; what should be done instead? I'm inclined to move it to Inflation in India and replace all the text with a redirect to Inflation until someone else gets around to writing this article. Any comments on that approach?

Thanks for any help. Mike Christie (talk - contribs - library) 05:47, 12 November 2011 (UTC)

  • Hi Mike. Re Non-banking Financial Company, I've seen a lot worse articles than this passed through AfC. It's a bit of a lottery depending who's reviewing. Having said that, I would just just leave this in the editor's user space for them to further improve (if they want) and note on the Clean-up/student list that it has no copyvio.

    Re Challenges of inflationary policy in India, again this is not that bad an article, certainly not so bad as to redirect. I'd leave it in place, move to Inflation in India, add any appropriate maintenance tags, and above all add {{WikiProject Economics}} and {{WikiProject India}} to the talk page as well as {{IEP assignment}}. (There are some specialised banners for individual IEP classes, but this will do in a pinch.) That way we can keep track of the IEP articles after the thing finishes. More importantly, the WikiProject banners give the subject-specialised projects a greater chance of finding it and possibly dealing with it in the perspective of the other articles within their scope. The maintenance tags are useful because most projects priortise their work via their cleanup lists.

    In general. I think we have to be careful about shooting these IEP articles at dawn, unless they are copyvio (although after a whole day of dealing with them, I get sorely tempted). The only time I remove material (apart from copyvio), redirect, or propose for deletion is if the material is so garbled as to be incomprehensible and requiring a complete rewrite, it essentially duplicates an existing article without adding anything new or helpful, or is dumped into an existing article where it is clearly not an improvement, and in fact a detriment.

    Frankly, I don't agree with Nitika's instructions about putting removed chunks of material on the talk page. It clutters up the talk page and serves no useful purpose. Instead simply note on the talk page that some material was removed and why and then link to the diff, e.g. [4]. Anyone who wants to work on the removed material has access to the history. Best, Voceditenore (talk) 10:04, 12 November 2011 (UTC)

Thanks for that. I left the student a note about it. I see from the student's talk page that their instructor approved the title/subject. Unfortunately this has happened a lot in the IEP. The instructors are approving topics without actually checking themselves that it isn't already covered in an existing article. I wonder if they were told to do this during the training sessions (were the instructors trained at all?) and more importantly, how to search. This has been a particular problem with the IEP because of a shaky grasp of English capitalisation conventions and a seeming lack of acquaintance with WP:TITLEFORMAT and WP:MOS generally. Thus they see Non-banking Financial Company as a red link and simply assume there is no article about it. Voceditenore (talk) 11:04, 12 November 2011 (UTC)
Thanks for the advice above. I have been deleting unsourced material from the students since that was what Nitika's instructions said to do, given the high likelihood of copyvio. If we feel that the upcoming CCI will address those issues I'll leave unsourced material in place from now on -- I don't like to delete material without definitely knowing it's a copyvio, but I could also see why that approach was suggested. If I do keep deleting unsourced material, I agree that a diff link on the talk page is a better approach; I'll do that instead. For Wikipedia_talk:Articles_for_creation/Non-banking_Financial_Company my concern was mostly that it's not in article space -- are these requests typically left sitting in project talk space forever if unsuccessful, or should it be deleted? If it's not to be deleted I will leave a link to it on the talk page of the current article MER-C linked to. Mike Christie (talk - contribs - library) 14:57, 12 November 2011 (UTC)

Note on duplicate students[edit]

I've just realized that some students appear in multiple sections of the student page, presumably since they took multiple classes. When I've been investigating a student's contributions I've been going through everything theyji did and noting all results in the comments column. I'm about to go back through my own updates and copy them to other locations on the page where the student appears. I think it would be sensible for anyone who is checking on the students work to search for their name on this page to see if it's already been checked by another editor. Mike Christie (talk - contribs - library) 17:23, 13 November 2011 (UTC)

I'm going through and adding my notes in other locations for those students but I'm not filling in the OA comments column if there is an OA name and the cleanup column (not the OA comments column) already has cleanup notes. I assume this is the simplest approach but if I should fill in the OA comments too, let me know. Mike Christie (talk - contribs - library) 17:57, 13 November 2011 (UTC)
If you find duplicate students, just cross-reference them in either comments section (Im going to later run a check on this and cross reference all dupes). Currently, we are in the process of bringing that page to a common format. The cleanup comments column is part of the new format, and thus has some rules attached to it. The OA commentji column, after the reformat will be kept for reference, and not edited at all. J When the reformat is over, everyone involved in the cleanup will be notified on how to use it. For now, you may just use the oa comments column (add your comments in a bulleted list if there already is something there). If you want to use the cleanup comments column, just check out the proposed guidelines here (not all of it is relevant) ManishEarthTalkStalk 18:47, 13 November 2011 (UTC)
Thanks. I think I'll stick with the OA comments column as I am not experienced at copyvio detection. Mike Christie (talk - contribs - library) 18:48, 13 November 2011 (UTC)

Commons image help[edit]

A student uploaded File:Money111.jpg which I am not sure about the copyright status for -- are the images on banknotes copyrighted? Mike Christie (talk - contribs - library) 02:54, 15 November 2011 (UTC)

My understanding is that it depends on the country, but that it is not ok in India [5]. The image itself seems to come from the RBI [6], which claims copyright of their site, so I'd assume the copyright would hold for that image without evidence otherwise. - Bilby (talk) 03:21, 15 November 2011 (UTC)
The Copyright Act in India gives a 60 year copyright to government images/designs. etc. in general. Would need careful reading to ascertain the exact rule position though. AshLin (talk) 03:50, 15 November 2011 (UTC)
I put if up for a deletion discussion at Commons [7]. Even if the banknotes themselves were out of copyright (which the don't seem to be), this is a collage with artistic input in the arrangement and colouring, not a straight-on image of a banknote. Anyhow, the experts there will know what to do about it. Voceditenore (talk) 06:36, 15 November 2011 (UTC)
But wait, there's more! File:Southasia111.jpg and File:SOUTHASIADEV.jpg are suspect imagevios: the gray strip at the top of the latter indicates it's likely to be a poorly cropped screenshot from some internet accessible database. I'm not sure if the data itself may be copyrightable (it depends on how it is compiled) but the presentation probably is.
By the way, doesn't anyone teach these students table markup or how to use an image editor?! This reminds me of a certain type of horrible Youtube video. MER-C 09:13, 15 November 2011 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── Well, to the newbie, table markup, whether taught or not, is pretty confusing. And I doubt the IEP would put more effort into teaching them table formatting as most of them won't need it anyways. Regarding the images, we'll have to probably look at all the IEP uploads, too, and fix them.. Anyways, I replaced the ugly image with a slightly less ugly comp-generated one (I used MS Word.. rather unorthodox, but its rather usefull for quick stuff like this). Ideally, it should be made svg or png, but I'm not going to start doing that. ManishEarthTalkStalk 11:28, 15 November 2011 (UTC)

It's not hard to point out and understand Help:Table or one of those leaflet thingies the WMF are so fond of. The images just keep getting better: File:Bowen's diagram 2.jpg and File:MU and TU of taxation.jpg. Do these students even look at their uploads before (or after) inserting them into the article?! (I won't fix these diagrams as I'm not familiar with the underlying economic models and hence can't tell whether they are correct. The camcorder diagram is just the market for financial capital where D = demand = investment, S = supply = saving and r = the real interest rate. This isn't obvious at all.) MER-C 11:51, 15 November 2011 (UTC)
OK, those are terrible. And not fixable. And shouldn't be on WP in the first place. OK, they are fixable if you know enough about the particular subtopic (I'm not). I really see no reason why someone would draw a diagram (I hate doing that), when their computer can draw the straight lines for them. Regarding the WP:Table, I seriously doubt that the students would read that if it was given in a pile of stuff along with WP:V/WP:NPOV/etc (I don't think the students thoroughly read these, either). ManishEarthTalkStalk 12:42, 15 November 2011 (UTC)

Mailing list discussion[edit]

See here. Starts in the message "Death and Post-mortem of Indian Education Program pilot -- #DelayedMail" by Srikanth Lakshmanan. It includes this scathing criticism of the WMF, which was forwarded to foundation-l. Enjoy. MER-C 05:14, 16 November 2011 (UTC)

The worrying thing is that some are thinking or repeating the process [8]. If it is to go forward I'm recommend using Wikipedia:Articles for creation as a way to minimise the damage. Thats the on-wiki process for new articles by new editors.--Salix (talk): 11:52, 16 November 2011 (UTC)
The poster of that message is one of two people who are responsible for running this program. I prefer an external sandbox because the IEP project may crowd out other newbies at AFC via backlogs. (AFC is currently backlogged). MER-C 12:00, 16 November 2011 (UTC)
I would have liked to see the IEP next semester planning discussion started on en-wiki at the same time, given the impact it has had. I've left a note for Annie Lin to that effect. Mike Christie (talk - contribs - library) 12:47, 16 November 2011 (UTC)
Just chiming in, I think that its fine to repeat the IEP, as long as they do it at a very small scale, and keep the community in the loop (And well involved in all the planning processes). There should be atleast one community member on-site, who has lots of experience (WikimediaIndia should have quite a few such contributors). ManishEarthTalkStalk 13:21, 16 November 2011 (UTC)

The thing that keeps making me facepalm, and it's been said both in that mailing list thread and in various other places where IEP was discussed, is this concept of "well, we don't have enough qualified mentors to assist the number of students we want to include. I know, let's bring in mentors who have even less experience than the students!" In what universe is that the choice to make, rather than "Ok, we only have enough competent mentors for X students. I guess we only have room for X students this time around. Maybe we'll gain more mentors as time passes"? Why would you supplement the "workforce" with people who patently don't know how to do their jobs, rather than just cutting down the number of students, especially in a pilot program where the goal is to test how things work?

I get that a whole lot of things went wrong, from a whole lot of different causes, in this program. But the desire to slap inexperienced people, many of whom had never even edited Wikipedia, into advisory and leadership roles for this program strikes me as one of the worst, especially when they had, according to those emails, eager local community members available who were being shut out in favor of "ambassadors" who didn't understand Wikipedia. A fluffernutter is a sandwich! (talk) 15:49, 16 November 2011 (UTC)

Hi everyone, I posted this on my user talk page too because Mike left me a message about this topic, but I'll repost what I said here as well:
The local India staff team members (Hisham, Nitika) and some of the San Francisco -based Global Education staff team members (myself included) had a long, in-depth meeting yesterday to talk about the future of the education program in India. One thing we talked about at length is whether to continue any in-class activities next semester (spring 2012), or instead to focus in the spring on doing post-mortem analysis and wait until after spring to start working with any classes. As you said, there are big risks to running in-class activities before we have adequate time to make a thorough analysis of what exactly needs to be changed to make the program in India more successful, and we discussed these risks at length during our meeting yesterday, with many people arguing for waiting until at least June before working with any more classes. We decided that Hisham and Nitika will make the call (soon) on whether or not any in-class activities will take place next semester based on these discussions, since they are the people who run the program in India. So we'll have more exact updates on that afterward, but rest assured that we share your concerns 100%!
Everyone was also in agreement that the post-mortem analysis and the planning process absolutely need to be a dialogue between the Wikimedia Foundation and the community. One of the mistakes we made this past semester was that we did not involve the community sufficiently in the planning, and we definitely want to change that. Various community members have been involved in the Pune pilot (thank you all for your help) and have a lot of knowledge at this point about what the outcomes and challenges of the pilot were and how those affected the larger English Wikipedia, so I think any analysis and planning process in the coming months will be inadequate if these community members are not an active part of the conversation. We'll like to use a variety of communication channels for the analysis/planning since each channel has its pro's and con's. So, be on the lookout for that soon as well! Annie Lin (Wikimedia Foundation) (talk) 20:15, 16 November 2011 (UTC)
One of the mistakes we made this past semester was that we did not involve the community sufficiently in the planning, and we definitely want to change that.
I just wish I had more faith in WMF having recognised the truth of the first part, and more hope that they'd take a useful approach to the second. So far though, nothing about this whole mess gives me any confidence. Andy Dingley (talk) 20:33, 16 November 2011 (UTC)
Hear hear. MER-C 09:12, 18 November 2011 (UTC)

I was one of those participating in that thread and other forks about the IEP. The one thing that consistently stands out it they are refusing to acknowledge a) they carefully avoid acknowledging that they need to scale down b)that the campus ambassadors must have more editing experience. After a while the discussion bogged down to no of edits a campus ambassador has and became ugly. The horde of CAs that descended into the discussion repeatedly keep claiming that the pilot was a success (a view shared by sue gardner). The CAs and the WMF IEP team, still refuse to acknowledge that they need more article editing experience in en wiki to handle such a program.

Even after all the heat, i am not surprised to hear Hisham and Nitika will make the call (soon) on whether or not any in-class activities will take place next semester based on these discussions. The very possibility that "in-class activities" may continue next semester with exactly the same setup despite all that has happened shows how wrong WMF's attitude toward the program is. I am reproducing kudpung's table on CA edits from meta

CA edits
User 1st edit Total edits Mainspace
en:User:Gsinghglakes 18 September 323 21
en:User:Ramshankaryadav May 696 41
w:en:User:Seva.panda 3 June 14 1
w:en:User:Arnavchaudhary June 117 23
en:User:Wasimmogal2007 12 Sept 2010 316 6
w:en:User:Pallaviagarwal90 28 August 109 10
w:en:User:Mihir.khatwani June 240 20
w:en:User:Tambeparag July 171 86
w:en:User:U.raghavendra June 39 6
w:en:User:AbhiSuryawanshi May 343 59
w:en:User:Rangilo_Gujarati February 1,210 206
w:en:User:ALX999 May 79 8
w:en:User:Mihir_Kelkar 31 August 9 3
w:en:User:Pratiklahoti8004 July 532 51
w:en:User:Gunit31 August 137 28
w:en:User:Devanshi_tripathi August 571 278
w:en:User:Anurag_acj 25 July 128 22
w:en:User:Vaibhavchandak 28 July 172 43
w:en:User:User:Debastein1 24 July 838 133 (user 244)
w:en:User:Vedantgupta7890 29 July 117 62
w:en:User:Minakshinajardhane 27 July 128 28
w:en:User:Nikita.agarwal 21 August 137 51
w:en:User:Shefalinaik 28 July 73 19
w:en:User:Roshnisaigal 30 July 188 96
w:en:User:Ishu.aghav 3 September 28 6
w:en:User:Arjunmangol 30 July 302 45
w:en:User:Tb0412 8 July 93 39
w:en:User:Kumarvikramsingh 6 September 29 3
w:en:User:RDebashruti 21 August 29 2

These are the sort of people the WMF offers as a solution to handle an ever expanding education program.

Despite all the issues we have raised and the large number of regular editors who have spoken out, the WMF a)refuses to acknowledge outright that the IEP pilot was a disaster b) is thinking about continuing the same program for the next semester c) still thinks throwing more inexperienced CAs into program will take care of things.

Unless something drastic happens from the en wiki community side - like an RFAR MER-C mentions or a blanket ban on student articles, i have no faith in the WMFs ability to self correct. --Sodabottle (talk) 10:11, 18 November 2011 (UTC)

  • The problem wasn't with the CAs themselves. The problem was that the organizers expected them to do what CA's are not meant to do, regardless of how much experience they have. I'm not sure what the thinking was behind that, especially given how inexperienced they were. The real problem was that the IEP organizers (somewhat late in the day) recruited "out of process" Online Ambassadors, the majority of whom were completely non-participative, and worse, even less experienced than the CAs! It was claimed that these IEP OAs had been trained. I'd be curious to know by whom and on what. Three of them had added copyvio to WP, for one thing. Another three of them are listed on the IEP course pages with no link whatsoever to their user account, if they even had one. Then after three months of chaos, when it was plain that the vast majority of these OAs weren't editing on Wikipedia at all, let alone mentoring students, they were assigned to clean up the copyvio. Note that I'm not referring here to the US CAs who were brought in later to help with the copyvio clean-up. Voceditenore (talk) 11:11, 18 November 2011 (UTC)
We (IEP OAs) were given a roughly one hour long lecture focused primarily on the goals/structure of the program and another hour long IRC session in which there was some role playing of assisting students. I'd be happy to forward the lecture notes to anyone interested. I'd also be happy to pass along the emails we received. I think they would be instructive to anyone wishing to analyze how and why this program failed. Danger High voltage! 15:35, 19 November 2011 (UTC)

WMF EP quantitative analysis conference 22 November[edit]

There's an online conference presenting quantitative analysis of the IEP (mostly) occurring at 16:00 UTC, 22 November (see here for your timezone). More information is available at outreach:Global Education Program Metrics and Activities Meeting. The general community is welcome to attend; we need a few people there to ask the hard questions and disrupt the inevitable WMF circlejerk. (I won't be attending myself, time = midnight for me).

I also note with disdain this official WMF blog post, which is extolling the virtues of expanding the EP without regard to community health.

P.S. Why aren't the Foundation staff posting this? (Cross posted to WT:USEP.) MER-C 05:50, 18 November 2011 (UTC)

I notice they only have capacity for 25 participants so get there early. Jojalozzo 15:54, 18 November 2011 (UTC)
I've replied to MER-C's post at WT:USEP, but please note that this meeting is not "mostly" about the India Education Program -- it is a high-level overview of activity happening in education programs around the world on several different language versions of Wikipedia. India is one of five countries we will discuss during the meeting. -- LiAnna Davis (WMF) (talk) 16:53, 18 November 2011 (UTC)

Reformat finished[edit]

The reformatting of WP:IEPS is finished (a few trivial things are left, which can be done later).
For all of you involved in the cleanup, here are the rules for editing the tables:

  • Please check all edits made by a student, not only the IEP-related ones. Also, check if a student has already been commented on/cleaned up on a different course (many students have enrolled for multiple IEP courses) before proceeding.
  • Explain what you have changed in the edit summary. If you have changed some data (not if you have added a comment), please sign the "Last local table update: signature/date" hatnote with ~~~~ (you may have to remove an older sign).
  • All comments now go in the "cleanup comments" section. They should be placed in bulleted lists (using *), with the recentmost comment up top. Precede the comment with a timestamp, using {{subst:ISO8601}}. Please sign the comments.
  • When you are done checking, update the "Cleanup status" column with the status (using a timestamped bulleted list like the "Cleanup comments" column). Use the following statuses: "Checked:OK", "Checked:Copyvios/Blanked", "Checked:Copyvios/Not blanked", "Not sure", and "Unchecked". If you want to add more detailed info, use the "cleanup comments" column. Wrap the cell with {{yes}}, {{no}}, {{partial}} templates according to the latest check. (Yes for "Checked,OK", No for "copyvio", Partial for "not sure"). Note that these templates only work if their opening braces touch the "|" of the table cell. E.g:
(The 2nd column is the comments column, and the third is the status column.
|*{{subst:ISO8601}}: Have checked thoroughly, does not seem to be a copyvio. ~~~
*2011-10-19T01:14Z: I'm not too sure about the first section. [[User:Example|Example]] ([[User_talk:Example|talk]])
*{{subst:ISO8601}}: Checked:OK
*2011-10-19T01:14Z: Not sure

Some more rules have been kept here (This page has been transcluded to the editnotice of WP:IEPS and is visible to whoever tries to edit the page).
If you have any queries, please post them below.
Thanks, ManishEarthTalkStalk 09:42, 18 November 2011 (UTC)

I have a couple of questions.
  • I've been posting in the OA column, as I have no copyright cleanup experience; do you want me to not do that any longer? Do you need me to move my past comments to the cleanup column?
  • It looks like you're considering this as an article by article cleanup, so that the comments apply to the article listed in that row. I don't think that's a good idea -- the students often didn't work on the article listed, but worked on another article, or multiple articles, instead. In addition, I've found it's much more efficient to check all a student's contributions when I get to that student, and make an annotation that refers to all their work. If you look at my comments you'll see that they are identical for each instance of the student. Search for "vastu1706" and you'll see what I mean. The result is that the comments you give as an example would not be meaningful, because you'd have to qualify which article you're talking about.
  • Finally, I'm curious as to why so much effort is being put into reformatting this page. So much of what the students have done is being completely removed that in many cases there's nothing for a returning cleanup pass to do. See this section, for example, and read the first half dozen comments; I don't think there's much left to examine. Add to that the fact that most students appear in the list two or three times at least, and that the cleanup work done so far has probably dealt with perhaps a third of the page already, and I'm not sure that this is such a big project after all -- at the current rate, about another two weeks will probably see it completed. I have no CCI background so pardon me if I'm missing something here.
Thanks -- Mike Christie (talk - contribs - library) 12:24, 18 November 2011 (UTC)
  • No, the past comments can stay where they are (You can move them if you want, just set the timestamp to 'unknown'). You do not need copyright cleanup experience to use the cleanup columns, just common sense. The CCI people will do a check later for anything we've missed.
  • Actually, it's a student by student cleanup. Sorry, I forgot to mention that. You must check the contribs of all the students in a row, clean up whatever is necessary, and report the findings. I don't see how that would make the comments not meaningful. You can qualify the article, and add another comment for the next article.
I've added a note above on how to handle multiple students.
  • Because the whole thing was rather haphazard before. One of the problems with different formats is that its hard to see what's going on fully. The other problem was that exporting the student/article data to a machine-readable list was a big headache (And this is needed if you want to keep an eye on the students or do some repetitive task). Actually, it wasn't much 'effort' to reformat the tables (I had a script), just a bit boring. Regarding your estimate, I (and I think most of the others) would disagree that it will take a few weeks to complete this (Read some of the discussions above, for example the "Common format-consensus" one). ManishEarthTalkStalk 13:18, 18 November 2011 (UTC)
Mike, this was discussed in other threads further up. Even if most of the students' contributions will have to be deleted in the end (a pity, but not our fault), you need a solid database to work on. So far, we haven't had anything like this. The various lists in other places have been incomplete, outdated, faulty, contradicting and inconsistent. Tons of articles were missing. We still find "new" IEP students by accident, which have not been listed so far. And some students have created more than one account or have been editing under IP addresses. Basing your work on this mess, you will be able to investigte some edits, but you will also miss many, even if you carefully check a student's other edits. The full article list to be checked could (and will) be automatically derived from a full students list, but we maintained the articles in the master table as well, since in several cases we found "new" IEP students by checking edits of originally listed or related articles, which have been in the scope of the programme. So, in order to get a complete list of students, we will have to reverse-lookup and check any editor, who edited an article im the past months, if they might have some connection with IEP. Some of this can be automated, but only after the tables on the master list were brought into a uniform table format, so that it can be parsed by scripts easily. Another reason for the table reformatting is the fact, that in order to force anyone to work on a single database (the only way to avoid synchronisation problems) we deleted the various distributed lists in other places (after carefully merging the info). Most of the info is not relevant for any cleanup efforts, but if it wouldn't have been relevant for the local community (instructors etc.) they would not have added it to the tables. After all, they still have to evaluate the students' work to determine if they have passed their courses or not. If we'd just delete their stuff without consideration, our efforts could hardly be seen as an attempt of a cooperation, and a complete lack of communication and cooperation (from their side) is what has caused this chaos in the first place. --Matthiaspaul (talk) 13:47, 18 November 2011 (UTC)
Thanks for the responses; I didn't mean to complain, just check on a couple of things. I'll start adding my comments to the cleanup column and stop adding them to the OA column. One remaining issue: what I meant by "not meaningful" is that "I'm not too sure about the first section" (the given example comment) doesn't mean anything if the row refers to multiple articles. Anyway, if the comments I've already been making are acceptable I will continue to use that format (in the new column) since I have worked out a fairly efficient process for doing them that way. Mike Christie (talk - contribs - library) 14:09, 18 November 2011 (UTC)
Make sure that you still timestamp and bullet the comments, whatever format you use. Regarding the example comment, it wasn't exactly an example comment, it was just to display how one should use {{yes}} and the timestamps. ManishEarthTalkStalk 15:28, 18 November 2011 (UTC)

One more question: if I find copyright issues currently I am just deleting the material, and leaving an appropriate note, both here and in the edit summary. Technically this material should be revdeleted, I gather. Should I be flagging these for revdel? If so, what's the best way to do that? Mike Christie (talk - contribs - library) 15:30, 19 November 2011 (UTC)

And one more; if an article listed was not in fact worked on by a student, should I delete that article from that row of the table? Mike Christie (talk - contribs - library) 15:47, 19 November 2011 (UTC)
When I search for copyvios I only check online sources. If I find nothing I will add "could not detect copyvio" to the comments on that article, and will add "Checked: OK" as the status, even though I've not checked offline sources. Let me know if that's not OK; and also please take a look at the first few edits I make here and let me know if there are other changes you need me to make. Mike Christie (talk - contribs - library) 16:06, 19 November 2011 (UTC)
Yes, it should be revdeled, that will be done during the CCI. You can flag them if you wish (that will reduce the CCI workload), but it is not necessary to do so. Don't remove articles from the list, the list isn't only for the cleanup (The profs need it, too).
Yes, it is understood that we can't easily check offline sources, so an online check is enough. Fortunately, if something is copied from a textbook, it is usually obvious example (atleast for most of the IEP edits I've seen). These can be safely blanked. If you're not sure, use the "not sure" option. Also, if you want to check out textbooks, you can do a Google Book search of the text in question.
By the way, don't forget to add the yesno templates to the cleanup status. This will primarily help the CCI (and us) understand what is needed at a glance (yes means it's clean, no means that it needs revdelling, partial means the cleanup-er wasn't sure). I have added them to one of the tables. ManishEarthTalkStalk 02:11, 20 November 2011 (UTC)
Thanks for the example edit; that was helpful. I'll put the yes/no/partials in. Please keep an eye on my updates and let me know if there's anything else I need to change about the way I'm doing them. Mike Christie (talk - contribs - library) 02:40, 20 November 2011 (UTC)

Watchlist dumps updated.[edit]

Since the reformat is over, I was able to update our watchlist dumps. The "Watchlist dump" pages contain text to copy-paste into your raw watchlist. The "Watchlist" pages are watchlist-like pages for the list of articles/etc in question (If you don't want to bloat your watchlist).
Here they are:

Enjoy! ManishEarthTalkStalk 14:07, 18 November 2011 (UTC)

Great, thanks! --Matthiaspaul (talk) 14:43, 18 November 2011 (UTC)

If anyone wants it, here's the script I use for generating the lists. It needs to be executed on WP:IEPS, using a Javascript Console (Chrome has one, I think IE and Greasemonkey have them, too). Execute the chunks separated by // separately, in order (The first chunk prepares the ground, the second chunk displays the usernames, third articles, fourth sandboxes). The last three chunks will temporarily hang your browser (or part of it.. on Chrome all Wikipedia pages are readable, but you can't type anything on them). The code is a bit badly written (I wrote it in a rush), but it works:


$.fn.removeCol = function(col){

   // Make sure col has value
   if(!col){ col = 1; }
   $('tr td:nth-child('+col+'), tr th:nth-child('+col+')', this).remove();
   return this;

}; function getLink(str){ var ret if(str.indexOf("index.php")!=-1){ ret=str.split("title=")[1].split("&")[0]; }else{ ret=str.split("/wiki/")[1] } return unescape(ret).replace(/_/ig," ").split("#")[0] }

function compareLinks(l1,l2){ s1=getLink(l1.href) s2=getLink(l2.href) if(s1>s2){ return 1; }else if(s1<s2){ return -1 } return 0; }

var a=document.getElementsByClassName("IEPtable") doc = document.createElement("body"); for(i=0;i<a.length;i++){ doc.appendChild(a[i])

} document.body=doc; for(i=5;i<13;i++){ $(".IEPtable").removeCol(5) } $(".IEPtable").removeCol(1) users=[] userC=0 articles=[] articleC=0 sandboxes=[] sandboxC=0; a=document.getElementsByTagName("a") for(i=0;i<a.length;i++){ h=getLink(a[i].href) if(h.indexOf("User:")!=-1){ if(h.indexOf("andbox")!=-1||h.indexOf("/")!=-1){ sandboxes[sandboxC]=a[i] sandboxC++; }else{ users[userC]=a[i] userC++ } } if(h.indexOf("User:")==-1&&h.indexOf("User talk:")==-1&&h.indexOf("Special:Contributions")==-1&&a[i].innerHTML.indexOf("YOUR ARTICLE")==-1){ articles[articleC]=a[i] articleC++ }



document.body.innerHTML="" users=users.sort(compareLinks) for(j=0;j<users.length;j++){ document.body.innerHTML+="[["+getLink(users[j].href)+"]]<BR>
"; }


document.body.innerHTML="" articles=articles.sort(compareLinks) for(j=0;j<articles.length;j++){ document.body.innerHTML+="[["+getLink(articles[j].href)+"]]<BR>
" }


document.body.innerHTML="" sandboxes=sandboxes.sort(compareLinks) for(j=0;j<sandboxes.length;j++){ document.body.innerHTML+="[["+getLink(sandboxes[j].href)+"]]<BR>
" }

ManishEarthTalkStalk 15:47, 18 November 2011 (UTC)

Cool! It appears the students have stopped making major edits but I'll wait a week or so before running the CCI program just in case. MER-C 03:37, 19 November 2011 (UTC)
Remember that not all student usernames have been added to the IEPS table. We may need to manually backcheck from the articles (If the WMF doesn't supply them.. they filled in most, but a few are still left. ManishEarthTalkStalk 10:26, 19 November 2011 (UTC)
The CCI program can be run incrementally -- it doesn't require the list to be complete (but we do). The sooner we start systematically cleaning up this mess (after my exams, 3 days now), the better. MER-C 11:57, 19 November 2011 (UTC)
Once you/Moonriddengirl/etc start the CCI, could you ensure that you log all cleanups to the master list? I don't want the OAs to have to duplicate the effort. You could either write "Cleaned by CCI" in the "Cleanup comments" using a timestamped bulleted list (see above), or I could add a separate column for the CCI.
Note that if a student is marked as 'checked' by an OA, it still will need a CCI, but your job will be easier. The OAs will have done internet copyvio checks and commonsense "looks like it was copied" checks, and blanked any copyvios. We will still need someone to revdel the stuff, and a final confirming check by the CCI (which knows much more about copyright than the OAs do). ManishEarthTalkStalk 07:40, 20 November 2011 (UTC)
I think shutting down the OA cleanup on the master list and redirecting the OAs to the CCI would be a better solution. MER-C 09:47, 20 November 2011 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── The issue with that is that not all OAs have copyright experience (Not all are generally wiki-aware, either). This point has been brought up a few times before (on this same page), so it would be better if the OAs to stay out of the CCI. They can do the bulk of the cleanup while the CCI can doublecheck and tie up the loose ends. That way, the CCI isn't overburdened, and the cleanup is still done systematically. ManishEarthTalkStalk 12:10, 20 November 2011 (UTC)

There's nothing wrong with repurposing a CCI as a general cleanup listing, it is, after all much more thorough than what the WMF made up on the spot. Instead of removing the diff listing -- something I was going to recommend against anyway with a messagebox and instruction modification, we should use multiple comments with OA comments being explicitly marked as such (example follows):
  • OA comment: seems OK User:Example OA 09:25, 21 November 2011 (UTC)
  • Green tickY reverted MER-C 09:25, 21 November 2011 (UTC)
I don't want the people at CCI doing any more work or form-filling than they have to, especially if it is due to WMF incompetence. We are stretched far enough as it is. MER-C 09:25, 21 November 2011 (UTC)
I think I'm going to stop the cleanup efforts I've been doing; I think the experts who do CCI work are going to do a better job than I've been able to do and I don't want to duplicate effort. There aren't many editors doing OA cleanup here so it might be OK just to shut it down here and move to CCI. Mike Christie (talk - contribs - library) 15:18, 21 November 2011 (UTC)
I agree, though I have nothing against the OAs helping out at the CCI (the community may oppose it, though). Anyways, you seem to be just about the only active cleanup OA. If the CCI people are fine with it, you could help them out in the way MER-C suggested. ManishEarthTalkStalk 12:20, 22 November 2011 (UTC)
We're fine with CCI taking over now if that's what you all would like. We certainly feel like it's our responsibility to facilitate the cleanup efforts as much as we can, but our U.S. OAs are in the crunch time for supporting their students right now, so honestly I doubt much more cleanup will happen from their end for another few weeks. We're certainly happy to ask for more help again after the term wraps up in the United States, because we're committed to ensuring cleanup happens, but if CCI is ready to take on the project now, we welcome the help. We'll stop asking our OAs for help at this time to let the CCI process happen, but do let us know if you'd ever like us to resume it. -- LiAnna Davis (WMF) (talk) 00:11, 23 November 2011 (UTC)
I have no qualm with OAs continuing cleanup at the CCI as long as the format above is followed and their comments are labelled as OA comments.
The best way you can facilitate the cleanup right now is to supply a full list of student usernames. My request for such a list made in the IRC office hour last month was neglected. We cobbled up one ourselves based on the information we have, but I strongly suspect it is incomplete given that it was rather poorly maintained throughout the semester. MER-C 08:01, 23 November 2011 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── Yep, the list I've generated is based on WP:IEPS, which is still incomplete. On a whim, I just checked out the database that Frank Schulenburg uses to run the student-o-meter on the toolserver (anyone with a ts account can access it). I've kept the student data I gleaned here. Use this diff to see the differences between those pages. Mostly, we have student usernames that the WMF does not, though they do have some that we don't. If you find any (most of the red lines are due to bad diff alignment), try to add it to this table (if you can find where the username fits in the other tables, that's fine, too), and then to this list if you feel like it. I checked the students till E, there are still a lot more to copy. I'm going to do this with a script after a while (I didn't expect there to be so many extras so I started by hand).
Remember, we still don't have all usernames, this is just a bunch of 100-odd extras.
ManishEarthTalkStalk 13:43, 23 November 2011 (UTC)

I have added more usernames. The list (almost) looks complete besides 10 usernames from the class "Machine Drawings & Computer Graphics". I have mailed the professor for this class asking for usernames. I'll add the remaining 10 usernames as soon as I get the list. Nitika.t (talk) 06:12, 24 November 2011 (UTC)
Added more. Only 1 or 2 missing now. Nitika.t (talk) 10:30, 25 November 2011 (UTC)

Wikipedia:Contributor copyright investigations/Indian Education Program is now live[edit]

I have run my CCI program on the union of the lists I posted here. The output is Wikipedia:Contributor copyright investigations/Indian Education Program and 4 other subpages.

36 of 770 users on the above lists do not correspond to actual accounts:

The following users have contributions prior to the IEP:

The final list comes out to be 736 users, less than the 1000 or so in the relevant boasts. As such, I still believe the list is incomplete. Please read the instructions and check for additional students when auditing, which you should inform me of. (As mentioned above, the CCI program is incremental so what's here doesn't need to be updated.) Have phun! MER-C 10:38, 26 November 2011 (UTC)

Other than trying to figure this out from the main students page, is there a list which covers whether or not these have already been checked? Or do we need to compare each one against the existing student's page? - Bilby (talk) 10:47, 26 November 2011 (UTC) Forget it. The first one I looked at had extensive copyvio in the first two lines that was missed when it was checked before, and that was checked by someone with extensive experience. :( I'll muddle through. - Bilby (talk) 11:04, 26 November 2011 (UTC)
I'll start going through the stuff soon. Just a note, even though it does not fall within the scope of the CCI, could you also tag camcorder'd pics? Use {{Cleanup-image}} or {{SVG}}. It doesn't take much extra time and it saves useful diagrams from deletion. ManishEarthTalkStalk 16:15, 26 November 2011 (UTC)

The CCI has been updated to include a further 19 students. This brings the total to 755. MER-C 11:32, 6 December 2011 (UTC)

Another batch of students[edit]

From Wikipedia:India Education Program/Courses/Fall 2011/Computer Organization and Advanced Microprocessing.

The following users do not exist:

The following users have contributions prior to the IEP:

This takes the CCI up to 823 students. MER-C 03:42, 17 December 2011 (UTC)

Bot approved--any more tasks?[edit]

The BRFA for User:Manishbot had been approved. I am letting the bot tag IEP talkpages with {{IEP assignment}}, and userpages with {{User WikiProject India Education Program}}.
Now, in this discussion, a sort of "beware of copyvio" message was proposed (to be posted on talk pages of articles). Should we go through with it? I'm copying the original proposal here:

Look out for possible copyright violations in this article
This article has been found to be edited by students of the Wikipedia:India Education Program project as part of their course-work.
Unfortunately, many of the edits in this program so far have been identified as plain copy-jobs from books and online resources and therefore had to be reverted. See the India Education Program talk page for details.
In order to maintain the WP standards and policies, let's all have a careful eye on this and other related articles to ensure that no material violating copyrights remains in here. <insert sig>

Now, we may need to change the last line to "..policies, please check the article for any copyvios...", and add something about the grammatical errors, but otherwise it looks OK (Though I'm not sure if we need to go through this at all).

Also, please let me know if there are any other automated tasks that the cleanup needs (I may need another BRFA for them, though). ManishEarthTalkStalk 12:51, 27 November 2011 (UTC)

Learnings, another mailing list discussion and planning for IEP v2.0[edit]

From [9]:

I'm writing to share an update on the way forward as far as making sure that we adequately capture all the learnings possible from the India Education Program. We also want to make sure that these learnings are robust and are incorporated into the core program design going forward. We also would like these learnings to also be of an adequately granular level so that we can identify trends of what doesn't work and what might work better - such as year of students, nature of faculty involvement, subject area, etc.

We are planning multiple channels to capture, analyse and incorporate these learnings.

  • WMF has commissioned a researcher, Tory Read, to conduct an evaluative study of the IEP and provide recommendations for improvement. Over the course of next few days, she will be interviewing several teachers, students, CAs, Directors, admins and other wikipedia editors from the global community to take their input and views about IEP. She has already spoken to staff in SF, spent the day with us in Delhi and will be in Pune over the next 4 days conducting face-to-face interviews. She will also be reaching out to community members outside of Pune. She will then write an evaluative story about what worked, what did not and learnings from the the pilot.

  • A series of video interviews were done by a Campus Ambassador in Pune - Abhishek Suryawanshi who conducted interviews with students, professors and Campus Ambassadors in Pune. We promised all interviewees that any personal identifying information would be removed from this analysis to encourage them to speak freely.

  • In addition, India Programs consultants (Hisham & I) and Wikimedia Foundation staff (Frank Schulenburg and LiAnna Davis) are interviewing experienced Wikipedia editors from India and across the world. Please do mail me (at if any one of us can contact you to ask your views (if we already have not.)

  • We want to collate data points to be able to analyze and draw out trends. Here is an example of data that we're trying to dig out (and this is just a sub-set of a preliminary list)
  • What's the amount of data that students have added to Wikipedia? What's the amount of data that got reverted? What's the net amount of information that the students have added on Wikipedia?
  • How many students edited articles outside of their in-class assignments?
  • How many student's got warnings on their talk pages? How many students corrected their errors after these warnings?
  • How many students got blocked? / How many students got blocked more than once?

  • At the WikiConference India we also had a IEP round table where we invited a couple of students, professors and campus ambassadors to share their personal views and experience working with IEP pilot. I'm going to upload the video if it's possible and share this with you. It was a very useful session where we discussed many of the points that will inform our future work - such as determining whether it should be voluntary or not, what student's were thinking while they engaged in copy-pasting into their articles, improvements to Campus Ambassador training, ideas on what kind of professors can work best on a project of this nature, etc.

  • The Learnings Page is my *very* preliminary draft at collating learnings from the pilot and exploring how how we can incorporate these in our way forward.

We will shortly summaries of all these findings as and when they are ready - and welcome an open discussion on them. Do please share your thoughts and suggestions on the above. Please also do let me know if I've missed out anything. Nitika.t 11:31, 28 November 2011 (UTC)

Yes you have. You did not post it on this page.

This was also posted on the wikimediaindia-l, starts here, thread name is "IEP Pilot - Preliminary Analysis". These discussions reference an early draft document containing the plan for the second iteration of the IEP. MER-C 05:58, 1 December 2011 (UTC)

Rather amusingly someone on the mailing list suggested that a few questions along the lines of "how much effort was spent cleaning up the mess" should be added, only to be told they were being too negative. [10] Hut 8.5 09:36, 1 December 2011 (UTC)
Ram Shankar Yadav is a campus ambassador for the program. MER-C 12:26, 1 December 2011 (UTC)
So, how much donor money has been spent physically flying consultants to and from SF and Pune throughout this program? Danger High voltage! 20:59, 1 December 2011 (UTC)
First or cattle class? (actually you don't wanna know, really) Kudpung กุดผึ้ง (talk) 09:32, 3 December 2011 (UTC)

The discussion continues here, same thread title. The list of people being interviewed is here, see also User talk:Toryread. MER-C 03:17, 3 December 2011 (UTC)

Thanks for posting a link to the interview list, I was just about to put it on my talk page. One modification: I haven't yet posted questions to Wikipedians on the list who prefer email or talk pages as the interview method, so those people will have through the end of this week to provide input on the questions.Toryread (talk) 23:29, 4 December 2011 (UTC)

Here is a list of work I am doing on the Pune Pilot Review: 20 hours in US reviewing talk pages and email list communication regarding IEP; Interviews (in person, unless otherwise noted): - Barry Newstead, WMF SF - Frank Schulenburg, WMF SF - Annie Lin, WMF SF - LiAnna Davis, WMF SF - Hisham Mundol, WMF India - Nitika Tandon, WMF India - Shiju Alex, WMF India - Ram Shankar Yadav, CA Pune - Ishita Ghosh, professor, SSE, Pune - informal conversation with 2 SSE students and 2 CAs over lunch in Pune - Rashmi Barua, SSE student - Devanchi Tripathi, SSE student and CA - 3 members of Pune WP community (one doesn't want his name here, so I'm not naming any of them), group interview over dinner - Abhilasha Sharma, SSE student - Anushikha Benazur, SSE student - Dr. Jyoti Chandiramani, SSE Director - 3 more SSE students, informal conversation over lunch - Debanjan Bandyopadhyay, CA SSE - Radha Misra, professor, SNDT Women's College, Pune - Shweta Shinde, student, COEP - Gautam Akiwate, student, COEP - informal conversation with 3 CAs and 1 student, COEP - Dr. Anil V. Sahasrabudhe, Director, COEP - Dr. Pradeep Waychal, professor, COEP - Kudpung, WP Admin (skype video) - Srikanth Lakshmanan, WP editor, OA for IEP (skype audio) - Hisham Mundol, WMF India (telephone) - Wasim, CA COEP - Pratik Lohati, CA COEP - Arjun M. K., CA COEP - Vaibhav Chandak, CA COEP - Prof. Abhijit Sir, professor, COEP - Bala Jeyaraman (telephone) - Moonriddengirl (telephone) - Risker (skype) - Ruud (email) - Andy Dingley (email) - Voceditenore (email) - Fluffernutter (email) - Danger (skype) - MER-C (talk page) - Matthiaspaul (email) - Cindamuse (skype) - Ayush Khanna, Data Analyst, Global Development Program, WMF SF (telephone).

Data gathering closes Monday, 5 December, except I will take the statistical numbers from WMF whenever I get them, and participants via email/talk page will have through end of the week to answer because I'm just getting questions to them later today. My contract includes 20 interviews, and I am already doing many more, because I've determined that the job requires it. This list was developed in consultation with WMF staff, India WP community members, and global WP community members.Toryread (talk) 00:20, 5 December 2011 (UTC)

Sincere apologies. I missed posting it on the en.wikipedia page. I'll make sure that I post updates on both the pages going forward.Nitika.t (talk) 04:05, 5 December 2011 (UTC)

FYI, I'm going completely dark from now through 12 December, 23:00 UTC. No phone, no internet, no computer. I'm looking forward to checking talk pages and beginning my synthesis when I return.Toryread (talk) 18:47, 5 December 2011 (UTC)

Pilot Analysis Plan[edit]

(Cross-posting to several pages) I've just created the page Wikipedia:India_Education_Program/Analysis to document our planned analysis of the Pune pilot. We've been collecting ideas in many different places, but we wanted to have one central page where we'll be analyzing the learnings from the Pune pilot over the next few months. We will using the results of this analysis to plan our next pilot in India, which will be kicking off in mid-2012. We will not be running the India Education Program in the first term of 2012. We are committed to using the next few months to get all the learnings we can out of the analysis, so we can launch a new pilot in six months or so that addresses all of the concerns raised from the Pune pilot.

We do have one major outstanding question in terms of how to analyze the pilot, which is how do we measure the impact of the pilot on the community? I really encourage anyone who has good ideas of how to do data collection around this to contribute to the discussion on talk page. -- LiAnna Davis (WMF) (talk) 22:45, 1 December 2011 (UTC)

I also tried to leave talk page messages for everyone who had posted more than 3 times on this talk page; my apologies if I missed you! -- LiAnna Davis (WMF) (talk) 23:16, 1 December 2011 (UTC)

Good news[edit]

Good news is that CorenBot is now back. This will be an enormous help for COPYVIO detection on new pages, but like all bots, it can make false positives, so anything CorenBot tags, must be manually verified using the duplication detector, particularly to check that it is not reporting a site that mirrors Wikipedia content. many thanks to User:Coren for the negotiations and development to get the bot up and running again.--Kudpung กุดผึ้ง (talk) 03:21, 9 December 2011 (UTC)

Given the rather primitive nature of most of the copyvios these students have produced, it seems simple enough to just run CorenSearchBot across all of the articles and try to pick off the obvious ones. That would probably make things a lot easier. The Blade of the Northern Lights (話して下さい) 19:27, 15 December 2011 (UTC)
I think that Coren has said in the past that it isn't possible to run the bot on existing pages because of the number of false positives from mirrors. Hut 8.5 21:20, 15 December 2011 (UTC)
Couldn't he just code something that searches the text added in the diff against the internet? That wouldn't get false positives from mirrors... Calliopejen1 (talk) 00:34, 16 December 2011 (UTC)
I think that would require a hugely powerful server dedicated to the use of CorenBot. A suggestion would be to ensure that Recent Changes patrollers have the IEP articles on their watchlists and apply the process detailed at WP:NPP. --Kudpung กุดผึ้ง (talk) 03:41, 16 December 2011 (UTC)
I wonder if it would still require too much processing power would be required if it were only to scan large diffs from users we're worried about (maybe future IEP participants and people who have gone through CCIs)... Calliopejen1 (talk) 05:11, 16 December 2011 (UTC)

New MediaWiki extension[edit]

See WT:USEP#New MediaWiki extension. MER-C 04:23, 11 December 2011 (UTC)

December Wikipedia Education Program Metrics and Activities Meeting[edit]

If you're interested in learning more about the Wikipedia Education Program in action around the globe, join us for the next Metrics and Activities Meeting on Tuesday, December 20 at 16:00 UTC. Please visit outreachwiki:Wikipedia Education Program Metrics and Activities Meeting for instructions on joining and time zone conversions. -- LiAnna Davis (WMF) (talk) 22:28, 12 December 2011 (UTC)

Still editing[edit]

We'd better keep an eye on the above page to ensure that none of the newer edits are copyvios (as the CCI may have already marked some students as 'cleaned', any more edits from the students won't fall under scrutiny). ManishEarthTalkStalk 14:41, 26 December 2011 (UTC)

Wikipedia Pune Club[edit]

See Grants:AbhiSuryawanshi/Wikipedia Club Pune and Grants talk:AbhiSuryawanshi/Wikipedia Club Pune--Sodabottle (talk) 07:11, 29 December 2011 (UTC)

Wikipedia Club Pune![edit]

Wikipedia Club Pune Golden Globe.jpg Invitation to Join Wikipedia Club Pune!
Dear IEP Followers,

Your contribution(s) and passion towards Wikipedia is amazing!

We made mistakes during India Education Program,Number was high,no experienced editors and many others... India Education Program Team is designing newer version of Program.

Inbetween,Before IEP 2.0, We are planning to launch "Small" Wikipedia Club Pune. With limited 25 Number of students,and with experienced editors/mentors.

I want Wikipedia to be a better place, I want to increase quality content on Wikipedia.But problem is (according to some people) I dont have enough experience, I am ready to accept the fact! I am newbie! But Today's Newbie one day will be experienced editor! And As newbie if I stop doing work, and stop editing Wikipedia, If I stop everything - How I am going to get Experience??

Please allow us to learn from our own mistakes,We want to correct our mistakes. And Wikipedia Club Pune is small effort with 25 students and 5+ mentors.

We are having so much knowledge in India, but thats not available on Wikipedia. If we are not going to update it through our efforts, then who else is going to update that! There are two kind of people, who make fun of failed things, and who suggests improvements to failed things to avoid mistakes, I am glad to see all of yours positive behavior and suggestions to make Wikipedia a better place. All of us are working towards common goal,ways might be different,lets collaborate to have Better Wikipedia.Country with 1.2Billion people deserves better Wikipedia.

As a experienced Editors and Veteran Wikipedians,I would love to invite all of you to Wikipedia Club Pune

We would be graceful to have your guidance and support to make Wikipedia a better place.

Wikipedia Club Pune will be fun place to edit and collaborate with like minded editors and contributors. Club will be also having exclusive interesting Activities for members.

For More details, Please visit

P.S : I was expecting someone else/experienced to start efforts to improve Wikipedia, but later on I realized everyone is busy, and I assume - If I start efforts, I might get support from experienced one's. Better to start things instead of waiting for someone to start,thats what I believe. As you are experienced one,And you also believe in improving Wikipedia,Lets work together.If you are doing similar kind of activities,Please do let us know, We would be more than happy to join and Help. And if you believe and if you Assume good faith , Please Register your name on Wikipedia Club Pune's Discussion Page!

Keep Editing, Keep Inspiring! AbhiSuryawanshi (talk) 11:34, 29 December 2011 (UTC)

I've just tagged the Club Pune logo for deletion from Commons. See Commons:Deletion_requests/File:Wikipedia_Club_Pune_Golden_Globe.jpg Andy Dingley (talk) 12:11, 3 January 2012 (UTC)

Indian image plagiarism[edit]

It's happening at all levels: Andy Dingley (talk) 12:05, 3 January 2012 (UTC)

Quantitative Analysis now available[edit]

Spreading the word: Ayush Khanna, a data analyst for the Foundation, has completed his quantitative analysis of the Pune Pilot. His numbers and conclusions are available at Wikipedia:India_Education_Program/Analysis/Quantitative_Analysis. -- LiAnna Davis (WMF) (talk) 18:57, 19 January 2012 (UTC)

Tory's report now available[edit]

Hi all, just wanted to alert you that Tory Read has published her analysis of the Pune Pilot. We want to thank Tory for her generous time -- she went above and beyond what we had asked her to do and interviewed many more people than we'd originally planned so she could get a fuller picture of what happened during the Pune Pilot. Thanks to everyone who took the time to answer Tory's questions as well. We'll be using her report to plan the next phase of the pilot program. -- LiAnna Davis (WMF) (talk) 23:43, 20 January 2012 (UTC)