Wikipedia talk:Flagged revisions/Trial/Archive 1

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

Backlog

No. Absolutely not. Do you have any idea how much of a backlog there is just for dealing with flagged newpages? We don't have anywhere near the resources to deal with something like this for every single revision. DS (talk) 02:36, 14 November 2008 (UTC)

I agree with DragonflySixtyseven. I hear de has over 100,000 revisions awaiting review. To me this seems like a massive extension of WP:AFC, which as everyone says is a giant mess. Also reminiscent of AFC is the "just a trial" idea. Also the granting of surveyor rights is bound to be a mess and to introduce bureaucracy. delldot ∇. 02:47, 14 November 2008 (UTC)
What about if revisions are automatically sighted within 12 or 24 hours ? See here. (Note also that AFC has well improved within recent months, the Category:Pending Afc requests contains only 22 pages at the time of writing, all recent.) We use to say the same of rollback that it would be a mess to assign and introduce bureaucracy, but in the end it is not. Cenarium Talk 03:47, 14 November 2008 (UTC)
Seems to me like if they're automatically sighted after time you lose the benefit without losing the cost (that people won't get to see their changes right away, it's disempowering or discouraging for new users, new content doesn't show up as fast, etc.). delldot ∇. 04:19, 14 November 2008 (UTC)
Vandalism is reverted very quickly nowadays, we won't loose the benefit. And new users can still see their edits on the draft pages and they know it'll be visible very soon. Cenarium Talk 04:34, 14 November 2008 (UTC)
Sorry for not being clear, I meant the benefit of each revision being reviewed. I would think the success of current vandalism fighting would indicate less of a need for flagged revisions. Seems like clear vandalism and edits that clearly meet all WP's criteria will get dealt with fast--it's the well-intentioned but slightly problematic edits that'll languish, like we see at AFC and with newpage patrol. These are of course the vast majority of newbie edits. I'm not sure seeing edits on the draft page will be as encouraging as the current setup for new contributors. delldot ∇. 04:58, 14 November 2008 (UTC)
As you know it, it's unrealistic to think we could review all edits by non-surveyors, we have a total of 1,215,029,157 edits on en, it's about 5 times the number of edits on de. If there's already 100000 unreviewed edits on de, then most of them are probably not uncontroversially inappropriate. Editors of the English Wikipedia will never accept such a situation on en, me included, so it's exactly because edits that are uncontroversially inappropriate are in most cases handled within a few hours, like clearly good ones, but others require a longer time, that a delayed automatic sighting will be extremely helpful, and not hinder the efficiency of SR. While it may be a benefit in reviewing each edit, it's unrealizable and, I think, not the point of SR, but rather preventing readers from seeing vandalism and libel. Other edits must be sighted if not overwritten or reverted for other reasons. Users can still review edits on their watchlist as before. There are also measures we could take to improve the system. For example, the abuse filter automatically filters all actions, we could create a filter that prevents certain edits from being automatically sighted (including surveyor's edits, and in this case, prevent the surveyor who made the edit from sighting it), as an additional protection, and possibly require sysop rights to sight certain edits identified as 'very bad'. For newbies, I think there are ways to reassure them about that, for example with the message displayed when the edit is done. The message displayed at test.wikipedia is "Edits will be incorporated into the stable version once an authorised user reviews them. The draft is shown below. 1 change awaits review", it needs to be improved and made more friendly (I suppose it's editable in a mediawiki page), telling them that edits will be visible on the article very soon, before a few minutes or hours at most (things we couldn't say without delayed automatic sighting), and that other users may edit in the meantime. We could also make that when an IP has recently edited an article, it is redirected to the draft page for this article (temporarily, as recorded in the computer's cookies). Cenarium Talk 18:17, 14 November 2008 (UTC)
Just to establish a sense of perspective, you are aware that de.wiki has fifty five million edits in total? However, this discussion is misplaced: all discussion over whether or not to enable FlaggedRevisions at all should take place on Wikipedia talk:Flagged revisions. This talk page should be used only for discussing the technical details of the proposed FlaggedRevs trial. Happymelon 18:02, 14 November 2008 (UTC)
Maybe we should transclude this page in Wikipedia talk:Flagged revisions. Cenarium Talk 20:58, 14 November 2008 (UTC)

Don't automatically sight for the trial

I would recommend against automatically sighting after a set length of time for the trial period. If sighted revisions causes a backlog, we would want to be able to gather data on how long revisions remain unsighted, and how big the backlog grows. Perhaps the idea can be revisited a month into the trial, or as a second trial, if this one isn't satisfactory. We would clearly not want to assess too early, as it will take time for editors to be granted privileges and get used to it.

Additional thought: It is possible we might want to implement the feature temporarily (say for the first week) as a stop-gap while editors are requesting and getting used to the sighting privileges.Sχeptomaniacχαιρετε 20:21, 14 November 2008 (UTC)

Since the trial is only on FAs, we won't be able to draw conclusions from it with respect to the size of the backlog. The German experience can tell us however, they have a backlog of about 100000 edits, and we have a volume of edits much more significant. Cenarium Talk 21:11, 14 November 2008 (UTC)
I don't find the number of edits in the backlog to be useful information (in fact, it smells more of FUD than data). It would actually be useful to know the average length of time a revision sits in the backlog, and the upper range. If the average is an hour or two, that's probably not serious, but if it's a few days, then it's worthy of concern. In addition, how many of the German Wikipedians have privileges for sighting, versus the number of active, registered users?
As far as data on backlogs here, I believe data can be acquired, as long as it's reasonably used. A lack of a backlog won't tell us very much, but having any significant one will. There's a lot to be gained from testing this out and seeing what exactly happens, rather than the hysteria I've seen from both sides (one side saying this will destroy WP, the other thinking it will magically fix things). Sχeptomaniacχαιρετε 22:53, 14 November 2008 (UTC)
That would be useful of course, I suppose we should ask German surveyors. I don't have the source for the 100000 edits in the backlog, but if this is indeed true, no matter how terrifying it looks, it should be taken into consideration for a full-scale implementation (and it means some edits haven't been reviewed for days or weeks). As you say, there are two extreme sides, delaying sighting would be a fair middle point. And as I said, it would also be far more adaptable, allow specific behaviors of filters w.r.t. SR, etc. Having randomized on de, I found edits needing review from Nov. 4 and more recent, for example de:Der Ball ist rund, de:MRT (Taipei), de:Leewellen and de:Stefan Jedele. I think the threshold for surveyor rights on de is too high (see de:special:listgrouprights), for example surveyors have rollback rights, while many users with otherwise good edits had it removed on en for misuse, edit warring, etc. Rollback requires surveyor rights for compatibility, but the inverse is not true, so it's an unneeded additional constraint. Cenarium Talk 23:39, 14 November 2008 (UTC)
well, it´s a qualityquestion. a de.WP-surveyor accepts an unsight version, if he isn´t sure. we ve discuss the backlogproblem and generated one, two, tree different Flagged revisions-project to limit the specialproblems of the different phases. a sight article with two weeks old unsight versions is possible. automatically sights ll dwindle the qualityeffect of this feature, best regards --Jan eissfeldt (talk) 02:44, 15 November 2008 (UTC)

If 99% of simple vandalism is removed within an hour or less (I'm making those numbers up, but for this example that's OK, and I think they are not far off the mark, maybe even conservative?) then if a revision gets automatically sighted after 2 hours, then over 99% of simple vandalism will never be seen by the general unlogged in public. That improves the apparent quality of pages, and disincents simple vandalism. Those are both good things. So I support automatic sighting. That doesn't mean sighted edits can't be reviewed and un-sighted later. It does not make things perfect, and it does not defend against POV pushing, or subtle vandalism, or other things, but it vastly reduces the probability of seeing YOU SUCK! for those who are not editors here. Which is our target market after all, not us. ++Lar: t/c 16:57, 15 November 2008 (UTC)

Sorry, but how will the automatic sighting after a period of time be realized technically? I studied FlaggedRevs.php, but failed to find any parameter related to it. Of course, I may be missing something. Ruslik (talk) 17:38, 15 November 2008 (UTC)
Now that's a good question. I had assumed that the code already allowed this but if it does not, then a new bugzilla bug would be needed. ++Lar: t/c 01:15, 16 November 2008 (UTC)
To my knowledge it doesn't exist, I made this up. It seems to be realizable, but there are a few technical suitabilities to resolve. Cenarium Talk 17:30, 16 November 2008 (UTC)
Well, I created Wikipedia:Flagged_revisions/Trial/php with detailed technical proposal. Only features that are currently available in FlaggedRevs.php were used. I think it is better to focus the discussion on which variable should be set 'true' and which 'false' instead of abstract concepts. Ruslik (talk) 18:36, 16 November 2008 (UTC)
OK, yes, in that case I agree. Let's stick to which switches and dials we actually have available. ++Lar: t/c 18:42, 16 November 2008 (UTC)
I think we only have a consensus for sighted revisions (others have been vehemently opposed). There should only be two settings: unreviewed and sighted, and only one additional usergroup, surveyor. Bureaucrats only should be able to configure a page. About granting surveyor rights, I'm still undecided: admins or bureaucrats only ? No autopromote to surveyor settings, as that has been too little discussed and too much opposed, and it could be used for another user group, larger than surveyor. No need to make things more complicated at this time. We still need to talk about the full implementation in parallel and how to resolve a number of issues, but, I agree, not here. Cenarium Talk 18:50, 16 November 2008 (UTC)
I want to avoid overburdening bureaucrats. I am proposing to have a special user group responsible for the trials ('reviewers', the name being not so important). Ruslik (talk) 19:01, 16 November 2008 (UTC)
Someone above mentioned the time required for a "Sichtung" (if that's the correct term) on the German Wikipedia. It CAN take quite a while (if it ever gets done!). I edited Joseph Sheridan Le Fanu on 5 September. It's still waiting for a Sichtung. The article was created years ago and it has "Keine Version Gesichtet" at the top. I created Cappella Sansevero on 17 September and made a number of edits since then - it is still waiting for its first Sichtung. My edits of Henry Williamson and County Dublin from 12 November are still waiting (and County Dublin, which has been around for several years, also has the sad news: Keine Version Gesichtet.
That's not the only problem with the German Wikipedia, but it's a nuisance.
BTW, I've also contributed to the Italian and Spanish Wikipedias and never had a problem. Hohenloh + 21:49, 28 November 2008 (UTC)

Minimalist proposal

I updated Wikipedia:Flagged_revisions/Trial/php. Now it is as minimalist as it can be in principle. I think now it can be !voted up or down. Ruslik (talk) 09:33, 17 November 2008 (UTC)

It's not too bad, but I disagree with the "one binary scale called accuracy". My justification for this is the lack of support flagged revisions has so far received as an accuracy-maintaining system; consensus, whether rightly or wrongly, seems to support flagged revisions only as a method of countering vandalism. Therefore, I'd propose a single binary scale named Sighted or similar, which can be set to "yes" or, by default, "no". – Thomas H. Larsen 01:27, 19 November 2008 (UTC)
I renamed two levels as sighted and unsighted. As to accuracy, I simply can not find a better word. Sighted is not a noun, and can not be used as the name of scale, in my opinion. The name of the scale is not so important. To avoid confusion I removed the word accuracy from the description. Ruslik (talk) 09:08, 19 November 2008 (UTC)

Who is going to do this?

We currently have 2,306 featured artilces. Who is going to go around with the reviewer right and make everyone of these pages flagable and then undo all of this in two months? Zginder 2008-11-19T17:12Z (UTC)

Short answer, a bot will. That is a trivially-sized task. Happymelon 18:18, 19 November 2008 (UTC)

Metrics

I am very much in favour of sighted/flagged revisions, but one aspect of this trial concerns me. How can we have a trial without having some metric for whether or not it has been successful? Otherwise all trials will dissolve into a qualitative argument about the efficacy. What I'd propose is that when a trial is agreed on, the articles selected are first reviewed for a fixed period of time ( or the history is assessed ) so that we can say what the last X weeks has brought in the form of vandalism (which is currently automatically visible), and compare that to the amount of vandalism made visible (through human error, presumably) during the trial. Then there is a reasonable metric of performance - beware also bias in the sample, since if, for instance, we chose FAs as the class for testing, I imagine that the TFAs might be skewed - that means that the metric should be measured over all FAs since it would encompass the TFA effect. Fritzpoll (talk) 11:18, 28 November 2008 (UTC)

While I agree with the principle of setting out what we should be monitoring over the trial period, anyone who believes that there is some magic number by which its "success" or "failure" can be judged is unduly optimistic. While the very thought of there being a way to quantitively measure the efficacy of FlaggedRevs (or indeed any other process on-wiki) is itself laughable, even if there were a magic formula to objectify the process, how on earth are we supposed to decide what value represents "success"? Like everything else on-wiki, FlaggedRevs is a consensus-building process; our main problem from the start has been an inclination to view it erroenously as a matrix of binary choices. There is no simple straight line on one side of which lies "success" and on the other "failure"; there is a continuum whereby each contributor to the future discussion will evaluate the trial results and draw their own conclusions. Some will change their opinions based on the new evidence, some will not, and of course change can and will occur in both directions. The important thing about a trial is that it gives people hard evidence on which to refine and reevaluate their opinions and arguments. So while I fully anticipate reams of statistics and synopses being extracted from the raw trial data, I think it is incorrect to assume that we can or should all interpret those data in the same way. Happymelon 17:06, 28 November 2008 (UTC)
My point was more that the data should be available for analysis, otherwise we'll end up no better off after the trial than before it. Fritzpoll (talk) 10:40, 30 November 2008 (UTC)

Name of usergroup

'Editor' doesn't make sense, all users are editors, and it doesn't describe the role at all (it also implies that others are not editors, huh). 'Sighter' is more explicit but it is too connoted to warfare. Surveyor has been used quite regularly, and if no better name is found, it sounds like an acceptable choice. Cenarium Talk 15:50, 29 November 2008 (UTC)

I agree that the implicit declaration of other users as non-editors is at best unhelpful. How about using 'reviewer' where we currently use 'editor' and 'surveyor' where we currently use 'reviewer'?? That seems more in keeping with the latter's role of oversight and moderation, while the former is responsible for actually reviewing pages. Happymelon 16:00, 29 November 2008 (UTC)
Sounds acceptable, for the trial at least. Cenarium Talk 23:14, 29 November 2008 (UTC)

Some comments

  1. Link to most recent version: In this trial, will there be prominent links to allow logged-out users to easily view the most-recent version if they wish, and to allow logged-in users to easily view the most-recent sighted version (or a diff of the two)? The wording of this trial page makes it sound as if maybe logged-out users won't be allowed to view the most-recent version, etc. I was going to edit in the phrase "by default", but that could be taken to mean something else, so I couldn't think of a good wording.
  2. Symbols in Recent Changes and on watchlists: if people can easily see which edits are unsighted, that will help a lot, I think.
  3. Wide level of trust: if people are worried about a big backlog, then give reviewer privileges to almost everybody. It will still catch vandalism by new users, i.e. most vandalism, I think.
  4. Nomenclature: we could label the link to the most recent sighted version "current", and to the newest (potentially unsighted) version "newest". This avoids implying that the "current" version is officially approved by highly trusted reviewers; in an optimally balanced system, some vandalism will still slip through. It also gives a more encouraging name than "draft" to the version someone has just edited. Um, by the way, if automatic flagging occurs, "sighted" wouldn't be accurate.
  5. Automatic flagging: if the software allows it, we could have three categories: 0, new edits; 1, automatically flagged after a period of time (5 minutes? 24 hours?); and 2, sighted. The most recent version with a non-zero flag would be displayed by default to non-logged-in users, but the difference in flags would still (if the software allows it) be evident in RC and watchlists to alert people that the edit may still need to be checked for vandalism. Coppertwig(talk) 21:08, 29 November 2008 (UTC)
I think 1,2,3 and 4 are good ideas. Especially 2! That would help with up keep, making it obvious which edits to double check and sight. --Falcorian (talk) 00:26, 30 November 2008 (UTC)
Some editors have proposed that we don't use 'delayed sighting' or similar systems, like expired revisions for the trial, to see how huge the backlog will become. Anyway, it is likely that the extension won't be updated to include this kind of systems at the time of the trial. I also think we need to work out ways to make this appear less like an approval process, it is not, but rather a manual and possibly partially automatic verification process. As I said in the #backlog section, the post-edit message needs to be completely revamped. Unsighted edits have an exclamation mark in the default extension in recentchanges, watchlists... For the 'classification' of edits, we may use 'new', 'expired' and 'sighted'. On granting reviewer/surveyor rights, we should be careful in the trial since we won't know how the usergroup will differ from the trial version. Cenarium Talk 02:01, 30 November 2008 (UTC)
The right that's currently called 'reviewer' is totally different to the 'editor' right. In fact reviewers can't actually sight revisions! They have to be editors as well. Reviewers are solely responsible for organising and implementing trials, as such, they must be a very restricted group. I think you've misunderstood the permissions (or, equally confusingly, you've started using the new terminology we suggested above :S). For the other points, I suggest that you investigate the setup at http://en.labs.wikimedia.org where the full FlaggedRevs demo is running. Most pages display the current version by default; I've got sysop rights there so I've set Tablature to display the stable version. You can see the answers to your points 1 and 2 there. I'm not sure I follow your argument against the use of "draft"; how is it inaccurate? Since we're not using automatic sighting for this trial at least, I don't think your points 4b and 5 are valid. I agree that automatic sighting is a very useful concept that we might well consider using, but we seem to be getting a little caught up in a feature that's not actually available yet! Happymelon 09:55, 30 November 2008 (UTC)


RfC on implementation

The proposal on the attached project page represents the culmination of a very lengthy discussion and development process for implementing FlaggedRevisions on en.wiki. A clear consensus is required to present to the developers to have the extension installed. Ultimately that consensus will need to take the form of a straw poll, however we are not yet at that stage. This RfC initially seeks external input on the proposal, the opinions, comments and suggestions of editors not already involved with the development process. As such, please (for now) avoid simple expressions of "support" and "oppose" and instead present arguments and comments specifically connected to this proposal. Happymelon 17:56, 14 December 2008 (UTC)


I think this kind of trial is a good idea, since without it, we'll never have the evidence for an informed discussion of the implementation of the extension and its effect on Wikipedia. That said, I think that the current proposal does not say enough about how limited the trial will be. I may well have read it incorrectly, but it seems as if surveyors will be able to mark pages flagged indiscriminately suring the trial. Is this the intention, or is there meant to be some limitation in scope? Fritzpoll (talk) 11:36, 15 December 2008 (UTC)
Technically they can mark any page in the main space or portal space. However their actions, of course, will be limited by the relevant policies by the mandate that the community will give them. One of the proposed trials is to enable Flagged Revisions over featured articles and portals (see here). This proposal is just a technical framework: specific trials should be discussed separately. Ruslik (talk) 12:20, 15 December 2008 (UTC)
Sounds fine then - I can't see anything wrong with the proposal as it stands. Fritzpoll (talk) 12:39, 15 December 2008 (UTC)
Fritzpoll, you haven't done any newpage patrol at all. You have no idea what sort of backlog this sort of thing creates. Even if we just limit it to cursory skims of newpages, we wind up with, literally, weeks of backlog, and almost no one helping. DS (talk) 04:18, 20 December 2008 (UTC)
I can assure you that I have done newpage patrol. Just not in the past 2-3000 or so entries in my logs thanks to a couple of lengthy runs at Huggle and spells of blocking/unblocking. Unless you viewed the entire log you wouldn't know this, which is fair - of course, it means there was no basis to your assertion either, which is not as fair :) I am fully aware of the never ending backlogs in this place, but there are ways around it, the most obvious of which is not to implement it over every single article, just over some of our more vulnerable ones. Best wishes, Fritzpoll (talk) 08:32, 20 December 2008 (UTC)
Ah, my apologies - I misread your log. My overall point stands, though: in the year since the implementation of the newpage patrol flag, which is analogous to the revision patrol flag, thousands and thousands of articles have gone unpatrolled. DS (talk) 14:44, 20 December 2008 (UTC)
An expiration system, automatically showing revisions that are old enough to IPs, at least for non-blps, would eliminate the harm caused by backlogs and wouldn't decrease significantly the efficiency of flaggedrevs. Cenarium (Talk) 15:37, 20 December 2008 (UTC)

I'm looking at the broader scope here and a "trial" will likely lead to a full implementation that I don't believe we would be able to handle. Some of the smaller wikis have huge backlogs of revisions waiting to be sighted. I'd very much like for this to be able to work but I don't feel it will, thus I oppose any implementation of Flagged revisions at this time, whether it be limited implementation as this proposal suggests or otherwise. - Rjd0060 (talk) 15:45, 15 December 2008 (UTC)

That is indeed a perennial objection to the concept of FlaggedRevisions, but it is based on assumptions: while those assumptions are sensible, and may apply to en.wiki, you simply cannot say so with certainty until we have tried. The "don't give an inch or they'll take a mile" stance is not really applicable here because taking the second inch (having the implementation extended across the whole wiki) is no easier than taking the first (having FlaggedRevs installed at all). Both require developer intervention and so cannot occur without consensus. So I don't think you're being fair to say that "a trial will likely lead to a full implementation" as if that were a unequivocally bad thing. The trial will lead to an implementation only if we conclude that that is the right thing to do. If, as you believe, we are unable to maintain FlaggedRevs on en.wiki, then it will be impossible for us to attain consensus for its deployment. Having had a trial period only puts us in a better position to make that choice, it does not encourage one or other outcome. Happymelon 15:51, 15 December 2008 (UTC)
In my opinion, it would be a bad thing actually, and for the reasons I already specified. I've seen enough evidence to reasonably conclude that we would have unmanagable backlogs and don't like the idea of gambling with it. Of course this is only an assumption, just as your optimism is an assumption (and everybody who has an opinion one way or another is assuming). Have you researched/been following the other wikis that have implemented it as I have? - Rjd0060 (talk) 15:55, 15 December 2008 (UTC)
I certainly have, and I agree that it is a serious concern, such that I would probably have opposed most of the wider implementations that have been proposed previously without evidence to the contrary. I maintain that we cannot be certain either way without such evidence. But the point at which we need to make the ultimate decision "do we think we can handle a full FlaggedRevisions deployment?" is not being made with this proposal; with this system it will be delayed until we have gathered enough evidence to say whether it is viable. When that day comes, we'll be in a much better position than we are now to answer that question correctly. Without this trial implementation, however, we will continue to be entirely in the dark as to the true answer, and so any other implementation would be a complete leap of faith. If, as you believe, any full implementation will be a failure, we will be able to see that from the trials. So if you are correct, those trials will strengthen your position, not weaken it. There is no "gamble" as you claim, because this implementation is not at 'risk' of turning into a full deployment. You are right about one thing: everybody who has an opinion is assuming things. Why don't we take the opportunity to replace those assumptions with facts? Happymelon 16:08, 15 December 2008 (UTC)
It is perfectly reasonable to look at other wikis who have implemented this to see what issues they have faced. I've done so and made my opinion here and nothing at this point is really going to change my opinion unless they say that the 10,000++ revision backlog was a software glitch. I'm comfortable with the amount of "research" I've done on this and comfortable with the conclusions I've drawn. But, there are other people here :-). - Rjd0060 (talk) 16:13, 15 December 2008 (UTC)
That's certainly an opinion you're entitled to, and I agree that there is some evidence to support it. If it is true, then these trials will only provide more such evidence, and convince more people that FlaggedRevisions is unworkable. So why do you oppose them? :D Happymelon 16:25, 15 December 2008 (UTC)
In order to limit or avoid backlogs, I've proposed to automatically 'sight' (more technically, expire) edits after a certain period (for example, 6 hours), with various ways to adapt the system, so that backlogs shouldn't be a problem.
Cenarium (Talk) 19:03, 15 December 2008 (UTC)
To me the backlog thing is not a big deal. I'd like to see flagged revs for the simple reason that it allows you to keep track of what the latest stable version of the page was. An example of where this was an issue is the recent vandalism to the Spike Video Game Awards yesterday there were over 200 vandal edits to the page, and it became very easy to loose track of what the good version was. Even if we used flagged revs for nothing but that it'd be useful. We can't really be recognized as a creditable source until there is an easy way for users to go back to a reviewed version of a page, ideally a reviewed version that has been thoroughly checked above what would be considered sighted. The only practical way to do this is with something like flagged revs. —Nn123645 (talk) 05:15, 16 December 2008 (UTC)

FlaggedRevs is fairly technical, but is the proposal to basically digg individual edits? I take it FlaggedRevs is already being tested on en.labs.wikimedia.org. So only a bureacrat can make/unmake someone a 'surveyor'? And a 'surveyor' can turn on FlaggedRevisions for a page? All admins and rollbackers automatically become 'reviewers'? Admins can make/unmake people 'reviewers'? And a 'reviewer' looks at a page and labels it 'sighted' or 'unsighted'? A logged out user will see only the most recent 'sighted' version? A logged in user will see the most recent version? What button do I push to send an edit down the memory hole? Joking aside, is Special:OldReviewedpages going to get extremely large? I suppose that's one thing the trial is meant to find out. Are there ideas floating around about the potential length of the trial and how many articles on en.wiki would be involved? Would the first trial just be about 'sighted'/'unsighted' revisions? Or are there ideas for a range of ratings? --Pixelface (talk) 18:22, 15 December 2008 (UTC)

The proposed trial will be only about 'sighted'/'unsighted' revisions. Any expansion of the rating scale will require developer intervention. However I am open to consider a more complicated scale if the community deems it necessary. The potential length of the trial? I can only express my personal opinion here: 6-12 months. The first trials will involve only a few thousands articles, so Special:OldReviewedpages is unlikely to grow extremely large. Ruslik (talk) 20:26, 15 December 2008 (UTC)
You have the right general idea, certainly. As you point out, many of your questions are best answered by a few well-planned trials! Happymelon 21:03, 15 December 2008 (UTC)
If an editor makes an edit to an article, and the editor is a 'reviewer', can they mark their own edit 'sighted' or does it have to be a different editor who marks it 'sighted'? --Pixelface (talk) 01:46, 16 December 2008 (UTC)
In the proposed implementation the edits of reviewers are marked sighted automatically. Of course, this function can be disabled if there strong objections against it. Ruslik (talk) 06:16, 16 December 2008 (UTC)

Since immediately implementing flagged revisions on every single article is most likely going to lead to backlogs, I'd support a trial on featured article (which are the articles most in need of retaining quality) and heavily vandalised articles (perhaps as an alternative to semi-protection). The growth of the English Wikipedia has already caused cleanup and referencing backlogs, so sighting backlogs are very likely to occur if we don't work to prevent it. - Mgm|(talk) 18:47, 15 December 2008 (UTC)

New Trial idea: A trial for featured articles will not show the full extent of the backlog. We have a few thousand featured articles at the moment. If Flagged Revisions goes live, there will be a great interest in it for perhaps the first few months, which will likely be the duration of the trial. But the problem is, as we increase Flagged Revisions to encompass greater and greater amounts of articles (perhaps first to A-class, then to GA(, interest in going through the backlog will likely die off.

What I would support would be a removal of the semi-protection system and an addition of this to all currently semi'd articles. That would allow IPs to make constructive edits, something that requires the use of an obscure template, {{edit-semiprotected}}, right now. If we do this instead, we will have increased our net beneficial edits and we will still have the trial you all seem to want. - NuclearWarfare contact meMy work 20:40, 15 December 2008 (UTC)

  • Argh where did the bold words come from? However, you are aware that the proposal you're actually discussing is just the technical configuration, not any particular trial? Implementing your proposal would require exactly the same implementation as is proposed here. We removed the specific details of the trials for precisely that reason: no matter how we trial, we still need the technical ability to trial. Why don't you move this suggestion to Wikipedia talk:Flagged revisions/Trial/Proposed_trials? Happymelon 20:48, 15 December 2008 (UTC)
    • Well, it seemed like that's what all of you were discussing. My apologies for not reading fully into each comment. And yes, I would support a decision to trial, as per my last post.
    • I'm confused now. The objective of this discussion is to figure out whether or not to start a trial of some form? - NuclearWarfare contact meMy work 01:51, 16 December 2008 (UTC)
      You are right the objective is whether to start a trial of some form. However for this to happen we need to agree on a configuration of Flagged revisions that is necessary for trials to start. This configuration should be of limited nature, but simultaneously flexible enough to enable fine tuning during trial period. Ruslik (talk) 07:58, 16 December 2008 (UTC)
  • As of right now, Category:Featured articles has 2,336 articles and Category:Semi-protected has 2,058 pages (1,873 articles I think). There are 5,601 GAs according to GimmeBot (although I count 5,610 currently) and 1,152 featured lists (although I count 1,148). FAs/FLs/GAs is 9,089, and plus semi-protected articles (ignoring any overlap) is 10,962. I think 123 FAs are semi-protected, I think 5 FLs are semi-protected, and I think 155 GAs are semi-protected. So I think there are about 10,679 articles currently on en.wiki that are FA or GA or FL or semi-protected. --Pixelface (talk) 03:15, 16 December 2008 (UTC)
  • I don't have the energy to dig through all the archives on this, but I believe that with the sighted revisions model a "backlog" isn't really a concern -- if there are no sighted versions, the gentle reader simple sees the latest version, as it is now. So nothing changes, except that we could ensure quality on articles that need it, perhaps focusing on particularly needy articles: FAs, BLPs, semi-protected articles. Given that, I support this trial plan (but then, I support turning flaggedrevs on for good). (Which is to say: I support this proposal. I got lost with all the different trial proposals. -- phoebe / (talk to me) 07:39, 16 December 2008 (UTC)
    • Yeah and if this is just about what unlogged-in readers see, I suppose a 'reviewer' would not need to 'sight' each new edit, they could just 'sight' the newest revision they find without a ton of vandalism on it. --Pixelface (talk) 19:44, 16 December 2008 (UTC)

I'm just brainstorming here. On Wikipedia there are currently edit wars (reverts to articles), and "wheel wars" (admins undoing each other's actions, sometimes on articles) and I think with FlaggedRevs there's a possibilty of "sight wars" — one 'reviewer' marking their preferred version of an article as 'sighted' and another 'reviewer' marking it 'unsighted'.

  • So what do we do when a "sight war" happens?
  • Can each revision only be 'sighted' or 'unsighted' one time?
  • Can the value be changed many many times?
  • Does the newest 'sighted' revision always take precedence over an older 'sighted' revision?
  • Or could the value be cumulative, and the number of 'reviewers' that mark a revision as 'sighted' would matter?
  • If admins give people the ability to be a 'reviewer', when should that ability be taken away?
  • Also, if a 'surveyor' enables Flagged Revisions on an article, and an IP edits the article, yet does not see their change immediately on the page, are the talk pages going to be full of IPs saying "Why doesn't the article show the change I made?"
  • Does there need to be note on 'surveyed' articles saying "Only the most recent sighted version is visible."?

And I don't know if the words 'sighted' and 'unsighted' are the best words; they might confuse people.

  • How does one decide whether to mark a revision 'sighted' or 'unsighted'?
  • And I guess reader ratings (shown with Special:RatingHistory) will not be in the first trial?
  • Can Special:OldReviewedpages show the aricles with the most edits to them since the last 'sighted' version?

Just some things to think about. --Pixelface (talk) 20:53, 16 December 2008 (UTC)

Thanks for your comments. Some of your questions have easy answers, others are more thought-provoking. An edit can only be sighted once: once marked as sighted, that specific edit cannot be 'unsighted'. The only way to reverse the effect of the edit is, as usual, to revert the edit (and mark the reversion as sighted). So I don't think we have to worry about "sight wars" as you call them, as they will just be normal edit wars. There is an issue, however, in that edit wars can now become unbalanced if one editor is a reviewer and the other isn't. I would think that edit warring in this situation would be dealt with severely, possibly resulting in loss of reviewer rights. Newest sighted revisions always take priority, just like more recent edits at the moment. Only one reviewer can sight an edit, it cannot be 'multiply sighted'. When to give and revoke reviewer status is a key question to answer in the trial period, as is when to sight a revision. Check out http://en.labs.wikimedia.org/wiki/Tablature for an example of a page that is currently displaying behavior similar to what the trial pages will show; you can check out what warnings and messages will be displayed for yourself. Happymelon 21:10, 16 December 2008 (UTC)
Thanks a lot for your answers. I looked at the Tablature page, and it was quite helpful. Looking at the history, I see a "Visual comparison" button and a "Wikitext comparison" button. Does that come with the FlaggedRevs extension? I also see "sighted" and "validated" next to edit summaries. What's the difference? Is 'validated' used to mark a "quality" version and 'sighted' to mark a "stable" version? --Pixelface (talk) 22:48, 16 December 2008 (UTC)
Well, this is not quite right. Sighted page can be unsighted (depreciated), see an example here. However any revision that does not contain vandalism, libel or copy-right violation must be sighted. If one reviewer overlooked something, the revision may be unsighted by another. Though Flagged revisions should not be used to resolve content disputes. Ruslik (talk) 04:57, 17 December 2008 (UTC)
The VisualDiff system is an experimental parser stage, not part of FlaggedRevs; although hopefully it will be coming soon to live wikis. You're quite right about 'sighted' vs 'validated'; the en.labs implementation uses three 'grades' of sighting, not two - sighted overrides unsighted, and validated overrides sighted, if the page is so configured. This initial implementation here does not use "qurality" revisions; it's something to think about at a later stage. Thanks for the info on 'unsighting', Ruslik, I wasn't aware that that was the case. However, I maintain that 'unsighting' an edit should not be condoned; if there is a legitimate reason why the edit should not be visible, then the correct response is to remove that problem and sight the clean version. I hope it's not a problem we're going to encounter; of course, you never know till you try. Happymelon 13:12, 17 December 2008 (UTC)
What happens if a vandal becomes a 'reviewer' and all their edits are automatically marked 'sighted'? What should happen to the person who made the vandal a 'reviewer'? What happens if a well-meaning 'reviewer' marks several revisions 'sighted' but did not catch the vandalism on the page and their 'reviewer' abilities are questioned? I really think a plan (when to sight, when to unsight, when and who gives reviewer status, when and who revokes reviewer status) needs to be in place before the trial starts. --Pixelface (talk) 21:10, 18 December 2008 (UTC)
I couldn't agree more, these are things that the 'crats will want to see clear guidelines for before allowing a trial to proceed. However, they are entirely distinct from the technical issues considered by this proposal and so, I hope, the fact that we haven't yet decided on such things as these should not prevent this stage of the process from going forward. Most importantly, these are things that might well change as we become more familiar with FlaggedRevs and what we can and can't expect from it, so it would be highly inappropriate to prescribe them rigidly at this time. In response to your general question, I'd say the situation is very much analogous to incorrect asignation of rollback, as the two permissions are on a similar level of sensitivity. Happymelon 21:55, 18 December 2008 (UTC)
  • The answer seems to be that we should give this a go; in a temporary, monitored, and clean fashion. If we never try it we won't know if it doesn't work here. Never trying seems to be keeping us in the dark for a function that could end up being useful. §hep¡Talk to me! 21:29, 17 December 2008 (UTC)
  • We have so many backlogs that people don't deal with as it is. Look at Newpage patrol, I know few users that do that. This looks like it would do more harm than good in general to the project. Wizardman 04:17, 20 December 2008 (UTC)
    However, using an expiration system to show old enough revisions to IPs, at least for non-blps, would solve this problem, and still prevent the quasi-totality of vandalism and other disruption from being seen. Cenarium (Talk) 15:37, 20 December 2008 (UTC)
  • I strongly oppose any possible implementation of this feature in any way. This takes away much of the purpose of the site, which is to allow anyone to contribute. I can already imagine the huge backlogs. If implemented, I could look up an article about a building that caught on fire five days ago and burnt down. I would see nothing about the fire, and the article would say the building is still standing. It could take quite a while for someone to get around to "flagging" that revision. Plus, more and more tasks to do. More and more backlogs. I do not like it. DavidWS (contribs) 03:02, 22 December 2008 (UTC)
    Your opinion on FlaggedRevs generally is, of course, one you're entitled to, and it is not that uncommon amongst en.wiki users. However it is currently based on nothing more than, as you yourself put it, your imagination. Can you prove that five-day backlogs would be commonplace? We can, with this trial. If you are in fact correct and FlaggedRevs is unworkable on en.wiki, these trials will point that out very clearly. Then you will have actual evidence to support your position. In that situation, having had a trial will make it harder, not easier, for FlaggedRevs to be deployed more widely on en.wiki. Happymelon 11:50, 22 December 2008 (UTC)
  • Can you prove that there are any benefits of flagged revisions? No trial before people have been allowed to state their opinions please. (i.e. not after this RFC) --Apoc2400 (talk) 15:33, 27 December 2008 (UTC)
  • Can the benefits of FlaggedRevs be proven? Yes, by conducting one or more small, controlled trials and seeing tangible benefits. Can you prove that the "huge backlogs" and "[loss] of the purpose of the site" will come to pass? Yes, by conducting the same trials and seeing the tangible downsides. Opinions are great and the whole point of this discussion (and the megabytes that have preceeded it) is to obtain as many of them as possible. Evidence for those opinions is even more valuable. I couldn't agree with you more: "no trial before people have been allowed ot state their opinions". If you take even the quickest skim through the archives linked to at the top of this page, you will see that we have just about every opinion on the spectrum. My contention is that we should have "no deployment without trials", for exactly the same reasons. Once again, this proposal, if the consequences of a full deployment of FlaggedRevs are as dire as you claim, this proposal will strengthen your position. Why then are you opposed to it? Happymelon 15:48, 27 December 2008 (UTC)
  • I have not claimed very much at all. At the top of this page you ask people to not register simple supports or opposes. Therefore you should not use this page to say that you have consensus for anything, even a trial. To get back to the general question, I doubt a small trial of a few articles and many people watching will give us any information about the backlogs if flagged revs is applied to millions of articles. --Apoc2400 (talk) 16:12, 27 December 2008 (UTC)
  • If you have not claimed very much, I'm confident that I have claimed even less. Where do I or anyone else claim that there is "consensus for anything, even a trial"? This is a discussion phase, where everyone is encouraged to put forward their comments and arguments that may influence the development of the proposal. You are quite right to say that this process does not produce a clear consensus, that will require a straw poll, which we are now planning. At this time, you have the opportunity to present your opinions, which will be welcomed, and the reasoning behind them, which will be contested; that is the nature of constructive discussion. You and everyone else who contributes to the eventual poll will be encouraged to read through this discussion and use it to form and inform their own opinions. I simply do not agree that your concerns, valid though they are, are actually in opposition to this proposal. This is not a discussion on "should we enable FlaggedRevs on "millions of articles""; it is not even a discussion on "should we enable FlaggedRevs on any articles". It is proposed merely to give ourselves the technical ability to enable FlaggedRevisions on a small set of articles in a controlled fashion and for a limited time without further developer intervention. Everything else, including the situations that concern you, are expressly excluded from the proposal. Happymelon 16:25, 27 December 2008 (UTC)
  • Do the experiment, then we will have some data that will show what the costs and benefits of this feature are. Tim Vickers (talk) 22:09, 23 December 2008 (UTC)
    • I think a trial run of this feature would be nice. It looks like a good benefit to the project. – Alex43223 T | C | E 08:37, 30 December 2008 (UTC)

New userrights group

A trial run limited on certain articles sounds interesting. I hope some people have some more-or-less scientific ideas how to run such a trial.

I don't quite understand why we need a new userright group for this. Essentially, if we want to have a trial run, we need to figure out a way to select a good sample of pages where flagged revisions should be tried, and keep that sample for a period of time. The actual switching of the "flagged revision" flag on a page does not seem to need "surveyor" oversight, and would be more efficiently done by a bot.

In other words, I welcome having people who think about pages that should be included and monitor those (and scientifically evaluate the results later), but don't see why they should add or remove the flagging flag while the trial runs. Not having to add an extra usergroup has the advantage that we won't have to argue who gets that userright and why. Kusma (talk) 15:48, 15 December 2008 (UTC)

You misunderstand the purpose of the flag. Yes, the setting and unsetting of pages could be most efficiently done by a bot, and indeed it probably will. In order for the bot to be able to make those changes, it will need the 'surveyor' user right! Surveyors are different only in that they have the technical ability to make the changes; which pages they change and for how long will be governed by the community and relevant policy. The situation is entirely analogous to bureaucrats: crats have the technical ability to change any user into an administrator, but the users who receive that treatment are selected by the community at RfA. Bureaucrats are chosen not for their ability to judge who should be made admins (because they don't do that), but because they are trusted to use that technical ability on behalf of the community. In the same way surveyors wield the technical ability to implement FlaggedRevisions on behalf of the community, they would not themselves be responsible for selecting the pages. Happymelon 15:56, 15 December 2008 (UTC)
I have no objection if the group contains only specialized bots (or if the flagging is done by a MediaWiki pseudobot, which I guess would be more of a hack, but a possible implementation). I just want that the oversight of the process is by the community, not by a group of people wearing a certain hat. Kusma (talk) 16:08, 15 December 2008 (UTC)
That's absolutely the case; the community will decide what to do to which pages, and the surveyors will execute those wishes. Having the flagging done by a maintenance script (the 'pseudobot' you describe) would introduce a very long delay between decisions being reached and the actions actually being implemented, which includes correcting errors and ommissions. Having a trusted on-wiki user to make these changes is definitely the simplest solution. Happymelon 16:23, 15 December 2008 (UTC)
I object to any new ability being automatically given to, or exercised by administrators. Part of the project page, foresees, administrators doing this function almost exclusively and having an untoward ability to prevent any reviewer they choose, by simply revoking their reviewer privileged. I see wheel-wars in this. We already have edit-warring, now we'll be able to allow admins to go-to-war over which editor should be a reviewer. This is not a good thing. Surveyor and reviewer should be settable only by bureaucrats or higher to prevent this. They typically do not war between themselves, and understand much more clearly than admins how to be fair to the community-at-large, not imposing a view by force. Wjhonson (talk) 04:45, 16 December 2008 (UTC)
For the full implementation of flagged revisions we need at least about 10,000 reviewers. Do you think that ~ 10 active bureaucrats can handle this? In addition, sysops can give and remove rollback now. I am not aware of sysops mass removing rollback to prevent rollbacking their edits. Such behavior would be an abuse of power. Ruslik (talk) 06:24, 16 December 2008 (UTC)
I agree with Ruslik. The distribution of 'editor' flags will be conducted in exactly the same format, and probably in the same place, as rollback, IPBE, and the various other rights that admins already have the ability and authority to assign. There is no evidence of the systematic abuse of this existing process that you seem to be afraid of, so I can't see any reason not to extend it to this process as well. Happymelon 13:55, 16 December 2008 (UTC)
(outdent) I think 'surveyor' is unnecessary, technical control of pages is already in the purview of +sysops, and if there is a page issue a sysop should be able to fix it without using extraordinary means (that may be possible using existing tools like moves/deletes). — xaosflux

Talk 12:03, 18 December 2008 (UTC)

Sorry, but I do not understand what you mean. Do you want to abolish surveyor usergroup and give all FR tools to sysops? However this will mean full implementation of Flagged Revisions, not just a trial, because once you give the tools to all 1600+ sysops there will be no simple way to stop the experiment. Ruslik (talk) 13:48, 18 December 2008 (UTC)
Yes there would be, remove the Flagged revisions extension.... — xaosflux Talk 12:04, 19 December 2008 (UTC)
Removing an extension is not simple, it will require consensus before filling bugzilla request. Since the consensus is difficult to achieve the FR extension will stay, and the trial will become a full implementation without any consensus for this. The consensus should be required to continue trials not to stop them. Ruslik (talk) 12:37, 19 December 2008 (UTC)
As for reviewers, this is a simple enough permissions that letting sysop +/- it for others should be a non-issue, we have ways to deal with wheel warriors. — xaosflux Talk 12:03, 18 December 2008 (UTC)
As for using a bot, I'm confused; is the trial scope so large it can't be managed? As for full implementation wouldn't this be name-space wide? — xaosflux Talk 12:03, 18 December 2008 (UTC)
Related discussion continued below. Thanks, Mailer Diablo 19:41, 13 January 2009 (UTC)

Visibility of the test

How visible will this test be? Specifically, will a sighted page contain any notice of the change? Will we be using any sort of site notice to announce the test? While I can accept the terms of the test (as frustrating as it is that sighted versions will be visible by default, which I personally think is a bad idea), I do think that we need to make it obvious to readers and people who aren't "regulars" what's going on. I don't particularly want to be getting confused messages about "why aren't my edits showing up?" without good cause. {{Nihiltres|talk|log}} 19:16, 15 December 2008 (UTC)

You can see how FlaggedRevisions looks at http://en.labs.wikimedia.org - I've set the page Tablature to display in much the same way as trial pages here would. You can see when you edit it that there is a warning above the edit box that edits will not be visible immediately; we can make this as large and obvious as we feel is necessary. As usual, most of the interface can be customised through modifying system messages to be as overt or subtle as desired. Happymelon 21:00, 15 December 2008 (UTC)

Deciding on individual trials

So, the current plan is to ask first on whether a trial (or multiple trials) at all should place, and then to get consensus on the specifics of the trials? I'm not really a fan of abstract discussions on Wikipedia, but I can certainly live with it. However, then the question becomes how do we decide on the specifics of the trials? It says 'consensus', which is rather vague. Wouldn't it be better to be more explicit about this, especially given how explicit the rest of the proposal is? -- Jitse Niesen (talk) 22:18, 15 December 2008 (UTC)

I prefer to think of it as breaking the problem down into manageable chunks rather than being abstract, but I take your point. It is in fact being deliberately vague in an atempt to separate the specifics of the trials from the implementation required to conduct them. I think that, given that we're handing the power to initiate trials (by creating surveyors) to bureaucrats, who are after all appointed based on their ability to gauge consensus, that we don't need to worry about a trial starting without consensus. Do you have any suggestions for a more explicit phrasing that is not overly bureaucratic? Happymelon 22:38, 15 December 2008 (UTC)
I was thinking about the level of consensus required. There is a difference between the rough consensus used at WP:AfD and the stronger consensus required at WP:RfA. But I'm happy to leave it to the discretion of the bureaucrats.
I think it is important that it's very difficult for the proposed implementation to evolve sneakily from a trial to a full deployment. Should this be made more explicit, for instance by modifying the second sentence so that it reads "The proposed configuration does not scale well to a full deployment, ensuring that it is only used for limited trials." Or is this perhaps patronizing by stating the obvious?
Incidentally, I went through the Q&A of the candidates for the recent Arbitration Committee elections. By my count, and there are a couple of borderline cases, there are two against flagged revisions (Lifebaka and Wizardman), ten with no or unclear opinion (Carcharoth, Hemlock Martinis, Lankiveil, Risker, and six who didn't answer the questions), and sixteen in favour. Of the top seven candidates, six are in favour and one doesn't know. -- Jitse Niesen (talk) 16:20, 16 December 2008 (UTC)
It's not "difficult" for this to extend to a full deployment so much as technically impossible :D. I to am willing to trust the bureaucrats to evaluate a 'sufficiently strong' consensus. Interesting point on the ArbCom candidates. Happymelon 17:49, 16 December 2008 (UTC)
Each area of proposal should get a seperate consensus. i. e. Get one for FAs, or A class, or GAs, or BLPs, or whatever, then address the next if it seems worthwhile. Don't say "Flagged Revs at all?" then "Okay, on what?" - that's too confusing. WilyD 17:38, 16 December 2008 (UTC)
Why is that confusing? Happymelon 17:49, 16 December 2008 (UTC)
You're likely to get a mandate to implement flagged revisions, but not in any particular class of articles, or not representing consensus. In the nominal case where GAs, FAs and BLPs are each supported by 1/3 of people for first implementation, you might think they're all equal. If the 2nd choice of the GA and BLP supports are both FA, for instance, then they're not equal. In practive the outcome is likely to be substantially messier. "Should we have flagged revs on FAs?" "Should we have flagged revs on GAs?" and so forth yield much clearer answers. WilyD 19:15, 16 December 2008 (UTC)
I disagree. You seem to think that, if this proposal is accepted, then we are duty-bound to implement FlaggedRevs somewhere and in exactly one place, and hence we will be confused over where exactly. That's not the case; if this proposal goes through and then we can't get a consensus to trial anywhere in particular, then so be it, we don't have any trials and hence no FlaggedRevs. If this proposal goes through and then we see clear consensus for trials on FAs and BLPs, then we trial in both areas. This proposal is not "should we have FlaggedRevs?" but "should we give ourselves the technical ability to have FlaggedRevs"; it's just a stepping-stone. But it's a stepping stone that will be needed by any implementation of FlaggedRevs,(IMO) makes intuitive sense to agree on it separately. Happymelon 13:17, 17 December 2008 (UTC)
I'm not suggesting that. Zero, or two, or whatever still could result in the fashion of "Where?" I'm talking about. But maybe you're right, that "Ability to implement when wanted?" is the first step. WilyD 15:24, 17 December 2008 (UTC)
While I agree we should have a seperate !vote for each type of article (FA, GA, BLP, Protected, etc...) I don't see any reason not to figure out whether people want any sort of trial first, and then hammer out the details. --Falcorian (talk) 18:38, 16 December 2008 (UTC)

It seems to me that if you're going to go ahead with this, that you should at least be planning on randomized trials, where you pick a class of articles and divide it into (trialed) and (not-trialed) so that there is some hope that we can compare like-for-like in seeing how much better (or worse) Flagged articles fare over non-flagged articles. Lot 49atalk 05:45, 4 January 2009 (UTC)

FYI, based on a conversation on Jimmy Wales's talk page:

Your feedback is appreciated. rootology (C)(T) 22:40, 19 December 2008 (UTC)

Report from the German Wikipedia

I'm mildly in favor of implementing flagged revisions here on the English Wikipedia, but before we do so, I'd love to hear a report from the German Wikipedia. As you all know, they've had flagged revisions for many months now, and we can definitely benefit from their experience. While our wiki is culturally different from theirs, a good amount of their experiences with the system should transfer over. Anyone know of a good way to get in contact with the German Wikipedia and hopefully get a report on their progress with flagged revisions — in English — to help us make our decision on whether we should implement it? --Cyde Weys 01:35, 28 December 2008 (UTC)

Just wait for a German Wikipedian to come by :)
P. Birken (who has been deeply involved in this project all along) published a report on wikide-l two weeks ago - in German, though. I'll try to translate it this afternoon, and I'm alerting him of this discussion. --dapete 11:36, 28 December 2008 (UTC)
Translation at User:Dapete/Report on Flagged Revisions, December 14, 2008. Enjoy (or not). --dapete 12:50, 28 December 2008 (UTC)

Thanks for the report, and for the quick translation! Flagged revisions looks promising for use here on the English Wikipedia. As far as I can tell, they didn't hit any major issues. The main minor issue seems to be a lag time between when an edit is made and when it is approved. Luckily, they've already come up with several mechanisms to address that that we could leverage here. Also, I don't doubt the tenacity of our counter-vandalism users one bit. They already provide 24/7 coverage through tools such as Huggle. Flagging non-vandalized edits on top of that wouldn't be a very big deal, I would guess. What do the rest of you guys take from this report? --Cyde Weys 19:32, 28 December 2008 (UTC)

I noticed the report says "Now there is a campaign to ensure that the [median] waiting time doesn't exceed 21 days, so this has been stable since November 19..." Who wants to wait three weeks for their edits to become visible? It also says there are only 3,000 to 4,000 manual sightings per day, which doesn't sound like a lot when there are 851,000 articles on German Wikipedia. This will be a disaster. Richard75 (talk) 15:54, 3 January 2009 (UTC)

maximum waiting time, not median waiting time. --Chin tin tin (talk) 23:08, 5 January 2009 (UTC)
Actually, P. Birken's report is lying when saying "there is currently a discussion to more clearly specify the criteria for sighted revisions" "a regular Wikipedia author has looked at it and the revision is free from obvious vandalism."
The discussion is not about "free from obvious vandalism" but about lots of additional criterias that should be met when flagging an article. Some of them (from de:Wikipedia Diskussion:Gesichtete Versionen#Endgültige Fassung der Sichtungskriterien, translation welcome):
The article must be sourced, it must fulfill German Wikipedia's rules for links, for notability criteria, for biographies of living persions and some more rules, copyvios should be checked, weblinks and picture links should be checked, in case of new articles additionally the rules from the Wikipedia manual of style should be fulfilled, article should have links and categories – then it is allowed to flag the revision.
And if it's not flagged, the the older version is still displayed. Actually, I don't like this German verboten approach.
--Cyfal (talk) 23:14, 3 January 2009 (UTC)

Information from another german further down the page. Lot 49atalk 16:17, 7 January 2009 (UTC)

There seems to be some misinformation about the German trial. They had trials for almost a year and a vote last fall,whether to implement it for good or not. Though not all German authors liked the feature there was a clear yes-vote. But note the German system has 2 step revision process (and only the first step was tested an voted on). Basically you have the flags "reviewed" (gesichtet) and "confirmed" (überprüft). Reviewed only means lightweight quality control (i.e. some other "trustworthy" or "established" author has read the article and has found no obvious errors or falsehoods), which blocks vandalism, obvious falsehoods/pov/proganda and alike. Confirmed means a thorough review of the article by an expert in field. Personally i like the general idea of flagged revisions (it's a bit like in software development) aside from blocking spam/vanadalism/extreme, they also allow a better organisation of quality control (for instance you can avoid that people accidentally proofread the same article under assumption nobody checked it yet). Please note that flagged revision in this form do not violate they anyone can edit principle and all edits remain visible to readers. It only means that the reader will receive the additional information, whether the content has been reviewed or not. Overall i would recommend a test.--Kmhkmh (talk) 16:24, 9 January 2009 (UTC)

I think it should be: Please note that flagged revision in this form do violate the anyone can edit principle as all edits remain invisible to visitors of the site until Sighted, you have to logout to see it because of the SUL [1]. Mion (talk) 01:59, 11 January 2009 (UTC), (Note, the edit is now Sighted)
Well your claim regarding the German is simply not true, but seems there is a great deal of confusion and maybe your impression is due to a lack of German language skills. So again All edits remain immediately visible in the German Wikipedia (sighted or not), but if a sighted version exists then the latest sighted version will be the default display - however the reader can always choose to see the latest version, by clicking at the according info at the top right of the article (click on "zur aktuellen Version"). The sighted status only decides about the default display but the reader is also informed how to access the latest version, meaning the sighted status does not block readers from accessing the latest (unsighted) edits.--Kmhkmh (talk) 03:57, 14 January 2009 (UTC)
Following the statistics 9 out of 10 articles have a first sighting, at the moment 93 %[2] of the articles are not showing the latest version, and only 1 in every 100 visitors is looking for the edit button, so 1 in every .... will look for this show me the latest version button? Mion (talk) 04:09, 14 January 2009 (UTC)
Actually I'm not sure how you got this from the page. The information i get there is that 93% are showing the latest version (literally reviewed in their newest version). The other info that only 1 in 100 will use the latest version/zur akktuellen Version button i cannot find at all right now. However to state that again - all readers can access the latest version - if they choose not to, then that's their choice (and possibly an indicator that they actually do prefer the sighted version to the latest one, which actually makes an argument for flagged revisions). However the current wording for the suggested implementation of the english version of flagged revisions differs from the German one in exactly that crucial issue (the accessibility for all readers) and I'm not sure whether that's intended or an oversight/mistake - i've asked melon for clarification on that--Kmhkmh (talk) 04:52, 14 January 2009 (UTC)
Let me rephrase it correctly:at the moment 93 %[3] of the articles are not showing the latest version but the older sighted version.Mion (talk) 05:20, 14 January 2009 (UTC)
Are you sure you are reading the link correctly? The way I read it is that "11935 modules have an out-of-date review" and "806474 modules are reviewed in their newest version" - "That is 93.17 percent" that ARE showing the newest version. Dbiel (Talk) 05:49, 14 January 2009 (UTC)
No, i'm not sure, the current numbers are not representive at all anyway.Mion (talk) 06:20, 14 January 2009 (UTC)
And after 1 year of trials and 8 months of live implementation there is still no statistical evidence from the German Wikipedia on any improvement....., maybe if you could provide the statistics ? Mion (talk) 02:16, 11 January 2009 (UTC)
A statistic to prove what exactly?--Kmhkmh (talk) 03:57, 14 January 2009 (UTC)
As I have outlined some hundred miles below, there is no way to measure success in this field. You can measure the lag of flagging and all that stuff but there is no way to measure success of the whole thing. --X-Weinzar (talk) 12:35, 14 January 2009 (UTC)