Wikipedia talk:Edit filter

From Wikipedia, the free encyclopedia
Jump to: navigation, search
the Wikipedia Help Project (Rated Mid-importance)
WikiProject icon This page is within the scope of the Wikipedia Help Project, a collaborative effort to improve Wikipedia's help documentation for readers and contributors. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks. To browse help related resources see the help menu or help directory. Or ask for help on your talk page and a volunteer will visit you there.
 ???  This page does not require a rating on the project's quality scale.
 Mid  This page has been rated as Mid-importance on the project's importance scale.


Do we have an accurate definition of what specials this removes? All the best: Rich Farmbrough23:01, 18 February 2015 (UTC).

In PHP, the executed code is preg_replace( '/[^\p{L}\p{N}]/u', '', $s ); where $s is the initial string. I believe that translates to "remove everything that isn't either a letter or a number" as evaluated by PHP's unicode compliant definition of what are letters and numbers. Dragons flight (talk) 23:12, 18 February 2015 (UTC)
Thanks DF. I imagine \p is Posix class or some such. It's unfortunate for some purposes that it removes spaces, which means word boundaries are lost. All the best: Rich Farmbrough01:18, 22 February 2015 (UTC).

Filter addition[edit]

Feel like these edits [1] (admin viewable only) could be used to add to Filter 58, bit surprised they weren't caught already--Jac16888 Talk 16:26, 20 February 2015 (UTC)

Is this a long term issue or was this one-off vandalism? Sam Walton (talk) 16:58, 20 February 2015 (UTC)
Right now looks to be the only occurrence I've seen (other than my userpage in the last half hour), however if this is who it is acting like and not just a wannabe, well we all know how much advantage they take of a gap when they find it--Jac16888 Talk 17:03, 20 February 2015 (UTC)

! (auto)confirmed in user groups vs article namespace == 0[edit]

Those two checks are probably the two most used by the filters. Which should come first ? In the running filters I've checked, the autoconfirmed check seems to come first a bit more often than the mainspace check but there's no clear winner. If we've got enough information and experience to make a performance determination, then the conditions should be in that same order for all filters. I've also noticed that filters sometimes check for "confirmed", and sometimes for "autoconfirmed". Since we only get very few edits by users in the confirmed usergroup, it should come down to whether checking for "autoconfirmed" (an exact match) is faster than checking for "confirmed".  Cenarium (talk) 06:51, 25 February 2015 (UTC)

I'm pretty sure that !autoconfirmed is more selective than article_namespace == 0. You can check using batch testing on either expression. So there should be a small advantage to having !autoconfirmed in front. As for "confirmed" vs. "autoconfirmed", I doubt there is a meaningful difference in performance by choosing one over the other. Dragons flight (talk) 08:49, 25 February 2015 (UTC)
Last time I checked about 40% of edits were to article space, so that rule should kick out 60% of edits. I would suspect that more than 80% of edits are by confirmed users. Therefore checking autoconfirmed first would rule out more cases.
However article namespace is a numerical comparison, taking (in theory) a clock cycle. Pattern matching is much much slower. So doing the article check first is actually likely to be more efficient.
Static pattern matching should use the Boyer–Moore–Horspool algorithm, which means that the longer string matches faster.
All the best: Rich Farmbrough00:46, 3 March 2015 (UTC).

Request for permission for PhantomTech[edit]

PhantomTech (t · c · del · cross-wiki · SUL · wikichecker · count (xtools) · count (cyberpower678) · pages created · auto edits · logs (block • rights) · google · lu  · arb · spi) (assign permissions)(r · rv · p · f · t)

I started working with these filters about two weeks ago in false positives, since then I've been pretty active in that area and doing a bit in the filter requests section. Right now I can't view some of the filters which delays (admittedly not by too long) some false positive reports that I otherwise wouldn't have a problem dealing with. Additionally I've been working on User:ThePhantomBot, a bot that detects LTA among other things, and going through the filters to see which can be offloaded onto my bot would help cut down the total number of edit filters. To be clear, I don't plan on disabling any filters that I set my bot to detect until my bot has been approved and the filter in question has been tested with my bot. I'll admit that two weeks isn't a long time for working on filters, but hopefully I've shown that I can be trusted to not do something reckless. I have quite a bit of experience with regex coming from having to make a regex based chat filter for something off-wiki. PhantomTech (talk) 21:52, 25 March 2015 (UTC)

  • Oppose this is one of the most dangerous user rights, and your on wiki editing experience is fairly limited (edits appear to primarily be through the use of automated tools in patrolling articles) - not enough to demonstrate the high level of trust this right requires. On a side note, your bot request hasn't even entered the trial phase yet. Will this stall it, or will you be able to proceed with publicly visible filters? — xaosflux Talk 05:46, 26 March 2015 (UTC)
@Xaosflux: I don't think this permission request being denied will affect how long it takes my bot to get into a trial phase, there is probably enough public information in LTA and SPI to get a good amount of additional filters setup. Even if there isn't, your opposition seems to be about me being able to mess with the filters, not see them, so if I really needed them for some reason I don't think it would be a problem to have a few of the private LTA ones sent to me. PhantomTech (talk) 06:41, 26 March 2015 (UTC)
Copies would certainly be possible. It seems like you really would only need abusefilter-view-private access for most of the work you want to do (as opposed to the dangerous abusefilter-modify permission), unfortunately the only usergroup with that permission is Administrators. A new user group "Edit filter reviewers" or the like could be created to allow private-read access, if community consensus could be demonstrated (perhaps in a new section below). It would have a lower barrier to entry, you are not the first person who has asked for this type of access (e.g. sysops of other projects that want to copy our filters). — xaosflux Talk 10:18, 26 March 2015 (UTC)
Proposed new group in thread below, please comment as desired. If it gains traction, will cross post to WT:PERM and WT:AN. — xaosflux Talk 17:56, 26 March 2015 (UTC)
  • Oppose due to the lack of data that indicates the required experience in making changes. Feel free to copy publicly visible filters to testwiki: and modify them or request private ones copied there for you to make changes or see how they work for your bot as needed, but I don't think it is wise to allow access on enwp without some indication of experience and ability to saftely work on these. If you require permission on testwiki, please let me know and I'll happily help you there. — {{U|Technical 13}} (etc) 11:07, 26 March 2015 (UTC)
@Technical 13: If your main concern is about my competence with edit filters I don't mind being quizzed. If you want to make up a few scenarios for filter requests or something, I wouldn't mind telling you how I would deal with them, including any filters I would set up. PhantomTech (talk) 15:20, 26 March 2015 (UTC)
  • I've not found "quizzes" to be common practice on Wikipedia, and that may be because they are not the best way to gauge competence. Practical demonstration seems to the most effective way, and I understand if you are wondering how you are suppose to get practical experience without the ability to actually do it (hence you might think it is a catch-22 that you can't have the bit to make changes because you haven't demonstrated competence of making changes with a bit you don't have). That is the reason I have offered to give you the bit on testwiki: where you can create and copy and change filters and get practice and see how it actually works. I'm sure that after 3-6 months of you actively making changes and not blowing everything up, your chances of getting the bit here will have greatly improved (although I won't guarantee you the bit at that point as I do not have the power to grant you the bit and I wouldn't make such a promise anyways on the grounds that I'm not the only one who has opposed at this time and a consensus of some scale would need to be achieved first. — {{U|Technical 13}} (etc) 15:54, 26 March 2015 (UTC)
@Technical 13: I'm not sure exactly what the difference between a demonstration using current filters and describing what I would do related to filters given a certain situation is in terms of a show of competence. A simple test (what regex matches this string) would hardly do anything in terms of demonstrating competence, but I'm suggesting more complicated and realistic tests. Asking me to make a filter for an LTA case that currently isn't active enough to have a filter and assuming for the sake of the test that it is active enough is the level of tests I was expecting. I know there aren't quizzes used for anything on Wikipedia but I think they'd do a much better job here as a display of competence than most other places, especially because of how important competence is here. Maybe it's the time that you're concerned about, working over a few months would likely involve a lot more "tests," and therefor be a better test overall, than you might be able to make up for me in a few days. PhantomTech (talk) 16:23, 26 March 2015 (UTC)
  • The difference is, in order for you to be tested adequately, you would have to have an admin or someone else sufficiently capable of giving you the bit upon completion that would be willing to spend months with you. I can't foresee that happening and the best thing is to just practice in a safe environment (like testwiki) and actually refine existing filters and make your LTA filter and whatever else over a few months. If you are interested, please email me and I'll give you the admin bit for this on testwiki and you can get started, otherwise, I think this is a discussion that is blowing in the breeze and not accomplishing much. — {{U|Technical 13}} (etc) 19:25, 29 March 2015 (UTC)
There is no abusefilter group at testwiki (only admins have access). Since this comes up on occasion, I think I'll add one along with phab:T93798. (see phab:T94214) Cenarium (talk) 18:22, 27 March 2015 (UTC)
  • I'd be happy to give anyone that requests something of this nature the admin bit on testwiki within reason of course. I'm not convinced there is a need for there to be a specific EF group there, that said, I won't object if you want to make one. — {{U|Technical 13}} (etc) 19:25, 29 March 2015 (UTC)
  • Comment Just to be clear, are you saying you wont edit the filters at all? Or just not delete them? Soap 15:29, 26 March 2015 (UTC)
@Soap: When moving filters to my bot, prior to it being approved, I won't edit them at all, this means I'll only be able to work with filters that are in log only mode or that are currently overly specific as a result of the limitations to what filters can check and to keep their false positives low. If my bot is approved, when moving filters over they'll be switched to log only if they aren't already and then disabled once my bot is effectively detecting and dealing with them. If you're considering approving this permission on the condition that I never edit filters I'd recommend not doing so, for my request and for any others. If the concern is not being able to trust my intentions with the filters then it wouldn't make sense to trust that I won't edit them, sure you can remove the permission if I do edit them but you'd do that anyway if I made it clear I meant to cause harm. If the concern is about competence then, like I said to Technical, I don't mind being quizzed with however many scenarios you want to throw at me. As a final note, realizing that filters can have big effects on Wikipedia, I will, as anyone should, ask someone else about a change if I'm not sure about it, including any filters I'm not sure if should be moved to my bot. PhantomTech (talk) 15:49, 26 March 2015 (UTC)
  • Based on this comment, I'm not sure you understand how the wiki works sufficiently. This comment gives me pause and indicates to me that concern about you having this right, or having a bot, isn't entirely invalid. I recommend you read over our bot policies and reconsider your comment here, perhaps you would like to make some clarifications? — {{U|Technical 13}} (etc) 19:25, 29 March 2015 (UTC)
@Technical 13: I've read the bot policy and I'm not aware of any problems with what I'm currently doing or plan to be doing in relation to it. Based on your reply above about making LTA filters on testwiki and your reference to the bot policy here I'm assuming there's been a miscommunication about what my bot is/will be doing, if your concern was about something else let me know. My bot is currently operating with a low edit rate in its own user space and will continue to do so unless it gets approved. I have no intention to replace all abuse filters with my bot. In relation to LTA cases the goal of my bot is to be used to detect any that either cannot have an abuse filter setup for them (either because of technical limitations or because they would hardly ever be used) or that do not need to be done by abuse filters. An example of a filter that doesn't need to be an abuse filter is Special:AbuseFilter/663 which could be dealt with by my bot to reduce the total amount of abuse filters, moving a single filter to my bot would be pointless but it's not like there's a shortage of LTA cases or related filter requests. Right now all my bot does when it detects something is report it, that's also almost definitely the only thing it will be approved to do, if it is approved, the only difference being where it reports to and how often it is allowed to do it. If I get the abusefilter permission while my bot is unapproved it will be used primarily for reading filters for false positives (Cenarium has pointed out that there is a global right that can be used for this) and adding any filters that are set to log only to my bot, leaving the filters themselves untouched. The few times I will actually change filters prior to my bot's approval is to fix bugs that cause false positives, no changes will be related to my bot. If my bot is approved I will, for appropriate filters, add the filter to my bot (as log only) and then change the abuse filter to log only, once my bot has been tested with the filter I'll fully enable it on my bot and disable it from abuse filters. I do plan on helping some with filter requests but don't plan on doing this anymore than I do now until after I'm done with most of the work for my bot. Again, if I didn't clarify the part you were concerned about or something is still unclear please let me know. PhantomTech (talk) 20:43, 29 March 2015 (UTC)
  • The few times I will actually change filters prior to my bot's approval... should be "none". I'm not an admin, so I can not give you the bit, but I've been around here long enough to know that this request is very likely to be ultimately declined. Your lack of interest or willingness to spend time reading and modifying filters on testwiki first by requesting them be copied there and making edits there and requesting changes be made here based on your changes there is an indication that you probably should not have the bit to even read the filters here. Based on this, I'm no longer willing to give you the bit you would need on testwiki to make this happen. I'm no longer going to watch this discussion, and I wish you luck if you wish to pursue this request further. — {{U|Technical 13}} (etc) 21:35, 29 March 2015 (UTC)
Again, my changes to the filter prior to my bot's approval will not be related to the bot. I don't see a problem with editing filters to fix false positives, I think I have enough experience with regex and programming to do it without messing things up, if someone disagrees or thinks I haven't proven it that's fine and I can't really do anything but offer to prove I'm capable. Having filters moved to testwiki for small changes to fix false positives then back here seems like an unnecessary prolonging of the process, I currently post recommended changes on the related page (false positives or requests) and think it's much more efficient, sure for big changes it would be best to do some testing first but I don't plan on making any of those any time soon. Even if I was completely incompetent when it came to making changes to filters, I don't see what that would have to do with having the read permission. PhantomTech (talk) 21:56, 29 March 2015 (UTC)
  • If a filter ends up no longer needed, an edit filter manager can disable it, so there's no actual need to edit filters. If you need to see private filters, a global usergroup exists for this, see m:Abuse filter helpers. I do not believe we should grant view-only access since this group exists, I suggest you request it at meta. Cenarium (talk) 20:30, 26 March 2015 (UTC)
@Cenarium: I realize if my bot was the only reason I'd be doing anything with filters that it would probably make more sense to just have someone else make the changes but I do plan on working with filters unrelated to my bot, mostly fixing false positives that come up and, once I'm not so busy with my bot, I might start working through some of the filter requests. PhantomTech (talk) 04:23, 27 March 2015 (UTC)

Edit filter reviewers[edit]

I propose to create a new user group "Edit filter reviewers" that has the abusefilter-view-private permission applied to it. Also propose to add +/- group changes to this new group to the existing administrators group. This will allow us to grant view access to users that only require the access to view private filters, but do not meet the thresholds to be able to modify them. Potential candidates would be non-enwiki admins that operate certain bots, guest admins and efm's from other wiki's that want to re-use our settings, users that may respond to false positive reports.

  • Support, as proposer. — xaosflux Talk 17:55, 26 March 2015 (UTC)
  • Oppose as unnecessary bureaucracy. We've regularly granted EFM to people who need to see private filters on the stipulation that they don't modify them. So far, I don't think that any of them have abused their rights. Reaper Eternal (talk) 18:15, 26 March 2015 (UTC)
    I don't envision the request process to be any more onerous then it currently is, but would give the ability to grant only what is needed; respect your opinion that this could be considered process creep - I'm really just looking for a way to say "yes" to more editors. — xaosflux Talk 18:54, 26 March 2015 (UTC)
It may make more sense to have this as a global group (handled at meta). There's already a m:Abuse filter editors group, and we might have a view-only equivalent (they could view private filters on all projects).  Cenarium (talk) 20:18, 26 March 2015 (UTC)
  • Actually, there's already one, m:Abuse filter helpers, so I suggest we refer users to this group if they only want view access. Cenarium (talk) 20:22, 26 March 2015 (UTC)
  • That group may have a higher barrier, as it is for ALL projects, we could just as easily have a local group for enwiki needs. — xaosflux Talk 01:15, 27 March 2015 (UTC)
  • The target audience you give as examples is cross wiki in nature, so it's not so much about enwiki's needs; a local group would have a too small use case IMO and it makes sense to offload the bureaucracy aspect to meta where it's centralized thus more efficient. Even for users who only check false positives, it makes sense to give them the global group so they can see filters at testwiki and other wikis where the same problems may arise. I don't see why the standard should be higher, since enwiki is the most looked-after project for people interested in exporting filters in the first place. Cenarium (talk) 18:19, 27 March 2015 (UTC)
  • Support No need to give people permission to change filters if they don't need it and there are probably a decent number of people that do false positives that many editors wouldn't be willing to trust with editing filters. Like xaosflux said, the global group probably does have a higher standard. While some users might benefit from the global permission I don't think many would need it, we don't use global rollback because most people who use rollback rights don't need it globally. PhantomTech (talk) 04:14, 27 March 2015 (UTC)
Retracting support, not opposing but Cenarium makes a good point in their latest reply. PhantomTech (talk) 18:37, 27 March 2015 (UTC)
  • Oppose I really think edit filter managers can work with this. A lot of the current EFM's do very little editing, or none at all. Only someone really trustworthy can get EFM, and if they say they dont want to actually modify the filters then I think they will stick to their word. I realize that's not the central argument here, but if youre saying there should be a separate right because it's a lower barrier and therefore is easier to get, I really really think we should try to work against that mentality and give editors the full EFM access. Soap 04:20, 28 March 2015 (UTC)
    Yes, my proposal was to have a local edit filter helpers group that could read the filters and logs but not change them, as about 50% of our filters are "secret". — xaosflux Talk 04:56, 28 March 2015 (UTC)
  • Interesting proposal. I've been previously declined EFM despite being trusted enough to be a TE, as well as everything else on here under admin, despite the fact that I only wanted the right to view filters so I could copy them to testwiki to work on them where I'm a crat and then propose changes and improvements here, and despite the fact that I'm a bot operator and labs tools maintainer. Due to that, I'd like to support this proposal. On the other hand, I feel extremely resistant to that on the grounds that it's bcreep and many of the reasons cited above. So, I'm afraid I'll have to remain neutral at this time. — {{U|Technical 13}} (etc) 05:30, 28 March 2015 (UTC)