Wikipedia:Edit filter/Traps and pitfalls: Difference between revisions
m →norm() and repeating characters: When + comma. |
user_age (and page_age) should be fixed since gerrit:775842 |
||
Line 31: | Line 31: | ||
|- |
|- |
||
! Variable !! At save !! At /examine or /test |
! Variable !! At save !! At /examine or /test |
||
|- |
|||
| user_age || 3600 || 7200 |
|||
|- |
|- |
||
| added_lines || Hello, world! <nowiki>~~~~</nowiki> || Hello, world! <nowiki>[[User:Alice|Alice]] ([[User talk:Alice|talk]])</nowiki> 21:07, 14 November 2019 (UTC)<ref group="note">This is actually the value of <code>added_lines_pst</code></ref> |
| added_lines || Hello, world! <nowiki>~~~~</nowiki> || Hello, world! <nowiki>[[User:Alice|Alice]] ([[User talk:Alice|talk]])</nowiki> 21:07, 14 November 2019 (UTC)<ref group="note">This is actually the value of <code>added_lines_pst</code></ref> |
Revision as of 08:53, 5 August 2022
This page[note 1] covers some common mistakes made by edit filter managers. For the full documentation, see Wikipedia:Edit filter/Documentation and mw:Extension:AbuseFilter.
Throttling
Throttling by user
alone throttles by user id, not username. All logged out editors share one user id, 0
. This may cause false positives if multiple unrelated anonymous users match the filter conditions. If this is a problem, throttle by user
and ip
instead.[note 2]
Throttling by ip
alone throttles logged in editors by their underlying IP address. Do not do this, unless the filter only targets logged out users. Instead, throttle by user
and ip
, as above.
user_rights
The user_rights
variable only contains the user's current rights. If the user has logged in using a bot password, or is editing with an OAuth application, user_rights
may be limited. For example, it looks like we could exclude extended confirmed users, bots, and administrators with[note 3]
!("extendedconfirmed" in user_rights) /* WRONG! */
but this will not work as expected if the user did not grant editprotected
when setting up a bot password. Instead, just specify the groups explicitly:
!contains_any(user_groups, "extendedconfirmed", "sysop", "bot")
Test/examine interface and recent changes
Some variables at Special:Abusefilter/test and Special:AbuseFilter/examine[note 4] will have different values from what they would have been, had the filter actually tripped at the time of the change.[note 5]
Suppose that Alice has a one-hour-old account, and adds the string "Hello, world! ~~~~" to a page that has only ever been edited by Bob.
One hour later, we look at her edit[note 6] with Special:AbuseFilter/examine. Some results may be surprising:
Variable | At save | At /examine or /test |
---|---|---|
added_lines | Hello, world! ~~~~ | Hello, world! [[User:Alice|Alice]] ([[User talk:Alice|talk]]) 21:07, 14 November 2019 (UTC)[note 7] |
page_recent_contributors | Bob | Alice Bob |
Order of operations
rlike
and other keywords have a higher precedence than +
. This does not check if added_lines
contains "foo" or "bar":
added_lines rlike "foo" + "|bar" /* WRONG! */
Instead use:
added_lines rlike ("foo" + "|bar")
norm() and repeating characters
The norm() function strips out confusing (spoofed) characters, repeating characters, special characters, and whitespace. However it does this in a specific order (the order just mentioned). This can lead to unexpected results:
string := "A AB,BCC";
norm(string) == "ABC" /* FALSE */
norm(string) == "AABBC" /* TRUE */
When in doubt, always use the debugging tool.
Creating a tag
Tags are created automatically when a filter is saved. Do not use the interface at the top of Special:Tags, unless you also want to activate the tag for manual use. Mistakenly activated tags may be deactivated from Special:Tags.
Be careful with arrays
The only operation that really works with arrays is length
. Other operations will implicitly cast an array to a string first. This could give an unintuitive result. For example, page_namespace in [12, 34]
is in fact equivalent to string(page_namespace) in "12\n34\n"
. Therefore, when page_namespace
is 1, 2, 3, or 4, the expression will be evaluated to true as well. In the above case, use equals_to_any(page_namespace, 12, 34)
as a workaround instead.
Be careful with division
One might expect that page_namespace / 2 === 0
will check if page_namespace
is either 0 or 1. However, the division operation in fact doesn't discard the remainder. That means, if the numerator is not divisible by the denominator, the result will be a float. In the above case, use equals_to_any(page_namespace, 0, 1)
instead.
Numeric comparisons with null
Like in PHP, null
is smaller than any number, i.e. null < -1234567
is true. This is especially problematic when using edit_delta
: if the action being filtered is not an edit, edit_delta < -5000
will evaluate to true. Remember to check that action === "edit"
when using edit_delta
like that.
Disappearing filter logs
Filter logs can disappear under these circumstances: 1) If an edit is saved and then rev-deleted or oversighted, then the filter log disappears from view (including from sysops). 2) Oversighters can remove the logs of either saved or unsaved edits. Edit filter counters will always increment, therefore, a filter may have fewer visible logs than the number of hits.
See also
Notes
- ^ The title was shamelessly stolen from C Traps and Pitfalls.
- ^ Not by
user
orip
- ^ All these groups have
extendedconfirmed
rights, according to Special:UserGroupRights - ^ When examining recent changes. Examining old filter hits will show the correct values.
- ^ See also T102944
- ^ Not a filter log entry, if any exists
- ^ This is actually the value of
added_lines_pst