Wikipedia:Bots/Requests for approval/NovemBot 3
- The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at Wikipedia:Bots/Noticeboard. The result of the discussion was Approved.
Operator: Novem Linguae (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)
Time filed: 20:49, Friday, April 15, 2022 (UTC)
Automatic, Supervised, or Manual: automatic
Programming language(s): AutoWikiBrowser
Source code available:
Function overview: Remove the following text from pages: Please select the <code><span style="color:#0645AD;">New section</span></code> tab above to post your comments below.
Links to relevant discussions (where appropriate): Wikipedia:Village pump (miscellaneous)/Archive 70#Talk page "Please select the New section tab ... " message, Wikipedia:Bot requests#Talk page "Please select the New section tab ... " message removal
Edit period(s): One time run
Estimated number of pages affected: 7,000
Exclusion compliant (Yes/No): No
Already has a bot flag (Yes/No): Yes
Function details: Will use a search for insource:"tab above to post your comments below.", namespace=all to generate the list. Will use replace to replace the text with blank. Will probably search for the following within the page and replace with blank. This will include a newline at the beginning that also gets deleted, so as not to leave a blank line:
Please select the <code><span style="color:#0645AD;">New section</span></code> tab above to post your comments below.
<!-- please add new posts to the bottom -->
VPM thread is pretty new as of the time of this filing, but is leaning toward consensus to remove. By the time this BRFA gets reviewed, hopefully will be enough discussion there for a clear consensus.
Discussion
[edit]Approved for trial (30 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete. ProcrastinatingReader (talk) 21:27, 15 April 2022 (UTC)[reply]
- ProcrastinatingReader. Trial complete. 30 edits. Some things I noticed.
- I decided to mark the edits as minor.
- The core regex is:
"\nPlease select the <code><span style="color:#0645AD;">New section<\/span><\/code> tab above to post your comments below.
- After immersing myself in some test edits, I added a couple other things to the regex that aren't explicitly mentioned in the consensus discussions but that I think should also be deleted:
(?:\n==Untitled==)?\nPlease select the <code><span style="color:#0645AD;">New section<\/span><\/code> tab above to post your comments below.(?:\n<!-- please add new posts to the bottom -->)?
. As I immerse myself more in this project, I may add more. If I shouldn't do this let me know. These extra things must be touching the "Please select the ..." text, are often found adjacent to it, and I think it's an improvement to delete them. - The bot added some pages where I've discussed the search phrase to its list. For example, WP:VPM, WP:BOTREQ, and this BRFA. I think this BRFA will be the only one that results in a delete. When the bot incorrectly deletes the phrase from here, I will just revert it if that's OK.
- –Novem Linguae (talk) 03:14, 18 April 2022 (UTC)[reply]
- @Novem Linguae: Few notes:
- You mention the regex will be tweaked, and I know you tweaked it to add "==Untitled==" for example, but I assume you noticed this after running some edits, so on some articles we have leftover issues (eg [1]), ie the cases where a replace was done before you noticed the extra to add. I feel like these artefacts shouldn't be leftover on article talks and should also be cleaned up. Though I note those headers were added as part of genfixes in another bot task. If editors manually saw this text on talk pages they might add a custom heading that wouldn't match. To that end, you may want to regex search for section titles associated with the text (regex search for (optional: any heading followed by any amount of whitespace) followed by message). Maybe something like
(?:\n+==+.*==+)?\nPlease select the <code><span style="color:#0645AD;">New section<\/span><\/code> tab above to post your comments below.(?:\n<!-- please add new posts to the bottom -->)?
(nb: untested) When the bot incorrectly deletes the phrase from here, I will just revert it if that's OK.
I'd suggest limiting it to talk namespaces (ns % 2 == 1), which seems the bulk of the additions, and seeing where that leaves you.
- You mention the regex will be tweaked, and I know you tweaked it to add "==Untitled==" for example, but I assume you noticed this after running some edits, so on some articles we have leftover issues (eg [1]), ie the cases where a replace was done before you noticed the extra to add. I feel like these artefacts shouldn't be leftover on article talks and should also be cleaned up. Though I note those headers were added as part of genfixes in another bot task. If editors manually saw this text on talk pages they might add a custom heading that wouldn't match. To that end, you may want to regex search for section titles associated with the text (regex search for (optional: any heading followed by any amount of whitespace) followed by message). Maybe something like
- ProcrastinatingReader (talk) 15:49, 21 April 2022 (UTC)[reply]
- Sure, will do. Let me know if you want another trial or can move forward with the entire run. –Novem Linguae (talk) 16:09, 21 April 2022 (UTC)[reply]
- Approved. Entire run is fine. ProcrastinatingReader (talk) 21:39, 22 April 2022 (UTC)[reply]
- Sure, will do. Let me know if you want another trial or can move forward with the entire run. –Novem Linguae (talk) 16:09, 21 April 2022 (UTC)[reply]
- @Novem Linguae: Few notes:
- The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at Wikipedia:Bots/Noticeboard.
Updated RegEx, for the record: (?:\n+==[^=]+==+)?\nPlease select the <code><span style="color:#0645AD;">New section<\/span><\/code> tab above to post your comments below.(?:\n<!-- please add new posts to the bottom -->)?
–Novem Linguae (talk) 00:05, 25 April 2022 (UTC)[reply]