Template talk:Page reports

From Wikipedia, the free encyclopedia
Jump to: navigation, search
Note: All names "RELC list" have been changed into new name "Page reports"
Wikipedia:Page reports

Page reports is a mini-project with a bot collaborating with user templates. The bot can produce Reports (a list of pages) to a users specification. For example: "List all pages in WikiProject Medicine that are not Low article quality". Both the templates for Request a report (before) and Access the report (after) are available for the user.

Useful links
WP page: Wikipedia:Page reports
Central talkpage: Template talk:Page reports (here)
Templates
Modules
Categories
Related pages, examples

The Request template[edit]

{{RCLinked
<!-- Only one of these sources for identifying what pages to include -->
| template = WikiProject Medicine 
| category = 
| prefix   =
<!--namespaces to include in report, default value is below -->
| namespaces = 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,100,101,108,109,710,711,446,447,828,829
<!--Include or exclude redirects defaults to true -->
| redirects = True
<!--Should talk pages of listed pages also be included? -->
| talkpages = False
}}
What else do you want to set as report rules? Werieth (talk) 20:13, 5 November 2013 (UTC)
Whow, this is fast. I am just setting up the taskforce structure, and don't heve mmuch time.
BTW, please never use "RC" for this (except inside your bot, invisible ;-) ), because Recent changes is something else, confusingly close.

-DePiep (talk) 20:21, 5 November 2013 (UTC)

This template would be for creating Recent Changes Linked lists. thus RCLinked would be a logical template name. Werieth (talk) 20:24, 5 November 2013 (UTC)
No, Special:RecentChanges is a completely different special page (tool, WP:RC). We work for Related changes (tool, WP:RELC), so RELC is appropriate. Your mixing up says how confusing it is ;-)
There is no such thing as related changes in mediawiki, the term and special page that is used is Special:RecentChangesLinked, it may often be called related changes but in the software its called Recent Changes Linked (aka RCLinked). It shows all changes to pages that are linked to from the current given page. Werieth (talk) 00:25, 6 November 2013 (UTC)
(Update below: I see you point). I don't mind what is or is not in mediawiki, or how it is called under the hood (and do you mean that Special pages are not in wikimedia code?). The user requests a Related changes overview (exactly that is the word in the toolbox in my wikiscreen). Even more so, there is a other menu option Recent changes (Interaction). Functionally they produce different results, so they are not the same. Also, for the user (say, a project editor) it is not about the technicality of using links to produce the list. That user wants & expects to see pages listed that are in the project. So "links" is also an under the hood word in this. -DePiep (talk) 06:46, 7 November 2013 (UTC)
Update: I see your point, and we can use it to serve the user better! I will work on this. -DePiep (talk) 09:50, 7 November 2013 (UTC)
No, it did not work. (I checked if a regular RC --as in: Recent changes from the menu-- would work on our page list. It does not, so any RC suggestion is off the table again.
Also, it is not about the links, but about the list (and opening it - the opening too shows a list overview). -DePiep (talk) 20:43, 5 November 2013 (UTC)
(edit conflict)Of course for simplicity, the templates should be named similar. May I suggest: {{Page reports request}}, or {{Page reports/Request}} (rationally preferred, but less nice to Editor Smith). DePiep
  • A REquest written in Article space (mainspace) shall not be processed. -DePiep (talk) 09:11, 9 November 2013 (UTC)

Parameters[edit]

About parameter names: let's use the verbal form like |add talkpages=. I prefer parameter names to be clear and descriptive, for average editors. That is why I suggest to use names like |add talkpages= instead of |talkpages=. The problem with the single word (noun) is, that is can mean other things too, so the user must think twice for its meaning. (Only those who work with it intensively, like we programmers, will learn to understand it by intuition). For example, a user can enter |talkpages=4 to expect talkpages for ns=4 being added (that is Wikipedia talk). Another example: a user can understand it to mean "what is the sandbox", and enter a filename: |sandbox=Wikipedia:WikiProject/List of pages/Sandbox. Just a preventable confusion, saves another visit to the documentation page. Most switch parameters can use a verb. - If the bot cannot take spaces in parameter names (is that so?), I suggest using underscores (above CamelCase or another programmers form): |use_sandbox=. -DePiep (talk) 09:08, 7 November 2013 (UTC)

Would suggest using _ as it would simplify template parsing. Werieth (talk) 13:45, 7 November 2013 (UTC)
OK. -DePiep (talk) 17:00, 7 November 2013 (UTC)

Project and identification parameters[edit]

project[edit]
 Done
  • |project=
Default: none
Should be used to identify and group related reports together. Will enable meta reports/lists for any project.
Example, WPMED could define |project=WikiProject Medicine
Propose name change to |subtopic= ("project" already has two meanings in this environment). The parent topic is the top page of the Request (say "Wikipedia:WikiProject Medicine"). For stable identification. See below.

We should not limit this to WikiProjects. The bot and the setup should be able to cover Requests from userpages, templates, and whatever. So a better identification would be: the top page name (Wikipedia:WikiProject Medicine). That is the identifying name for all its reports (all its requests), and can be found easily from every page. A shortened name, and a shortcut link, can be used in writings all right. Same for a Request in template space: Identification would be "Template:Convert". -DePiep (talk) 22:07, 7 November 2013 (UTC)

  • Proposal: The definition for a group of reports is the top-page name (not talkpage) of any Request page: "Wikipedia:WikiProject Medicine", or "Template:"Convert". So not free to set. All requests in such a topic are grouped. -DePiep (talk) 07:35, 8 November 2013 (UTC)
Still unclear to me. Wereith, will the bot work on every wikipage alike, or are you limiting it to Wikipedia space (or even smaller, WikiProjects)? -DePiep (talk) 10:12, 8 November 2013 (UTC)
My initial though was to keep the request process as open as possible. The project parameter is the primary method for grouping reports. My thought was to create whether its in project space or user space a Reports/<project> where all reports with a given project name are listed. Werieth (talk) 11:04, 8 November 2013 (UTC)
I am using this idea as follows: the topic is, by default, the Top page (no subpage, and in subjectspace i.e. not talkspace). That is a stable identification for all Requests. Then, a user can add a "subtopic" in the Request, that will group tose Reports in a separate subgroup. Say, the user Requests three Reports to research Reference issues. All Requests are given |subtopic=Ref Research, and our Access-the-pages overview template will list them nicely together (reading from the meta-reports). One more happy user, and no confusion (we will have good identifications). I will make examples.
I use values "topic" and "subtopic" for this ("project" has more meanings). -DePiep (talk) 10:48, 9 November 2013 (UTC)
request_id[edit]
 Done
|request_id=87654321
Generated and set by the bot once (written by bot in the Request template)
Essential: unique over all requests.
Format: for the botmaster to choose. Must be unique, so somehow the bot must remember which ids are used (but not where, or for what). Even deleted id's (from deleted requests) are not to be reused. For simplicity, we use a counting number as an example. The bot remembers which number was assigned last.

Workings:

1. The bot assigns this id when is discovers a new Request (Request without request_id), and writes like |request_id=987654 into that template.
2. The meta-report (job report) notes the id, and its fullpagename. This combination will be used to validate a request in the future (negative validation occurs when: a job is moved to another page by a user; a tampered number can be detected as something wrong). For now the bot does not need to correct these errors. Note that the pagename is not part of the id. It is a flag used to spot changes. Also, when the regular Request parameters are changed by an editor, the id shall not change.
In every next job run, the bot replaces the meta-report(s) with this request_id. Other meta-reports are untouched.

The point is that we want to identify each job request over time. The settings may change by an editor, but the id stays the same. This also allows identification for multiple requests on a single page. The id-to-pagename check is useful to note changes (to start with, it could give an error message; solving it could be done separately).

First major usage will be the job report. A meta-report will be identified by this same id.

Where ever the report is written, any update will have to overwrite the right report, even when there are two on a page. The id numbering is also present in the list of meta-reports (probably Lua data pages). Maybe errors can be detected from that list too (Lua code). -DePiep (talk) 09:19, 8 November 2013 (UTC)

This will need to take some thinking on how to implement. However my thought for the reports would be a 1 to 1 relationship. Each request template has its own page. Usage of multiple templates or on other pages (where the bot cannot simply overwrite the page contents) is asking for a headache and problems. Werieth (talk) 11:46, 8 November 2013 (UTC)

Data sources[edit]

  • Only one of these options can be used at a time
  • By default all listed talk pages are converted to their associated non-talk pages.
This enables one to list a WikiProject banner and get all articles within the scope of the project.
template[edit]
 Done
  • |template=
Default: none
Interacts with: #category, #prefix, #template
category[edit]
 Done
  • |category =
Interacts with: #category, #prefix, #template
prefix[edit]
 Done
  • |prefix =
Interacts with: #category, #prefix, #template

What is prefix? talk pages already have an option, what are you referring to by subpages? and how are you defining update frequency? Werieth (talk) 20:42, 5 November 2013 (UTC)

e.g. |prefix=Template:Convert: The default full list is generated by |template =WikiProject Medicine. That list (let me call it the "first list", the one before any operations) will be reduced by ns filters, and maybe expanded by its Talkpages. All fine.
This parameter gives the option to generate the "first list" from somewhere else: the list of prefix pages. Say in Project space, we can ask for the Page reports for all subpages of the project. That is, all pages with prefix "Wikipedia:WikiProject Medicine" (these pages may not all have the project template, so that default route would miss out pages).
Another example is from Template space. A template developer/follower might want to see all changes to a certain template -- including all its subpages. The prefix list (special page) |prefix=Template:Convert generates that "first list".
I agree this is non-default behaviour, so could be added later on. It's just we should not make it impossible. -DePiep (talk) 21:31, 5 November 2013 (UTC)
Quick suggestion for code simplicity: These source-criteria (default: project template transclusions, or my alternatives) are exclusive for simplicity. So if a |prefix= is set, the ns filter is not used.-DePiep (talk) 07:37, 6 November 2013 (UTC)
By default all namespaces are included, it would be kinda stupid for someone to create a report based off a prefix and then exclude that namespace. Werieth (talk) 18:20, 7 November 2013 (UTC)
??? If the prefix has an ns, how could "all namespaces" be included? Logic says only that one is included. And adding no ns in the prefix would mean "mainspace" right? Not "all namespaces to be checked". That is regular PrefixIndex workings. So: if prefix is used, any limitation in namespaces is not used. -DePiep (talk) 22:13, 7 November 2013 (UTC)

Report options[edit]

namespaces[edit]
 Done
  • |namespaces= 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,100,101,108,109,710,711,446,447,828,829
Options: list of ns numbers
Defaults to all namespaces
Namespaces to include in report
Note that by default that talk pages are converted to their associated namespace.
This note is irrelevant here.
This means that entering just "4" would yield the same as just "5"? (and adding talkpages only through parameter add_talkpages?). OK with me, but we would loose the option to include many content pages and only a few talkpages (say 3 and 7) -- and not setting the add_talkpages at all.
I suggest 1. using the ns's as themselves be it talkpage or not, and 2. have the add_talkpage parameter, when set, overrule these ns talkpage numbers. (True=add all, False=remove all, no value=by ns list). -DePiep (talk) 22:20, 7 November 2013 (UTC)
  • Werieth, you still have not described how the the parameters interact: |namespaces= and |add_talkpages=. The single question is: how are the talk-namespaces (the odd ones) added or to or removed from the resulting list. -DePiep (talk) 07:49, 8 November 2013 (UTC)
    My initial thought was to filter the raw source data via these namespace values. Lets say we are working off a template or category which is used in multiple namespaces, it would ignore data coming from namespaces not identified in this parameter. Werieth (talk) 11:42, 8 November 2013 (UTC)
    All |add_talkpages= does is add a link to the corresponding talk page in the report. See an updated example at User:Werieth/sandbox Werieth (talk) 11:42, 8 November 2013 (UTC)
Below in #add_talkpages I'll expand on the other suggestion I have. -DePiep (talk)
redirects[edit]
 Done
  • |redirects=true
Options: true, false
Defaults to true
Include or exclude redirects defaults to true
Werieth, this reminds me of this. In a WLH list, I have this option ("Show/Hide redirects"), and also "Show/Hide links" (the third option being "Show/Hide transclusions", which is obvious to Show for us here). The question rises: if you can filter on Redirects, can you also filter on "Links"? If yes, that could be a similar switch here. -DePiep (talk) 10:40, 9 November 2013 (UTC)
I Have no clue what you mean by show/hide links. Werieth (talk) 12:54, 9 November 2013 (UTC)
────────────────────────────────────────────────────────────────────────────────────────────────────Here is a What Links Here page (WLH special page) for a regular template: [1]. Listed pages have added "(transclusion)", "(redirect page)", or "" -blank- for a regular link.
The page has a box "Filter" that allows us to choose:
Hide transclusions | Hide links | Hide redirects . Clicking a "Hide" option will filter out those from the list. Then the word "Hide " changes (toggles) into "Show".
Since you use the option "Redirects" (which seems to work alike), there might also be an option to include/exclude "links" too. The third option to exclude "transclusions" to me seems not needed, but maybe it is. -DePiep (talk) 15:29, 11 November 2013 (UTC)
add_talkpages[edit]
 Done
  • |add_talkpages=false
Options: true, false
Defaults to false
Should an additional link for the page's talk be included?
  • Comments:

"True" adds all talkpages of the listed non-talkpages. Reduces the need to specify them in the ns parameter. And works especially nice too for the |ns= selection. Adding ns=11 to get the Template Talk pages listed would only list those with the template, right). Setting it to false is also a simple switch, easier than checking the ns numbers. -DePiep (talk) 21:31, 5 November 2013 (UTC)

Toggling talkpages above to True by default isnt that big of a deal. -Werieth (talk) 11:06, 6 November 2013 (UTC)
When set (either T or F), will this overrule any odd ns number that is set? -DePiep (talk) 17:02, 7 November 2013 (UTC)
See the second bullet on #Data sources, by default talk pages that are part of the list are converted to the non-talk version. Werieth (talk) 17:06, 7 November 2013 (UTC)
An other proposal below. -DePiep (talk) 12:30, 11 November 2013 (UTC)
  • An other logic. Rule: when set, this |add_talkpages= overrules the talkpages listed in the {{namespaces}} list (including the default ns=all).
  • Example 1:
|namespaces=0,2,4,5,6,7, |add_talkpages= (empty or missing)
→ ns's reported: 0,2,4,5,6,7 (same as your proposal)
  • Example 2:
|namespaces=0,2,4,5,6,7, |add_talkpages=true
→ ns's reported: 0,1,2,3,4,5,6,7 (all talks are added; same as your proposal). btw, this looks like it should be in the "generating source list" params. It really is expanding the source list!
  • Example 3:
|namespaces=0,2,4,5,6,7, |add_talkpages=false
→ ns's reported: |namespaces=0,2,4,5,6,7 (all talks are deleted)
Motivation: this gives the user a simple switch to control the output wrt talkpages. It is way easier than having to check the ns numbers. I think the logic is easy to grasp. -DePiep (talk) 12:30, 11 November 2013 (UTC)
I would rather keep things uncomplex. keeping all the primary ns filtering in one parameter is the most sane thing. Werieth (talk) 13:42, 11 November 2013 (UTC)
max_report_age[edit]
 Done
  • |max_report_age=7
How often the report should be updated. Daily, every 7 days, once a month, every other month or whatever.
Options: integer number
Set to 0 for an update upon the next run, will then reset the value to its default which is 7 days


Interacts with: some hardcoded setting to prevent bot abuse/overuse?
Name change proposal: |update after=30 or |update interval=30? (the correct wording for it). The frequency is "1/30 days". -DePiep (talk) 13:00, 7 November 2013 (UTC)

and how are you defining update frequency? Werieth (talk) 20:42, 5 November 2013 (UTC)

A project page list does not alter that much. Maybe this parameter could take |update interval=0 for an update asap, and then change the 0 into "30" or so - switch itself off. -DePiep (talk) 21:31, 5 November 2013 (UTC)
What context are we working with, milliseconds, minutes, hours, years, decades? Werieth (talk) 21:39, 5 November 2013 (UTC)
Dunno. Any suggestion? -DePiep (talk) 22:27, 5 November 2013 (UTC)
My thought would be either hours or days with a command option for forcing a faster update. Werieth (talk) 00:10, 6 November 2013 (UTC)
Hi. If you're talking about how often to update the primary list of pages being watched, hours would be better than days (once an hour?) - watchers would then be effectively patrolling new medical pages (once they've been tagged with {{WPMED}}) as well as recent changes. --Anthonyhcole (talk · contribs · email) 06:28, 6 November 2013 (UTC)
No, this is about how often the list is updated (renewed). I'd say once/14 or once/30 days would do. That is only to catch (add) pages that were recently added to the project. Plus an option to request "update asap, for once" after big changes (e.g., when many project templates added to pages). Your point, how often people click on the RELC link (open the RELC special page) we do not use in the bot setup directly. User experience will be discussed in another section. -DePiep (talk) 07:23, 6 November 2013 (UTC)
  • Recapture |update frequency=14 is the interval in days between two bot runs for a single Request. So the data page is updated on January 25, and February 8 (after 14 days).
Note 1: "frequency" is the wrong word (that would ask for "frequency=1/14 days" to be correct). So better change it in |update after=14 or |update interval=14
Note 2: About the option |update after=0: I meant this. That "0" could be set to ask the bot: "update asap". Is useful when many changes have happened in the tracking template (the number of project pages has changed seriously), and waiting weeks is not desired. Then, to prevent bot abuse (updating almost continuously), after an "asap" update the bot sets the zero back to "30" or so (editing the Request).
Note 3: Maybe it is more systematically to write 30d (like the archive bot does). That allows more easy to add different measure by a code ("4w", "asap"?). Bot masters choice?
Note 4: Updating interval set in hours may be too much of detail. Remember, we only list the pages in the project, tracked by a project template. That number of pages does not change too much within hours, I assume. Even "1 day" updates would be very hot. (Can the bot be triggered when the number of project pages has changed say over 10?)
Note 5: the bot can have a hardcoded default. 30d? -DePiep (talk) 13:00, 7 November 2013 (UTC)


in_category[edit]
 Done
|in_category=
Cross references an existing source list with a category
Useful for finding all unsourced BLPs within a projects scope
May be defined several times to further filter the results
Good feature. Repeated parameter I guess (not a comma separated list)?
No as category names may contain commas. Werieth (talk) 14:00, 9 November 2013 (UTC)
not_in_category[edit]
 Done
|not_in_category=
Deletes pages who are in that category
Usage example: list all WP:MED pages, except those in the Low Improtance category
May be defined several times to further filter the results
disabled[edit]
 Done
|disabled=False
Options: true, false (default =false)
|disabled=true will tell the bot to skip updating this request.

Non-bot parameters[edit]

These are parameters that are not used by the bot, but help the user describing the Page reports. The bot simply passes them through to the meta-report (so that templates can use it).

summary[edit]
 Done
  • |summary =
Brief summary of what the report's purpose is. Is not used by the bot.

Parameters (proposed)[edit]

exclude namespaces[edit]

  • |exclude_namespaces=1,3,10,11
Options: list of ns numbers
Interacts with: #namespaces
Would cut out ns's from the |namespaces= list.

Adding proposal |exclude ns=, that would cut out ns's from the |namespaces= list. btw, the default is |namespaces=all I suggest. -DePiep (talk) 07:37, 6 November 2013 (UTC)

Talking about the ns list: Wrong place, and a bad idea si I struck it.. -DePiep (talk) 19:04, 10 November 2013 (UTC)
Some sp edits. -DePiep (talk) 13:22, 11 November 2013 (UTC)

max_size[edit]

|max_size=
Maximum size of pages to be included in the list

min_size[edit]

|min_size=
Minimum size of pages to include in the list

list_images[edit]

|list_images=
Instead of listing resulting pages that where found, create a basic list of images used on those pages.

list_templates[edit]

|list_templates=
Instead of listing resulting pages that where found, create a basic list of templates used on those pages.

title_before[edit]

|title_before=
Used for alphabetical sorting, this is the cutoff point where the list should stop reporting at

title_after[edit]

|title_after=
Used for alphabetical sorting, this is the cutoff point where the list should start reporting at

age_of_last_edit[edit]

|age_of_last_edit=
Age in days of the last edit, only returns a list where the last edit was at least X days ago

has_template[edit]

|has_template=
Filter the primary source list to only those that inlcude a particular template

does_not_have_template[edit]

|does_not_have_template=
Filter the primary source list to only those that do not have a particular template

talkpage_has_template[edit]

|talkpage_has_template=
Filter the primary source list to only those whom's talkpage that inlcude a particular template

talkpage_does_not_have_template[edit]

|talkpage_does_not_have_template=
Filter the primary source list to only those whom's talkpage that do not have a particular template

talkpage_exists[edit]

|talkpage_exists=
Filter the report to only inlcude those pages that either do or do not have a talk page
Overly complicated. -DePiep (talk) 11:56, 9 November 2013 (UTC)
Might be complex but it could be useful. (IE find all images with talk pages) or find all templates without talk pages. Werieth (talk) 12:58, 9 November 2013 (UTC)

Non-bot parameters (proposed)[edit]

These templates are not used by the bot. They are used set up the user-side tempaltes. The bot does pass them through to the meta-report.

show_request[edit]

|show_request=true
Options: true, false (default: false) true
This parameter is not used by the bot; just written back in the meta-report.
Set to true, the Request template is visible as a template, presenting the settings.
Set to false, the Request is not visible.

Maybe there could be a second level: |show_all=true to show all technical aspects (target page names, Request id, ...) -DePiep (talk) 23:06, 7 November 2013 (UTC)

shortcut[edit]

|shortcut=[[WP:MED]]
Options: A shortcut to the WikiProject (like WP:MED), including the brackets. Optional
Not used by the bot (but reported back in meta-report).
Set to true, the various templates can use this link to link to the project page (or other page). If absent, the templates will link to the top page (e.g. WikiProject Medicine (labeled without ns). -DePiep (talk) 07:43, 8 November 2013 (UTC)

add_page, add_topic, add_subtopic[edit]

|add_page=
Lets a user add a page manually to the Request. Not numbered, can be multiple
The Bot does not use this, but passes through to the meta-report. Our access-tempaltes add this page to the list of Reports.
Example: |add_page=Braille will make Braille available, in the access template, along with regular Reports.
|add_topic=, |add_subtopic=
Adds topics to the our access templates for Reports. Can be added repeatedly (same parameter reused)
Not used by the bot, pass-through to the meta-report.
Allows user to put two Report topics (e.g., two from Wikiprojects) together. Subtopic can be a specification within that topic.
Would suggest not using numbered prameters in the request template. Would just make it overly complex to parse. Just repeat the add_subtopic as many times as needed. Werieth (talk) 13:01, 9 November 2013 (UTC)
These are not used by the bot, so nothing to parse there. They only need to be passed through to the meta-report output 1:1 (like "|add_page1=Braille"). With a number, they are differently named params just like "template=" and "category=" are different (parsing in module/lua can be done).
There is another drawback with just repeating them, not numbering. The show-the-request template code in {{Page reports/Request}} uses old-style wikicode. That can read a parameter name only once, so it would show only the last input of that series. Not a big issue, but not needed. Pass them through will do. -DePiep (talk) 13:57, 9 November 2013 (UTC)
Honestly I think keeping it sane by just repeating the param name would be best, if someone wants to see all of that information they can look at the meta report. Werieth (talk) 13:55, 11 November 2013 (UTC)
As you propose, indeed it can work. Same parameter can be reused, meta-report can manage. -DePiep (talk) 14:23, 11 November 2013 (UTC)

Hardcoded bot settings[edit]

  • maximum page size by number of links (forces a split into new page)?
  • Page reports page name(s) (list pages to be updated)?
    Each report will be confined to the page where the template was posted. Werieth (talk) 17:54, 7 November 2013 (UTC)
I don't find this a good idea. Page names are proposed and discussed in #The "Page reports" pages. -DePiep (talk) 21:20, 7 November 2013 (UTC)
Given that what I am envisioning would have the potential to go far far beyond a simple RELC page, it is. The two parameters that I am thinking about in #Project Parameters would enable the creation of a summary page which lists and provides links to the individual reports. Werieth (talk) 21:24, 7 November 2013 (UTC)
I am not saying the pages should be named "RELC list" or so (that is just a preliminary project name; and it is not in my proposed page names). I am saying a. writing dedicated pages everywhere is bad practice and b. you are expected to propose and discuss. I have envisioned things too. -DePiep (talk) 21:41, 7 November 2013 (UTC)
Given the amount of data and the number of ways it can be manipulated dedicated pages are just about the only way it will work. The site already has issues saving just a full bare list of pages for WPMED. Defining a few static generic reports like "All pages in project" and its variants is OK, however with the type of bot Im planning we can have a LOT more power and I dont think that limiting it to just a few report types is a good idea. If you are not interested in the expanded project I am more than happy to create the basic tool you want and provide that to you while I develop the more compressive and feature rich tool for others. Werieth (talk) 21:54, 7 November 2013 (UTC)
This thread is about page names. I don't see how the scope of the bot potential could mean that we shouldn't use systematic page naming. -DePiep (talk) 18:51, 10 November 2013 (UTC)

Deferred or disapproved parameters[edit]

These parameter proposals are deferred (to think about for later versions) or disapproved completely (for logicalley not sound/not needed).

add_subpages[edit]

Deferred or Not done
  • |add_subpages=true
Options: true, false (default: false)
Adds all subpages of a given page

Adds all subpages of a given page to the "First List", without the need for the check template. Only mainspace has no subpages. A functionally clear and simple switch. Yes there is overlap with other switches, but easy to grasp.

Example: Template:NavPeriodicTable (wp:elements). The option adds subpages the subpages.
This one too not the default list generatiing, but could be possible later on. Should not be made impossible. (Or, in general, I suggest the bot should not be hardcoded into a single way of list generating)-DePiep (talk) 21:31, 5 November 2013 (UTC)
Wouldnt this and the prefix generator be redundant? Werieth (talk) 21:39, 5 November 2013 (UTC)
re Redundancy. Is not the intention, but there is an overlap. |prefix= lists all sub pages, say in a wikiproject (|prefix=Wikipedia:WikiProject Medicine). But |add subpages=true would work as a secondary with the basic list (as sourced by project template). When a page X is listed because of that template, this switch then would add all subpages of page X. That is: page X/doc, page X/data (irrespective whether they have that project template on their talkpage).
In other words: prefix generates a basic list, include_subpages only one expands a basic list. (logic says: when prefix is used to generate the list, then include_subpages has no effect because the basic list already has them).
Notes: I think this useful for exploring project, template and module space. They work with subpages easily.
Not essential, no need to add this to the first version. We might look at this later. -DePiep (talk) 13:18, 7 November 2013 (UTC)
  • At the point where you are getting into multiple pages with sub pages needs to be done carefully as you can quickly get an out of control list. Werieth (talk) 14:21, 7 November 2013 (UTC)
The bot can cut off if the Report page is over a size (or # of pages). Say at 1MB Report size. Then add a botmessage (error/warning). What else can go out of control? -DePiep (talk) 18:46, 10 November 2013 (UTC)

split[edit]

Deferred to possibly a future version
  • |split=
Options: number of pages ?; a code to split by ns, of "Class" (FA--stub article quality)?
Interacts with: hardcoded setting for maximum page size
Says how to split the full Report list over multiple pages

A code to split a (large) page over multiple pages. E.g., the WP:MED page could be split in 4 pages with 28.000/4 pages each (~ 0–9, A–H, I–O, P–Z). Or a code |split=class could say to split by quality class (FA-A-GA- ... Stub (=7) and page "other"). -DePiep (talk) 22:39, 5 November 2013 (UTC)

configuring the split format isnt as easy. Offhand my first thought for pure ease of development would be to split on every X pages. IE every page contains at most say 1,000 pages, which would be sorted alphabetically. In order to keep things sane whatever splitting methods we develop must work with all page source methods. (template, category, prefix ect.) Werieth (talk) 00:10, 6 November 2013 (UTC)
Very quick, to be fleshed out later: 1. default splitting is by max_page_size. The bot has an absolute max (say 30000 links/page; nice testing with current WP:MED; note that is request is |add talk=true, we'll have 64000 links in the First List). The bot definitively starts a new page with 30001. Such straight page breaks are numbered 1-2-3 ... (user requesting like "|max page size=25000" (links) could be added later. -DePiep (talk) 07:37, 6 November 2013 (UTC)
Can't we sort & split by namespace (ns), by default even? After Articles+Talk, there are not many projects that have 30.000 non-article links. They could be in one page most of the time.
Gotta go now. Will be back tomorrow to flesh this out. Then I'll have more time to explain. TDon't break your mind on these points, for now. See you. -DePiep (talk) 07:37, 6 November 2013 (UTC)
Toggling talkpages above to True by default isnt that big of a deal. Instead of coming up with dozens of ways to split a report lets take a different angle. Instead of the basic /1 /2 /3 splitting needed for size issues, we could just have a project invoke multiple report requests. If they want something beyond the basic default all pages listing have them configure and use a second/third ect report template. That way we can define a fairly simple template without having to go crazy. If they want a report with just articles or article talk pages they can use the namespace parameter to filter the report to just the relevant pages. Please remember to keep things simple, instead of creating a dozen different methods to split a report we can just use a few existing parameters to redefine what is output by a report. Werieth (talk) 11:06, 6 November 2013 (UTC)
Minor point: better not name pages "/1, /2", because that would create subpages. Space + Brackets like /Artilce (1), /Article (2) would be nice. As you say, this is a splitting of last resort splitting is not needed. The user should split the Request into two. -DePiep (talk) 17:10, 7 November 2013 (UTC)
OK for multiple Requests per project. How would we keep them apart? For example, a job report (meta-report) should only overwrite its own report. -DePiep (talk) 17:07, 7 November 2013 (UTC)
Smart bot is smart :P . Each report request should be on its own page. IE a report that is for just articles is placed on <project>/Articles and for images its <project>/Files the list of reports (meta report) can be configured however you want. (My initial thought was just a large array in a lua template). Werieth (talk) 13:52, 11 November 2013 (UTC)

use_sandbox[edit]

Not done
  • {para|use_sandbox|true}}
Proposal withdrawn, not needed separately. See below. -DePiep (talk) 10:04, 9 November 2013 (UTC)
Options: true, false. (default: false)
To have the bot writes to /sandbox subpages (testing pupose)
Interacts with: target (output) filenames (#The "Page reports" pages)
When set to "true" in the Request template, the bot writes output to /sandbox subpages like Wikipedia:WikiProject Medicine/List of pages/Articles/sandbox.
Usefull for bot testing and for user request testing (the user can request a new report type, but see it tested first by setting this parameter). Might be useful from today to prevent our existing WP:MED pages to be disturbed by test results. -DePiep (talk) 08:54, 7 November 2013 (UTC)
  • The bot would normally write the results of a report on the page where the template was added. This would enable projects to define and create what ever reports that they want. I see this tool having quite a bit of functionality and perhaps going beyond your limited RELC idea. My thought would be to create a template perhaps {{WikiProject report}} where parameters can be set to get a list of all files included in their articles, all articles that are between B and G alphabetically, all articles, all templates. Basically a method for generating reports that wiki projects want. Werieth (talk) 17:28, 7 November 2013 (UTC)
Where page should be written, later more (I have different thoughts). And yes, a template for the user that shows all available Reports (e.g., in the a wikproject) is my target #1!!! That template is easiest when it can use the bot meta-report, via Lua. We are heading that direction. WT:MED already has such a template (a very primitive one). -DePiep (talk) 10:04, 9 November 2013 (UTC)
Withdrawing a separate "use_sandbox" switch: not needed. The user can specify a "subtopic" in the Request, and so create a different set of pages to track (A bit along the suggested #Project parameter). Such a "subtopic" can easily be named be "sandbox" or "Research references" or "What a nice Sunday", so the playing in the sandbox does not need a separate switch. -DePiep (talk) 10:04, 9 November 2013 (UTC)

Questions[edit]

Werieth and DePiep, I'm having trouble understanding the above. What pages are you including in this tool's scope - pages with {{WPMED}} on the talk page? Also, above I asked if the list of pages in scope could be updated (with pages that were recently added to the project) frequently - because it would be good for new medical articles to be included in this tool's reports as soon as possible after they are tagged. Yet you seem to have settled on updating the list every fourteen days. If I've understood that correctly, can you explain why you've opted for such a long delay between updating the list with newly-tagged pages, please? --Anthonyhcole (talk · contribs · email) 17:29, 7 November 2013 (UTC)

I havent set any report timeframes, except for perhaps a minimum of 24 hours per update. For the most part the update frequency can be set via the template. Werieth (talk) 17:35, 7 November 2013 (UTC)
I'm not sure what that means. Sorry. And what are you using to define your source list, is it {{WPMED}}? --Anthonyhcole (talk · contribs · email) 17:43, 7 November 2013 (UTC)
Right now I am only planning on having the bot run once per day. This would mean that the most often that a report could be updated would be once per day. However, each report can pick how often it should be updated. Each report will be able to define its own source see #Data sources above. Werieth (talk) 17:48, 7 November 2013 (UTC)
(edit conflict) 14 days was just an example. 24h seems reasonable, if the bot can handle that. Be aware that it would be an overwrite always every time, because the bot cannot detect beforehand if there are changes at all (no pages pages added trigger). Or is there a counter, Werieth?
The bot picks up all pages that have the {{WPMED}} template. These are talkpages, so the pagename is changed into its twin page name (like article page name). That is a basic list. Which namespaces kept in the list is an option for the project editor (you). Also whether their talkpages should be added to follow. This is done in the Request template, put somewhere on on a Project page. (Currently for WP:MED: only article namespace is in there (28.000), and not their talkpages). Once the bot is alive, you can tweak these settings in your project's Request, to get usefull Changes overview. -DePiep (talk) 17:54, 7 November 2013 (UTC)
What DePiep said, and once done you can have the bot creating as many reports customized in just about any way you want. All you would need to do is create a subpage in your project, fill out the report request template, and wait. Werieth (talk) 18:04, 7 November 2013 (UTC)

So Special:RecentChangesLinked/Wikipedia:WikiProject Medicine/List of pages/Articles reports changes to articles whose talk pages have {{WPMED}} (because of |template =WikiProject Medicine). And if |category = or |prefix = were filled, it would also report changes to pages in that category or with that prefix. And we can set the frequency for list updates to whatever we think meets our needs. Perfect. --Anthonyhcole (talk · contribs · email) 18:09, 7 November 2013 (UTC)

One minor tweak I will make to your comment (its 99% correct), the three parameters defined in #Data sources cannot be combined. You can pick one of the three as a source. Werieth (talk) 18:18, 7 November 2013 (UTC)
Gotcha. Thanks for the clarification - and for everything you guys are doing here. --Anthonyhcole (talk · contribs · email) 21:02, 7 November 2013 (UTC)
And, as I understand it, you can add a second request for WP:MED that would list that category in a separate page (a separate list to follow). That should give two links to click. -DePiep (talk) 21:16, 7 November 2013 (UTC)

The "Page reports" pages[edit]

The lists are to be written to well-defined pages. A first suggestion for page naming (the rule numbering allows for changes and additions):

  • (1) The requesting top page.
rule 1.10: the top page of the RELC request. Irrespective of the place of the "Page reports request" template, can be a subpage.
Example: suppose for WP:MED the "Page reports request" template is at Wikipedia:WikiProject Medicine/Project management. That is legal. The "Requesting top page" then is defined as being Wikipedia:WikiProject Medicine.
Wikipedia:WikiProject Medicine
  • (2) The RELC grouping page.
Rule 2.10: This page is a subpage, always named "/List of pages"
Wikipedia:WikiProject Medicine/List of pages
  • (3) The Page reports page. Contains the actual list to be used.
Names by ns:
Rule 3.10: If all namespaces are in, it is: "All".
Rule 3.20: Name of the non-talk namespace, turned plural. Mainspace (blank name) is named "Article", so the page is named Articles".
Rule 3.30: Multiple names are separated by ", " (comma space), list separator:
"Articles, Wikipedias, Templates". (Wikipedias?? better use "Projects"?)
Rule 3.40: If talkpages are added, each ns is paired with "+Talks". So "Articles+Talks, Templates+Talks".
Rule 3.50: If only talkpages are requested (ns is 1,3,5,...), the name is "Talks, User Talks, Wikipedia Talks".
Not ns names:
  • (4) Rule 4.10: Todo: if the list is generated through a non-default principle (not the project template), the name is ...
  • (5) Rule 5.10: Todo: If the list is split (filtered other than by ns), the page has a bracketed disambiguation term:
"Articles (A–M)" or "Articles+Talks (Class-A pages)" (?)

-DePiep (talk) 22:14, 5 November 2013 (UTC)

Commenting myself:
re (1) I think we should switch to the talkpage of the top page, always. So it would be Wikipedia talk:WikiProject Medicine. Then it will also work in Article/talk space (for whatever reason we'd want to work there) and Module/talk space.
re (2) I think the grouping page should be plurals, always: /Lists of pages. So it would be Wikipedia talk:WikiProject Medicine/Lists of pages.
If we agree, we better change this early in out taskforce. Well, we need to define our page names early anyway. Especially the default ones (rules 1-2-3, and 5). -DePiep (talk) 22:21, 5 November 2013 (UTC)
  • Suggestion: add option "/sandbox" to pagename(s):
The bot can work in /sandbox mode (|). In this case, the grouping page/'Page reports page(s)/All (?, tbd) have the pages have suffix "/sandbox":
Wikipedia:WikiProject Medicine/List of pages/sandbox
and/or
Wikipedia:WikiProject Medicine/List of pages/Articles/sandbox -DePiep (talk) 08:42, 7 November 2013 (UTC)
I see where you are heading. All proposed parameters are to be postponed for version 2.0 (as you said: first build a stable basically functioning core).
More important is the bot feedback (=meta-report, or job report). The job is not finished when the data page is written. To handle these data pages, in-wiki templates must be able to know the page names, content, and other facts (see #The meta-report). This is crucial for the first version. -DePiep (talk) 21:34, 7 November 2013 (UTC)
See the #Project Parameters given those two parameters creating a meta report is trivial. Werieth (talk) 21:36, 7 November 2013 (UTC)
It is essential, because we must have a wikiside handle to get these reports (page name, content description, etc.) in a link template and more. Whether it is trivial as in "easy to code", that is fine. If trivial means "not that relevant", you are wrong. One can not just wrote a page and have the user find out how to find is and use it. To be clear: it is a bot's job, because from within the wiki we cannot read the Request template. -DePiep (talk) 21:59, 7 November 2013 (UTC)
  • Simple base: 1. All pages are written in a talkspace, not a content space. 2. Page names are systematically for the taskforces overview (not at random); they may be dynamically though. 3 Pages are dedicated by the bot/project (that is, owned for writing; manual edits can be overwritten). 4. Pages are used through a template, not directly by the user.
Note: a decision that we only write in talkpage spaces (1.) is to be made asap. It is used all over the project. -DePiep (talk) 22:44, 7 November 2013 (UTC)
(edit conflict)I was using the term trivial in a manor referring to the fact that Each report is tied to a project, and B: Has a discription. Given those two factors combining them into a single source (template, wikiproject page, or lua module) is easy. Werieth (talk) 22:53, 7 November 2013 (UTC)
These reports should be in the Wikipedia namespace, IE ns 4 which is neither a talk or content namespace. The bot really doesnt care what the report page titles are, it would be able to work with what ever. If a project wants to use a predefined structure, they can, or if a project wants to expand/shrink the reporting to fit their needs the bot would be able to do that too. Werieth (talk) 22:53, 7 November 2013 (UTC)
No, the bot should not write pages in Subject space ("subject space" is the opposite of talkspace; see {{SUBJECTSPACE}} vs {{TALKSPACE}} magic words -- "Subject" is the right word for this, I just discovered it). I do not want to (first) discuss with you the essence of a subject page, (second) discuss all exceptions that oppose writing there, (third) write documentations or warnings for these exceptions, (four) spend the rest of my stay at wiki explaining four times a month to a user of our Page reports why his REquest went wrong. Plain simple: the bot only writes Reports in Talkspace. -DePiep (talk) 15:53, 11 November 2013 (UTC)
This is where I am going to disagree and set my foot down. Reports can be generated just about anywhere. If a user wants to get a report (just for them), they can throw a request template up in their userspace. Also project space vs project talk space is also an issue. If a report is created and the members of the project want to discuss changes to the report they should be able to use the talk page of the report to discuss it (Thats why we have them in the first place), or if the project needs to coordinate work on the results of a list the talk page is the best place to do that. A user shouldnt have to find a wikiproject and pigeonhole their request as part of the project when in fact it may not be relevant to the project. Werieth (talk) 15:59, 11 November 2013 (UTC)

Expanding too fast[edit]

I think we need to step back and re-focus this discussion/development. Right now I see at least three or four different ideas being developed in parallel. My thought would be to create a single template that can be added to a page in order for a single report to be generated. (See my template suggestion above). Our primary focus should be designing that template to create the different types of reports needed. Once we have the types of reports and formats for the reports developed we can go one step further and develop a meta template for that, but lets take things one step at a time and start with a good foundation. Once we have an established foundation we can then develop a structure for listing and managing the different reports. Werieth (talk) 00:21, 6 November 2013 (UTC)

You are right, multiple ideas run parallel and appear chaotic now: they are not yet fleshed out. There is some chaff, some wheat. That happens when a brainstorm is triggered. Partly this is because I am also taking ideas from the "user experience". What would a user expect (or need), and therefrom what options are needed? These are just project editors or interested editors and if we want to serve them, we'll have to prepare for their different mindsets and so for flexibility.
That said, I understand you want to build a solid working base first. Also, in the development it is up to you as the bot builder to make a sequence in the elements you build (e.g., write to a fixed pagename first, in the development period). Still, to make it a usefull bot-utility for all wikiprojects (even all wiki), we must be able to discuss options and grand setup (possibly to be filled later, but the scheme should be there).
Some questions: Can you describe what you mean by "type of reports"? Is that type defined by the requested ns list only, or more? And how you mean to create "a structure for listing and managing reports" later? You don't need a pagename to write to? -DePiep (talk) 06:03, 7 November 2013 (UTC)
I agree with the reduction in topics you made (as seen in your Request template parameters). I separated the other ideas to "(proposed)", to be more out of sight. -DePiep (talk) 12:29, 7 November 2013 (UTC)
  • Werieth}, there are some major decisions to be made first, to make the development of this taskfoce go ahead. Since they can determine the bot internals, they should be made as soon as possible (otherwise we are loosing freedom to choose). These topic are:

-::DePiep (talk) 08:13, 8 November 2013 (UTC)

The Meta-Report[edit]

A meta-report is a report written by the bot about a bot run. A simple example is a note like this:

"This page was created on 4 November 2013, 12:17 (UTC) by User:ExampleBot".
  • I propose a meta-report in this form:
  1. The bot writes a metareport on the grouping page (see update below), that is the parent page of all report subpages (the pages with the links, datapages). The reason is that the data page is not intended for readers to open. It even may be large and slow. (When used as an Special page, this page size does not matter). Also, a meta-report on that grouping page can describe multiple subpages (e.g. when page splits were done). It is a one-page overview for the reader.
  2. The bot fills in a well defined, dedicated template, and writes it on that page.
  3. Using unnamed parameters could be more easy to program for a bot. Any preference, botmaster?
  4. The template can have other useful content, such as links to helppages. That can be added in template space.
  5. To get a taste, I created Template:Page reports/Meta-report1. (the "1" is added because we may need other meta-report templates). -DePiep (talk) 10:36, 7 November 2013 (UTC)
6. Update: The bot better not write on the grouping page. Others may want to customise that one. It could use a dedicated subpage where the bot is in control and can overwrite brutally everything. Suggest to write bot meta-reports on a subpage, like:
Wikipedia:WikiProject Medicine/List of pages/Page reports meta-report -DePiep (talk) 12:22, 7 November 2013 (UTC)
Werieth, about listing the page(s) in a template (for the user ready to click). Sure the bot can write a meta-report (in a template). But that would be cumbersome and inflexible; we might want to use that same info in the another form or function (that would require another template to write for the bot). This is even worse when we use dynamic pagenames (e.g., content-describing as "Articles" or "Pages with prefix x", as a pagename). Also a template cannot read the Request template.
Would it be possible that the bot writes one meta-report to a Lua data page? Lua data is very structured, so it would need some precise formatting, but the reward is flexibility. (Don't know if you are familiar with Lua. So at the risk of underestimating you: Varius Lua routines can use the data as needed, to present any regular template. Data example: Module:RailGauge/data. Code that uses the data: Module:RailGauge. The data page could be called like Module/Page reports/data/Wikipedia:WikiProject:Medicine. A template can call the code module to get the data. Very versatile).
If you think this is a good idea, we could ask a Lua programmer to design that data page. -DePiep (talk) 17:29, 7 November 2013 (UTC)
So, do you want to write meta-reports in Lua data code or template fillings? Not in plain text right? -DePiep (talk) 22:29, 7 November 2013 (UTC)
I can pretty much do it in any of those. Templates would probably be the easiest. Werieth (talk) 22:57, 7 November 2013 (UTC)
Then let's state that it is Lua data, because that is very flexible in usage. Also, it would only need a single meta-report per bot job (the templates can choose their data as useful). -DePiep (talk) 08:16, 8 November 2013 (UTC)
I've asked Stradivarus to come over for the Lua data structure. -DePiep (talk) 09:21, 8 November 2013 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── Hi. :) First off, let's see if I'm understanding you correctly. From the above, it seems like you are interested in using Lua for the template that generates the meta-report, not for generating the actual lists of links. Have I got that right? — Mr. Stradivarius ♪ talk ♪ 10:15, 8 November 2013 (UTC)

Yes. The bot writes wikipages with links, and writes job reports (meta-reports).
Without the job report, we can not use these pagenames in templates at all -- unless they are hardcoded :-( . Also other information to be presented to a user (such as report page content, last update, source used), is needed.
Writing this old-style template would need to hardcode that template parameters in the bot, which is inflexible. Also we want to be able to create multiple templates. -DePiep (talk) 10:28, 8 November 2013 (UTC)
In that case, it would probably be easiest to write the meta-report as a template. We can then grab the template parameters from Lua and output the report dynamically. There's no need for the bot to go to the extra effort of generating a Lua table for this, for two reasons. The first is that Lua coders on Wikipedia deal with template parameters like this all the time - it's the number one use-case scenario. The second is that if you generate a Lua table you will have to put it in the module namespace, whereas with a template you can put it anywhere. There's one thing I'm missing, though. How will the bot know what pages to generate? Will it read the meta-report template, and then generate the different subpages from that? Or will the data to put on the subpages be defined somewhere else, and the bot reads those subpages and generates a meta-report template invocation by itself? — Mr. Stradivarius ♪ talk ♪ 10:42, 8 November 2013 (UTC)
It would populate the list via the individual request templates. These would be grouped via the project parameter. Werieth (talk) 10:51, 8 November 2013 (UTC)
re S: A user can put #The Request template on almost any wikipage, say the WT:MED. Parameters are set (a bit like the Archive bot template). The bot sees this template, and runs the job, writes the list pages, and writes the meta-report. Templates connect to these pages and give users smart links (such as: open special page Related changes for a page). The meta-report is written afterwards, and is not the report definition (input). -DePiep (talk) 11:19, 8 November 2013 (UTC)
Strad, I don't get it. You mean if there is a template in wikicode (not lua code) a module can read that page, keeping the structure? And don't you actually mean: the bot should write a template call (from a transclusion, with the parameters values set by the bot)? Can lua read data (parameter values) from a template call like User:DePiep/sandbox? -DePiep (talk) 11:31, 8 November 2013 (UTC)

Lua module[edit]

(edit conflict) The meta-report is to be written into a Lua data page. The structure and workings are discussed here. Since we have no project name yet, I assume "Page reports" for now.

The module can contain various functions to produce output. Main feature is that all project metadata will be made available. The bot should not need to write metadata elsewhere (unless that place is not served by the module, say metadata on a Report page itself).

  • Bot job request_id: Above in #request_id I have proposed to generate an unique id for every Request (=the Request template put on a page). It is unchanged over time. For now, let's assume that id looks like id0000123.
  • Bot job topic. The Request template is placed on any (talk) page. Proposed, the topic is the FULLPAGENAME of the top page, non-talk. So all Requests in say a WP project share the topic.
These are the two identifiers I need for now.
  • The actual job data is (all pagenames are FULLPAGENAME, unless stated otherwise):

Once per job:

0. Request id.
2. Topic (see above)
3. Request page (page on which Request is sitting)
4. Return all Request parameter settings by parameter name
5. Relevant bot hardcoded settings (version, bot name, ...)
Any messages (error or other)

Zero, one or multiple per job:

6. All affected (created, edited, ...) Report page names (the lists)
7. All other affected Page reports pages (overview pages, ...)
6, 7: per page: size, update time, number of pages listed?, content description, ...
Examples: Wikipedia:WikiProject Medicine, Template:Convert (the bot Request should work in in all ns alike, barring prohibited places like mainspace and module space).

These are our elements. We can set up the data pages like this:

  • Parent data page : Module:Page reports/data/.

We use subpage /data/ to keep job reports separate from other data pages the module might need.

  • Data page per topic:
Module:Page reports/data/Wikipedia:WikiProject Medicine.
  • Meta-reports:
--identifier
request_id=0000123
--data block:
--validator
    request page=
    is new request=
    error message about id=
--meta-report (single request job run) 
    bot name=
    botversion=
    time=
    message=
    ...
--per page
        report page1=
            --data block this page
        report page2=
            --data block this page
        overview page tyupe A=
            --data block this page

--end of job report request_id=0000123

--second job report in this topic:
--identifier
request_id=00876543
--data block:

--end of job report request_id=00876543

So this way, all topic jobs are together. When a Reqeust is rerun (the regular update), the specific block (per request_id) should be replaced. (Easy enough for the bot?). Maybe the bot would like underscores for this output too.

An other solution could be to name the datapage by the id.

Module:Page reports/data/id0000123.
Would make it hard to group jobs within a topic (or even on the same page).
  • errors by id: the bot can mark errors in identification (id-Request_page do not match any more, request_id is unknown or malformed). TRhe bot does not have to solve that per se. Once in the meta-report, possibly the module can check for and report mal-identified pages. -DePiep (talk) 11:11, 8 November 2013 (UTC)
    Lets take a more concrete example, write up a mock lua meta report for User:Werieth/sandbox assuming its the only result for that project. Werieth (talk) 11:56, 8 November 2013 (UTC)
Good idea. You think of introducing a request_id? -DePiep (talk) 12:35, 8 November 2013 (UTC)
Its going to be done, I just need to think of a good way of doing it that is sane, scalable, and would work well. Werieth (talk) 13:29, 8 November 2013 (UTC)
Werieth, I don't know if you are waiting for this, but I prepared Template:Page reports/Meta-report1 for the parameters. Filling a template (not Lua)), would then be setting writing these parameters and their values in some page. Would it be needed in Lua code, the values are there (the format will differ, Lua uses all the brackets in the case)

Example[edit]

Manually, I'd think your sandbox example would yield this:

Page reports: job report
bot
botname User:Werieth
botversion
bot job result
update 11:42, 8 November 2013
botmessage
the request (in)
request_page User:Werieth
topic User:Werieth
subtopic [[]]
template Template:WikiProject Medicine
in_category Category:All articles lacking sources
prefix
namespaces all
redirects
add_talkpages true
report pages (out)
page1 User:Werieth/sandbox
page1_size 63409
page1_links 345
 
{{Page reports/Meta-report1
<!-- bot -->
| botname =User:Werieth
| botversion =
<!-- result -->
| update = 11:42, 8 November 2013
| botmessage =
<!-- request (echo) -->
| request_page =User:Werieth/sandbox
| topic =User:Werieth
| template = Template:WikiProject Medicine
| in_category = Category:All articles lacking sources
| prefix =
| namespaces =all
| redirects = 
| add_talkpages =true
<!-- output pages -->
| page1 =User:Werieth/sandbox
| page1_size = 63409
| page1_links = 345
}}

A few points |page1_links=345 not sure where that value comes from. There are a total 1932 total links on that page and 966 different articles listed. Unused params should be hidden and only displayed when a non-default value is displayed. Werieth (talk) 15:17, 8 November 2013 (UTC)

Yes, the number was random, just to show something. More questions: since we have the pagename, we can get size from {{PAGESIZE:pagename}} already. Up to you if you want to have it written by the bot. OTOH, I'm not sure if you have the # of pages (links). If it's easy, please add them (says more to a user).
Another thing: maybe in request input you won't require ns for a "template=", logically sound. Still, adding it in this output would be usefull.
Don't worry about the appearance, this is not the table for the users; just technical background intfo. Bot's task is only to write it down. The show table is just a showcase for us, for the eyes. If I understand Stradivarius, the lua module can read that data for further processing.
I think you might want to improve the parameter list, completing it with all params you use. And it better be written on a different page, not the data page (dedicated page). -DePiep (talk) 17:16, 8 November 2013 (UTC)
I repeat that a wikicode template probably will not work for Lua (so we still cannot read the values to use in user templates). That would mean you need to produce a Lua datapage output instead of this template one. Now that you know this, you can choose as you like (two-steps via wikicode, or directly in lua code). -DePiep (talk) 18:44, 8 November 2013 (UTC)
Like I have said, I havent started any code yet, but it shouldnt take more than an hour or so once we have the format defined for both the request templates and output formats. Whatever the formats are going to be we should finalize those. Ive already thrown my thoughts into the mix, but the sky is the limit for where we can take this. If you want to setup a preliminary format, or you want to wait, doesnt make too much of a difference for me. Werieth (talk) 23:08, 8 November 2013 (UTC)
There is only one Request template ({{Page reports/Request}} now, proposed), and the content list of parameters is up to you to formalise. If I understand the process (as Archive template works), the template code can be empty, because the bot reads directly from the page where the template is used (where params are filled). Then once the params are given, we can make the Request template to show these to the reader on the page (as a feedback; the bot won't be bothered). The bot does not need to know about that formatting.
All output templates will be based upon the single meta-report. Once we can read that, we can create & improve templates for the users within wiki. These are not formatted or build by the bot; that would be way too inflexible. Even more: the user will reach all Report pages through a template we provide; no user will have to click browse through page structures manually.
To finalize the parameter sets, I want to ask you to document (list) all input Request input params you use now; that would be in Template:Page reports/Request/doc. Also I'd ask you to make the ouput params list complete as you think good for the meta-template (above, in the example). -DePiep (talk) 00:41, 9 November 2013 (UTC)
I'll try to provide you with the Lua format asap. Please list the params. -DePiep (talk) 00:51, 9 November 2013 (UTC)
Ive gone ahead and listed the parameters that I either think are no brainers for inclusion, or tagged as not done those that probably shouldnt be done. If you want to add more params feel free to do that. Werieth (talk) 13:04, 9 November 2013 (UTC)
Thanks, will be into it tomorrow. When the bot accepts repeated params (like |in_category=), can you output them with a numbering? Adding like "_1" at the end would be good (regex). -DePiep (talk) 14:03, 9 November 2013 (UTC)
If this is being done in lua, lua already supports list objects which is what this would be defied as. Werieth (talk) 14:06, 9 November 2013 (UTC)
You are pushing me into a pro Lua programmer :-). Challenge taken, will come in useful for all these multiples. Now this may take more than one day, I cannot promise speedy delivery. Another point: when the bot has to update an existing meta-report (after a new run), that will happen in a larger page. I understand you can take out the old meta-report (page part) and replace it, right? -DePiep (talk) 16:51, 9 November 2013 (UTC)
So |in_category=, |not_in_category= can be repeated. Which else? Source generators |category=, |prefix= maybe? And the logic with repeated param names is or, right? Always? Note: Another situation of possibly large Report page. There better be an absolute size limit in the bot, say 1MB (or a number-of-pages listed, say 50k). The user can refine then the request. -DePiep (talk) 12:08, 11 November 2013 (UTC)
For the start lets keep the source params unique. right now there are three params that fall into the multi-value category. in_category, not_in_category, and topic. Ill post a preliminary example of the structured data for the meta template shortly. Werieth (talk) 14:01, 11 November 2013 (UTC)
No, "topic" is unique. I hope you mean "add_topic" (the user can add Reports from an other WikiProject, and more of these). btw, |add_topic= is a pass-through and will only be used by the user-access template). The three source-params unique: OK. -DePiep (talk) 16:00, 11 November 2013 (UTC)
Oops, these three all can be multiple: |add_topic=, |add_subtopic=, |add_page=. All non-bot params. Should I write more about this "access template"? -DePiep (talk) 16:03, 11 November 2013 (UTC)
Will take a look later on. As I said, I am working on this already in Lua data (and working on keeping this page clean & organised). An issue we need to discuss is identification. That also influences parameter names & meanings. -DePiep (talk) 16:27, 11 November 2013 (UTC)
Indeed looks like an easy and stable production for you, this will work. Thanks for the demo. Some first impression points (kept json punctuation for now):
1. Te module should not have to interpret "defaults"; they are defined in the bot so should be reported by the bot (don't want to start defining default values in two places). So I need like "title_before": "", not "title_before": <its default>.
2. Lua doesn't do "null"; that is blank "". See next point.
3. Params with a blank value (empty, null or omitted) can be left out altogether. So we state: parameter "title_before" omitted equals "title_before": "" equals "title_before": null. I'd like it very much if you could leave those out. Are there situations where "blank" is a meaningfull option? (yes, |namespaces=; better put all three list sources in then always? Or report exactly the one that is used?). Still this blank/omitted parameter does not mean some default value; it means no input.
4. All pagenames are FULLPAGENAME, so "template": "WikiProject Medicine" should be "template": "template:WikiProject Medicine". (It is a bot interpretation that it is a template:, but others are not supposed to know or assume that assumption. Self-documenting).
5. Missing: Request pagename, Report pagename (parameter proposals below, from the PRP identifiers, might add some more params). -DePiep (talk) 19:54, 11 November 2013 (UTC)

The Access template[edit]

The Access template is a template for the user that gives access (links) and overviews to the various PRP pages that are existing in the topic (say, in WP:MED). It gives links and to: the Request, the Report (page), an RC (RELC) s.p. link, and its status. More so, it will list all such pages within the topic.

Page reports (demo Access template)
Topic: Wikiproject Medicine
Topic comment: First trials we do in Medicine
Subtopic: All project pages
Subtopic comment: Just to get a feel
Request Report RC Updated
Wikipedia Talk:WikiProject Medicine Wikipedia:WikiProject Medicine/List of pages/Articles (800 kB!) RELC 5 Nov 2013 12:44 (UTC)
Wikipedia Talk:WikiProject Medicine/Another request Wikipedia:WikiProject Medicine/List of pages/Non-articles RELC 7 Nov 2013 10:22 (UTC)

Simplified. All data will be pulled from the Lua module (who reads the meta-reports). No labels used. Mock up data. -DePiep (talk) 16:21, 11 November 2013 (UTC)

The PRP structure (identifiers and definitions)[edit]

These are the main structures of this Page report project:

All page names are FULLPAGENAME unless stated otherwise.
  • Wikipedia:Page reports, WP:PRP: this mini-project, a collaberation of a bot and templates. In the future: Central page. Also for users (documentation, help, ...). It is not a "WikiProject".
  • Request: {{Page reports/Request}} template put on a page by a user and params filled.
Request page: the page where the filled Request template sits.
  • Request_id: id within PRP for each individual Request. Not yet available.
For now, the Request page name is used as the Request_id. That means that there can be only one Request on a page.
  • Report (Report page): The payload product of PRP. A page created and written by the bot, consisting of a list of (linked) pages. One Request produces one single page. The page is updated (overwritten) by the bot every X days.
Report page name: tbd.
  • Access template: a {{Page reports}} (todo) template that gives the user access to & overview of all PRP pages & issues within a topic (say wikiproject). Will be provided (added to a page) by PRP.
  • Meta-report (or metareport?): Data written by the bot about the processing of a single Request (time,. result, ...). Will be written to a Lua module data page. Also contains feedback of the original Request.
  • Topic: the top page name of the Request page, in Subject space (=non-talkspace). Within PRP, the topic naturally groups all Requests within a topic (say all within Wikipedia:WikiProject Medicine). Access template then can show these Requests+Reports together in one place.
  • Subtopic: A subtopic that the user can define in the Request. All Requests with the same Subgroup name are grouped separately. This allows the user to make a noiche research topic within the main topic. For example, a user can Request two REports to reseachr subtopic "Reference issues".
  • Report header (or footer?) template: {{Page reports/Listpage header/Redirect}}:(todo; better name). Every Report page should get a template for reference (categorisation, explain what it is, useful link(s)). Could be light and simple?
This template is a redirect to prevent it showing up in the RC (RELC) list whenever the template is edited. The Redirect is more stable.
  • Ending and Deletion of Reports. tbd. Somehow abandoned requests and Reports should be deleted.


Werieth, I propose this set of structure definers. Some parameters are defined different from your current usage (if you agree, they'd have to be changed a bit). This setup (or some setup) is needed for a good Lua data structure. For the structure, the other parameters are not identifiers.

One issue unresolved is the |request_id=. As long as we do not have that one, we can not handle two-Requests-on-one-page (maybe the bot can, but we cannot go back at them in the Access template for example). That would be very user-unfriendly. It is natural for a user to write multiple Request on a single page, of course. We want this to be available on a regular Project talkpage, in several sections ("I'll ask for a PRP Report" they'll write, Yessss!). Can you do some research in this?

As you know, I am not yet happy with the names of the Report pages the bot creates. I'd want them in Talkspace always. And I think they better not be on the Request page itself (Requires the user to start a new page for a request. And is very unfriendly if the page is big, as the MED page is today. One does not want to edit a Request there). More on this this after I have set up the Lua data format. -DePiep (talk) 17:43, 11 November 2013 (UTC)

In order to make things simpler Report page and Request page should be the same. See example at User:Werieth/Request. Ive been trying to think of a good way to manage request IDs and just havent yet had any great ideas. Right now Im leaning towards using page_id as the report id. Werieth (talk) 18:40, 11 November 2013 (UTC)
That is as simple as in "throw it all together into one bag". I do not support that (For example, it does not allow multiple Requests on a regular talkpage -- too weird). It is about basic page name composition, we should be able to handle that. And since the user can approach the pages through a link (we provide), we can give the page virtually any name. First thoughts:
<topic talkspace>/Page reports/<some-{{SUBPAGENAME}}-from-the-request>. Bonus: the page .../Page reports can contain the Access overview template. -DePiep (talk) 19:21, 11 November 2013 (UTC)
I think I need to make a few things clear, the report will go on whatever the page is using the request template (otherwise it gets real ugly real fast, I can think of over a dozen possible serious issues that it could cause.) And it will only support 1 request per page. (The bot will retrieve the template and then overwrite the page with the template+ results). Its the most logical and sane way of doing it. Otherwise you are going to end up with a complete mess. Also I said I do NOT support the primary listing in the talk namespace. The report should be placed in the subject namespace. It will allow discussion and collaboration on the report if the talk page can be used for that. Werieth (talk) 19:30, 11 November 2013 (UTC)
Well, that can not be done in main space, module space, and lots of serious template spaces; that is three exceptions I can think of already. You have not addressed the 800k page size issue (losing users), nor the example for a regular talkpage (winning users). And even in Wikipedia space, a subject space is not a talkspace. If you could mention the top issues with multple-requests-per-page (apart from the id issue), maybe there a solution can be found. I can think of subpagename generation (can not be double indeed), but that is not a blocking obstruction.
And to be clear about basics: topics that are in play outside of the bot too, like pagenames, you are supposed to discuss not to dictate. You can raise it as a question. -DePiep (talk) 20:12, 11 November 2013 (UTC)
1) Why would there ever be a need for this in the talk/mainspace? similar rule applies for module. I really only see this template being used correctly in the Wikipedia and user namespaces. Multiple requests per page become a nightmare to process/maintain/update. I am flat out telling you, zero negations, on that fact 1 report per page. I am not going to spend a dozen hours of my time figuring out all the different ways users will screw up the multi-request per page process, and ensuring that it doesnt break something. Also when you get more than 1 report per page you start getting into size issues. If size issues do become a factor for single reports, those reports can be split into sub reports (A-M, and N-Z for example using the |title_after=,|title_before= functionality). Take a look at Wikipedia:Database reports Each report as its own page and associated talk page for issues revolving around it. A similar process would be ideal here. Werieth (talk) 20:26, 11 November 2013 (UTC)
"Why would there ever be a need for this in ..." -- when was the universal scheme reduced this way? (Not long ago someone told me I was too limited by thinking "RELC" only; the reports are much wider you know!). It is simple: the need for a report can arise in every space, so can be asked in every space (not mainspace; excluded of course). Repeatedly on this page I have mentioned other spaces where sensible requests can arise. I see no reason why the bot should be limited to specific spaces. "only being used correctly in ..." -- that can only arise when extra limitations are added to the manual.
Yes, keeping the requests apart when on a single page, is the issue. It surprises me that while you were thinking about an request_id, you already had decided to kill any need for it.
"more than 1 report per page ... size issues" -- Off course not more reports on one page. I never said that (it could happen if you stick to write-report-on-request-page, but that should be abandoned as I described before). And early on we decided not to split pages in version 1; so I am surprised that an A-M and N-Z split pops up again. Why then did I work loyally with the "one page per report" conclusion? I cannot follow this. Anyway, this size issue only pops up when multiple reports per page are suggested. I won't.
The size issue I described is this: if you put an 800k page together with the Request on the same page, that page becomes way too slow to do edits (tweaks) in the request. That is when users go away.
You mention Wikipedia:Database reports as an example. First: these are all created by human intervention (verbose requests on a talkpage); then the programmer thinks a page title. The PRP does not want human handling of a request, nor shall necessary pagenames be created by human thinking (for me, that also includes preventing the need for a user to create a new page). So it is just a matter of automated, systematic page name creation. That is where the parallel breaks: these PRP reports are to be much quicker to set up, they should not be for the techies only.
And by the way, in the DB reports you mention: a. all pages are children of a single parent page, and b. the "configuration" is a separate page! Both are what I proposed here earlier. Why appreciated differently?
Are there any other remarks you can make after reading my postings on this page? I'd rather not having to discover them later. -DePiep (talk) 23:16, 11 November 2013 (UTC)
Yes a report may arise in any namespace, but realistically we shouldnt have random subpages in the article talk namespace, or category namespaces really. For the most part either the Wikipedia namespace, or user namespace should address 99% of the cases. Like I have said the request template and the report will be on the same page, I am not going to screw around trying to ensure that users cannot screw the bot up with generating reports from random pages and having them saved to random locations. Its not worth the 6+ hours it would take to ensure a sane, logical, and safe way of doing it. You miss interpret what I mean by splitting a report. If for some reason a particular report becomes too cumbersome for a project the existing report should be disabled and an appropriate number of new reports should be created. (Whether that's an A-M split, by first letter, by page size or any of the other filtering options) Thus any single report isnt actually being split by the bot, but by users who have created new reports. Creating a report via this process is as simple as throwing a template on a page where you want a report created. The bot really doesnt care what the title is, it looks at the config template, and uses those parameters to create whatever report is wanted. If you take a look, the "configuration pages" are not config at all, but are uploaded python scripts that create those reports. Nothing in the reports can be configured on wiki, it just provides technical information. PRP's are trivial to create/modify if you want play around with the template parameters that I have at User:Werieth/Request. If you cannot accept the one request per page and request is on page where report is saved, Ill just stop development. There are only a few issues that I am making a hard rule about, that is one of them. I am not going to spend hours upon hours idiot proofing my code. I am a software engineer by profession and I know/can foresee a large number of problems if we dont do it that way. Its also going to make maintenance of such a bot a fairly large time sink. Werieth (talk) 23:44, 11 November 2013 (UTC)
After six days, and only after I probed the issues for the n>>1'th time, you are here to declare four preconditions in concrete, molded and hardened somewhere else. That makes me wonder if there are other such claims sleeping somewhere. So I ask, before going ahead: is there anything on this page or in the bot-wiki-user interfaces you would like to have clarified or fleshed out? And apart from these four preconditions, is there any other issue that must be concluded, or will cause another precondition, before going ahead? Thank you. -DePiep (talk) 09:39, 12 November 2013 (UTC)
──────────────────────────────────────────────────────────────────────────────────────────────────── Im only seeing two definitive points. 1 page per report which the bot can completely overwrite, and that the request template is on the page where the report is created. Those conditions are not that excessive. Werieth (talk) 10:38, 13 November 2013 (UTC)
You wrote above:
1. the report will go on whatever the page is using the request template
2. it will only support 1 request per page
3. I do NOT support the primary listing in the talk namespace. (triple stress in original)
4. I really only see this template being used correctly in the Wikipedia and user namespaces (introducing a new "correctly").
Is four, not two. I will not discuss them now, except for the point "must be on the same page": a bot unable to handle the very concept of a page link may be too limited to work with Wikipedia, or internet in general.
You do not mention any points you'd like clarified or solved early on (my question). That gives me the expectation (by experience in an out of this mini-project; maybe I am a pro too in this) that they can and will pop up later on, causing the need for late redesigns, an underperforming project, or a project dustbin.
  • My beef is this. I saw this mini-project to have the bot-editor interaction one step above the known botrequest process. A core bot run itself can be very stable, and could cause no unsolvable problems IMO. The higher level process should allow automated requests (not human talkpage requests), user-friendly as in: the editor does not have to be a tech savvy to make a request, and it should serve a larger audience than the requesting editor. The bot request-report should be wrapped in a user-oriented presentation (template, help, options, control, overview). These bot requests just doesn't need to be that difficult or frustrating for an average editor.
From there both bot-wiki interfaces (request-side, result-side) should have the editors experience designed in too. I worked to design & demo this in designs, templates and a data meta-report. All this requires protocol agreements with the bot/botmaster: the interfaces. This talkpage is filled with such points of contact. And apart from the bot interfaces, I was working on these user-side templates (which are on-wiki all, so have no direct effect on bot workings).
The problem I experience is, that I did not get that much feedback from you. Not on the bot-interface talking points, and by consequence not on the wiki-side/user-side designs. Only when I pushed for a decision here, there appear four preconditions (and I find it telling that you only count two). After that, no response on my open question for the urgent topics.
Now that you have not identified any early-decisions or urgent clarifications I asked for in my previous post, I must expect more preconditions and similar difficulties later on in the process. In other words: first you do not interact with my side of the designs, and later conclusions are imposed. That does not contribute to my goal of the user-oriented design (that is absent this way), and lets me work for the bin. It is not how this software development works. What will result is another bot incommunicado, that does take-it-or-leave-it, and the editor can research for themselves how the black bot box works.
Concluding, since you did not get involved with discussion the overall PRP design, and do not take ideas from the user-side argument, the development & cooperation as I intended it is frustrated. I can expect a repetition of this. Sure the bot may work, but the user-experience is not incorporated and so I expect it to be another ask-the-bot-per-issue talkpage. That is a pity, but I cannot change that apparently. I abandon this mini-project. -DePiep (talk) 11:10, 14 November 2013 (UTC)
@DePiep: Of the 4 bulleted points just 1&2 are actually concrete, 3 is a personal opinion and preference which has zero impact on the bot, take it more as a strong suggestion, but not definitive. Point 4 is a prediction/observation and not much more than that as I was going to configure the bot to just drop the report where ever the request template is located. As for user side configurations, (IE wiki templates and similar) I can pretty much write the bot to work with just about anything you want. I was thinking that I just dump the relevant information into a lua module and you where going to output it however you want it. If you want explanations on why I am putting my foot down on points 1&2 I can explain it, but simply put, having multiple requests per page and having the request template on a page different than the output introduces about a dozen problems that I can easily foresee and probably dozens more that I cant see. One easy example I can think of is what happened to MiszaBot when it first started archiving talk pages, vandals often used the bot to flood talkpages with stuff from other talk pages. Misza had to introduce a |key= into the template which must be requested from the bot op. I really would rather not spend hours ensuring that the bot cannot be abused to cause that type of disruption. Keeping the request template and results on the same page makes it both simpler and secure. I am making the one request per page rule because its a pain in the ass trying to remove/add manipulate wikitext in a manor that would enable the bot to save multiple reports per page. Defining what the bot should delete, what it shouldnt delete, where it can and cant modify parts of a page would become another 6+ hours of coding and review to ensure that it doesnt break stuff. Werieth (talk) 11:48, 14 November 2013 (UTC)
re the development process, I'm not addressing the content issues.
These strongpoints in itself are not really my concerns. I could say OK with these, lets go ahead. The bad experience is that I had to pull to get something out, they came late, they came inflexible, and I did not find any co-thinking/feedback/support at all for the user-side of it. (This explained: what we called "user-friendly" for lack of a better word. My ambition was to make the UI as simple as possible for the average editor - not the tech editor. E.g., the core templates both in content (the Lua data feed!) and like on what page to add them. That would definitely be a step up from the handwritten technical botrequests). That grand plan was curtailed. Writing a "not" in capslock, italics and bold is not an invitation to discuss a suggestion. I read no intention to find other routes for difficult issues, no "can't we ..." question. Also, when I asked about "urgent issues" to decide or to explain, there was no response. This experience - a frustration as you can read - made me expect more of the same: another roadblock ahead, late, underarticulated, battle argumenting, and requiring a redesign of work already done. That is not what I want to spend time on. -DePiep (talk) 09:51, 15 November 2013 (UTC)

────────────────────────────────────────────────────────────────────────────────────────────────────Werieth@, Maybe this question can bring clarification.
How would you describe a projected user-side of this operation? What would a user see, what would that user do and have to do, how could he/she get a positive experience from it (a useful response), and make him or her come back for more? I assume that user is an average WikiProject user (quite into editing, but not in a software-technical way), but maybe you have a different target user in mind. Working examples of certain aspects can be illustrating too. -DePiep (talk) 07:54, 18 November 2013 (UTC)