Help talk:Archiving a talk page

From Wikipedia, the free encyclopedia
Jump to: navigation, search
the Wikipedia Help Project (Rated Mid-importance)
WikiProject icon This page is within the scope of the Wikipedia Help Project, a collaborative effort to improve Wikipedia's help documentation for readers and contributors. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks. To browse help related resources see the help menu or help directory. Or ask for help on your talk page and a volunteer will visit you there.
 ???  This page does not require a rating on the project's quality scale.
 Mid  This page has been rated as Mid-importance on the project's importance scale.
 

Orphaned archive?[edit]

Resolved

I was searching Talk:Islamic views on slavery and found what appears to be an orphaned archive. I don't know how to make it show in the archive list, so I'm posting here in case someone else does (I figured it'd be better here than disrupting the topical discussion there). --72.227.105.84 (talk) 20:36, 10 May 2014 (UTC)

I'm looking into it. Posting so as not to duplicate work by multiple people. — Makyen (talk) 21:35, 10 May 2014 (UTC)
The page was created due to an error in the MiszaBot configuration. I have:
  • Corrected the archive name which became invalid due to a page move on April 12, 2011‎.
  • Corrected the counter to 4 instead of 112. A new page will be created so as not to disturb any of the current archive pages.
  • Changed the minimum number of threads required to be archived at one time from 10 to 1.
  • Changed the time that threads remain on the page from 10 days to 90 days. At this point, the amount of traffic on the page does not justify 10 days. I also feel it is inappropriate to suddenly spring a 10 day time to archiving on people after no auto-archiving for an extended time.
  • Changed the archive box to automatically detect archives so that maintenance requirements are reduced.
  • The archive box has changed color to the standard colors for talk pages. If you desire to change it to what it was to match the TOC, I left commented out text in the {{archives}} template which will do that and a description there as to what to uncomment.
  • Added archive indexing (Although that bot is more or less defunct).
That has gotten the archive page in question such that it is available in the links from the talk page. You should also now have a functional archiving configuration. I will watch the page through the first pass of lowercase sigmabot III (lcSB3) to verify that everything is working. lcSB3 has not done any archiving in the last couple of days so there may be a temporary problem with it and archiving may not start for a day or two. — Makyen (talk) 22:19, 10 May 2014 (UTC)

Rationale?[edit]

Personally, i am completely agreeing with the idea of having archives, and "get" their purpose, but i don't see it properly documented anywhere. Is there any kind of "mission statement" for this Archiving task? (preferrably focused on its usefulness for noticeboards) -- Jokes_Free4Me (talk) 10:18, 18 May 2014 (UTC)

Why two archive bots?[edit]

Two different, yet similar, archive bots. Why?? One uses hours, one days; one gives max size as xxK, one spells out max as xx,000; one has a counter parameter, one does not; one says "algo", one says "age. Really, is there any sense to all this? For the non-bot educated editor who sees problems, what is the best way to solve the problems? I raise this because in looking at templates on the article talk pages I see conflicting info as to what bot is in use. Come on, bot-wizards, fix this and give regular users a unified, user-friendly archive template. Thanks. – S. Rich (talk) 05:56, 8 June 2014 (UTC)

It is this way because it is how it developed over time. There are not good reasons for it. As with many things on Wikipedia, it is the result of volunteers seeing a need and responding to that need.
Yes, in an ideal world it would be nice if the parameters for the bots were identically formatted. However, the current use is not going to be changed to that for a variety of reasons. The bots are run by different volunteers who would be unlikely to be interested in modifying their functional and approved bots to handle parameters in some other manner. In addition, both bots have a large base of pages which already have the parameters formatted in the manner each bot is expecting.
It is, unfortunately, not possible to create a over-template which uses a unified parameter syntax and use that. This is because each bot must have their exact template on the page in order to function.
In my experience, the most common source of conflicting information as to which bot is in use is which bot is reported by either an archiving notice and/or an archive box template vs. which bot actually has a configuration template on the page (or if there is a configuration template on the page). The bot that is actually in use is the one that has a configuration template on the page. The bot in the archiving notice or archive box template should be changed to reflect that fact. The one caveat there is that all MiszaBot/Config templates actually are handled by lowercase sigmabot III. Any page which reports that it is being handled by one of the MiszaBots should be changed to lowercase sigmabot III. — Makyen (talk) 06:26, 9 June 2014 (UTC)

Size of Archives[edit]

For the summary of the discussion see the Closure statement section. Armbrust The Homunculus 20:30, 19 August 2014 (UTC)

The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

A point of debate came up between myself and Technical 13 regarding the optimal size of a talk page archive. While the page does list the ideal size for the talk page, it does not enumerate the ideal size of a talk page archive. A quick tour of several pages revealed configuration for archive sizes from 100k to 700k. So that we can have a codified answer, what do people think is a reasonable size for the archives of a talk page before whatever archiving process is being used spills into the next archive. Hasteur (talk) 18:45, 24 June 2014 (UTC)

I am going K? Spills into the next archive? We use GIGAbytes these days for single files! WTF? Next time I see the opportunity, I will manually archive a talk page with a pointer to the full version. 75.152.119.10 (talk) 13:17, 13 July 2014 (UTC)
My personal viewpoint is for somewhere between 120k to 150k bytes. Obviously some pages are much more deserving of larger archive sizes (else certain whopper threads will dominate an entire archive by themselves) but the amount of casual access of the talk page archives is low enough that I feel a doubling of the main talk page size is reasonable. Hasteur (talk) 18:49, 24 June 2014 (UTC)
  • I've changed the 700KB ones you mentioned because pages crash and fail to load between 120-250KB, and anything over that is asking for trouble. My personal preference is in the 120-150KB range but I need something documented somewhere to be able to link to that from the template in question. — {{U|Technical 13}} (etc) 18:51, 24 June 2014 (UTC)
    • If our ANI archive pages were max 150k we would have archives in the 4000's range. No. –xenotalk 18:55, 24 June 2014 (UTC)
    • T13, I think you better undo that change and pray that nobody saw it as AN and ANI operate on a very specific archiving routine and individual threads can go onward for ~150k without breaking a sweat. Hasteur (talk) 18:57, 24 June 2014 (UTC)
      • Xeno already reverted it, and I've posted on WT:AN about it because those archives are useless if people can't open them because all they see is WikiMedia Error:... then that means the page is too big to load and there needs to be an adjustment... — {{U|Technical 13}} (etc) 19:11, 24 June 2014 (UTC)
There is an unresolved, intermittent issue with archiving failing when pages get beyond 512 KiB. An example of this issue is at Template talk:Automatic taxobox/Archive 8. I would suggest that any size be kept well below 512 KiB so as not to hit that.
WT:AN and other very high volume talk pages are significant exceptions which are in a class by themselves. They should not be considered when adopting suggestions for general use. It can easily be explicitly stated that they are the exceptions which they, in point of fact, are.
What size to use for archive pages is a trade-off which needs to balance a desire to have relatively few – 10s to 100s (high volume), not 1000's – archive pages created over a significant time (years) with the need to keep archive pages small in order to allow those with less capable machines to still be able to view the archives. Obviously, the MediaWiki software is also a constraint due to a variety of limits. Which limit is hit will depend on the content of the page. There are some talk pages which already run into some of those limits.
In my experience, for the vast majority of pages, archives in the 100kB range are more than sufficient for the amount of traffic which they experience. Obviously, for higher volume pages that should be adjusted upwards. Keep in mind that most pages experience a relatively low volume of traffic on their talk pages. In addition, when they do experience a high volume it is usually something that is only in a burst of traffic (perhaps over a couple/few months), not a high volume of ongoing traffic over years. It is quite reasonable for the size of current and new archives to be adjusted if the level of volume changes significantly over an extended period of time. — Makyen (talk) 21:40, 24 June 2014 (UTC)
With WP:FLOW on the horizon isn't this discussion somewhat academic? WaggersTALK 09:21, 25 June 2014 (UTC)
FLOW has been "on the horizon" for months if not years (see WP:LQT), and considering the WP:VE rollout (and Vector before it, and monobook before that, etc. going all the way back to WikiAntiquity), it's almost certainly going to be massively unpopular and take months or years to fully roll out. So yes, this does matter. --NYKevin 03:13, 30 June 2014 (UTC)
  • I see no reason whatsoever to impose a limit on the size of archive pages. The limits we set for wikipedia articles stem from our mission to provide access to readers and editors on older machines and slower connections. For archives of talk and project pages, the rough goal is still the same but the imperative is much less strong. If someone on a much older machine can't easily read User_talk:Protonk/Archive_5 we haven't failed in our mission to readers. Moreover, when I manually archive my talk page I have no interest in looking up a policy or help page to see if I'm violating some rule on what is likely to be one of the least trafficked pages on the wiki. I'm especially uninterested in being told after the fact that I have to move 50k here or there (or worse, have someone else edit the archive) in order to meet some incredibly minor use cases. As for the considerations below, the first is unimportant because the second limit will be breached well before it. The third is a concern, but we all need to keep some perspective on how likely it is that any user, let alone a user with a machine that can't load 1-2 Mb of content per page, will ever read a talk page archive. I'm not saying it doesn't happen. It obviously does. But it's just not that prevalent, so let's not overstate the upside. Protonk (talk) 15:21, 15 July 2014 (UTC)
  • Considerations:
1) Technical: It needs to be small enough so an edit-and-save of the entire page won't cause difficulties with the server. This applies to all pages, not just talk page archives.
2) Technical: It needs to be small enough that an editor can edit-and-save the entire page without having difficulties at his end, assuming he's using a browser less than a few years old and assuming he's not using a browser or device that is simply unsuited to the task of editing pages (e.g. some cell phone web browsers just aren't good for page-editing). This applies to all pages, not just talk page archives.
3) User experience: It needs to be small enough that an editor can scroll from top to bottom in a reasonable period of time OR it needs to have a table of contents section at the top. As most talk page archives have such an index this likely won't be an issue.
Bottom line: If it's got a table of contents, just use the same maximum size as for any other Wikipedia page. davidwr/(talk)/(contribs) 17:51, 1 July 2014 (UTC)

Three archive maximums[edit]

  • 1. 70,000 for User talk pages
Under page User:MiszaBot to setup user talk page archiving: {{subst:User:MiszaBot/usertalksetup}}
That setup has: | maxarchivesize = 70K
Under page User:MiszaBot/Archive_HowTo, Example 2 has: | maxarchivesize = 70K
  • 2. 100,000 for article talk pages
Under page User:ClueBot_III, Example: Changing from MiszaBot to ClueBot III
MiszaBot                  ClueBot III
|maxarchivesize = 100K    |maxarchsize=100000
Note that this talk page's article has a similar comparison between User:lowercase sigmabot III archives max 100K, and User:ClueBot III archives max 100000.
  • 3. 150,000 for high-traffic Wikipedia project pages
Under page User:ClueBot_III, an example for Numbered archives has: maxarchsize=150000
Maximum is actually minimum

Description: The target maximum size of the archive in bytes before %%i (see format) is incremented. If 0, this is disabled. In general, this parameter is used for numbered archives, but not for archives organized by date. This is not a hard limit. Resulting archive page sizes will almost always exceed this number, perhaps by a great amount. Each time ClueBot III runs on a page it archives all threads that are old enough to qualify for archiving into a single file. If you have maxarchsize=100000 with a current archive file size of 90k and it ends up that there are 60 threads to archive with a total size of 250k, then the current archive will be extended to 340k despite of [sic] the 100k limit.

According to WP:TALKCOND: "Large talk pages become difficult to read and strain the limits of older browsers. Also loading time becomes an issue for slow internet connections. It is recommended to archive or refactor a page either when it exceeds 75 KB, or has more than 10 main sections." That's everything in my notes on this topic. FWIW: My browser does not like archives greater than 500K.
Cheers. —Telpardec  TALK  20:57, 4 July 2014 (UTC)

Telpardec Main pages are reccomended to not go over the 75 KB, that's why I floated the suggestion of an archive being about 150 KB so that there's space for high frequency archives, but at the same time not have certain templates start nagging visitors when the talk page archive goes above 75 KB. Hasteur (talk) 22:42, 4 July 2014 (UTC)
@Hasteur: Thanks for the clarification. My additions above were in response to a request for a documentation page to link a certain template to. There is more than one source of information regarding the maxarchivesize parameter. Perhaps recommendations based on the above could be incorporated into this talk page's article. High frequency pages like AN/I do not need greater than 150K maxarchivesize, since the bot only stops using a particular archive number after the max size is reached or exceeded. (Hmmm... Why would anyone want to be nagged by a template? :) Cheers. —Telpardec  TALK  23:20, 4 July 2014 (UTC)
Re: the template nagging, I think Hasteur is talking about the MediaWiki message MediaWiki:Longpagewarning (see its source code), which has been deprecated for over three years. Graham87 03:37, 5 July 2014 (UTC)
Oh, never mind. It's about the changes discussed above in this thread. Graham87 04:39, 5 July 2014 (UTC)
Graham/Telpardec (Graham87Telpardec) I'm referring to this diff which inserted a nag into pages that used the Archive basics template that kicked off this entire discussion. Hasteur (talk) 16:32, 5 July 2014 (UTC)
  • I think it is a fine idea to limit the size of archives for heavily trafficked pages such as ANI., the main page, etc I don't think we need more than general guidance for article talk pages, and I don't see any need at all to enforce limits on user talk archives. I don't think that is a problem so it doesn't require a solution. I am also wondering if this really a topic of broad interest and wide impact, as the listing at CENT would imply. (I note the last comment before this one was four days ago)
Since this seems to be a discussion between some more technically minded people I have to ask: I often see people commenting on how many KB a page is. To us non-technically minded people this is a bit of a mystery. How do you even find that number? Beeblebrox (talk) 19:56, 9 July 2014 (UTC)
  • Beeblebrox, there are multiple ways you can find the raw page size including, but not limited to, the page history . . (42,256 bytes) , reading from the page information page (append &action=info to any page URL), or using the {{PAGESIZE}} (which is expensive only when using it to find the size of a page you are not on) mw:Help:Magic words#PAGESIZE. You can find the post template inclusion size in the preprocessor report which is hidden in a comment in the page source or in the Parser profiling data: section of any page in edit mode. — {{U|Technical 13}} (etc) 20:11, 9 July 2014 (UTC)
Ok, I guess I did know that first one, but not the others. thanks. Beeblebrox (talk) 21:01, 9 July 2014 (UTC)
  • (edit conflict)Beeblebrox The reason why this discussion is adverted at CENT is due to the fact that the change that kicked off this discussion imposed a "You're not doing it right" nag that would show up on pages that used the {{archive basics}} template. While there is no explicit guide as to how long the talk page archives should be, the user leading the cause for this argued that the archives should be no longer than the main talk page because of legacy browser support. It has been my experience that unless there is an ironclad consensus or written in policy statement it is better to wrestle in the mud with a pig than discuss with the user when they feel that they are right. As this is off in the 4th "Over the Horizon" from core talk pages I felt that a notice in CENT would be helpful given that the best practice will affect many pages throughout the entirety of Wikipedia. Hasteur (talk) 21:33, 9 July 2014 (UTC)
I've never used that template, but I think we can all agree that one editor of one template cannot create new policies by fiat. Beeblebrox (talk) 22:15, 9 July 2014 (UTC)
Followup on archive overflow

After noting the archiving behaviour of ClueBot III above, I came across a talk page where archiving was recently activated and 22 sections with 141,601 bytes were archived at once by lowercase sigmabot III, which spread the sections across 3 filenames, instead of 1 big one like ClueBot. FYI. —Telpardec  TALK  01:55, 11 July 2014 (UTC)

  • Support any standard I like giving people the option to choose anything, but I also like having a single recommended value for people who have no idea what is appropriate. Setting archive sizes is a problem which I personally have faced. Anywhere between 70-150k seems reasonable to me. 70k looks best when there are many 1-sentence messages. 150k is best for longer wiki-style discussions. I would support any consensus in that range which proposed a blanket recommendation of one single value for all cases and in all circumstances. Blue Rasberry (talk) 19:31, 15 July 2014 (UTC)

Closure statement[edit]

Official closing was requested at WP:AN, and since this was listed at WP:CENT, it ought "formally" to be closed...but as that's not normal for this kind of page, I figured a statement here ought to work, without the formal "this is an archive; do not modify" warning atop the big box. This is quite clearly a no-consensus situation: after rereading everything, I can't see anything that would attract substantial agreement, aside from Beeblebrox's obvious point that one editor ought not to be making major changes alone. Nyttend (talk) 12:01, 19 August 2014 (UTC)


The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.


Editing archive for rectifying error[edit]

Resolved

I just did 2 times.[1][2] Because the archive header tells not to edit content and there are many pages like these that have errors. So there are no issues? Thanks OccultZone (TalkContributionsLog) 14:37, 6 July 2014 (UTC)

Those edits are fine. Any changes to archives that don't alter the actual text are completely unproblematic. In other words, correction of spelling or grammatical errors in archives would not be a good idea. Graham87 03:51, 7 July 2014 (UTC)
I am marking this section as resolved. This post shall be referred if there are any concerns. Thanks Graham! OccultZone (TalkContributionsLog) 04:09, 7 July 2014 (UTC)

Size of Archives 2[edit]

The discussion above closed without consensus a couple of months ago, but I think we as a community can do better. Andy mentioned above that we are in no danger of running out of subpages. This is of course true, but it is easier to search fewer, larger archives than a multitude of smaller ones. That is the primary argument in favor of maximizing archive size - archives function better as archives when they are less fragmented.

Talk page archives not routinely accessed the way talk pages are, so a larger loading delay is acceptable. We still need the archives to be visible to all editors, of course - but since they are intended to be static, their editability is not a priority.

I suggest we confine our discussion to "vanilla" talk pages; there will be exceptions such as ANI archives that need not be considered in this edit page. Additionally, this is a discussion about updating the recommended settings on our Help page - the settings will not be binding if local talk page consensuses prefer other settings.

For reference, I created example subpages of (roughly) 127kB, 255kB, 512kB, and 1024kB. They all load in less than a second for me. I have seen various claims that pages above a certain size "cause trouble" for some browsers or connections; is there any actual evidence that this has been studied? The server sometimes hiccups, so simply experiencing an isolated occurrence of load failure is not a particularly strong argument.

Based on the above, I think the "default" automatic archival threshold at which to stop adding new threads and move to the next subpage should be 400kB. By contrast, Google tells me that the average website is over 1600kB. VQuakr (talk) 07:35, 30 October 2014 (UTC)

Other things to consider include: what is the maximum size that the archiving bots can create - for example, the MiszaBots (now replaced by lowercase sigmabot III) apparently froze when the archive reached 2000K; and what is the maximum size that the search feature can handle - the current default search (LuceneSearch) can manage 512K but not 1024K, although I don't know where the boundary is. CirrusSearch (Preferences → Beta features → New search) may move the boundary, but again, I don't know what to, and a bigger issue is that it's not enabled for all users, and has occasional downtime so we need to take both search methods into account.
For those pages where we may run out of subpage numbers (ANI is now up to IncidentArchive860 and at the present rate may reach IncidentArchive999 in about November 2017) the bots should continue (e.g. with IncidentArchive1000) without a hiccup. --Redrose64 (talk) 08:04, 30 October 2014 (UTC)
About the 2000k limit - that is the default page size limit of the MediaWiki software, and I think the value that is currently used on Wikipedia, which is presumably why the bots couldn't cope with it. (See mw:Manual:$wgMaxArticleSize.) — Mr. Stradivarius ♪ talk ♪ 09:10, 30 October 2014 (UTC)
The MediaWiki default is 2048K to be precise. Wikimedia wikis set wgMaxArticleSize to 2000K in http://noc.wikimedia.org/conf/highlight.php?file=InitialiseSettings.php. PrimeHunter (talk) 13:46, 30 October 2014 (UTC)
  • The key factor that you are overlooking, VQuakr, is that browsers do not load wikitext, which is what your demonstration page sizes are based on. The following is a table of your page sizes and what the browser actually has to download.
Wikitext size Webpage size Difference
Average page size
according to Goggle
1600kB (1.56 MB)
127kB 2100kB (2.05MB) 1973kB (1.93MB)
255kB 2228kB (2.17MB) 1973kB (1.93MB)
512kB 2488kB (2.42MB) 1976kB (1.93MB)
1024kB 3028kB (2.95MB) 2004kB (1.96MB)
So, our page size without any text for some users is already well over the 1600kB average page size. For me, on my mobile device, I start have troubles with archives that are about 170-200kB of wikitext (which is about 2.12MB of actual data to download for the page). This is why I suggested above that the 120-150KB range is reasonable. As far as being able to find stuff in the archives, it's not hard to add a search box to the talk page to find a specific topic or phrase in the archives for that page, so "having to look through multiple smaller archives to find something" isn't a valid reason for opposing this limit in archive size. — {{U|Technical 13}} (etc) 14:12, 30 October 2014 (UTC)
@Technical 13: the search box works fine if you have a keyword or phrase to search that is relatively unusual. Less so if you are browsing, or attempting to follow the historical flow of discussions. Reducing the fragmentation of archives is indeed a "valid" reason, even if it does not apply to your personal workflow. My takeaway from your summary above is that we could reduce the archive fragmentation by 400% by increasing the default size from 100k to 400k, while only increasing the web page size by 15% (2073 to 2373 kB) (and of course greatly reducing the total download size if someone is browsing all the archives). Given the strong positives and apocryphal negatives, this seems like an obvious decision. VQuakr (talk) 17:19, 1 November 2014 (UTC)
  • VQuakr, what you are failing to observer is the fact that webpages on mobile devices start failing to load between 2MB and 2.1MB, so page sizes between 100kB and 300Kb do not load well on mobile devices. Based on this, we should try and keep our pages sizes as close to the 100kB (2MB) side as we can. 120kB to 150kB seems fairly reasonable for this. — {{U|Technical 13}} (etc) 21:41, 1 November 2014 (UTC)
What is your source for this? VQuakr (talk) 22:50, 1 November 2014 (UTC)

Auto-archiving of WP:ANRFC[edit]

Could anyone advise how this page can be autoarchived? This would not be a time-based archive, but should be activated by the addition of the {{done}} template within a section (as things should remain listed until they have been closed). Cheers, Number 57 21:36, 17 November 2014 (UTC)

How to archive[edit]

For beginners, on the Help page go to "Cut and paste procedure" near the beginning and follow the instructions underneath in the box headed "Simplified procedure for archiving". This is the easiest way. All the other instructions are very confusing and hard to understand, IMO. ~ P123ct1 (talk) 08:54, 6 December 2014 (UTC)

Unrelated comment[edit]

Hi kikichugirl.

Actually, I tried to create a page for company. I'm completely new to wiki. But i have followed some of instructions. Im little confused of my page getting deleted. Can you please role back my page again? Is their any issue for me to create same page again? — Preceding unsigned comment added by Vthink developer (talkcontribs) 09:48, 13 January 2015 (UTC)

@Vthink developer: This is the talk page for discussing improvements to the help page Help:Archiving a talk page. You probably intended to post to the talk page of the user kikichugirl (talk · contribs). --Redrose64 (talk) 14:02, 13 January 2015 (UTC)

Legobot[edit]

Shouldn't the information about Legobot archive indexing be removed from the page? According to User:Legobot page, the indexing function is inactive ("Replacement for User:HBC Archive Indexerbot -Inactive"). Vanjagenije (talk) 12:16, 23 February 2015 (UTC)

Yes, It should be removed I think. It's currently misleading to keep giving the indexing instruction using that bot. Tvx1 00:27, 11 March 2015 (UTC)
  • I remember talking to Legoktm about this not too long ago, Legobot is still indexing. That said, I'm guessing you are reading it as legobot inactive when I read it as it's doing the task the inactive HBC used to do. — {{U|Technical 13}} (etc) 03:17, 11 March 2015 (UTC)
    • Just to make it clear, based on this report, I updated the table on Legobot to say the tool is active. — {{U|Technical 13}} (etc) 03:34, 11 March 2015 (UTC)

Archive Index[edit]

How can I display an Archive index in my talkheader like that is seen on this talk page? I have been trying some stuff but I can't figure it out. Jahn1234567890 (talk) 16:33, 23 March 2015 (UTC)