Template talk:Cite journal/Archive 2009 October

From Wikipedia, the free encyclopedia
Jump to: navigation, search

Should Citation bot always add ISSNs?[edit]

In response to my complaint that the Citation bot was adding ISSNs to all the journal-article citations in the Mango article, the bot's maintainer replied "I think I recall discussing this issue at length at Template talk:Cite journal a number of months ago, and the consensus was that ISSNs were valuable and should be added wherever possible." I just now read Template talk:Cite journal/Archive 2009 May #ISSNs are useful, independent of DOIs and see no consensus at all for a bot adding ISSNs to all uses of {{cite journal}}. On the contrary, the summary of that discussion focuses on whether the bot should remove ISSNs, deciding (correctly in my opinion) that the bot should leave them alone. I see no consensus for adding ISSNs everywhere.

I shudder to think of what an article like Autism would look like if ISSNs were added for each citation. For one thing, many of the ISSNs would be duplicated, as the article cites journals like the Journal of Autism and Developmental Disorders. For another, the ISSNs would often be inaccurate, as medical journals often have multiple ISSNs (one print, the other electronic) and it's not clear which one should be added automatically. Adding ISSNs may be helpful in some fields, but in medicine it is definitely counterproductive, as the ISSNs take up valuable screen real estate and contribute almost zero useful info that is not already easily available, resulting in a net negative for the encyclopedia. They should not be added automatically by a bot. Eubulides (talk) 17:35, 7 September 2009 (UTC)

I just noticed this and hell no. These are useless links which serve no purpose whatsoever other than clutter citations. Headbomb {ταλκκοντριβς – WP Physics} 22:00, 7 September 2009 (UTC)
I never understood why we should even use ISSN in this template Well, I read the April 2009 discussion, and it seems that the ISSN is useful for inter-library loans and offline versions. However, the bot shouldn't be adding them en masse, specially when the journal name field already points to the article on the journal, or for journals like Nature. Let editors add them in a case-by-case basis as there are many cases where they are just clutter (many citations from the same journal, medical articles which already have PMID numbers, etc). --Enric Naval (talk) 00:04, 8 September 2009 (UTC)
What guidelines should editors follow when deciding whether to add an ISSN? Martin (Smith609 – Talk) 03:00, 8 September 2009 (UTC)
  • An ISSN should be added only if it's genuinely helpful. One possible (albeit extreme and hypothetical) example would be an obscure journal whose name is a duplicate of that of a better-known journal, in a citation that does not have a DOI or PMID or any other systematic identification. At the other extreme, well-known journals like The New England Journal of Medicine don't need ISSNs.
  • Whatever rule is used, I don't see how it could be automated. Certainly different editors will have different opinions on which journals need ISSNs, possibly depending on the field or even the particular case. A style should not be forced by the bot.
Eubulides (talk) 05:14, 8 September 2009 (UTC)
It's a multi-tiered question, really. Is an ISSN helpful to editors? Yes, absent other, more specific linkages to the article or journal. Is it helpful to readers wishing to dive deeper into the sources? Again, absent a specific link to the article. I'd suggest we default to adding them into the parameters but only display them when there is no DOI, PMID, or even journal wikilink, i.e. New England Journal of Medicine. For historic content, predating the use of ISSN numbers, an OCLC is a reasonable substitute, though neither as reliable nor as widely accessible. By all means give article editors an option to suppress display, but build the web. Even if the ISSN defaults to hidden in all cases, it should still be captured for use in the emitted metadata. Let's not conflate the display functionality of {{cite journal}} with the important linkbuilding functionality of citation bot.LeadSongDog come howl 13:49, 8 September 2009 (UTC)
Right now, {{cite journal}} conflates display with metadata, by always displaying the ISSN. If the template didn't conflate the issue perhaps matters might be different. But in the meantime the bot should not be adding the ISSN and messing up the article's display. Eubulides (talk) 08:02, 9 September 2009 (UTC)
Reiterating the previous discussion, an ISSN is useful even when there is a DOI or PMID. These identifiers do not always lead to open access articles & many inter-library loan forms do not have fields to include that information. While many articles on journals in WP have ISSNs, some do not. I don't see why you'd exclude information that assists people in obtaining copies of the article & including the ISSN lowers the bar of having to refer to the WP article on the journal or otherwise search for the ISSN. ISSNs, if given, should be visible. I am ambivalent about adding them via bot. The only valid argument against doing this that I have read in any discussion is one of aesthetics (I can see why some think they are clutter, but I do not) & my !vote is to have function over form. --Karnesky (talk) 21:19, 8 September 2009 (UTC)
  • It's not primarily an argument about aesthetics. It's primarily an argument about utility. Screen real estate is valuable, and cluttering it up with ISSNs wastes a valuable resource. If we're going to use up space, it'd be much more valuable to spell out the journal title if abbreviated, or to list the name of the journal's publisher, or to spell out the authors' full names, etc., etc., but these are also (rightly) not done by the bot because they also chew up the user's screen for what many editors consider to be so small a benefit that the overall effect is harmful. It is not for the bot to second-guess the editors' judgment here.
  • ISSNs are not valuable for medical articles, even under the above scenario. Any library that can get you a copy of Journal of Autism and Developmental Disorders via interlibrary loan has a web browser that can resolve DOIs for you. Medical libraries may be a bit behind the times, but they are not in the dark ages! Perhaps matters are different in other fields, but in medicine the ISSNs should almost always be omitted.
Eubulides (talk) 08:02, 9 September 2009 (UTC)
The screen space argument may have some bearing on what {{cite journal}} should render on screen, but no bearing on what should go into the COINS metadata. The NLM maps journal names and abbreviations to NLMID, ISSN, eISSN, and OCLC numbers for a simple reason: many journals have ambiguous names. The ISSN is not ambiguous.LeadSongDog come howl 14:21, 9 September 2009 (UTC)

I think Martin should be free to continue adding this as useful metadata. The template should be altered so that it does not display the ISSN if a DOI is available (or some other display policy to be thrashed out by consensus—the point is that what information we choose to display should be decoupled from what information we record. Hesperian 04:05, 10 September 2009 (UTC)

This is backwards. The bot is breaking articles now. I am disabling it for articles where it is gratuitously inserting ISSNs (and now month names too! what's up with that?). The bot should not be inserting unwanted gorp into today's articles on the offchance that the template will be fixed at some future date. The bot is supposed to be helping editors produce better Wikipedia articles; it should not be getting in the way. Eubulides (talk) 06:09, 10 September 2009 (UTC)
To bluesky for a moment, another option could be to generate by default a wikilink to the journal's WP article. Many of these would of course be redlinks at first, but that might motivate more editors to get on with stubbing out those journal articles. Or would that suggestion be too Pointy? That approach minimizes the screen clutter yet provides the additional metadata at the cost of a clickthrough. Yet another option would be to craft some html code to display the extended metadata on a mouse hover. I'd like to see opinions on that approach from WP:WikiProject Accessibility members: it seems to me it could be friendlier to screen readers. The Web Content Accessibility Guidelines suggest hovertext.LeadSongDog come howl 05:28, 10 September 2009 (UTC)
Wikilinking to journal titles is typically WP:OVERLINKING and a mistake. References already have too much blue text. These extra wikilinks are more likely to cause trouble (by people clicking on them by mistake, when they wanted info about the source) than they are to help. Also, these relatively-useless wikilinks make it harder for people using screen readers to navigate through the citations. Eubulides (talk) 06:09, 10 September 2009 (UTC)
No, it's not. When I click "ref 45" I don't want to have to backtrack to ref 21 to get my wikilink to Acta Physica Polonica B. The reference section is not prose to be read in bulk, each one should stand on its own. ISSNs add nothing at all for anyone. You can google "ISSN 0028-4793" just as well as "N Engl J Med". Headbomb {ταλκκοντριβς – WP Physics} 08:26, 10 September 2009 (UTC)
It's quite possible that you prefer wikilinking every journal title. But this preference is certainly not universal, as it causes real problems for the user interface. If anything, I'd say the consensus (if there is one) is more against wikilinking journal titles than for it, on WP:OVERLINKING grounds. This is independent of whether just one instance of the journal title is wikilinked, or all of the instances. In any case, the citation bot should not be adding these blue links against editorial preference. Eubulides (talk) 00:49, 11 September 2009 (UTC)
Wow, Eubulides and I agree that "References already have too much blue text. These extra wikilinks are more likely to cause trouble (by people clicking on them by mistake, when they wanted info about the source) than they are to help" :-)
IMO by far the highest-priority link is one that links to the actual source: for articles, DOI or URL (or both, if DOI leads to a subscription page but URL leads to a fair-use copy on the author's web site); for books, URL to author's copy or to Google Books if that presents the extract.
The best of both worlds would be a gadget that shows or hides wikilinks to other items, e.g. author, journal, publisher. But I suspect that would be of interest only to reviewers. --Philcha (talk) 07:24, 10 September 2009 (UTC)
My priorities for which of the links to display by default are: ref's doi and/or fair use url direct to the source, wayback or other archivelink, ref's pmid, arXiv or best similar index id, authorlink, then one of the journal's links (wikilink, issn, oclc, or other). I'd be happy to put them all out of sight behind a hover or clickthrough, but less happy with using Pop-ups, as they are not so consistently available and accessible to all users. (Think mobile.) While I haven't seen research on it, I expect that logged in editors are likely to want more links than are IP readers. As an editor I value the pmid index link nearly as much as the article doi link because it helps put the source in context, showing other sources which cited it, without being captive to one publisher. As a reader of a well developed WP article, not so much. But we should keep in mind that the average reader is much less nimble at navigating the way through various indices and search engines than the average wikipedian citegnome is. Give them links for the What and Who. The Where, When, Why and How don't need links displayed to IP users by default.LeadSongDog come howl 13:34, 10 September 2009 (UTC)
I tend to agree with the above comment, except that I'd rather not see any journal wikilink because that blue text harms me more than it helps. The point is that the Citation bot should not be imposing this wikilink on the articles that I help edit; there is certainly no consensus that they should be added everywhere. Eubulides (talk) 00:49, 11 September 2009 (UTC)

Having data that is not displayed & having role-specific displays both seem bizarre to me. Are there other examples on WP where we hide data and/or show anonymous users something different from what logged in users see? --Karnesky (talk) 15:45, 10 September 2009 (UTC)

Basically, anything that you can set via Special:Preferences represents user interface elements that can be different for logged-in versus IP users. This includes several quite-different ways to display equations, for example. Eubulides (talk) 00:49, 11 September 2009 (UTC)
That differs from what it seemed LSD proposed, though: the defaults for logged in users are the same as the defaults for anonymous users. MediaWiki allows usercss & javascript too, so you can certainly fix these to your personal taste. No reason to force those changes onto other users, though. --Karnesky (talk) 02:40, 12 September 2009 (UTC)
Again, this is backwards. The Citation bot should not be imposing an editorial style on editors. The forced addition of ISSN and month is obviously controversial. The ISSN adds to the forest of blue links, and many editors find that to be WP:OVERLINKING. The addition of the month, to journals, is irrelevant information. Both additions use up valuable screen space. Perhaps some editors prefer this, but many, perhaps even most, do not. The Citation bot should not be imposing controversial styles like this. Eubulides (talk) 03:26, 12 September 2009 (UTC)
That remark has nothing to do with the comment you were replying to. LSD had posed the idea that logged-in editors should have a different default view from IP users by default & you and I traded a few comments regarding this specific point. Do you agree that we should NOT do this? It really has nothing to do with what the bot should be doing (which i admitted to being ambivalent about, above). --13:55, 12 September 2009 (UTC)
Ah, sorry, I completely misunderstood your comment, and I apologize. I agree that we should minimize the number of differences between what logged-in editors see, and what IP users see. In the places where that sort of thing is done (date-formatting, image sizing) it introduces a lot of complexities, and I doubt whether such a heavy hammer is needed for this rather small nail. Eubulides (talk) 14:09, 12 September 2009 (UTC)
Evidently my phrasing lacked sufficient precision, my apologies. My intent was that the stylistic decisions of which links to have {{cite journal}} display should be explicit choices, not implicit. I suggested my own preferences and reasons for which explicit choices to have the Citation bot emit by default. Of course the article editors should be able to choose alternative settings: that is what "default" means. But my intent was that explicit hiding of journal (issn, oclc, etc) links is a way to reduce the sea of blue without losing the underlying metadata linkage. The bot has the additional potential usage of ensuring that journal links are by default set to hidelinks=yes (or whatever syntax is chosen) for subsequent instances of a linked journal within a single article after the initial hidelinks=no. LeadSongDog come howl 08:24, 14 September 2009 (UTC)
  • Just to chime in with a thought, letting the 'editors of an article' decide whether to use ISSNs in that article seems difficult to reconcile with the prohibition of 'owning' articles. If Fred comes along to an article that Jonny created and Sara has spent ages maintaining, should one editor's opinion be 'rated' more highly than another? This isn't what happens in my experience. If a group of article editors decide that they love or despise ISSNs, is there anything to stop them from going round any article they please, monopolising the consensus and having their way? Martin (Smith609 – Talk) 14:26, 17 September 2009 (UTC)
I'm afraid the question of whether to allow editors to make stylistic choices in citation (as in other matters) was taken long ago and is unlikely to change anytime soon. If the templates attempted to limit stylistic choices, they would simply be used less often, to the overall detriment of the project. That's why we have so many flavours of {{cite}}, {{harv}}, etc. Just consider all the nausea we've been through on something as basic as date formats or WP:ENGVAR. OTOH, by picking defaults the onus of effort moves to the editor overriding the default. Inertia then works in favour of a consistent format, yet without stopping editors from making topic-appropriate choices to overrule the defaults. LeadSongDog come howl 17:35, 17 September 2009 (UTC)
In my experience by far the most common case is that at most one editor of an article cares about citation format, and in that case the ordinary consensus rule applies for that article. In a few better-traveled articles there may be brief discussions about format but I've never seen it turn into a major argument much less an edit war. Almost nobody cares about citation format that much, and when there is a disagreement the general rule is to stick with the style the article already has. Eubulides (talk) 18:50, 17 September 2009 (UTC)

Use of pp[edit]

I noticed the template uses pp when outputting page numbers even if there is a single page. My understanding is pp refers to multiple pages. I was wondering if it is possible for this template to check if there is a single page or a page range and use p or pp respectively. --Kmsiever (talk) 15:09, 9 September 2009 (UTC)

{{cite journal}} doesn't generally use p or pp, it just puts the page numbers without the decoration. But for several of the citation templates, if you use page= instead of pages= you will get p. instead of pp. —David Eppstein (talk) 15:38, 9 September 2009 (UTC)
Perfect. Thanks. --Kmsiever (talk) 22:00, 9 September 2009 (UTC)

SmackBot and others appear to be removing the pp. from the cite journal references. [1]

The cite journal documentation says to manually add the p. or pp.

  • page or pages: 45–47: first page, and optional last page (separated by an en dash –). Manually prepend with p. or pp. if necessary. If you need to refer to a specific page within a cited source, use Template:Rp.

If an article has references from books, newspapers and journals, why shouldn't the references be consistently formatted? -- SWTPC6800 (talk) 01:43, 23 September 2009 (UTC)

Because the 'pages' parameter in a book citation states the total number of printed pages in a book. In a journal, it refers to the range of pages that the article occupies within the book. Most citation styles therefore format the two things differently. Martin (Smith609 – Talk) 06:39, 6 October 2009 (UTC)
No, as the documentation for {{cite book}} says, its |pages= parameter refers to the page range of the cited material (often a chapter); it does not refer to the number of pages in the book. Eubulides (talk) 06:58, 6 October 2009 (UTC)

ISSN proposition[edit]

Previous discussions regarding ISSNs have petered out before any obvious consensus was reached. Would there be any objections to including the ISSN in metadata, but hiding it by default? That way, most users will not be troubled by clutter, but users who do find it helpful will be able to turn it on if they wish to see it.

Martin (Smith609 – Talk) 06:46, 6 October 2009 (UTC)

Yes, I object. We shouldn't hide any data by default. --Karnesky (talk) 15:32, 6 October 2009 (UTC)
So you are saying that it is better not to have the data at all, than to have it visible only to those who wish to use it? Could you explain your rationale? Martin (Smith609 – Talk) 22:20, 6 October 2009 (UTC)
No. I am saying the status quo, of displaying ISSNs that are given, is better than your proposed change to conceal them. The only real argument provided against showing them was one of page space/aesthetics, but this is just not compelling to me compared to the benefits of having more data. Having better metadata and providing article editors with additional information weren't the only explanations given for why we want ISSNs at all. It has been stated by myself and others that they can be very useful for allowing readers to request articles & that they can be useful to have on a printout. I'd have no objection for providing a "hide issn" parameter. If a citation bot were to add issns to articles, I'm ambivalent as to how such a parameter would be set. Perhaps have it set:
  • to the same setting as other references on that page
  • default to hiding it
  • do not change the setting as to whether or not to hide it; only use it when adding ISSNs.
--Karnesky (talk) 22:32, 6 October 2009 (UTC)
It is not simply aesthetics; it is a matter of whether the citation is useful to readers. Displaying the ISSN benefits an exceedingly small number of readers (by saving them a few minutes in a library) while at the same time hurting the user interface of all users by cluttering valuable screen space with almost-always-useless links. However, if there's consensus to put in an ISSN for a particularly-obscure citation, I don't see why the ISSN shouldn't be displayed and printed out. Eubulides (talk) 23:47, 6 October 2009 (UTC)
Karnesky, do you personally use ISSNs? Is there anybody that can attest to finding them useful themselves, or are we possibly catering to an imagined need? Without the input of somebody who actually uses them in Wikipedia, it is difficult to assess how useful they actually are - we can only speculate. Martin (Smith609 – Talk) 04:26, 7 October 2009 (UTC)
Yes, I do. And there were other people in the archives who claimed that they did. Did you read all of the past discussion? --Karnesky (talk) 05:53, 7 October 2009 (UTC)
  • I only recalled one person mentioning that they used them, and don't remember them putting much energy into defending their presence.

Karnesky's suggestion[edit]

So, your previous suggestion implies that ISSNs will be useful in some articles, and not in others. What articles should ISSNs be visible on by default, and which should they be hidden on by default? Martin (Smith609 – Talk) 06:50, 7 October 2009 (UTC)

If we are going to make this article specific, we should use the same guideline as we do for any citation style questions (whether or not to use templates, which templates to use, etc.): consensus or the initial editor's style wins out. I don't see why ISSNs are particularly more important than any other aspect of citation styling that they would need their own guideline. --Karnesky (talk) 13:48, 7 October 2009 (UTC)

Leadsongdog's suggestion[edit]

I would want the ISSN (or equivalently the OCLC) visible by default until the full journal name is verified or (better) a specific article ID is checked, whether DOI, PMID or whatever. After that, it's less important, though it still has a place in the COiNS metadata. LeadSongDog 07:20, 7 October 2009 (UTC)

How would this work in practise? Are you suggesting that the initial editor manually adds the ISSN, and that it is removed from display at a later date? Martin (Smith609 – Talk) 07:37, 7 October 2009 (UTC)
Right. So that if an ISSN is provided without an article ID the ISSN or OCLC should display. An input of terse=yes (or some such sugar) could override that default display. Once an editor or Citation bot finds a doi, pmid, arXiv, or other unique ID, and adds it to the citation, then the defined ID could set terse=yes too. LeadSongDog come howl 13:45, 7 October 2009 (UTC)
Automatically marking a record to be terse based on other identifiers does not address ISSN's utility in requesting articles through interlibrary loan if an article-specific identifier does not lead to an open access copy of the article. This is particularly problematic for journals with overlapping names. --Karnesky (talk) 13:51, 7 October 2009 (UTC)
I'm not sure I follow. Can you provide an example problematic article-specific ID? Certainly a pmid or doi will unambiguously lead to an issn with one or two clicks.LeadSongDog come howl 20:50, 7 October 2009 (UTC)
I don't see why you changed your mind since last April, when you stated:

Given that electronic access via the DOI tends to be prohibitively expensive for many editors and that it introduces recentist bias, I'd say there are actually some grounds for preferring the ISSN to the DOI. Definitely not the reverse. Given WP:NOTPAPER, what's the dilema?

What has led you to change your opinion? --Karnesky (talk) 21:00, 7 October 2009 (UTC)
DOI access to proprietary article content remains expensive to those without institutional support, but in most (all?) cases it still lets you find out what the correct journal name is. Citation bot has proven itself quite effective at populating the journal name, as has {{cite doi}} (though I still have other reservations about the wisdom of using cite doi and its brethren in articlespace). I'd forgotten about the recentism issue, but that still stands. Balanced against it is that is the whole issue of article size and associated load time. Eubulides has been pointing to WP:FA examples such as Virus and Autism where the load can take a long time even at modest DSL speeds. For dialup they would be painful indeed.LeadSongDog come howl 21:38, 7 October 2009 (UTC)

(unindent). Full CoinS (which essentially replicate the reference) for the few articles with over a hundred references will add only ~3 seconds to page load times on 56k dialup. How is an 8-digit ISSN going hurt page load in some way that we should actually worry about it? --Karnesky (talk) 22:24, 7 October 2009 (UTC)

You're right that it's a pretty trivial portion, but it all adds up. I'm not overly concerned one way or the other once we've done what we can to provide a link to the article and correct any errors in the basic information. LeadSongDog come howl 23:47, 7 October 2009 (UTC)

Eubulides' suggestion[edit]

I'm not sure what "by default" means, but my suggestion is to display the ISSN if it's specified by an editor in the template, and to not have the bot add an ISSN. That is, ISSNs can be added by hand if need be; for most citations, they're not needed. Eubulides (talk) 07:25, 7 October 2009 (UTC)

The question that this proposal raises is what guidelines an editor should use when deciding whether or not to add an ISSN by hand. What would you propose? Martin (Smith609 – Talk) 07:37, 7 October 2009 (UTC)
Don't bother to include an ISSN if you have a working DOI or PMID or a stable URL. Otherwise, include an ISSN for journals whose names would otherwise be confusing (e.g., Science, ISSN 0193-4511; no, it's not that journal, it's some other journal named Science) or perhaps if the journal is really obscure (e.g., Journal of Clinical and Experimental Psychopathology & Quarterly Review of Psychiatry and Neurology, ISSN 0447-9122). Eubulides (talk) 08:11, 7 October 2009 (UTC)
That sounds workable. Just one thing; it seems something of a personal judgement call as to what journals are obscure and which are not. Perhaps Karnesky could weigh in with his experience of which journals ISSNs are useful for? Martin (Smith609 – Talk) 08:27, 7 October 2009 (UTC)

COiNS bloat[edit]

I decided to measure why Autism '​s HTML was so large, and discovered somewhat to my display that COiNS is a large part of it. By my measurements, the 100 kB of COiNS data in the HTML version of {{cite journal}} in Autism has grown the article's HTML from about 434 kB to about 533 kB, a 23% increase. This works out to roughly 600 bytes per citation. Why is the COiNS metadata so large? Is there some way that we can shrink it dramatically? Or can we generate it conditionally, only for the few users who need it? The default, surely, should be to not generate the COiNS data for {{cite journal}}, since most users don't need it. Eubulides (talk) 23:47, 6 October 2009 (UTC)

Wow! Almost half the length of that article on screen is its References section. It's something of an exceptional case, isn't it? That suggests other possibilities: a subpage for large bibliographies, so they only load when readers click for them. There's a discussion currently going on that relates, at Talk:Phage monographs#Bibliography articles.LeadSongDog come howl 00:59, 7 October 2009 (UTC)
  • It's not an exceptional case. For example, the HTML text of Virus, the currently featured article, is 26% longer because of the COiNS data.
  • The discussion in Talk:Phage monographs#Bibliography articles is about bibliographies, which is a different issue. The issue here is the References section, which is automatically generated from "<ref> ... </ref>" entries, and which cannot be delegated to a different page like a bibliography might.
  • I suspect that this problem has not been mentioned before because nobody bothered to mention it. If you're on a broadband connection I suppose it's not that big a deal. But about half of the Internet population (mostly the poorer half) is on slower speed lines. This half is seriously impeded by the COiNS bloat.
Eubulides (talk) 04:10, 7 October 2009 (UTC)
I wasn't clear, sorry. The problem is that these articles have far too many references cited to be readily read on a slow connection, even with no COiNS data. Really, 195 reference entries is just plain nuts. We can't avoid having them all, but we can still look for ways to allow readers to read the article without waiting for all the citation details to load. One way could be to have a bot move each citation details to a linked bibliography page or subpage, replacing it with a link to an anchor for each citation on that page. There's no fundamental reason all the details need to be on the same html page as the article, if we can find a workable implementation. LeadSongDog come howl 06:23, 7 October 2009 (UTC)
195 references really isn't too many. We cover a lot of complex issues here on WP, some of which are based on substantial bodies of scholarly research. Accurately representing this research can easily take 195 references. But perhaps there is a technical solution. KellenT 09:22, 7 October 2009 (UTC)
I understand how it gets to be this way, as an artifact of how we generate content by massively collaborative editing, but how many other encyclopediae have articles that are anywhere near half-full of citations. As editors we need them as a quality control measure, but how much value does the average reader get from them? Do we have research that says what fraction of readers click through to the citation and then to the references themselves? My sense is that its a very small percentage. LeadSongDog come howl 13:58, 7 October 2009 (UTC)
Large numbers of citations are necessary in Wikipedia to give the article authority. There is no reason why the reader should believe a single word written in Wikipedia unless there is a citation to support it. The authority of Wikipedia derives the authority of its sourcs. No sources, no authority. ---- CharlesGillingham (talk) 02:29, 12 October 2009 (UTC)
I get about 450bytes per citation and roughly a 27% increase in page size due to COinS on another heavily cited article. KellenT 01:49, 7 October 2009 (UTC)
We serve compressed pages, yet you report only the uncompressed size. COinS compresses very, very well. I doubt removing it would be of significant benefit. The images on that article still account for the majority of the page size. Zotero and other COinS ingestors are becoming reasonably popular. I think the COinS should stay. If you're serious about removing them, we should call for comments on the talk pages for the other cite*/citation templates that use them & the microdata project here. --Karnesky (talk) 06:09, 7 October 2009 (UTC)
For Autism the COiNS data grows the compressed size of the HTML text from 63,987 to 79,382 bytes, a 24% increase. So for this example the COiNS data actually compresses a tiny bit worse than the rest of the HTML. Low-bandwidth users regularly suppress images, and the COiNS data measurably impedes their use of Wikipedia. Has the COiNS project ever measured the overhead imposed by COiNS? It's not just network overhead: it's also CPU overhead (on client and server), which helps to drain batteries on laptops and so forth. Does anybody have figures for the number of people who regularly use Zotero etc. on Wikipedia? Would these users be OK if COiNS data were available only on request (e.g., as a user preference) and was suppressed by default? It may be appropriate to ask for further comments but first I'd like to understand the problem. Eubulides (talk) 08:32, 7 October 2009 (UTC)
That is somewhat surprising to me: because the COinS duplicates in-page content, I would have thought that gzip would do a much better job. Still: that is 16 seconds to load over 56K, vs. 13 seconds without COinS. I think that extra three seconds is really not that bad & we should remember that a penalty of even this magnitude will be on a minority of pages. A majority of our articles do not have as many citations (although we could all point to a few articles that do have hundreds of citations).
Who do you mean by "the COinS project?" The ad hoc community who built it (with members of the OCLC and academic librarians)? Those who implemented them in citation templates? In any case, I doubt it: the utility of having a working citation microdata-like format (though, admittedly, flawed) outweighs the minuscule network and CPU hits it is responsible for in the vast majority of uses.
Usage statistics for Zotero are probably not possible to get: there is no information flow to a central site except for downloads of the extension, unless a user chooses to enable syncing. There have been almost a million downloads of the extension from the Mozilla site, but this does not count downloads from Zotero's site (which are very strongly encouraged over using the Mozilla site), and I can't immediately find details from their site. Over 600 libraries have gone to the trouble of making a custom, public LibX extension. Statistics for individual versions are available to the librarians who made the extension, but I don't see statistics for the project as a whole.
Do you have statistics for the percentage of low bandwidth users to wikipedia? Our alexa stats demographics show a very high percentage of our users are in countries with high broadband deployment & they also show that our users are highly educated (and thus, perhaps, more likely to have broadband).
I'm with LSD: we should be more careful in how we are referencing certain articles, rather than worry about the metadata. --Karnesky (talk) 17:34, 7 October 2009 (UTC)
I think Eubulides means Wikipedia:WikiProject Microformats/COinS. Hmmm. Suppose we should raise this there, non?LeadSongDog come howl 18:11, 8 October 2009 (UTC)