Wikipedia talk:No original research

From Wikipedia, the free encyclopedia
  (Redirected from Wikipedia talk:NOR)
Jump to: navigation, search
Peacedove.svg The project page associated with this talk page is an official policy on Wikipedia. Policies have wide acceptance among editors and are considered a standard for all users to follow. Please review policy editing recommendations before making any substantive change to this page. Always remember to keep cool when editing. Changes to this page do not immediately change policy anyway, so don't panic.
Frequently Asked Questions (FAQ)
I disagree with the definition of secondary source.
Wikipedia mostly follows the definition in use by historians, which requires more than simply repeating information from some other source or rearranging information from the author's notes. The earliest definition of a secondary source in this policy was in February 2004 "one that analyzes, assimilates, evaluates, interprets, and/or synthesizes primary sources".
WikiProject Spoken Wikipedia
WikiProject icon This page is within the scope of WikiProject Spoken Wikipedia, a collaborative effort to improve the coverage of articles that are spoken on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
 
Shortcuts:

Primary and secondary sources[edit]

If there is a fact discrepancy between primary source and secondary sources citing it, should it be pointed out in the article? --Artman40 (talk) 11:54, 27 November 2014 (UTC)

If a secondary source appears to cite a primary source incorrectly, then ideally try to find another secondary source which cites it correctly. If not, then I would say that it's right to report the discrepancy as neutrally as possible in accordance with WP:NPOV. Peter coxhead (talk) 15:36, 27 November 2014 (UTC)

Plotting graph based on published data[edit]

If a wikipedian creates a graph image based on the data (numbers on a table or some other forms) that has been published by a reliable source, would that be considered "original image" and that is not considered to be an "original research"? If plotting graph from published data is allowed to be included in the articles, I think we should add a short paragraph to the Original Image section to be used as an example. Z22 (talk) 13:10, 9 December 2014 (UTC)

Yes and no... If the graph accurately depicts what is said in the source, then no... we don't consider creating the graph to be Original Research. If it goes beyond what is said in the source, then yes, it would be. Ask this... if you took the information presented in the graph and wrote it out in text form (citing the same source), would it be problematic?
That said... ideally, graphs and other images should not be used to present information in an article... they should be used to illustrate information that is summarized in the body of the article text. As long as that summary is not OR, then the graph illustrating it would not be OR either. Blueboar (talk) 13:46, 9 December 2014 (UTC)
Thanks for the explanation. Yes, it just depicts what is in the source. The same source is also used in the text, but sometime it is just difficult to give an overall picture in the text, so we may need to create a graph to accompany the text. My question really is, should we somehow include some forms of what you just said in the above as an example in the "Original images" section so other wikipedians are more clear on the policy? Z22 (talk) 14:04, 9 December 2014 (UTC)

Secondary and tertiary sources[edit]

Since being berated for citing Encyclopaedia Britannica (EB) in an article, I have been involved in a discussion about Wikipedia:Identifying reliable sources. This has led me to this article, and got me questioning the distinction made there and here between secondary and tertiary sources. It seems to me that the way these sources are written is much messier than the guidelines and policy suggest.

  • Surely, many accepted secondary sources are, in good measure, making use of other secondary sources as well as primary ones. In the guideline's example, the book on WWII is very likely to cite other books on the war, and use those to support the author's view of the war. So "secondary" sources do not just rely on primary ones. I think that a secondary source that relies only on primary sources is a very rare beast indeed.
  • Similarly, many "tertiary" sources - as in many EB articles - are not simply "compendia that summarize primary and secondary sources." It seems to me that the authors of many EB articles thought that they were doing much more than summarizing other people's work. They probably thought that they were writing what these guidelines would define as a secondary source article. Of course they would use secondary (and maybe primary) sources, but so do other secondary sources. Their language may be more accessible to the intelligent reader than it would be if they were writing in a scholarly article, but that shouldn't be a reason to treat their work in the tertiary source differently. Furthermore, I have read many scholarly articles, in journals and books, whose whole purpose is to summarize the field. If such an article were published in a book or journal, these guidelines would seem to define it as secondary; yet, if the same words were published in EB, the guidelines would see them as tertiary. (And I would be told off for using them!)
  • Just to complete the set. I can imagine that many primary sources quote from secondary and tertiary sources.

Myrvin (talk) 14:50, 29 December 2014 (UTC)

A secondary source can be 100% based on only primary sources, as long as the specific source introduced transformative thought to the other sources, such as analysis, criticism, or evaluation. For example, a movie review, a secondary source, is typically only going to be based on the primary work, the movie itself. There is no requirement that a secondary source have to include other secondary sources, just transformative thought. --MASEM (t) 15:18, 29 December 2014 (UTC)
@Myrvin: The distinction made in the article is one derived from history and the social sciences. In these subjects, a primary source is essentially the evidence, the raw data. Unfortunately, this useful distinction is then misapplied in other fields; for example treating all scientific journal articles as if they were primary sources in science in the same sense as, say, a letter written by Abraham Lincoln is a primary source in history or biography. Although this is manifestly not the case, it seems to have become the orthodox interpretation here. Peter coxhead (talk) 19:43, 29 December 2014 (UTC)
That may be part of the problem Peter coxhead, but surely, secondary history sources also use other secondary sources too. A history book using only primary sources (such as letters), must still be rare, although, I guess, less rare than in other fields. In science (including social ones), most (all?) papers cite other papers and books. Myrvin (talk) 20:18, 29 December 2014 (UTC)
One thing to keep in mind is that a source can be primary for one topic and secondary for another. A "hard" science peer reviewed paper may be secondary on past research or related topics, and primary on the data presented directly in the paper. --MASEM (t) 22:12, 29 December 2014 (UTC)
@Masem: I think you need to clarify the very last part of your comment above. Yes, the data presented in the paper is clearly in the same category as original documents are in historical research. Yes, the review section of a scientific paper is necessarily secondary. The issue is the status of the analysis and conclusions drawn by the author(s) of the paper based on the data they present. These are not "primary" in the same way as the original documents in historical research. They are equivalent to the analysis of historical documents presented in a history paper, monograph or textbook, which would in that field be called secondary. Peter coxhead (talk) 22:26, 29 December 2014 (UTC)
It really depends how it is presented. I've seen some research papers with very little conclusion on their data after a rigorous review of past sources, or just saying "Well, here's our data, they collaborate with the prior work." and little else. But a good paper will present the data and a conclusion from that, and that's secondary for that purpose. The point, though is just that a source is not 100% always primary, secondary, or tertiary - one has to evaluate that against the topic in question. --MASEM (t) 22:29, 29 December 2014 (UTC)
@Myrvin: I completely agree with you that the distinction between secondary and tertiary sources is fuzzy, regardless of the field to which the distinctions are being applied. My point was merely that there is a clear distinction between primary and other sources, but that this distinction does not correspond to the one regularly used in the English Wikipedia. Why were you "told off" for using EB as a source? Tertiary sources are fine according to WP:PSTS. Peter coxhead (talk) 22:26, 29 December 2014 (UTC)
Long story Peter coxhead. See Wikipedia talk:Identifying reliable sources#Encyclopaedia Britannica, and Talk:David Hume/GA4. Myrvin (talk) 07:52, 30 December 2014 (UTC)
Actually, good historians tend not to cite other published history books... and rely heavily on primary sources. Of course those historians are actually trying to conduct original research on their topic, and the best way to do that is to go right to the original documents and see if they can gain new insights on them. We, on the other hand, are writing a tertiary source (an encyclopedia)... and as such, our job isn't to present new insites.. but to report on the insites of others. we try to avoid conducting original research. So, we have to be very careful about our use of primary sources. We can (with care) use them for blunt (attributed) statements of what they contain, but we can not analyze them or draw conclusions from them. Blueboar (talk) 23:00, 29 December 2014 (UTC)
I think you mean "good historians who are writing for a scholarly audience". A good historian who is writing a basic textbook on Ruritanian history is probably not going to look for primary sources to support every fact, or even most of them. WhatamIdoing (talk) 23:27, 29 December 2014 (UTC)
@Blueboar: the original issue was the use of the terms "primary", "secondary" and "tertiary". As you rightly say the goal is to avoid original research. I remain unconvinced that applying distinctions that at best work in limited fields of study makes much of a contribution to this goal, and it sometimes erects unhelpful barriers. Peter coxhead (talk) 01:41, 30 December 2014 (UTC)
WAID... yes, history textbooks do rely on secondary sources, and avoid primary sources... a text book is a tertiary sources, and so it is appropriate that they do so. That is also true of encylopedias...
Really, the key to avoiding OR is to remember that Wikipedia is tertiary source... a compendium that summarizes what is said elsewhere. If we want to say something in an article... we need to find an source that directly says it. If you apply that concept (that anything we say - be it a fact, an analysis of fact, a conclusion based on analysis, etc - needs to have been said elsewhere before we can say it here)... you will successfully avoid OR. Do this, and the fine distinctions between primary and secondary really become irrelevant. Blueboar (talk) 13:01, 30 December 2014 (UTC)
Perhaps Blueboar could offer some examples of pure secondary sources in, say, history and science. If I understand correctly, they should not cite any other secondary sources, but only use primary ones. Myrvin (talk) 13:32, 30 December 2014 (UTC)
This book [1] says this book [2] is a secondary source. Yet the Denying book, although it uses primary source letters etc., in order to attack the deniers must (and does, see p. 183) cite a lot of other secondary sources. Does this make the first book wrong and the second book a tertiary source? Myrvin (talk)
This book [3] says that encyclopedias (eg. EB) are secondary sources. It's never heard of tertiary ones. Myrvin (talk) 13:51, 30 December 2014 (UTC)
This one [4] agrees that encyclopedias are tertiary sources, but its examples of secondary sources seem to include textbooks. It also refers to "scholarly secondary sources". Myrvin (talk) 14:05, 30 December 2014 (UTC)
This one, in science,[5] says that textbooks are tertiary sources, but that secondary sources are review articles and primary sources are journal articles. Myrvin (talk) 14:55, 30 December 2014 (UTC)
This book on public health informatics [6] says that secondary sources are indexes and bibliographies of the primary literature, whereas review articles are tertiary sources - along with just about everything else. Myrvin (talk) 15:18, 30 December 2014 (UTC)
This work [7] seems to question Blueboar's ideas of what historians do. Myrvin (talk) 15:59, 30 December 2014 (UTC)
  • The distinctions are less bright-line than our WP:NOR seems to indicate. For example, sources that compile data and report it, often without commentary, are which? I would think it pedantic to call the census bureau's report that New York City has x population is primary (but aren't the little forms being filled out by the populace primary and the census bureau secondary? but I digress), but repetition of same facts in an atlas is secondary, and when regurgitated as solemn fact in an encyclopedia it's suddenly tertiary. Most of the basic substance of an encyclopedia may be from any sources - need we await secondary sources for the AirAsia crash before writing the article - and if news reports are primary, as suggested, we'd have no references at all. When anything is said beyond the plain facts: that's when we should expect secondary sources that state what whatever interpretation of the facts being stated. Carlossuarez46 (talk) 05:48, 6 January 2015 (UTC)

difference between this and WP:COATRACK[edit]

I sometimes wonder why anyone cites WP:COATRACK anymore to call for deletion when this policy has the potential to support more deletion than the COATRACK essay does and has the additional benefit of being policy. Everything that is complained about in COATRACK could be objected to by noting that the sources cited by the "coatrack" are not, to quote from OR, "directly related to the topic of the article." Which I think is unfortunate with respect to the breadth of this policy, because I am of the view that this policy should not be used to delete material when A) everyone agrees it is not a coatrack and B) everything within the editor's edits is fully supported by the sources cited (ie the individual claims or "parts" in the edit are supported AND the "whole" of what the editor's edits say). Let's take the "The United Nations' stated objective is to maintain international peace and security, but since its creation there have been 160 wars throughout the world" example. If an editor added that sentence, it's obviously OR. But what if the first clause was already in the article, and an editor just added the second clause? I'd think it would be better to object to that sort of editing - where the OR conclusion is external to the editor's contribution - as a coattrack violation than as OR. That's not say that it isn't OR, but rather that WP:COATRACK is more useful in terms of resolving a dispute over whether to include in these situations where there is no OR in the editor's additions, no OR in the article apart from the editor's edits, and you only get OR by combination. I'd grant that it's not a more useful approach in this United Nations case but I believe it is in cases where there isn't a coatrack issue.

Let me give you a hypothetical example of the issue: in a RS primarily about Indonesia it's noted that Obama lived in Indonesia. I add this to the Obama article. Somebody decides that this detracts from how red, white, and blue this politician is and deletes my addition. WP:OR is a deletionist's standby, and they justify the deletion by quoting "you must be able to cite reliable, published sources that are directly related to the topic of the article" and the topic of the article here is Obama, not Indonesia, never mind that Indonesia and Obama are indisputably connected. I protest, asking the party alleging OR to point out what is being said that is above and beyond what it is being cited. "That Obama is less than 100% American," is the response, with this "un-American" contention being "conclusion C" in "If one reliable source says A, and another reliable source says B, do not join A and B together to imply a conclusion C that is not mentioned by either of the sources." I then retort that the conclusion that Obama's "less than 100% American" is not necessarily reached by my edit, the RS material simply implies what it implies without steering readers in any particular direction not warranted by the material. Readers may be invited to consider the possibility of relatively less "American" - and that may be my particular take on the information - but readers are not being led to that conclusion by neutrally reporting the facts. To which I get the response that it doesn't matter if that's the conclusion that is in fact "reach"ed, it only matters if it is "implied", and in any case the case is closed because OR policy indicates two hoops both be hurdled, drawing the fact about Obama from an RS being merely the first and the RS spending more time talking about Obama than Indonesia (or rendering explicit possible implications like being less than all-American) being the oh-so-critical second.

You see the problem here? Almost anything can be "implied" if one is looking to read in the implication one wants to read in. I think we could agree that pointing out that Obama lived in Indonesia is not a coattrack, never mind what else the cited RS talks about. But OR policy as it is written right now is so broad in terms of how much OR it identifies that the material gets deleted. "To demonstrate that you are not adding OR, you must be able to cite reliable, published sources that... directly support the material being presented" would correctly place the burden of proof. But what we have instead is an additional hurdle for inclusion, namely, that the sources cited be "directly related to the topic of the article" which I think fails to resolve the issue, instead generating an argument over whether it is good enough for X% of the source to be "directly related" and over how to define the "topic" or "directly related" (e.g. whether A being directly related to B which is in turn directly related to C shows a sufficient relationship between A and C or not). In my view, the burden should be placed on the party alleging OR so that this policy is more consistent with the instruction in WP:UNDUE to "Remove material only where you have a good reason to believe it misinforms or misleads readers in ways that cannot be addressed by rewriting the passage..." Does the material misinform or mislead readers? No? Then presumptively not OR in my view. What OR should provide, and only provide, is an explanation for why it misinforms or misleads (the reason justifying exclusion).--Brian Dell (talk) 00:14, 12 January 2015 (UTC)

Did you have a specific edit in mind, or were you just venting? Blueboar (talk) 11:41, 12 January 2015 (UTC)
I reckon I got pretty specific in that hypothetical I detailed above. I run into variants of it most of the time I get accused of OR, and I think it's a problem, not a single, specific, one-off problem, but a general and recurring one. Here's the executive summary: almost any exercise of editorial judgment, something that's basic to our job as content builders, can be potentially (mis)construed as OR. Pretty much everything you do to an article, for example, is going to "imply" some change to some "conclusion" somewhere to at least some degree, even if just at the overall article impression level. The United Nations example shows how OR can misinform or mislead. What's needed is a discussion of material that neither misinforms nor misleads, since that's where reasonable people can and do disagree. WP:COATRACK is an example of something that's more concerned with making the point clear than with overbroad abstractions.--Brian Dell (talk) 04:50, 15 January 2015 (UTC)