Jump to content

Wikipedia talk:No original research: Difference between revisions

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
Content deleted Content added
Line 428: Line 428:


There is a discussion at [[Talk:Donald Trump#Airliner shot down]] that may benefit from editors familiar with this policy of No original research. [[User:Bob K31416|Bob K31416]] ([[User talk:Bob K31416|talk]]) 10:05, 6 November 2022 (UTC)
There is a discussion at [[Talk:Donald Trump#Airliner shot down]] that may benefit from editors familiar with this policy of No original research. [[User:Bob K31416|Bob K31416]] ([[User talk:Bob K31416|talk]]) 10:05, 6 November 2022 (UTC)

== "Secondary sources" extremely questionable! ==

The penetrating and annoying calls for "secondary sources" overlook the fact that these - if not illegally copied from some encyclopedia - are usually written by non-specialist journalists, and all too often with very little understanding and misleading interpretation of the facts. Perhaps you should take a look at the book "Factfullness".
Hans J.J.G.Holm [[Special:Contributions/2A02:8108:9640:1A68:31D1:F57B:3DCE:D718|2A02:8108:9640:1A68:31D1:F57B:3DCE:D718]] ([[User talk:2A02:8108:9640:1A68:31D1:F57B:3DCE:D718|talk]]) 09:18, 22 November 2022 (UTC)

Revision as of 09:18, 22 November 2022

– 15:34, 10 May 2021‎ (UTC)

Simple synthesis is not original research

I have noticed recently some editors claiming that simply compiling information from multiple sources and grouping it together by theme or obvious relationship could be SYNTH or OR. WP:NOTOBVIOUSSYNTH, Wikipedia:What SYNTH is not#SYNTH is not just any synthesis, WP:MNA, Wikipedia:What SYNTH is not#SYNTH is not a rigid rule, Wikipedia:What SYNTH is not#SYNTH is not obvious II, Wikipedia:These are not original research#Compiling facts and information, all say that this is not SYNTH. Is there a way we could clarify the policy to make it clearer that "simple synthesis" is not necessarily SYNTH/OR, and that there are acceptable bounds to table stakes assumptions. For example, one source says that Elon Musk had a secret relationship with Sergey Brin's wife, and another source says that Sergey Brin sought a divorce from his wife. We could reasonably list those 2 facts together. Even though perhaps this implies that the divorce was due to the secret relationship, we aren't stating that conclusion, we are simply grouping related information. Andrevan@ 22:32, 26 July 2022 (UTC)[reply]

When it comes to BLP and accusations, blatantly connecting those two facts without any other source making the connection is SYNTH and inappropriate on a BLP page. On the other hand if those just happened to be two "facts" were documented under a personal life section without any direct implication one begat the other, that would be okay, but we should be playing very carefully with inclusion of accusations in the first place. We are not celebrity gossip, which this stuff borders on, and it would simply be better to have more factual clarity before including
--Masem (t) 22:49, 26 July 2022 (UTC)[reply]
I haven't specifically edited the Elon/Sergey example, but let's assume for the sake of argument that it did have enough RS that it was graduating out of the gossip column. I disagree with the interpretation that BLP would prevent putting that information if attested in RS. Andrevan@ 22:51, 26 July 2022 (UTC)[reply]
We should not be including claims and accusations even against PUBLICFIGURES until we have two or more reputable sources independently confirming or supporting the claims. But even if both example statements have that support, we would have to be super careful to make sure our wording does not directly imply one resulted from the other, unless that inference is also reported by at at least one reputable source. We have too many problems with what may seem like low level synth that makes it really easy to create slander in wikivoice from unconnected facts. Masem (t) 22:56, 26 July 2022 (UTC)[reply]
So that's what I'm asking in this thread, given we have enough sources, is it SYNTH to report related facts together? We are not making an inference, and we are not explicitly concluding anything, but since both belong in the "personal life" section one after the other, there is no insinuation, it is simply grouping of related facts and information per all the policy links I cited. Given that we clearly disagree on this, I wonder if others agree with your interpretation, mine, or something else. Andrevan@ 22:59, 26 July 2022 (UTC)[reply]
Yes, if you are implying something a singular source doesnt support, and you can imply by placement, that is still OR. You are the one deciding those facts are related. A source has to do so. We dont just wink wink our way out of SYNTH. nableezy - 23:06, 26 July 2022 (UTC)[reply]
So you believe it would be original research to write in the article two sentences, one after another, about Sergey Brin's personal life, because some people might interpret that it is implying a causal effect when none was explicitly noted. Even though these statements are obviously correlated, though we are not claiming or seeking a causal connection. How do you square that with the policy statement that collecting related information under a common heading, Let the readers draw their own conclusions after seeing related facts in juxtaposition. is not original research? Andrevan@ 23:12, 26 July 2022 (UTC)[reply]
Well in this specific case theres a Wall Street Journal report linking the two. But I would still say that Musk supposedly having an affair with Brin's wife doesnt belong in Brin's BLP regardless. We dont need to document salacious details about a couple's marriage unless it becomes unavoidable by dint of its widespread coverage. And that has not happened yet. nableezy - 23:28, 26 July 2022 (UTC)[reply]
That's fair but I was trying to make a hypothetical example. Let's just say we didn't have the WSJ article linking the two explicitly, but it still had achieved widespread coverage. I'm asking about SYNTH not WEIGHT. You've made your view of it clear and that aligns exactly with Masem. I'm not urgently trying to add this incident to the article so let's maybe let some others opine. I'm curious if I am alone in my interpretation or if there is a difference of opinion within the community on what is an acceptable baseline assumption or if there even is such a thing. Andrevan@ 23:33, 26 July 2022 (UTC)[reply]
Id prefer a less salacious and BLP involved example tbh, but to take an example in a world I often edit in. I can easily source that the acquisition of territory by force is a war crime. I can likewise easily source that Israel has claimed territories it has acquired by force (the Golan and East Jerusalem). I cant, in the article Israel, have these lines next to each other without one source connecting the two: The acquisition of territory by force is a war crime. Israel has acquired East Jerusalem and the Golan Heights through force. nableezy - 00:04, 27 July 2022 (UTC)[reply]
That isn't a great example because the first statement reads like a non-sequitur, since it doesn't relate to Israel and is just a general statement. In my example, both statements were related to the subject of the article assuming Sergey Brin's personal life was the section. However, I think your example would be valid if we had the statements (taken from List_of_United_Nations_resolutions_concerning_Israel): As of 2013, the State of Israel had been condemned in 45 resolutions by the United Nations Human Rights Council (UNHRC). The United Nations General Assembly (UNGA) has adopted a number of resolutions stating that Israel's strategic relationship with the United States, a superpower and permanent member of the Security Council with veto power, encourages the former to pursue aggressive and expansionist policies and practices in the Israeli–Palestinian conflict. The United States responded to the frequent criticism from United Nations organs by adopting the Negroponte doctrine of opposing any UNSC resolutions criticizing Israel that did not also denounce Palestinian militant activity. Those two statements are separate, but related. Andrevan@ 00:10, 27 July 2022 (UTC)[reply]
That given this type of information is on the order of gossip mongering, simply wait until sources have corroborated details including or correlation. We are far too eager to push these gossipy items when it is policy from BLP to wait and see. Masem (t) 23:16, 26 July 2022 (UTC)[reply]
I'm specifically picking on the SYNTH issue about stating related facts together that some editors think implies negative or slanderous information. Let's assume that there are sufficient sources covering it. Can we cover them together by grouping only? If not, how do you square that with the policy? Andrevan@ 23:18, 26 July 2022 (UTC)[reply]
It doesnt have to be negative or slanderous to be an OR issue, thats a BLP issue. But let me ask you this. Do you actually think that X had an affair with Y's wife in January. Y filed for divorce in February does not imply that the divorce is due to the affair? Do you think that if no source directly makes that connection that we should be making it explicitly? Implicitly? If not explicitly, why is it acceptable to make it implicitly? nableezy - 00:16, 27 July 2022 (UTC)[reply]
I agree with you that a reasonable person might see two statements like that and assume that one caused the other, but I do not believe that it is OR or SYNTH to write those two statements together in the same section. Because it is the person making the conclusion and we aren't doing any crazy POV twisting, we are simply grouping related statements. We are simply reporting the facts per policy, juxtaposition without a conclusion is not OR/SYNTH. Sure, probably, they are related somehow, but not necessarily causal. Andrevan@ 00:22, 27 July 2022 (UTC)[reply]
Ok, but you didnt answer any of my questions besides possibly are we implying anything (I think you said yes?). nableezy - 00:26, 27 July 2022 (UTC)[reply]
I do not agree we are making the connection implicitly or explicitly, but a reasonable reader might assume that there is a causal link, and might reasonably infer they are correlated or connected. But we are not doing anything other than grouping related facts that relate to the same subject in the same way. So I don't think we are implying that there is a causal link, but it is reasonable to assume that they are both related to Sergey Brin's personal life and can therefore be grouped. Andrevan@ 00:40, 27 July 2022 (UTC)[reply]
I dont get how you can say a reasonable person might see two statements like that and assume that one caused the other and then follow that up I do not agree we are making the connection implicitly. One follows the other. nableezy - 01:09, 27 July 2022 (UTC)[reply]
Correlation is not causation - the two facts happen contemporaneously, were reported together, we don't know if A caused B, B caused A, or some third event C caused A and B. So we aren't implying causation. We do believe they are correlated - they relate to the same thing and to each other. Andrevan@ 01:16, 27 July 2022 (UTC)[reply]
That isnt the point though. If you feel that a reasonable person might see two statements like that and assume that one caused the other, then you are saying that it is implied. Implication meaning the conclusion that can be drawn from something although it is not explicitly stated. nableezy - 01:24, 27 July 2022 (UTC)[reply]
No, that's inference. An inference is a reasonably drawn conclusion that a reader may make. An implication implies intentionality on our part. Andrevan@ 01:32, 27 July 2022 (UTC)[reply]
define: implication - the conclusion that can be drawn from something although it is not explicitly stated. nableezy - 01:34, 27 July 2022 (UTC)[reply]
Reasonable people can assume that Sergey divorced his wife because of Elon, it's also possible that they got divorced first and then the affair, it's also possible that he cheated on her first and then they got divorced and then Elon got involved. We aren't implying why the facts are related, but a reasonable person might infer that one caused the other. They could also be wrong about that. Stranger things have happened. It's not strictly implied that one caused the other - that's what the reader is inferring. Andrevan@ 01:35, 27 July 2022 (UTC)[reply]
I really think we should stop using a real world example involving three living people. But from my reading of this, you seem to want to be able to hint at something that you cannot directly say, and in my view that is still inappropriate. nableezy - 02:10, 27 July 2022 (UTC)[reply]
We can come up with a different example. My point is that it's not "hinting" to juxtapose related facts. It's not contravened by policy that a reasonable person might be able to make a conclusion, so long as we don't make it. Within reason of course, I can think of ways that it wouldn't be valid. Andrevan@ 02:14, 27 July 2022 (UTC)[reply]
Disagree totally, especially with your example. Avoiding SYNTH is easy. If you can find reliable sources which connect two facts, its not SYNTH. If you can't, then it is SYNTH and should be avoided. If you want to include such "simple synthesis" without corroborating reliable sources, then you are in the wrong. -- Netoholic @ 23:33, 26 July 2022 (UTC)[reply]
How do you square that with the policy statement that collecting related information under a common heading, Let the readers draw their own conclusions after seeing related facts in juxtaposition. is not original research? Maybe the policy is unclear on the OR of implication because I can't find this. If that is the consensus of editors then so be it, but can you elaborate on the policy justification? Andrevan@ 23:36, 26 July 2022 (UTC)[reply]
BLP overrides NPOV as it has legal implications. If you cannot see how even simply putting these two statements next to each other causes a BLP problem, you really need to step back to understand the importance of BLP management on WP. Masem (t) 23:42, 26 July 2022 (UTC)[reply]
Please explain how, if we could separately source the statements, Sergey Brin's wife had an inappropriate relationship with Elon Musk, and Sergey Brin is seeking divorce from his wife, are a violation of BLP if taken together. Andrevan@ 23:45, 26 July 2022 (UTC)[reply]
You are creating the implication that the divorce was a result of the inappropriate relationship. While that may be Occum's Razor for why it is happening, we cannot make even such apparent leaps of logic when dealing with BLP. Maybe the affair was the last straw but there were other more pressing reasons why Brin sought a divorce. This is why it is far better to wait to have corroboration of accusations and around personal life claims like divorces. Now, obviously, it does look like the WSJ has made the connection here based on Elon Musk's page, so this is a null and void example since the situation appears resolved, but without the WSJ article, we would likely be best to avoid including yet-verified claims until we have that evidence even if they seem widely reported. Masem (t) 00:22, 27 July 2022 (UTC)[reply]
I agree we cannot make that leap, but that's why we aren't. You made the leap. The reader might reasonably assume that, but we aren't. Andrevan@ 00:23, 27 July 2022 (UTC)[reply]
I think thats too cute by a smidge. Almost Trumpian, a lot of people are saying level. Im not saying, but a lot of people are talking about it. nableezy - 00:27, 27 July 2022 (UTC)[reply]
Trumpianism would be making an unsourced statement, and then claiming it was a rumor that he heard, when he really made it up himself. I'm talking about adding 2 completely true, sourced statements to an article, and whether it would be OR/SYNTH to list them in the same section near each other because they both relate to a related topic, when that might create some implication in readers' minds that the events are connected even though we did not say they were. Andrevan@ 00:33, 27 July 2022 (UTC)[reply]
I said almost Trumpian, the part thats similar is the hinting at, suggesting at, but being able to say hey I didnt say that. Like I said, too cute by a smidge. nableezy - 01:37, 27 July 2022 (UTC)[reply]
"Suggesting" and "hinting" also go with the stronger usage of "implying" but that all connotes some expressive intent on our part, when we are simply juxtaposing related info and letting the reader decide what they think or how they want to conclude/interpret, per the policy links I quoted. I do think "almost Trumpian" is kind of a personal attack. It's implying we are trying to smear someone. Who exactly are we smearing by reporting that Sergey got divorced? His wife? Elon? Andrevan@ 01:50, 27 July 2022 (UTC)[reply]
Youre reading something into that phrase that isnt there. I said up above It doesnt have to be negative or slanderous. Its the hinting at something but still trying to claim I didnt say it that Im describing like that, but sure if you take that as an attack, then lets just call it inappropriately trying to run around the OR policy by making implications while maintaining a veneer of plausible deniability. nableezy - 02:14, 27 July 2022 (UTC)[reply]
We are supposed to play very much safe and middle ground with BLP. I agree that there may be a handful of readers that will not make that implication, but there are definitely some that will, and that's why we should avoid it.
Also, you are wrong that these are "true" statements. They are verified to the RSes making the claim, but (at least at this time) the only thing that is true is the divorce happened. We still do not have "truth" related to the affair. So it is not the case of simply just putting two true statements next to each other. If the affirm was 100% confirmed, and the divorce 100% happened, it would be hard to argue that those two statements can't be placed next to each other even absent any suggested connection. It would be like a case of "Country X's economy sank 2000% in 2020. In late 2020, President of X was voted out of office." - it would be implied that the economy was the cause for X losing, but as these are both 100% statements with truth behind them, we can connect them without the concerns of the above. Its when we're dealing with accusations and other yet-proven statements that it can be a problem. Masem (t) 00:46, 27 July 2022 (UTC)[reply]
Mmm, ok. So you believe it would be OK if the two statements were completely verifiable and fully sourced for truth in wiki voice, and the case where it is not OK is when one is simply an allegation that hasn't been proven. Your BLP argument was that putting a true statement next to an allegation may lend credence to the allegation. Is that what you mean? So if they are both unambiguous facts that are related, we can list them together, but not if some of the statements are controversial or are attributed to opponents/critics. Is that fair? Andrevan@ 01:01, 27 July 2022 (UTC)[reply]
Masem, I would call your example of “the economy falling and President X losing the election” a classic SYNTH. Sure, a bad economy can often lead to an election loss… but not always. There could be a dozen other factors that are not mentioned. We can not imply that X lead to Y unless we have a source that directly connects X and Y. One way to tell if two factual statements, placed near each other, are forming an original conclusion - just switch the order in which they are presented. If doing this changes the conclusions the reader will form from the facts being in proximity, you have engaged in SYNTH. Blueboar (talk) 01:24, 27 July 2022 (UTC)[reply]
Depending on the context, I think those two statements are interchangeable. For example in a background article about the election, we might want to write some background details about the economy or the pandemic to set the scene, and then say what happened during the election, especially if those election issues were relevant during the campaign. We can't say the bad economy caused the bad election but we can talk about the economy being something that came up. We could also state the results first, and then talk about the conditions and the background. It's not SYNTH to talk about the state of the economy and the economic issues from the campaign, we know that they are related to the election, even though we don't know a proximate cause of the results. But a reasonable reader might assume that the economy was a main reason, and we don't need to try to avoid that if it's going to happen due to the proximity of factual sourced info. Andrevan@ 01:41, 27 July 2022 (UTC)[reply]
"Background" article sections are one of the easiest places to slip into SYNTH. If particular facts are appropriate background, then it should be easy to find attributable reliable sources that offer up those facts as background. The danger is in editors adding in facts that they believe are background, creating an OR/SYNTH situation. -- Netoholic @ 02:25, 27 July 2022 (UTC)[reply]
So if it is the position of editors commenting that there's no such thing as "simple synth," how do we interpret the policies I posted at the top of the thread? Can we give an example of what is "allowable not-OR synthesis"? Or such a thing doesn't exist? Andrevan@ 02:45, 27 July 2022 (UTC)[reply]
One example, I think, of "allowable" synthesis is in list articles. For example, List of unusual deaths has a defined inclusion guideline, but there is no reliable source that includes all of the list's entries (some external lists might include some of them), so that collection is a Wikipedia collection, more exhaustive than any one external source is. Some of the policies you posted are really just essays, so are one interpretation of the main WP:NOR policy. Some editors wish to see softened guidance, some prefer stricter. I stand on the side of preferring external reliable sources do the synthesis, and us just citing those sources and not going beyond them. -- Netoholic @ 03:08, 27 July 2022 (UTC)[reply]
True, that's actually right, my bad. Masem (t) 02:16, 27 July 2022 (UTC)[reply]
On the topic of synthesis (which is in the WP:NOR policy), you also have to connect it to WP:NPOV. "Undue weight can be given in several ways, including but not limited to the depth of detail, the quantity of text, prominence of placement, the juxtaposition of statements, and the use of imagery". So it's absolutely possible to imply something that isn't verifiably true or neutral, just by juxtaposing statements.
Let's say we were talking about a political candidate who dropped out. A reliable secondary source is summarized as: "The campaign found that they were slowly running out of money. On August 5th, they announced they were suspending their campaign."[1] A wikipedia editor comes in and inserts a statement. "The campaign found that they were slowly running out of money.[1] Journalists also criticized the candidate's debate performance.[2] On August 5th, they announced they were suspending their campaign."[1] All those statements are verifiable, but we've substantially changed the meaning to something that isn't really stated by the secondary sources. (You could imagine how malicious an editor could be with selecting what statement to insert, even a verifiable one.)
The policies are mean to be taken in their collective spirit. Most WP:SYNTH issues will at least raise a potential issue with WP:VERIFIABILITY, and even WP:NPOV in some cases. The point of WP:NOR is that editors are not supposed to be creatively assembling facts to present novel ideas, comparing or grouping things that no reliable source has found. Wikipedia builds articles based on a WP:SUMMARYSTYLE. It's not an easy line to define, but it's something we have to watch for. Shooterwalker (talk) 15:53, 27 July 2022 (UTC)[reply]
But let's look at this statement: "The campaign found that they were slowly running out of money. On August 5th, they announced they were suspending their campaign."[1] I think that statement is not OR/SYNTH because it's a reasonable assumption that the campaign's finances are related to how long the campaign can stay around. However, according to the interpretation of some editors, this would be OR/SYNTH unless we explicitly found a source saying that the campaign shut down due to their financial situation. Andrevan@ 15:56, 27 July 2022 (UTC)[reply]
I know it's hard to talk about hypotheticals. But in this hypothetical, the source is making the connection. (Though I think there's always a valid argument to be had about whether we are summarizing the source properly.) It becomes a much bigger issue when you insert the middle statement, and sourcing it using a [1] [2] [1] scheme. Like I said, it's a hard line to define, but the policy exists to promote discussion about cases exactly like this. Editors should always ask if we implying something that isn't verifiable in the sources. Much stricter standard if it involves WP:BLP and/or controversy. Shooterwalker (talk) 16:01, 27 July 2022 (UTC)[reply]
I agree the [1][2][1] example in this case feels problematic, but what about a situation where source [1] discusses the campaign's finances only, and source [2] discusses the campaign being suspended only. I contend it is not OR/SYNTH to group the two statements sequentially with 2 sources Andrevan@ 16:49, 27 July 2022 (UTC)[reply]
No… that is precisely SYNTH. You are taking two statements and linking them to imply a conclusion that neither source states. A+B= C(implied). Blueboar (talk) 17:05, 27 July 2022 (UTC)[reply]
This is what is a major issue with our writings nowadays...editors want to craft articles around narratives they think exist on topics, but really are narratives of their own creation. Thus we get SYNTH like this where two disperately sourced statements are written together in a manner that implied one follows or connects to the other that fits their personal narrative but one not necessarily supported by sources. Same type of argument on the NPOV page about include votes from lawmakers. It is why we really need to get our of writing on immediate current events and instead wait until later covered by RSes. Masem (t) 17:19, 27 July 2022 (UTC)[reply]
I really don't agree. It's common for campaigns to run out of money, and it's common for them to be suspended. I do not think we are making an untoward, original research, novel synthetic implication by listing them together, even though a reasonable reader might assume that the campaign suspended due to running out of money. It's just a given in politics that "the campaign ran out of money" and "the campaign was suspended" are related statements that go in sequence. We aren't making a conclusion or a connection beyond the obvious one. Andrevan@ 18:33, 27 July 2022 (UTC)[reply]
I want to echo the other editors here. It's hard to discuss hypotheticals, but the [1][2][1] example is very likely to be WP:SYNTH. The [1][1][1] is less likely to be a problem, but it's always possible that someone cherrypicks statements from a source, and juxtaposes statements to imply something that isn't there in the source. That's why the WP:SYNTH policy exists, to remind editors that we're not supposed to draw comparisons / sequences that aren't there. Each case will need to be discussed by a consensus of editors until you have something that is plain from the sources, without any POV pushing or new ideas. Shooterwalker (talk) 18:36, 27 July 2022 (UTC)[reply]
I absolutely agree. It's case-by-case and a consensus of editors will decide, POV pushing or novel ideas are never good. I think Shooterwalker you are pointing out that there DO exist situations of "simple synthesis" as per the policy, which is distinct from the position that some editors are taking that "any synth is OR." Andrevan@ 18:39, 27 July 2022 (UTC)[reply]
The hypotheticals may be getting in the way of the discussion. A lot of the time, I will personally go through a reliable source and pull out what I think are the most important points, and arrange them into a sequence that is logical for readers. There is no intention of synthesizing a new idea. But someone may come along and say "actually, you make it sound like these two things are connected, and the source doesn't explicitly say that". Assuming they're right, I have two choices. One is that we rephrase until we find something that is closer to the original source. The other is that I find another reliable source to support the point that I think is being made. The point is that it's something we have to be sensitive to, but there will always be some grey area between summarizing and synthesizing. Shooterwalker (talk) 18:43, 27 July 2022 (UTC)[reply]
I absolutely agree. "There will always be some grey area between summarizing and synthesizing." I wish the policy would say that if that is the consensus of editors that such gray area does exist and is nonzero, of course we must be careful. Andrevan@ 18:45, 27 July 2022 (UTC)[reply]

It ain't in the policy

User:Andrevan has asked several times about how people square the actual OR policy with a statement that ...isn't in the policy. It isn't in any policy or guideline.

The words in question are a partial, out-of-order quotation from the "explanatory essay" Wikipedia:These are not original research. In full, they say:

  • Comparing and contrasting conflicting facts and opinion is not original research, as long as any characterization of the conflict is sourced to reliable sources. If reliable references cannot be found to explain the apparent discrepancy, editors should resist the temptation to add their own explanation. Present the material within the context contained in reliable sources, but avoid presenting the information in a way that "begs the question". An unpublished synthesis or analysis should not be presented for the readers' "benefit". Let the readers draw their own conclusions after seeing related facts in juxtaposition.
  • Identifying synonymous terms, and collecting related information under a common heading is also part of writing an encyclopedia. Reliable sources do not always use consistent terminology, and it is sometimes necessary to determine when two sources are calling the same thing by different names. This does not require a third source to state this explicitly, as long as the conclusion is obvious from the context of the sources. Articles should follow the naming conventions in selecting the heading under which the combined material is presented.

The first snippet that Andrevan quotes was added in this edit, and the second was added in this edit. Neither of these editors are especially active these days, and there have been few detailed discussions about them. However, I feel like Andrevan's is quoting them out of context. Where are the "conflicting facts" that readers should draw their own conclusions about? I don't see any in these discussions. Where are the "synonymous terms" that need to be lumped together in these discussions? I don't see any. This is not especially relevant. WhatamIdoing (talk) 22:32, 27 July 2022 (UTC)[reply]

Fair, it's true that what I quoted/linked to there were essays and not policy, and I apologize for not distinguishing clearly between policy and essays. The essay notes I have been citing have stood for many years and I think they still should be thought through, though if consensus has changed, then it has changed. I believe those essays do reflect the views of 2007 accurately. The essay "What SYNTH is not" is also prominently linked from the WP:NOR main page. It's also true that in the discussion above, everyone except me has argued that an implication that could be reasonably inferred, is still WP:SYNTH. So noted. But there are policies that apply: Wikipedia:Neutral point of view#Making necessary assumptions, for example. I do think that regardless of what you think of my argument, there is still some gray area where "simple synthesis" is not original research. However, perhaps I failed to make a compelling argument or illustrate with reasonable examples. Andrevan@ 22:42, 27 July 2022 (UTC)[reply]
I agree in principle that some sorts of limited, "simple synthesis" statements will not violate WP:OR. One could, for example, combine a source about the population of Canada with a separate source about the population of the US, and a third source that has the population of Mexico, and produce an estimated number of humans living in North America. This is simple enough that it is not OR. But the example given above is not so simple, and it could easily be abused, especially for writing about politicians ("He has been condemned by <long list of organizations>. He voted for the Freedom and Liberty Act" – hmm, makes it sound like these organizations opposed that bill, doesn't it?).
I think that a more productive line of inquiry for right now might be "what should I do?" rather than "does it technically violate a rule?" To give you a slightly less fraught example, imagine a substub that says "Alice was an HIV activist known for promoting equitable access to healthcare. She died in 1996."
This might improperly imply that Alice died of AIDS. But what to do? Well, one thing is to find information to expand the article in ways that clear up the potential misconception: "Alice (1903–1996) was a nun who founded an HIV hospice in 1984 and became an HIV activist. She died of breast cancer at the age of 93." Another is to separate it, so that each of the two sentences is in a different paragraph or a different section. This little gap can sometimes help, at least a little bit. (I know that Miss Snodgrass told you that a paragraph can't have just one sentence, but that's not the English Wikipedia's rule.) A third option is to leave out the less important information (Alice's death isn't relevant to her notability). There might be more options. WhatamIdoing (talk) 00:56, 28 July 2022 (UTC)[reply]
Makes sense, thanks for the thoughts. [01:00, 28 July 2022 (UTC)] So WhatamIdoing would you agree that the following statement is not problematic synth: "Joe Colonialperson was a moderate abolitionist on the issue of slavery according to his writing[good secondary source citation]. Joe Colonialperson owned 5 slaves[cited to a github account from Washington post which has a table of slaveowners]." These events are connected but we aren't making any implication as to how the first impacts the second or vice versa, but they are both related to the issue of slavery. We're "compiling facts" "grouping under a related heading." Or am I synthing OR now? If so, what is the OR conclusion other than a general sense of the "yet"ness of the 2nd statement. Am I making sense or totally off the reservation here? Andrevan@ 02:28, 28 July 2022 (UTC)[reply]
You are creating the impression that Colonialperson was hypocritical (wanted to end slavery but owned slaves) even though that's not said. That's a problem. Masem (t) 02:54, 28 July 2022 (UTC)[reply]
So keeping in mind WhatamIdoing's earlier comment, "what to do" now? Do we have to delete the 2nd statement about Colonialperson? Andrevan@ 02:55, 28 July 2022 (UTC)[reply]
Understanding the nature of slavery and abolition at the time, either you find secondary sources that speak to the compounded statement, or in this case, the fact that he owned slaves seems extremely trivial when considering that being an abolitionist was likely more important. Masem (t) 03:09, 28 July 2022 (UTC)[reply]
So by your logic, even if there is a source that is reliable showing that Colonialperson owned 5 slaves, you believe it is inappropriate SYNTH to add this fact to the article at all, because it would suggest the conclusion that Colonialperson was a hypocrite, since he was also stated to be an abolitionist. I really can't agree with that, I think we have a responsibility to report this information if it's properly referenced and accurate information. If that is the consensus of editors obviously I will abide by it, but it seems like a wrong and bad policy to me. We shouldn't whitewash facts and history for fear of implying something negative about a person, especially one that died hundreds of years ago (hypothetically). Andrevan@ 03:14, 28 July 2022 (UTC)[reply]
No we don't have that responsibility. We are here to summarize sources, not repeat every detail they give. I would have a real hard time to believe that if there are a fair number of sources reporting on both aspects here, that none of them cover the intersection of those ideas (that is, to explain why he owned slaves if he was an abolitionist). But I can see the situation where there's plenty of coverage of the abolitionist aspect but where the slaving owning was only in 1 or 2 sources. In such a case, we'd just not include that trivial information.
Just because the person is long dead, it still is a SYNTH issue which applies to any topic. We have to be fully aware of this throughout any writing, but moreso when it is a BLP. Masem (t) 03:26, 28 July 2022 (UTC)[reply]
Consensus may be that you are correct, and I am mistaken, but permit me to argue for the lesser interpretation of SYNTH. If consensus is that I am wrong, I will abide by that.
Given the scenario which we agree is possible, that a few sources cover that Colonialperson owned slaves, but the vast majority of more "classic" sources like mainstream textbooks, generally just talk about Colonialperson an "abolitionist." That means we should reduce the WEIGHT of the Colonialperson-slavery info. I don't agree with the argument to leave it out altogether or even that it is a form of original research to include it. Original research is about pushing novel ideas and new conclusions. There are and should be reasonable examples of "allowable synthesis," namely applying simple logic, organization, such as basic calculations, grouping, or substituting synonyms etc. I would argue this also extends to juxtaposing related facts. It makes sense that we should take care to avoid suggesting new ideas by implication, but you are actually arguing that any inclusion of this fact constitutes original research due to unavoidable implications of the coexistence of two facts.
I do think we have a responsibility to report verifiable, reliable information, about living or dead persons, and especially if they are public figures. I think we have a responsibility to NPOV not to try to preserve the good image of people just because something unpopular might be true about them. Similarly, we need to, for NPOV, appropriately characterize the opponents' views of a person (writing for the opponent) or qualify/attribute relevant minority views. I don't agree that the original research policy was designed to prevent editors from doing any kind of logical organization or providing additional verifiable information that would be educational and informative, and is encylopedic.
The slavery example is somewhat contrived but also realistic. Summarizing sources does not mean being selective about including relative, pertinent, sourced information, because it only appears in a small fraction of the sources. That is an argument to reduce WEIGHT, not to eliminate the information. And the idea that it is original research because the reader might judge the subject of the article harshly for their actions, just by knowing 2 facts, goes beyond simply the idea that we are consciously creating an implication by placement. Andrevan@ 03:38, 28 July 2022 (UTC)[reply]
WP:NOT and WP:V - there is no assurance that every bit of information that has been reliably published needs to be captured in an article. And if the tradeoff of not including a minor point about owning slaves is to avoid implied SYNTH, we're going to prefer the latter. SYNTH starts, immediately, Do not combine material from multiple sources to reach or imply a conclusion not explicitly stated by any source. and that's exactly what the situation is here. It's pretty clear we absolutely take steps to avoid potential implied conclusions. And if that means we need to drop less-covered information, so be it. We are a summary source and absolutely do not need to include every detail that is published.
Yes, organization of information is not necessarily original research, but the wrong approach to organization in a manner that is not similar to how the topic is already covered can create synthesis and original research, as well as create NPOV problems. (eg this is why we try to have editors avoid "controversy" sections on topics and instead work that commentary into the overall article, as such controversy sections often bring in lots of this synthesis-like OR by how statements are grouped.) Masem (t) 04:47, 28 July 2022 (UTC)[reply]
I agree that not all information should be included necessarily, but suppose there was a consensus of editors that the WEIGHT/NPOV was due, I am concerned about the SYNTH issue in that case. Andrevan@ 19:51, 30 July 2022 (UTC)[reply]
The slave owner example is interesting but we would have to be very careful with such information if we can't find a RS that explains the apparent discrepancy. Consider a few possible explanations; 1. They are synically hypocritical (do as I say, not as I do). 2. They are actually torn by it but still own the slaves because they feel it is needed at this time (I suspect Jefferson was in this group). 3. They inherited a plantation that came with slaves. Once the property transfer was in order they freed the slaves but they were the legal "owner" for the time between inheritance and freeing them. 4. They were a slave owner, decided it was wrong and then became an abolitionist (John Newton). For the given facts these associated contexts are very different and we need to be very careful about presenting the given facts in a way that implies which, if any, of the contexts is true. If the only source for owning the slaves is a primary source then it probably isn't due. I do get that modern sensibilities considers slave ownership to be very notable. However, if WP:RS about the subject don't note the slave ownership then we shouldn't either. Else we are engaging in OR to establish relative WEIGHT. Springee (talk) 02:42, 29 July 2022 (UTC)[reply]
I agree that those 4 possibilities exist, but I don't agree that we need another source to tell us which it is. We can leave it unknown and allow the reader to make their own conclusions. We should avoid introducing SYNTH but just adding the information, provided it had appropriate sourcing and sufficient weight, is not default SYNTH IMHO. Andrevan@ 19:50, 30 July 2022 (UTC)[reply]
"You are creating the impression that Colonialperson was hypocritical": Not necessarily. Consider the similar (in logical structure, not in moral weight) "Joe Restaurant advocates for legally prohibiting smoking in restaurants, but allows smoking in his restaurant because it is currently legal and customers want it." This was a pretty typical stance in the US restaurant industry a few decades back: they didn't really want smoking in their restaurants, but they also didn't want a third of their customers leaving for places that permitted it, while only a tiny number of new customers would come to the lone smoke-free restaurant. It's not always hypocritical to advocate for a systemic change without making individual changes in advance. WhatamIdoing (talk) 01:43, 29 July 2022 (UTC)[reply]
In your example, this is presuming that "because it is currently legal and customers want it" is sourced. Without that, the phrase returns to synthesis. Masem (t) 01:47, 29 July 2022 (UTC)[reply]
I agree Andrevan@ 19:49, 30 July 2022 (UTC)[reply]
Without a source, none of it could appear in an article. However, I give this an example of the logical mistake. Compare:
  • "Joe Restaurant advocated for legally prohibiting smoking in restaurants, but allowed smoking in his restaurant"
  • "Joe Colonialperson advocated for legally prohibiting slavery, but owned 5 slaves".
Hypocrisy isn't the only reason why someone would advocate banning a system they're exploiting. WhatamIdoing (talk) 22:21, 30 July 2022 (UTC)[reply]

On combining sources

I think that when editors are deciding whether to combine sources to get to a conclusion, I would like them to consider four things.

  1. Are the sources really comparable? (Are they equally rigorous? Do they use the same methodology? Are they definitely talking about the same place, or really comparable places? Are they definitely talking about the same time period, or really comparable time periods?)
  2. Does combining the sources get to a conclusion that's helpful or interesting to the reader?
  3. What's our purpose in combining them? Are we trying to lead readers towards a particular conclusion?
  4. Why can't we find a reliable source that's combined them already?

There are certainly cases where all four questions have good answers.—S Marshall T/C 19:22, 30 July 2022 (UTC)[reply]

These seem like valid questions, but the case that I think maybe we lack clarity on or we aren't properly communicating on (or maybe I am just wrong on this, that is indeed possible), is when combining sources simply for organizational purposes, making reasonable assumptions about the meaning of their content, and to show a change over time or otherwise construct good writing and communication, based on table stakes assumptions about the context of the information.
For example to question 3, the answer is emphatically no in all the cases I have offered, and I think explicitly using them to get to a conclusion does make it synth, where there seems to be a gray area is in what can be considered inappropriately implied. To question 2, I don't think presenting related facts that show different or conflicting views at different times, imply that the person is a hypocrite and is therefore SYNTH. The person's views may have evolved, and we don't necessarily know why. Or to WhatamIdoing's point, sometimes there is a strategic or pragmatic reason to say A and do B. SYNTH is about original research and advancing new ideas, not the basic facts and assumptions needed to organize an article and write effectively about history or science or whatever.
For example, totally made-up and historically nonexistent: Congressperson Abraham Adams supported the General Sedition Act in 1863.[1] However by 1867 Adams told the New York Gazette that legislation pertaining to sedition was a violation of the right to free speech.[2] Adams voted against the 1877 Alien and Sedition Acts.[3] To your question #4, sometimes we just don't have that source which makes the connection. That doesn't mean they aren't still connected, but I contend that in order to fit these facts together, it's reasonable to know that if source 1 says Adams supported the act, source 2 shows that he opposed the act and is later in time, and source 3 says he voted against the act which is even later still, that this is a normal situation in politics. We aren't unduly implying that Adams' actions were hypocritical or creating original research. It is known in politics that if you take a position on an issue, and vote for or against that issue, those facts are related by virtue of the basic way the system and actors work. Andrevan@ 19:35, 30 July 2022 (UTC)[reply]
I'm afraid my ignorance of US politics is profound, so please forgive my failure to follow your point. How does this example combine sources to reach a conclusion?—S Marshall T/C 20:10, 30 July 2022 (UTC)[reply]
I agree, it does not "reach a conclusion," but editors above and in other discussions have argued that such description would be SYNTH/OR. Andrevan@ 20:12, 30 July 2022 (UTC)[reply]
Your example here is not like the previous examples, because of the source where Adams explained his change of mind, which thus is not an issue with synthesis. It would be synth if that statement was not available, and resulted in a implicit suggestion that Adams was hypocritical or the like, which we cannot do. The previous examples were similar - two facts were presented that presented contrary statements but without any input of why that contradiction existed, so it made the person written about appear hypocritical. Masem (t) 20:41, 30 July 2022 (UTC)[reply]
I am glad we agree this series of statements is OK, but Masem, source number 3 doesn't say why Adams voted against the bill, so by your earlier logic, if we create this sequence, we are implying that his 1867 statements, are the reason for his 1877 vote, even if this source didn't tie them together explicitly. I contend this is "fair synth" but you earlier, led me to believe you did not. Perhaps, I am mistaken. Or to use more of a real example: "Marjorie Taylor Greene was critical of NATO.[1] Marjorie Taylor Greene stated that the US should leave NATO[2]. Marjorie Taylor Greene voted that Finland and Sweden should not join NATO.[3]" Andrevan@ 20:44, 30 July 2022 (UTC)[reply]
I am saying that the key line in your example is the 1867 statements that explain the change of mind, which gives reason of why it is fine to mention, in sequence his earlier "for" vote and then his later "against" vote. Without his statement, putting the two vote stances next to each other creates synth in the implication of hypocritical nature. In the case of the MTG example, if you did not include the first statement that gave her stance on NATO, then putting the other two sets of sources together implies her being critical, which again we cannot do. But the first statement gives the necessary OR from RSes that then is fine to link them. Masem (t) 21:05, 30 July 2022 (UTC)[reply]
Masem, I think you're assuming far too much. Most people would not expect us to protect politicians against all hints of hypocrisy – the profession is rather known for that quality, or at least for prioritizing party loyalty above principles – but we shouldn't be fixated on such concerns. In this case, the politician might not have changed his mind at all. The difference could have been the bills in question (some forms of seditious speech are not legally protected free speech in this country; perhaps one bill infringed on free speech and the other didn't), or the difference could have been party politics (perhaps the real difference in the two bills had nothing to do with sedition or free speech, but an unrelated "Christmas tree" clause that the opposition party hung on it), or the difference could have been circumstantial (politicians are more likely to vote against free speech during a war), or the difference could have been the attitude of the voters, or any number of things.
The problem here is not "He voted for Bill #1, and he voted against Bill #2". If there is a NOR problem here, it is in claiming that the two votes were inconsistent (the "However" language) and claiming his second vote was motivated by free speech concerns. WhatamIdoing (talk) 22:35, 30 July 2022 (UTC)[reply]
A lot depends on context too. Let's say this lawmaker was a Democrat, and the way the votes fell were against normal democratic patterns. (eg say #1 was for more military spending and #2 against expanded immigration). Now, those statements may be factual, but just their presences without any further explanation or context (given that we don't normally report how a lawmaker votes on every bill) would appear to be critical of these voting patterns. That's why we prefer editors to wait and work from secondary sources (not primary news stories) that provide analysis and context to better describe, in this case, the political positions of a lawmaker, so that we absolutely avoid the potential OR by trying to analyze ourselves. Masem (t) 23:10, 30 July 2022 (UTC)[reply]
If their vote on the bill was suitably referenced in a number of reliable sources with non-trivial mentions, I believe that vote could be appropriate, if it relates to their positions on the topic, i.e. the context. If their vote was somehow unique or different, if anything, that is equal or more reason to mention it. Breaking with one's party is often notable. Assuming already verifiable, reliable, and due weight for notability reasons, this idea that we're doing original research by listing the votes that people have taken, simply because someone might interpret that negatively or critically, is quite objectionable to me. Andrevan@ 23:19, 30 July 2022 (UTC)[reply]
I would expect that if many sources documented these votes that there would be context in those sources to explain why they are important, if they are specifically calling that lawmaker by name and not just mentioning the voting call. The secondary-like content of those news stories is necessary to include, not just the primary-factual details that, without the secondary information, are just data points that we as editors cannot connect. We have gotten really really sloppy on this type of writing overall on WP, far too much focus on prose-line style writing rather than looking for summaries and analysis with editors thinking they know what's best to include. That's why it is very important to know that improper synth can come from combining sources, particularly those that are only carrying primary information and not secondary analysis. Masem (t) 23:29, 30 July 2022 (UTC)[reply]
If you have a number of reliable sources with non-trivial mentions, you should be able to provide a sourced explanation of why these votes matter and what they represent. WhatamIdoing (talk) 04:47, 31 July 2022 (UTC)[reply]

Combining sources is necessary -- we rightly expect and require that articles will have multiple sources. Selecting which information from each source to include is necessary -- we are meant to summarize what the reliable sources say, not obsessively regurgitate every detail. So the wording of SYNTH needs to be quite nuanced. There's a lot, in the previous discussion, about how you can violate SYNTH through the sequence of ideas -- the order in which you say things -- and I think that's true but when we're thinking about how to write policy, we need to let editors write articles. Most articles should present related facts in chronological order. It should not be a SYNTH violation to do so. In fact, I think SYNTH should be written conservatively with a lot of nuance.—S Marshall T/C 08:55, 31 July 2022 (UTC)[reply]

I fully agree with S Marshall on this 100%. Huggums537 (talk) 09:34, 31 July 2022 (UTC)[reply]
While it is important that articles should provide a good chronologic order to events relative to a topic, our goal should still be focused on a long-term, looking-back view of the topic, and that may mean that a strict chronologic order may not be ideal. For example, to use lawmakers again, a strict chronological order of how they voted is far less helpful and more prone to synthesis than working from the lawmaker's key stances on issues and what they in their government role to support or oppose that. Or another way to look at this is that we should be working to emulate how secondary sources structure content, rather than trying piecemeal too many primary sources which is where SYNTH can easily come into play. Masem (t) 12:58, 31 July 2022 (UTC)[reply]
It is the case, that much synth/notsynth is a matter of the way the content is written (often with use of implication), and that it is sometimes a cross-over with some ways of undue emphasis. -- Alanscottwalker (talk) 13:52, 31 July 2022 (UTC)[reply]
Agreed with this Andrevan@ 17:29, 31 July 2022 (UTC)[reply]

Lexical cohesion in sources

I couldn't find any guidance on lexical cohesion in the policy and started a thread at WP:NORN#Treating lexical cohesion in sources.

Looking at the archives, might've been better suited for this talk page. Would appreciate a wider input at the WP:NORN discussion. PaulT2022 (talk) 23:35, 31 July 2022 (UTC)[reply]

If so then NOR won't help much. NOR doesn't say you have to use the words the sources use. It says your articles have to mean what the sources mean.
In this case the sources mean "war crimes" and I think you can and should say that.—S Marshall T/C 07:08, 1 August 2022 (UTC)[reply]
I don’t disagree… however, a caution is required: Different people can read the same source, and interpret the words it uses as “meaning” very different things. Blueboar (talk) 13:23, 1 August 2022 (UTC)[reply]
Exactly. I don't advocate writing those words in the policy, although there's what might turn into an essay about it in my userspace.—S Marshall T/C 17:36, 1 August 2022 (UTC)[reply]
I do think an essay summarizing these discussions might be worthwhile, maybe as part of WP:NOTOR. Andrevan@ 20:31, 1 August 2022 (UTC)[reply]
An expansion of Wikipedia:These are not original research#Paraphrasing?
Etymological fallacy and Semantic change are also things that need to be addressed. The advice that seems most salient to me is that when Major Authority™ says that something's name (or spelling) has changed, then you should use the new name even if you are citing a source that uses the old name. For example, the older names for what's now called Intellectual disability should be replaced by the current name (all the usual exceptions apply: in direct quotations, redirects, etc.). Similarly, if a group says that an old term is inappropriate for some reason (a reason that is compatible with encyclopedic purposes), then editors should use the more appropriate/clearer/less offensive term. Here I am thinking of words like Miscarriage, which most major medical organizations and patient groups declared to be preferable to the older term spontaneous abortion decades ago. The older term sounds like women are just popping out for an abortion on a whim. I am not thinking of euphemisms ("lost his battle with depression") or obviously problematic terms ("Don't call me a thief; call me a professional property reapportioner"). WhatamIdoing (talk) 23:26, 25 August 2022 (UTC)[reply]
Also: When a variety of sources use a variety of words, then that's an excellent opportunity to employ elegant variation. I understand that in some languages, students are taught to find the best word and use it repeatedly, but that's not considered good writing style in English. If half your sources say "Auto racing" and the other half say "Car racing", then use both. WhatamIdoing (talk) 23:31, 25 August 2022 (UTC)[reply]
I agree about the elegant variation piece. However in other cases there is the issue of WP:COMMONNAME. Andre🚐 23:39, 25 August 2022 (UTC)[reply]
There have been several discussions about this recently. Last weekend, I was thinking about some potential advice and realized that it would be possible, by dint of someone quietly cherry-picking sources and then insisting on using whatever name is most commonly used in the currently cited sources for an article, to end up with an article that isn't allowed to mention its title after the first sentence. WhatamIdoing (talk) 16:17, 26 August 2022 (UTC)[reply]

Community research should be encouraged! Community research would unleash the potential of humanity!

Original research can go in a special box and still be based on every other Wiki policy and principle, for example: consensus and neutral point of view.

I think this policy page flies only because for the early days of Wikipedia Jimbo Wales sent a mailing list post and everybody else now hops on. Altanner1991 (talk) 04:50, 31 August 2022 (UTC); edited 05:09, 31 August 2022 (UTC)[reply]

Disagreed. No original research is a very important policy. It's about the idea that Wikipedia isn't a place for publishing original thought. It's a place for compiling secondary source references to support distillation of verifiable information into general purpose reference. Andre🚐 04:54, 31 August 2022 (UTC)[reply]
This has 0% chance of happening, but anyway: instead of a separate box, how about a separate website? We can call it Wordpress, Blogspot, YouTube, or Twitter. Crossroads -talk- 04:51, 1 September 2022 (UTC)[reply]
Well, I thought it was the greatest suggestion. Wikimedia is far superior. Blogs and social networking? No, their lack of collaborative potential renders them irrelevant. Altanner1991 (talk) 05:44, 1 September 2022 (UTC)[reply]
There are all kinds of things that Wikipedia is WP:NOT. Some of them are even good in the right circumstance -- democracy, databases, speculation, websites. But original research is outside the scope of an encyclopedia. Shooterwalker (talk) 05:03, 1 September 2022 (UTC)[reply]
I suppose a sister project would have been a more conservative suggestion. Altanner1991 (talk) 05:45, 1 September 2022 (UTC)[reply]

Yes, I might agree that the encyclopedia is best (at least, at this point in time) kept "neat", meaning no major deviations from the traditional "encyclopedia concept". But the idea of "community-led research" in a sister project is something I would find exciting. Altanner1991 (talk) 05:51, 1 September 2022 (UTC)[reply]

i'm n00b. what of the case of simple calculations that any wikipedia reader can verify for themselves?

asking for this: https://en.wikipedia.org/wiki/Talk:Fischer_random_chess#How_do_I_go_about_adding_statistics?

I propose to add statistics that I'll calculate myself (eg how often white wins vs black wins vs draw) and then people can verify for themselves but it'll take about 15 minutes to verify. is this original research? Thewriter006 (talk) 13:39, 28 September 2022 (UTC)[reply]

Yes it is, don't do it. If your results are valid, perhaps you can just search existing publications for the figures you came up with, and then cite those publications. Do not add unsourced material to the article based on something you came up with yourself, even if you are the author of the definitive work on the applications of statistical analysis to chess. Mathglot (talk) 23:14, 28 September 2022 (UTC)[reply]
how many minutes is the cut-off here: is 10 seconds acceptable? Eg 'White has about, on average, a 7% increased advantage in these 90 positions (Evaluation is 0.1913) compared to the remaining 870 positions (Evaluation is 0.1790).' There's no source for 7%, but there is for 0.1913 and 0.1790. And then you can calculate for yourself 7% in 10 seconds. So 10 seconds is ok but 15 minutes is not. Hmmmm...what's the cut off? Or is the 7% even O.R. too?
P.S. This is chess960 not chess. ;) Thewriter006 (talk) 08:09, 29 September 2022 (UTC)[reply]
The whole % calculation assumes these are linear ratio scales, which is non trivial. So while the calculation may (to some extent) be easy, the interpretation may be nonsensical (which is why it should not be added) E.g. it also makes no sense to claim that going from 32 Fahrenheit to 48 Fahrenheit is a 50% temperature increase - as becomes blatantly evidente when we use the Celsius equivalent (going from 0 tot 8.9 Celsius) which would amount to an infinite % of temperature increase). Arnoutf (talk) 15:08, 29 September 2022 (UTC)[reply]
Agree with Arnoutf. Yes, there is an evaluation of 0.1913. Yes, there is another evaluation of 0.1790. Yes, 0.1913 is about 7% greater than 0.1790. But if you're not an expert in writing about chess960, you won't know know what an evaluation is, and what are valid ways to compare one evaluation to another. Which is why a reliable source should be making the comparison, and we should cite the reliable source. Jc3s5h (talk) 15:36, 29 September 2022 (UTC)[reply]
  • When we say that simple calculations are not OR, we are talking about very basic arithmetic - adding two numbers together, converting feet into meters… things that the average 10 year old would understand. Statistical calculations are not that basic. When in doubt, cite a source. Blueboar (talk) 16:32, 29 September 2022 (UTC)[reply]
    Agree with Blueboar. Couldn't have said it better myself. Altanner1991 (talk) 03:24, 30 September 2022 (UTC)[reply]
    I think usually it should be allowable to use the kind of calculation the source intends be used. For example, if citing a table that was written with the expectation that the reader would interpolate between values, and the value being looked up is in between two tabular values, it would be appropriate for the Wikipedia editor to interpolate. Jc3s5h (talk) 17:33, 30 September 2022 (UTC)[reply]
  • I collegially differ from my colleagues' remarks above about WP:CALC, and I believe it should go much farther than they suggest. Mathematics is a lot like a foreign language. It is not needful that anyone should be able to understand your calculations. All that matters is that someone should be able to understand it. We have some really top notch mathematicians on Wikipedia who can verify the more rarefied calculations for you. And indeed, in practice, articles within the scope of WikiProject Mathematics do rightly allow some pretty advanced maths, because it's impossible to explain mathematics successfully without examples and we can't rip off examples from textbooks because of copyright.
For example, our article on Tensor product of modules has a footnote that reads:

First, if then the claimed identification is given by with . In general, has the structure of a right R-module by . Thus, for any -bilinear map f, f′ is R-linear

I wouldn't expect a humanities graduate to follow that. But I put it to you that it is a good and valid way of verifying the claim it makes, and my position is that it does and should fall within the scope of WP:CALC.—S Marshall T/C 18:28, 30 September 2022 (UTC)[reply]

Avoiding original research in determining superlatives

I have been having a discussion with others on Talk:Longest flights about how to verify "lists of superlatives." The case in question is a list of the longest flight currently operated by each type of aircraft. Since all commercial flight data is available on various commercial websites, in principle this information is just a case of sorting, which I suppose is a routine calculation as allowed by this policy. However, the number of flights is large enough that in practice this is done by a user running a script every week to scrape all of the flight data off of a flight data website and then sorting to find the longest flights. Is there an accepted way to cite these claims so that they are verifiable and do not fall afoul of the no-original-research policy? CapitalSasha ~ talk 13:36, 3 October 2022 (UTC)[reply]

Thats synth because it requires making assumptions (like, did the carrier st one point offer a longer flight no longer offered?l. Superlatives in WP's voice should always be taken as OR. Masem (t) 14:04, 3 October 2022 (UTC)[reply]
That was my initial thought, but it was pointed out that the same criticisms apply to lists like List of tallest buildings that seem to be well-accepted. (No actual source is provided saying that The Marina Torch in Dubai is the 77th-tallest building in the world.) CapitalSasha ~ talk 16:31, 3 October 2022 (UTC)[reply]
Except there, there are clear standards for how building height us measured and separate listing of these buildings relative to each other. Adding a new building to a well defined list like that as long as the standards for measurement have been set is not the same issue. Masem (t) 16:58, 3 October 2022 (UTC)[reply]
Standards for flight length have also been determined though. They are measured by great circle distance. FlyingScotsman72 (talk) 01:19, 5 October 2022 (UTC)[reply]

Let's finally split PSTS to its own page

This has come up time and again, and I think we should just do it. I realize that this will entail an RFC and probably some hand-wringing over whether the same words on a separate page still say the same thing. I get it; change is hard, and we want to get this right. But on the other side:

  • the only reason this was ever in this page is because we wanted to tell people that Wikipedia is not a primary source, so we'd appreciate if editors didn't just make stuff up themselves and stick it in articles (i.e., "original research", as in "Wikipedia is not a publisher of original thought", with the numbered list beginning with WP:NOT "Primary (original) research"),
  • whether a source is primary, secondary, or tertiary doesn't have much to do with whether a claim is verifiable and therefore not original research (nothing that's actually verifiable is a violation of OR),
  • the concept of PSTS is important to multiple policies and guidelines, not just this one. Actually, not even mostly this one. The words primary, secondary, and tertiary do not appear anywhere in this whole policy except in the one ===subsection===.

Looking it over, there will have to be a few changes, but they all seem surmountable. I might try to mock this up in the sandbox later, but so far, it looks like we'll need a new nutshell for the split-off policy (the existing one doesn't mention PSTS at all), and we'll need to decide whether PSTS should be called a "core" content policy in Template:Content policy list, or if it should be list in "Other", next to BLP and NOT. There's also the more mechanical matter of repointing various shortcuts, but that's easy.

It looks to me ike the lead of the current page won't need a single word changed, and it's possible that nothing else will either, except to copy the existing PSTS subsection to another page. The new page would probably benefit from a couple of introductory sentences.

@GregKaye, this is partly inspired by your comments above, so I'd like to know whether you see any problems with this. What do you (all) think?

WhatamIdoing (talk) 21:35, 14 October 2022 (UTC)[reply]

No.
WP:PSTS is the fundamental core of WP:NOR. Satisfying NOR requires a balance of primary and secondary sources, as nicely laid out at PSTS.
Verifiability is another policy. If you think NOR and V should be merged, let’s return to WP:A (a very good idea but failed catastrophically due to poor change management).
“Primary” means “original”. It appears in the title. “Secondary source” is mostly every source that is is reputable and not primary.
PSTS is core policy. It is the meat of NOR. It immediately goes to source typing, which is essential in writing an encyclopedia as opposed to writing a random collection of facts. Wikipedia is the first. Google serves for the second. SmokeyJoe (talk) 23:25, 14 October 2022 (UTC)[reply]
The top of NOR says "The phrase "original research" (OR) is used on Wikipedia to refer to material—such as facts, allegations, and ideas—for which no reliable, published sources exist."
Therefore, I conclude:
  • reliable, published source exists – not OR
  • reliable, published source does not exist – OR
Note the complete absence of any words like primary or secondary in that definition. That's because they're not technically relevant to the question of whether a given claim is OR.
I disagree that any policy requires a "balance" of primary and secondary sources. This could only be true if you think that "balance" could involve zero primary sources, which is definitely the desirable "balance" for articles like Cancer. There is no reason for that article to cite any primary sources at all.
But even if you had the wrong balance of source types, the result wouldn't be "material—such as facts, allegations, and ideas—for which no reliable, published sources exist." It would just be another article with verifiable, non-OR contents that needed more work. WhatamIdoing (talk) 23:44, 14 October 2022 (UTC)[reply]
The top part, the current lede, is bloated and not very good. The early versions of the page began better. The part you quote is particularly poor. You conclusions suggest that you are going straight to WP:V.
You disagree that WP:PSTS requires a balance of primary and secondary sources? I’m astounded. It squarely does, and it is the most important part of this core policy, to require a balance of primary sources (sources of facts) and secondary sources (evidence of interest, and contextualisation of those facts). This policy establishes the need for the balance so clearly that it need not be repeated elsewhere.
Failure against PSTS usually means the article needs more work. Where the balance utterly fails, all facts no secondary sources, it is the extreme case covered by WP:N, which is an explicit WP:DEL#REASON and is regularly enforced. SmokeyJoe (talk) 23:54, 14 October 2022 (UTC)[reply]
Which sections are "most important" isn't really the point. The point is that the fundamental definition of "OR" has nothing to do with historiography. PSTS could continue being the most important policy even if PSTS's words weren't located on the same page as WP:CALC's words.
The early versions of the page don't mention PSTS at all. See, e.g., the first day:
---
Wikipedia is not
the place for original research such as "new" scientific theories.
From a mailing list post by Jimbo Wales:
If your viewpoint is in the majority, then it should be easy to substantiate it with reference to commonly accepted reference texts.
If your viewpoint is held by a significant scientific minority, then it should be easy to name prominent adherents, and the article should certainly address the controversy without taking sides.
If your viewpoint is held by an extremely small minority, then whether it's true or not, whether you can prove it or not, it doesn't belong in Wikipedia, except perhaps in some ancilliary article. Wikipedia is not the place for original research.
---
Two months later, it got its first mention of primary sources, and that was to say that "Wikipedia is not the place for original research such as "new" theories. Wikipedia is not a primary source."
If you start an article with only plain, simple, obvious facts and no contextualization, you are not engaging in original research. You're just not writing a very good article. The point made in the early versions was that Wikipedia is not a publisher of original research. PSTS is not really about that rule at all. WhatamIdoing (talk) 00:44, 15 October 2022 (UTC)[reply]
If you start with just plain simple facts, you’re taking a wild chance that what you are writing about is Wikipedia-notable. It is extremely poor advice to tell a newcomer that they can do this, even if they can. SmokeyJoe (talk) 01:59, 15 October 2022 (UTC)[reply]
PSTS isn't about notability, and it isn't written for newcomers. WhatamIdoing (talk) 03:03, 15 October 2022 (UTC)[reply]
PSTS is the foundation of WP:N, the requirement that each article has two secondary sources. WP:N covers the extreme end of applicability of PSTS, where secondary sources don’t exist, and to attempt to write on the topic can only violate WP:NOR. WP:N doesn’t limit content, it is only for deletion/merge decisions. If you want to limit coverage of a subtopic within an article due to lack of subtopic notability, the policy basis for doing this is WP:NOR, specifically WP:PSTS.
All core policy should be considered to be written for newcomers. There is a history of leading Wikipedians using policy editing to engage in high-language debates with each other, but these pages should instead been regarded as basic policy that should be amongst the first pages that newcomers are pointed to, as is the case. SmokeyJoe (talk) 03:21, 15 October 2022 (UTC)[reply]
I'm struggling to understand how WP:N's reference to secondary source could possibly be harmed if the exact same words are on a page with a {{policy}} tag at the top that's called Wikipedia:Primary, secondary, and tertiary sources instead of being located on a page with the same tag at the top that's called Wikipedia:No original research.
When editors want to limit coverage of a subtopic within an article due to lack of subtopic pertinence, the most commonly invoked policy basis for doing this is WP:DUE, but even if you like to invoke PSTS for this, I again fail to see how that goal could possibly be harmed by putting the exact same words on a separate page. WhatamIdoing (talk) 05:27, 15 October 2022 (UTC)[reply]
Harm?? WP:N has a foundation in PSTS was the point.
DUE is good for most cases. PSTS may be better sometimes, like when someone wants to add data that no source ever commented on. SmokeyJoe (talk) 08:37, 15 October 2022 (UTC)[reply]
Let's stipulate that WP:N has a foundation in PSTS.
What would happen to WP:N if we decided to WP:MOVE this page to a different title? Nothing, right? Not a single word of PSTS would change, and WP:N would not be affected at all.
What would happen to WP:N if we decided to put WP:SYNTH on a separate page? Nothing, right? Not a single word of PSTS would change, and WP:N would not be affected at all.
I suggest to you that cutting and pasting the text of PSTS to a separate page, also marked as policy, still linked straight there by WP:N, without changing a single word of PSTS, would equally have no effect on WP:N.
I am literally asking you to tell me what could possibly change if WP:N links to these exact words: "Wikipedia articles should be based on reliable, published secondary sources and, to a lesser extent, on tertiary sources and primary sources...." either
  • in a subsection on a policy page, versus
  • at the top of a policy page.
What difference does the location of the words make to WP:N? WhatamIdoing (talk) 15:42, 20 October 2022 (UTC)[reply]
The word “viewpoint” necessarily implies a secondary source in the historiographical meaning. Jimbo’s post says that others’ secondary source are required. SmokeyJoe (talk) 02:09, 15 October 2022 (UTC)[reply]
In practice, we don't use the historiographical meaning, and I don't think his famous comment about "viewpoints" has any connection. All he says about OR in that message is "Wikipedia is not the place for original research", and he says this at the end of this paragraph: "If your viewpoint is held by an extremely small minority, then _whether it's true or not, whether you can prove it or not_, it doesn't belong in Wikipedia, except perhaps in some ancilliary article", in a thread about (literally) whether a Wikipedia editor had proven Albert Einstein wrong about special relativity. Think about that. That is the origin of our rule against original research. It has nothing to do with the value of secondary sources. It has everything to do with crackpots making stuff up and trying to get it published as The Truth™ in Wikipedia. WhatamIdoing (talk) 05:21, 15 October 2022 (UTC)[reply]
I don’t know who is your “we”. Wikipedia should use the historiographical definitions because an encyclopedia is an historiographical work, as opposed to a science report, or journalism (the main competition).
Jimbo was responding to the late 1990s thing of many amateur physicists determined to publish their theories, anywhere. I think it diminished due to the arrival of good search engines, when they could search for their discoveries and discovery that they weren’t new at all. SmokeyJoe (talk) 08:42, 15 October 2022 (UTC)[reply]
The English Wikipedia uses articles from celebrity magazines and breaking news as the sole basis for articles. Either:
  • We don't use the historiographical meaning of secondary, or
  • We don't technically require true secondary sources.
Take your pick, but don't waste your time try to convince me that articles sourced entirely to WP:PRIMARYNEWS contain any source that a historian would, if looking back from even 20 years in the future, call a true secondary source.
If you feel like Wikipedia therefore isn't really an encyclopedia, then I won't contest your conclusion. Some may decry this and some may acclaim it, but regardless of individual opinions about whether it's desirable, it's a fact that we regularly accept articles that don't have any true secondary sources. WhatamIdoing (talk) 15:50, 20 October 2022 (UTC)[reply]
Cancer contains many primary sources. Note that source typing, primary vs secondary, is not inherent but depends on how the source is being used. SmokeyJoe (talk) 23:57, 14 October 2022 (UTC)[reply]
I didn't say that cancer doesn't cite primary sources; I said that it shouldn't. WhatamIdoing (talk) 00:35, 15 October 2022 (UTC)[reply]
It should. An article should standalone. It needs to define things, and give examples. These go to primary sources. All pure secondary sources, all opinion not facts, like running editorials containing running commentary assuming you already know the topic, do not make acceptable articles. Articles need both facts and contextualisation. SmokeyJoe (talk) 01:58, 15 October 2022 (UTC)[reply]
A meta-analysis is a secondary source. It is, at its heart, a mathematical calculation. Do you think that meta-analyses are "opinion not facts"? Or is it your opinion that it's not a secondary source, even though multiple reliable sources say that it is?
It is common in scientific articles to source facts to secondary sources. High-school chemistry textbooks are not primary sources for facts about chemistry. WhatamIdoing (talk) 05:31, 15 October 2022 (UTC)[reply]
If it’s a standard analysis, used in its standard way, then it is neither opinion, nor a secondary source, it is just standard data processing. If the analysis is new, or it’s use is not standard, then the applicability and interpretations are opinion. To better do this test, can I have some real examples?
It is common in scientific articles to find all sorts of atrocious referencing and other nonsense. Wikipedia should do better than some common things. Wikipedia should never reference high-school textbooks. Among other things, it is not the purpose of a text book to a reference work, but a teaching tool. SmokeyJoe (talk) 08:49, 15 October 2022 (UTC)[reply]
doi:10.1111/j.1471-0528.1990.tb01711.x is one of the most famous meta-analyses. The creative analysis comes in deciding which things to analyze, not in how one does the math. WhatamIdoing (talk) 15:57, 20 October 2022 (UTC)[reply]
I can’t agree that at its heart, a meta-analysis is a mathematical calculation.
“Opinion” is a simple typical example of a description of secondary source content. More generally it is anything that is transformative of the primary source information. SmokeyJoe (talk) 13:29, 15 October 2022 (UTC)[reply]
"I like dark chocolate" is an opinion, and it is not secondary material. WhatamIdoing (talk) 15:58, 20 October 2022 (UTC)[reply]
I think the question of whether WP:PSTS is both correct and good advice to newcomers needs to be resolved first. You appear to have a beef with PSTS. SmokeyJoe (talk) 02:02, 15 October 2022 (UTC)[reply]
My only concerns with PSTS are that:
  • It doesn't have much to do with editors making stuff up ("original research") and trying to cram it into Wikipedia ("Wikipedia is not a publisher of original thought"), so it doesn't belong on this page.
  • There are too few editors who understand that Wikipedia:Independent does not mean secondary.
Note that I haven't proposed changing a single word of PSTS. I just want it "physically" located on a separate page – a policy in its own right, not a subsection of a policy that doesn't mention PSTS at all outside of that one subsection. WhatamIdoing (talk) 05:35, 15 October 2022 (UTC)[reply]
“Making stuff up” is not the focus of intent of NOR to counter, but the creative combination of facts by editors.
I think your essays on source typing are excellent. I don’t know how splitting out PSTS would help there.
I think PSTS shouldn’t be split out because PSTS is the core of NOR. NOR needs to include source typing and the need to balance primary and secondary sources, as defined historiographically, as per the articles primary source and secondary source. SmokeyJoe (talk) 08:56, 15 October 2022 (UTC)[reply]
"Making stuff up" really is the focus and intent of NOR. SYNTH is all about editors making stuff up by saying "when I put this source next to that one, I get this new conclusion". The definition of NOR is "material—such as facts, allegations, and ideas—for which no reliable, published sources exist" – in other words, "stuff made up by editors" (or, in these latter days, stuff copied by editors from obviously unreliable sources). WhatamIdoing (talk) 16:00, 20 October 2022 (UTC)[reply]
No, because at best this would be a bunch of laborious re-arranging for not much or any benefit, but also because I can see this is premised on the same erroneous 'it is not OR if any source anywhere on Earth says the same thing' POV discussed to death here and here in these very archives.
PSTS is on this page because synthesizing primary sources into a narrative is a form of original research and hence forbidden. It matters not one iota that the individual statements are supported by the primary sources. Wikipedians are to cite secondary sources, not write them (something very similar is said at WP:MEDRS, but the general principle of relying on secondary sources applies everywhere).
Do not combine material from multiple sources to reach or imply a conclusion not explicitly stated by any source. Similarly, do not combine different parts of one source to reach or imply a conclusion not explicitly stated by the source. If one reliable source says A and another reliable source says B, do not join A and B together to imply a conclusion C not mentioned by either of the sources. (emphasis added) Crossroads -talk- 02:40, 15 October 2022 (UTC)[reply]
This statement: "PSTS is on this page because synthesizing primary sources into a narrative" does not happen to be factually true. This would be clear if you had been editing back in the day, or even if you just spent all day reading the archives and stepping through the history of the policy.
PSTS is on this page because editors were fond of saying that "Wikipedia is not a primary source", and then they had to explain what that meant, and since none of the pages except NOT, which was already a mile long, said anything about this, the longer explanation ended up here.
SYNTH is wrong whether you do it with primary sources or secondary sources or tertiary sources or a combination of any of them. There is absolutely nothing about PSTS concepts that is relevant for understanding SYNTH. SYNTH could get along just fine if PSTS had never existed (NB: the same cannot be said for other policies and guidelines), and SYNTH will definitely get along just fine if the exact same words are present, complete with the exact same policy tag at the top of the page, on a separate page. WhatamIdoing (talk) 05:41, 15 October 2022 (UTC)[reply]
BTW, it's not "laborious" at all to split that section out. You can see it at Wikipedia:No original research/PSTS with a few notes from me in red. Because PSTS is not integrated into NOR, or even mentioned at all outside the one section, then removing it from NOR would take about ten seconds, and setting up the separate page took only a few minutes. WhatamIdoing (talk) 06:00, 15 October 2022 (UTC)[reply]
I don't think the history particularly matters when it comes to reasons for how things should be now. Regardless of how easy it is to move the text, one of the most common forms of OR is misuse of primary sources. That's a rationale for keeping it here. Crossroads -talk- 00:24, 17 October 2022 (UTC)[reply]
  • My primary concern with the current PSTS section is that it focuses the reader on the wrong thing… it focuses on evaluating the source, rather than evaluating the text of our articles - ie what we write, based on that source. OR does not stem from the type of source being used, but what we do with that source.
One advantage of splitting the PSTS section off into its own policy/guideline is that we could expand it… explaining HOW to use primary, secondary and tertiary sources appropriately, and HOW to avoid using them inappropriately. Blueboar (talk) 13:11, 15 October 2022 (UTC)[reply]
Interesting, Blueboar. Evaluating the source, typing the source as primary or secondary, depends on how it is being used. Can you point to an example of where editors have had trouble with this?
How to use appropriately, that is always going to be an essay. While much of the current word count could be better explained in a dedicated essay, the core point has to remain policy, surely? SmokeyJoe (talk) 13:21, 15 October 2022 (UTC)[reply]
Of course it does. That's why the core point should be moved to a separate policy page. I have absolutely no intention of "demoting" PSTS from policy status. In fact, I think it would be more accurate for you to think about this as a suggestion to "promote" it as its own, separate, stand-alone policy. WhatamIdoing (talk) 16:08, 20 October 2022 (UTC)[reply]
I endorse that reasoning, and would strongly support putting it into its own page. I'm thinking of this absolutely excellent essay, which would belong in a split-off version of this policy. Wikipedia:Identifying and using primary sources DFlhb (talk) 15:02, 12 November 2022 (UTC)[reply]
  • I don't think we should eliminate discussion of PSTS on this page, as their basic definition is essential to understanding when OR comes up, but I do think we would benefit from a guideline page to explain in more depth what these sources are, how to identify them (we have that primary essary, but there should be similar advice for all three), inclusion of what Blueboar says above, that a work can be primary for one topic and secondary for another, there's no catchall here. Perhaps there's also consideration of how that type of page would intersect with the existing WP:RS. I agree we cannot completely separate PSTS discussion from OR, but I think the matters around PSTS need more than essay pages to flesh out. --Masem (t) 13:37, 15 October 2022 (UTC)[reply]
    Can you share an example of why you really need to understand PSTS to figure out if someone's putting stuff in an article that isn't in any source?
    Indisputable NOR violations include:
    • I can't see the Earth curving, so it's flat. (No reliable source says this, so it's an OR violation.)
    • This tweet says he got married today, and this other tweet says he's in City this evening, so obviously the wedding happened in City. (Straightforward SYNTH)
    • Paul is an actor.[source saying he's not] (Assuming no other reliable sources say this, it's an OR violation.)
    I don't need to know which of these are primary, secondary, or tertiary sources to figure out that these are NOR violations. Every single one of them would be a NOR violation no matter which type of source was claimed. WhatamIdoing (talk) 16:06, 20 October 2022 (UTC)[reply]
    On your idea of keeping some basic information, NOR already has a section Wikipedia:No original research#Related policies. There are similar sections in WP:V and WP:NPOV. We could add a similar summary of WP:PSTS to that section, or use a Wikipedia:Summary style approach to shorten what's in the ==Using sources== section, with a {{Main}} link to the new policy page. WhatamIdoing (talk) 16:11, 20 October 2022 (UTC)[reply]
  • This has felt out of place for a while. They are related, but not any more than WP:V and WP:NPOV. WhatamIdoing is right that we can link to related policies and keep a short explanation of whatever is relevant. Shooterwalker (talk) 02:16, 21 October 2022 (UTC)[reply]

Credit tally

High-quality source X says Actor Y made films (plural, number unspecified) for Studio Z. Within the context of a given Wikipedia article, the specific number is pertinent. Is tallying up the relevant credits in IMDb (or an authoritative print filmography) to specify the number a routine calculation or original research? 24.90.253.80 (talk) 00:40, 25 October 2022 (UTC)[reply]

Can we sidestep the question and instead provide a list of the films? That is, avoid saying Joe Film made three films for Studio Z and instead write something like Joe Film made several films for Studio Z: Amazing Alice, Bob's Business, and Carl v. Carol. This could be awkward if the list is very long, but there is a lower risk of an OR challenge from it.
If you need a number, then it's usually okay to find a filmography that lists the films and count them up. Searching in different places to find all the films you can carries a bigger risk (both in terms of policy compliance and in terms of getting the wrong answer, which would be a very big problem). WhatamIdoing (talk) 06:16, 31 October 2022 (UTC)[reply]

directly related to the topic of the article

Can someone point me to the content in the body of this policy that justifies the bolded wording? I see no need for it.

To demonstrate that you are not adding original research, you must be able to cite reliable, published sources that are directly related to the topic of the article and directly support[a] the material being presented.

It makes sense to guard against coatracking "off-topic" content into an article, but what does the wording above have to do with OR, rather than just to off-topic content? The essence of OR is "content not based on RS". It is another matter when unsourced or reliably-sourced content is placed in the wrong article. It just doesn't belong there. -- Valjean (talk) (PING me) 23:57, 5 November 2022 (UTC)[reply]

I have seen cases (but can't recall) where editors use a whole host of RSes to come to a conclusion about a topic where none of those RSes actually make that claim directly, typically trying to claim some statement must be included by way of analogy, or often in cases of controversial material that is not seen as controversial by RSes, by pointing out analogies of other cases or other types of faulty logic to make their case. That falls out of the WP:SYNTH aspect, which is its own part of the OR policy. Masem (t) 00:03, 6 November 2022 (UTC)[reply]
Yes, that is an abuse of sources that can certainly be OR. SYNTH is one type of such abuse. I see that as related to our reasonable requirement that sources must "directly support the material being presented." I'm referring to something else. -- Valjean (talk) (PING me) 00:37, 6 November 2022 (UTC)[reply]
I've never liked this sentence as I don't think its meaning is clear. The meaning of "directly supports" is crystal clear and at the heart of NOR. But how can a source "directly support" a statement yet not be "directly related" to it? The only times I've seen "directly related" employed in a content dispute is by someone who argues that even though a statement is explicitly provided by a source, the source as a whole concerns another topic and so isn't "directly related". I think this is a misuse of the sentence, but what is an example of a proper usage that wouldn't be equally served by just having "directly support"? Zerotalk 01:39, 6 November 2022 (UTC)[reply]
A hypothetical example would be that someone would want to argue that specific actions Russia has done are war crimes, by way of citing numerous academic sources that point out that other similar acts in past wars were considered war crimes (directly supporting the information), but not a word from RSes that state that Russia's acts are also considered war crimes. The editor is creating inappropriate OR that while the material directly supports the information, it does not directly reference the topic. Masem (t) 02:06, 6 November 2022 (UTC)[reply]
I would say that's a case where the source does not directly support the statement, but only provides a basis for an argument. The statement "Russia committed war crimes" requires the argument part, which would be OR. Zerotalk 03:49, 6 November 2022 (UTC)[reply]
I would find that there are editors that would state that saying "here's all these RSes that said if a country did X those are war crimes" to justify "Russia doing X is a war crime", justifying that the RSes talking about war crimes are "directly related", presuming we're talking an article like the Ukraine-Russian war. The lack of any source to connect "Russia doing X" to being a war crime is certainly a basis of argument but I've seen editors try to logic this approach on other topics. I know this is all covered by the principle of SYNTH, but that's what I'm seeing in the lede is trying to capture briefly the section of SYNTH in the lede. Masem (t) 15:25, 6 November 2022 (UTC)[reply]
As a second hypothetical but also of what I've reminded of what I've seen before, say we have a high quality RS that is a focus on a person X, likely a critique of their political or ideological position, which is 100% valid to use on the article about X. But within that we get a line like "Like Y, X shares (this view)." where Y is a different person that is only mentioned briefly in that context. In that case, that RS would not be sufficient to use to justify "Y has (this view)" on the article page about Y because the article, while mentioning Y, is not directly about the topic. Masem (t) 15:49, 6 November 2022 (UTC)[reply]
@Masem: I agree that we should devalue sources that support some text only by passing mention. Once I had a dispute about the use of a historical claim made in passing in a newspaper cooking column. The question here is whether the words "directly related" in the policy are intended to indicate this issue. If so, it isn't clear enough and needs expanding on rather than relying on editors to grasp the proper intention of those two words. Zerotalk 00:08, 7 November 2022 (UTC)[reply]
That's my point: You go to the body in the SNYTH and the "directly related" language is right there. It is not like that is magically appearing out of nowhere here. Masem (t) 01:49, 7 November 2022 (UTC)[reply]
User:Masem, that exact language is not found at the SYNTH section. The lead is the only place where that wording appears. Maybe you're thinking of some synonyms that mean the same thing? If so, please quote them here. -- Valjean (talk) (PING me) 15:17, 8 November 2022 (UTC)[reply]
Quoting from SYNTH ""A and B, therefore, C" is acceptable only if a reliable source has published the same argument concerning the topic of the article." Masem (t) 15:40, 8 November 2022 (UTC)[reply]
Okay, now I see what you mean. The wording in the lead, which is what I'm discussing, covers two things, whereas the SYNTH wording discusses only the first of the two.
LEAD: "sources that are directly related to the (1) topic of the article and directly support the (2) material being presented."
SYNTH: "source has published the same argument concerning the topic of the article."
Their mentions seem to be about very different topics. LEAD is about "related to the topic" and SYNTH is about the "same argument concerning the topic." The first is a meta aspect and the second is a very specific aspect. The first is about "Trump" in an article about Trump with no regard to any specifics, IOW the source must mention Trump. The second is about an intricate argument within the article about Trump, IOW the source must mention Trump and connect him to the argument about him. Slightly related, but not always.
The source should support the argument, and that requires it already is related to the topic of the article, IOW the part about "related to the topic of the article" seems superfluous. -- Valjean (talk) (PING me) 22:20, 8 November 2022 (UTC)[reply]

Suppose there is a Wikipedia article about Russia and consider the following case.

Russia targeted civilians in the war.[1] Targeting civilians is a war crime.[2]

where RS [1] is about Russia and RS [2] is not about Russia, yet it directly supports its sentence. This would be OR because it implies that Russia committed war crimes without an RS saying so. Having the phrase "published sources that are directly related to the topic of the article" would prevent this OR. Whereas just having the phrase "directly support[b] the material being presented" would allow the OR because the RS [2] directly supports the material that it is associated with, which is the sentence, "Targeting civilians is a war crime." Bob K31416 (talk) 09:47, 6 November 2022 (UTC)[reply]

That example is a textbook case of SYNTH. Ref [1] is a good source for the first sentence and ref [2] is a good source for the second sentence. Neither is problematic in isolation. However, the juxtaposition of the two sentences is clearly intended to tell the reader that Russia committed war crimes, which is not directly supported by either source. Zerotalk 10:08, 6 November 2022 (UTC)[reply]
I'm not sure whether you are agreeing or disagreeing with what I wrote. Could you explain more. Bob K31416 (talk) 10:42, 6 November 2022 (UTC)[reply]
In your example, the (unstated but clearly intended) conclusion that Russia committed war crimes is NOT directly supported by either of the sources. So this use of sources runs afoul of the "directly supports" rule (not to mention the SYNTH rule). It is not a case where the "directly related" part makes a difference. I'll poset that there is no case where the addition of "directly related" to the policy outlaws anything that is not already outlawed and, moreover, that the concept of "directly related" is too vague to be useful. Zerotalk 10:54, 6 November 2022 (UTC)[reply]
I think my original message refutes what you are saying, so I'll leave it at that, except to say that "published sources that are directly related to the topic of the article" has been a part of the policy's lead for at least 14 years and I have found it useful for understanding the policy. Bob K31416 (talk) 11:07, 6 November 2022 (UTC)[reply]
Bob K31416, your example is an excellent demonstration of SYNTH. That part of NOR is good and explains how one type of source abuse is covered by NOR. There are other types of source abuse that cause other problems. -- Valjean (talk) (PING me) 16:05, 6 November 2022 (UTC)[reply]
Val. Is this question, related to an ongoing discussion at Donald Trump's talkpage? GoodDay (talk) 15:18, 6 November 2022 (UTC)[reply]
User:GoodDay, it is triggered by that discussion, but because it is more of a policy question that has implications everywhere, I chose to discuss it here. We can't change policy at Talk:Donald Trump. If this results in a change that will affect that discussion, then we can deal with it there. -- Valjean (talk) (PING me) 16:00, 6 November 2022 (UTC)[reply]

Okay, lets approach this from a slightly different angle. Would we lose anything by eliminating that phrase? In what situation is that phrase actually necessary for THIS policy?

To demonstrate that you are not adding original research, you must be able to cite reliable, published sources that are directly related to the topic of the article and directly support[b] the material being presented.

How's that? I don't see that OFF-TOPIC is directly related to this policy. It's just off-topic and should not happen. Not all forms of source abuse are NOR. -- Valjean (talk) (PING me) 16:11, 6 November 2022 (UTC)[reply]

Perhaps this is the only policy level P&G that I know of that warns about using off-topic sources to try to justify content in articles. It is nutshell'ing this line in the body " "A and B, therefore, C" is acceptable only if a reliable source has published the same argument concerning the topic of the article." Masem (t) 16:14, 6 November 2022 (UTC)[reply]

It's explaining WP:SYNTH. We should not combine sources about A (the topic of the article) and B (sources not talking about the topic of the article) to imply C (some claim that B is somehow relevant to the article topic). Crossroads -talk- 23:06, 6 November 2022 (UTC)[reply]

This particular phrase, as written, does not explain SYNTH, though based on the archived discussions, I think it might have been intended to.
What it actually says is that editors shouldn't use sources that aren't about the subject of the article (e.g., do not cite medical journals while writing Box office, even if a journal article mentions box offices; do not use film industry magazines while writing SARS-CoV-2, even if a magazine article mentions that virus). A specific warning to "generally" avoid passing mentions was added around the same time. That line probably belongs to Wikipedia:Neutral point of view#Balancing aspects ("if all you can find is a passing mention, it probably doesn't belong in the article"), not to NOR anyway.
As these points are made elsewhere, and as the application is more general than absolute (e.g., if you are writing a sentence about the effect of pandemic lockdowns on movie theaters, you might cite a variety of sources about lockdown effects, and not exclusively sources that are primarily about Box office or SARS-CoV-2), I don't think that the words "are directly related to the topic of the article and" truly need to be in the first paragraph of the policy. SYNTH will still be 100% banned even if those exact words aren't in the lead. I am slightly inclined to remove those words, for less confusion and more concision. This should be understood as changing the wording but not the meaning of the overall policy. WhatamIdoing (talk) 02:19, 13 November 2022 (UTC)[reply]
I agree. I don't think it adds anything and also agree that nothing is lost by deleting those words. -- Valjean (talk) (PING me) 02:45, 13 November 2022 (UTC)[reply]

While I acknowledge the technical arguments for removing that text (in addition to keeping the text shorter, removing the text makes the policies more composable), I think there are practical benefits to keeping it. Synthesis is one of the more insidious challenges we have when building a neutral encyclopedia and whatever we can do to briefly explain our long-standing position to editors is beneficial. I agree that we should not bloat our policies yet IMO the clarity provided to our editors for this particular issue outweighs the minor loss of conciseness. Orange Suede Sofa (talk) 04:15, 14 November 2022 (UTC)[reply]

?? SYNTH is untouched by this. -- Valjean (talk) (PING me) 04:39, 14 November 2022 (UTC)[reply]
@Orange Suede Sofa: From the discussion here, it is clear that even highly experienced editors cannot agree on the purpose of those words. So far from clarifying anything, the evidence is that they are more confusing than helpful. SYNTH is far better described by its own section. Zerotalk 06:02, 14 November 2022 (UTC)[reply]
Bingo! The SYNTH explanation is good. We are not supposed to abuse sources by making content not backed by those sources. -- Valjean (talk) (PING me) 06:37, 14 November 2022 (UTC)[reply]
  • OK, I have a "directly related" question. Take a case where information changes over time. Take theory X which was originally, widely viewed as not true but was later found to be true. How do we deal with a case where a person/organization is declared by RSs to be wrong for supporting the theory but later RSs don't reverse that claim when the new information comes out? How does "directly related" apply? Consider Mr Smith's BLP says he was wrong when he claimed X. This is cited to sources that directly make the claim. A few years later understanding shifts on the topic. We don't have new sources saying "Mr Smith turned out to be correct". What should be done? One option would be remove the accusation. That might be OK but it kind of buries that the person was publicly declared wrong. Essentially this would be saying the original RSs are no longer due because they are not accurate. Another thing that might happen is an editor says "Mr Smith was found to be correct [source that says theory is true but doesn't mention Smith]. My feeling is this option is synth since the source didn't say "Smith was right". A third option is to simply state "Theory X has since been found to be correct [source stating X is true]". This is true to the sources but opens the question, is this a form of synth since it clearly implies Smith was right even though no sources state that. However, the actual claims are true to the sources and facts about the theory are, in my view, DUE when they relate directly to the nexus of Smith and the theory. Would removing "directly related" change this? Springee (talk) 13:50, 14 November 2022 (UTC)[reply]
I would say that in the scenario you lay out, a source saying that Mr. Smith is wrong should be considered obsolete… and thus no longer reliable except as a primary source for saying “X thought that Smith was wrong”. However… I would also argue that mentioning X’s obsolete opinion is UNDUE, unless that opinion is noted by more modern sources. Thus, the correct action is to omit the discussion of Mr. Smith’s rightness/wrongness all together. What people in the past thought of him is now irrelevant. Blueboar (talk) 15:03, 14 November 2022 (UTC)[reply]
  • The issue is that SYNTH is widespread, and we should include clarification of it in the lead of this policy to remind editors to avoid it. Removing the phrase would IMO result in more cases of SYNTH popping up across the encyclopedia, which would just waste editor time. I strongly support keeping it if the choice is binary between keep or remove, but I'd also support removing and putting a clearer explanation of SYNTH in the lead. DFlhb (talk) 14:07, 14 November 2022 (UTC)[reply]
    It won't have any effect on SYNTH, and it's not representing SYNTH anyway. The words being discussed say "directly related to the topic of the article".
    What that says is: Please go revert your edit today to Alt-lite, because the source you cited is "directly related" to reviewing a couple of books about Anti-fascism, which means that the source is not directly related to the subject of Alt-lite. Similarly, a bunch of sources in your edit to Mike Cernovich are "directly related" to other subjects and only mention him in passing, or not at all (example), so you should go revert that, too.
    If you care about SYNTH, you should be looking at the immediately previous sentence, which says "This includes any analysis or synthesis of published material that serves to reach or imply a conclusion not stated by the sources." WhatamIdoing (talk) 21:15, 14 November 2022 (UTC)[reply]
    Being "wikistalked" (I'm kidding obviously!) by an editor I respect is an honor :)
    Very fair point. That exact thought actually occured to me as I edited these articles after posting here. I see the limitations of my reasoning: there are many ways to use sources that aren't directly related, yet are used in a SYNTH-compliant way. I've reviewed the discussion on Donald Trump about the Iran plane thing (that prompted this) and there's likewise no issues with how the sources are used there.
    I support removing the "directly related" passage. DFlhb (talk) 22:07, 14 November 2022 (UTC)[reply]
    Thank you for the compliment, and also for reviewing that discussion, which I couldn't make myself read completely. WhatamIdoing (talk) 06:27, 15 November 2022 (UTC)[reply]

Talk:Trump

There is a discussion at Talk:Donald Trump#Airliner shot down that may benefit from editors familiar with this policy of No original research. Bob K31416 (talk) 10:05, 6 November 2022 (UTC)[reply]

"Secondary sources" extremely questionable!

The penetrating and annoying calls for "secondary sources" overlook the fact that these - if not illegally copied from some encyclopedia - are usually written by non-specialist journalists, and all too often with very little understanding and misleading interpretation of the facts. Perhaps you should take a look at the book "Factfullness". Hans J.J.G.Holm 2A02:8108:9640:1A68:31D1:F57B:3DCE:D718 (talk) 09:18, 22 November 2022 (UTC)[reply]
Cite error: There are <ref group=lower-alpha> tags or {{efn}} templates on this page, but the references will not show without a {{reflist|group=lower-alpha}} template or {{notelist}} template (see the help page).