Jump to content

Wikipedia talk:Verifiability

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by 66.102.198.179 (talk) at 04:35, 1 June 2010 (→‎Independence: proposed amendment (re-tabled): 3 fatal flaws). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Journalist's blogs on their Newspaper's web site

Could someone clarify the status of such blogs as reliable sources? Footnote #4 in the policy says "Some newspapers host interactive columns that they call blogs, and these may be acceptable as sources so long as the writers are professionals and the blog is subject to the newspaper's full editorial control. In March 2010, the Press Complaints Commission in the UK ruled that journalists' blogs hosted only on the websites of news organizations are subject to the same standards expected of that organization's print editions (see Plunkett, John. "Rod Liddle censured by the PCC", The Guardian, March 30, 2010). " To me, that says that a professional journalist's blog, posted on a UK newspaper's web site is reliable. Is that incorrect? —Preceding unsigned comment added by Momma's Little Helper (talkcontribs) 13:42, 18 May 2010 (UTC)[reply]

This is a good question but will probably get more attention at WP:RSN (under a separate topic maybe). — e. ripley\talk 17:16, 18 May 2010 (UTC)[reply]
This is not a question about a particular source, but a request for clarification regarding a general policy. I appreciate SlimVirgin's response, below, which is my understanding as well. Momma's Little Helper (talk) 23:59, 18 May 2010 (UTC)[reply]
Yes, it's a reliable source because a professional journalist's blog on a newspaper website is just another word for a column. It's not a personal weblog. SlimVirgin talk contribs 17:17, 18 May 2010 (UTC)[reply]
Just to be clear, at least some of the genesis for Momma's Little Helper's question stems from questions about whether an article published in Yediot Aharonot about a living person is reliable or not (specific discussion here). Supporters of including the information argue that it should be included because some newspaper columnists from accepted reliable sources have written blog entries that summarize/recount the original article, but which contain no new reporting themselves. I contend that in that instance, a blog entry that does nothing more than describe another publication's story can't be used as a WP:RS. — e. ripley\talk 17:23, 18 May 2010 (UTC)[reply]
Of course some reliable sources are better than others. Have modified the Newspaper and magazine "blogs" section to reflect the source about the PCC ruling more accurately – note that The Spectator is a magazine, not a newspaper, and that the PCC expects the same standards in newspaper and magazine blogs that it would expect in comment pieces that appear in print editions. Not as useful as news pieces, but usable as comment pieces, presumably with in text attribution. . . dave souza, talk 19:04, 18 May 2010 (UTC)[reply]
Newspaper blogs aren’t directly comparable to columns in print. The difference for our purposes is in the lack of editing. As a rule of thumb: Anything in a print newspaper is reviewed by at least two people after the writer and before publication, but newspaper blogs are not edited.
See this article from the American Journalism Review (Dec 06 – Jan 07):
  • "Most newspaper blogs are self-edited."
  • "Nobody edits what I write before it goes online." – Daniel Rubin, full-time blogger at the Philadelphia Inquirer
  • "Some newspapers try to edit at least some of their blogs." Maurreen (talk) 06:55, 19 May 2010 (UTC)[reply]
Your last point seems to be the case in the UK , per the footnote (now part of the article itself) I quoted: "journalists' blogs hosted only on the websites of news organizations are subject to the same standards expected of that organization's print editions". Momma's Little Helper (talk) 13:38, 19 May 2010 (UTC)[reply]
This is all important to know, but we're missing a larger point: Even if a newspaper blog doesn't go through a proofreader or copy editor (which blogs at my publication did) or a top editor, they still carry the imprimatur of the newspaper or magazine. Professional journalists of high-caliber, WP:RS periodicals write to a necessarily high, professional standard and know that whatever they write is going to represent the newspaper or magazine.
I agree with Maurreen that an edited print column is a higher-quality source. I'm not sure there's a practical difference between a newspaper/magazine print column or online column a.k.a. blog, though: When you're talking about, say, the real-estate blog of Jay Romano of The New York Times, whether there's a top editor on that or not, it's still going to be up to the professional level of The New York Times. --Tenebrae (talk) 00:02, 25 May 2010 (UTC)[reply]

Draft new ATT proposal

For anyone interested, based on recent discussions here and on WT:NOR, I have started a draft version of a new proposed ATT policy under my user page, which includes the current versions of WP:V, WP:NOR and some material from WP:ATT. It is very much a works-in-progress, but comments would be appreciated on its talk page. Thanks, Crum375 (talk) 00:29, 24 May 2010 (UTC)[reply]

I'll take a look.
Some advice... if you really are serious about moving this forward... SHOUT it to the rooftops... post notice after notice about it in every venue you can... repeatedly. Keep people updated on the progress with yet more notices. One of the more common comments about the old ATT page was "I never knew this was in the works"... even though several hundred editors contributed to the drafting, a lot of people were still taken by surprise when it actually went live. You have a chance to avoid that error this time around.
Also, please remember that people have become attached to the existing policy pages... especially the core policies that they cite every day. You can expect knee-jerk opposition to the idea of a merger, even if the proposed policy says exactly the same thing using the same language. Take this opposition seriously.
I wish you luck, and offer my support. Blueboar (talk) 00:53, 24 May 2010 (UTC)[reply]
As far as SHOUTING, I fully agree, but it only makes sense to do so when there is a shoutable product. At the moment, it's just a first draft, and it can benefit greatly from comments and ideas. Thanks for the support. Crum375 (talk) 01:02, 24 May 2010 (UTC)[reply]
Offering my support as well. I'll look at the draft in detail later, but at first glance it seems to be going in very much the right direction.--Kotniski (talk) 06:37, 24 May 2010 (UTC)[reply]
Thank you. Any input would be much appreciated. Crum375 (talk) 14:51, 24 May 2010 (UTC)[reply]

Independence: proposed amendment (re-tabled)

In an earlier thread I proposed making the following change to the wording of WP:SOURCES:

Articles should be based upon reliable, third-party published sources with a reputation for fact-checking, accuracy and independence.

Independence means that a source is free from pressures associated with a strong connection to the subject matter (such as, but not limited to family relationships, close political affiliation, business dealings or other benefical interest) that may compromise, or can reasonably be expected to compromise, the source's reputation for reliablity.

At the time, most of the objections to this amendment were in relation to using autobiography as a source in articles. Having thought about this issue, I am not against using autobiographical sources, but I realise that they are a potential minefield, in the sense that, they are a form of self-published sources, and for that reason, are not strictly reliable in any case. For instance, if I were to quote an autobiograhical source in an article about a living person, I would do in a way that made it absolutely clear that I was doing so (e.g. "XYZ said in his book that...") to alert the reader to the fact that a person speaking about their own life may not be the unbiased source of information about events that affected them.

I feel that we should revisit this proposal, because independence is an important principle in the real world, that when compromised, can have catastropic effects (the Enron Scandal comes to mind in this regard) for those who regard reliable sources as an important form of external verification. It seems to me that independence and reliablity are two vital characteristics of high quality sources, and to ignore one or the other would fatally compromise this policy on verifiability. Would anyone care to support this proposed amendment? --Gavin Collins (talk|contribs) 08:54, 25 May 2010 (UTC)[reply]

This works fine for most articles... but not all. How would the proposal affect an article that is about the beliefs and practices of a specific group... say a religious group. I would think that the best (most reliable) sources would be those from within the religious group itself... sources that can speak with authority as to what the beliefs are (or are supposed to be). In other words, there are at least a few situations where articles should be based primarily on dependent sources. Blueboar (talk) 13:06, 25 May 2010 (UTC)[reply]
I agree. There are many situations where the best and most knowledgeable sources are the ones directly involved. We should of course strive to present, based on the reliable sources, all prevailing views about the topic, per WP:NPOV, but the majority of the sources and the highest quality ones may sometimes be closely related to the subject matter. Crum375 (talk) 13:52, 25 May 2010 (UTC)[reply]
I agree, but the problem is not solved by turning down this proposal, because "independence" and "third-party" mean almost the same thing. So by leaving "third-party" the problem remains. Jc3s5h (talk) 14:00, 25 May 2010 (UTC)[reply]
Also agree with Blueboar. Independence is important, but it's not a key criterion for inclusion of all information. We have guidelines indicating when it's appropriate to use self-published sources as sources on themselves; writing this into verifiability would call for eliminating that practice entirely, which would be highly problematic. —chaos5023 (talk) 14:01, 25 May 2010 (UTC)[reply]
I gather that this idea stems from recent discussions at WP:NOTE... I fully agree that notability needs to be established through independent sources (to show that someone other than those directly involved have taken note of the topic)... but that is a different issue from what we are talking about here. As a general rule, independence is good in a source... but not always. It depends on the topic. Blueboar (talk) 14:03, 25 May 2010 (UTC)[reply]
Yes. Articles should be "based" on third party sources (and also not on primary sources) to establish notability, but we don't require all sources, or even most sources, to be third party. To add the word "independent" and delve into its definitions would create a false impression. Crum375 (talk) 14:31, 25 May 2010 (UTC)[reply]
I have to disagree with this proposal. For some topics, particularly historical type stuff, it works great. I agree on Blueboar's concern, that while independence is important, it is not a requirement for all information that should be in an article, and would be concerned on how it could effect some topical areas. And how does one determine if something has a "reputation for independence"? I've never seen a source that does, really. Almost every media outlet is considered "biased" by some party or another, even academics. It wouldn't leave much sourcing available. -- AnmaFinotera (talk · contribs) 14:26, 25 May 2010 (UTC)[reply]
In answer to Blueboar, at present this policy says that "articles should be based upon reliable, third-party published sources with a reputation for fact-checking and accuracy". Although it is silent on the role of primary sources, essentially this policy says that articles should be based on third party, not primary sources alone. This point is the subject matter of Wikipedia:No original research#Primary, secondary and tertiary sources. That does not mean you can't use primary sources, or sources which are not striclty reliable such as self-published sources, but what WP:BURDEN makes clear is that without third party (secondary & tertiary) sources, a topic should not have its own article. I think Blueboar is wrong to say that there are a few situations where articles should be based primarily on dependent sources - my reading of policy says that can never be the case.
My proposal goes beyond the issue of weight that should be given to primary sources, to what makes a third party source reliable: if a source is not independent, then is not reliable either: they are two sides of the same coin. Simply put, a lack of independence may compromise, or may be reasonably be expected to compromise, a source's reputation for reliablity. --Gavin Collins (talk|contribs) 15:01, 25 May 2010 (UTC)[reply]
That's not what our policies say. Per WP:PSTS, an article may not be based purely on primary sources. And per WP:N, an article must be based on reliable third party sources to establish notability. But as I noted above, "based on third party sources" does not mean that all sources, or even most sources, should be third party. And in many situations, the best and highest quality sources are connected to the subject matter. So in summary, we use third party sources to establish notability for the article's subject, and may use any reliable sources to fill in the details, per WP:V, WP:NOR and WP:NPOV. Crum375 (talk) 15:04, 25 May 2010 (UTC)[reply]
Gavin.collins is creating a useless tautology. First he defines independence as, in part, "a source is free from pressures...that may compromise, or can reasonably be expected to compromise, the source's reputation for reliablity". Then he states "if a source is not independent, then is not reliable either". Only if you are using Gavin.collins definition of independence. Many sources may be closely connected to subject matter, but still have a strong reputation for accuracy and fact checking. An example would be the United States Supreme Court. These sources might be biased, or may be constrained to work within a biased framework, yet can still be reliable. Jc3s5h (talk) 15:25, 25 May 2010 (UTC)[reply]
I believe that editors should normally and automatically evaluate sources for independence—because someone with a significant conflict of interest really isn't a "third party"—but I think that explicitly spelling it out on this page might be just a bit WP:CREEPy. WhatamIdoing (talk) 18:01, 25 May 2010 (UTC)[reply]
If editors automatically evaluate sources for independence, then should we not say so in this policy? --Gavin Collins (talk|contribs) 21:34, 25 May 2010 (UTC)[reply]
There are at least three problems with this suggestion ... The 1st, and most important, is the term "compromise" which potentially could lead to libel lawsuits against WP if a talk page says a source's "reliability" (which could be considered synonymous with "personal integrity") has been compromised. The 2nd is the fact that the vast majority of subject matter experts are directly connected to their subjects of expertise by vocation and few outside of certain vocations even understand the subject matter. The 3rd is that nearly ALL modern published sources are constantly subject to economic pressures which determine what subjects and what perspectives of said subjects will be allowed (funded) to go to press. 66.102.198.179 (talk) 04:35, 1 June 2010 (UTC)[reply]

Self-published

Also related to the same conversation at WP:N, we now have an assertion that www.coca-cola.com, published by Coca-Cola, Inc., is not self-published. I think the rationale is that multinational corporations are too big to be capable of self-publishing a website.

Would anyone object to adding "corporate websites" to the list at WP:SPS? Or perhaps it would be more appropriate to provide a definition of non-self-published (i.e., something with both editorial independence [from the business side] and editorial control [of the reporters]). WhatamIdoing (talk) 17:54, 25 May 2010 (UTC)[reply]

A sufficient majority of the editors who regularly edit source-related policies and guidelines insist on talk pages (but do not reveal in the guidelines and policies themselves) that "self-published" refers to the actions of a single individual, or small band of individuals. Self-published, in the minds of this cohort, cannot refer to large organizations. The guidelines and policies have been written with this covert definition in mind. Thus, a change in the definition to include large organizations requires a serious re-write of all the source-related policies. Jc3s5h (talk) 18:00, 25 May 2010 (UTC)[reply]
"Self-published," as the policy makes clear, refers to individuals or small unprofessional groups publishing things like personal blogs. It doesn't refer to The New York Times or Coca Cola publishing material about itself. SlimVirgin talk contribs 18:03, 25 May 2010 (UTC)[reply]
Then it's (past) time to do that serious re-write. Corporate websites are basically advertising published by and for the corporation, with zero oversight by any other party (much less an independent one), and it is stupid to pretend that advertisements aren't published at the direction, cost, and discretion of the advertiser. (And just who do we think publishes coca-cola.com, if not Coca-Cola, Inc.? Martians? The US government?)
(And if anyone claims that the newspaper ads BP is running are properly published in the same sense that the news stories next to them are, I might actually scream.) WhatamIdoing (talk) 18:13, 25 May 2010 (UTC)[reply]
"Self publishing" in the Wikipedia sense does not refer to an organization publishing its own material, because that definition would include virtually all sources, including the New York Times. By "self publishing" we mean individuals or small groups who have no or minimal oversight levels. Coca Cola and other large corporations are extremely careful in what they publish, and have many vetting layers, so they are clearly a reliable source, though normally a primary one. This is clearly covered in WP:SOURCES and WP:SPS. Crum375 (talk) 18:19, 25 May 2010 (UTC)[reply]
No, it wouldn't. Stuff NYT published about itself (e.g., its subscription rates) should be considered self-published. Stuff it publishes about others should not. There is no oversight on a company's own website: Only Coca-Cola itself decides what goes on coca-cola.com.
And again: If Coca-Cola, Inc. isn't the legal person who is publishing that website, then just who is? WhatamIdoing (talk) 18:27, 25 May 2010 (UTC)[reply]
I think you have your terminology confused. The NYT does not magically become "self published" when it describes itself, and not "self published" for other topics. Its vetting mechanisms, as well as the physical publishing and printing mechanisms, are substantially the same, regardless of what they publish. The difference is that if they describe themselves, they become a "primary source" for that information, since they are directly involved in the topic being covered. This is true for all sources: if they describe a topic they are involved in, they are a primary source for it, otherwise they are a secondary source. Therefore, Coca Cola or other corporations are a primary source when describing their products. They can be very reliable, but they are still primary. Please read WP:SOURCES and WP:PSTS for more info. Crum375 (talk) 18:33, 25 May 2010 (UTC)[reply]
I don't have my terms confused: I'm using the standard, plain-English definitions, and I'm arguing that Wikipedia should, too.
I do think that you've never worked for or had any experience with a newspaper or other traditional form of media. Here are some facts that you don't seem to possess: Even at The New York Times, a distinction is made between advertisements (including self-advertisements) and editorial content. Circulation departments do not employ journalists to draft their published subscription terms. Publications like advertising rate cards are not considered editorial content and are completely exempt from journalistic editorial control: No editor named on the masthead at NYT has any authority over what the ad department publishes about the ad department's policies or fees.
I fully agree that coca-cola.com is a primary source for statements about Coca-cola, Inc and its products. The website is also a self-published source, because it is published by Coca-Cola, Inc. about Coca-Cola itself. Nobody except Coca-Cola, Inc. controls the publication of their corporate website; it is therefore self-published. The classic classification system (primary, secondary, and tertiary sources) does not include a fourth point called "self-published". Whether something is self-published or not is an entirely separate consideration from whether it is a primary, secondary, or tertiary source. To give a simple example, ISBN 9781556435218 (a dictionary) is (1) a tertiary source and (2) self-published. (The same can be said, BTW, for Noah Webster's works.) WhatamIdoing (talk) 20:09, 25 May 2010 (UTC)[reply]

(edit conflict)

One might reasonably argue that self-published speaks to the lack of a discernable and functional distinction between the author and the publisher. If one is employed by or otherwise subordinate to the other then they effectively have a single identity and consequently reduced reliability. The NYT editorials should be considered self published. "Staff writer" items about the NYT should too. Byline attributed articles by individual journalists gain some degree of additional credence, and very reputable journalists may rise to be arguably independent. One simple test is whether they are syndicated to papers that are not under the same ownership. There is however no case in which one would expect Coca-cola's advertising agency to publish material that Coca-cola was opposed to. That rather is why it is called an agency.LeadSongDog come howl! 20:25, 25 May 2010 (UTC)[reply]
(ec) When we say on Wikipedia "self published", we mean individuals or groups with no (or minimal) vetting layers, publishing content. When it comes to larger organizations with established formal mechanisms for accuracy checking and multiple individuals involved in content vetting, they are no longer "self published" by our definition. In any case, there are two independent criteria for source evaluation: reliability, which depends on vetting layers and reputation for accuracy checking, and type, primary vs. secondary, which depends on whether the published material is related to the source. To determine if a particular source is appropriate in a given situation, we need to decide its reliability and whether it's primary or secondary. A large corporation is a good source for technical details about its products, for example, but not an acceptable source for their notability, nor how they compare to competing products. Crum375 (talk) 20:31, 25 May 2010 (UTC)[reply]
Just to keep the pot boiling, what about audited accounts? Johnbod (talk) 19:48, 25 May 2010 (UTC)[reply]
It's obviously a primary source for the kind of statements we might be making (e.g., about a company's revenue during the previous year). I believe it would be better described as self-published than as non-self-published: It's usually part of a larger report (that the auditors don't sign off on), and it is the company itself that makes the audited statements available to the public. WhatamIdoing (talk) 20:09, 25 May 2010 (UTC)[reply]
That seems a very specious argument to me; it is obviously verifiable that it is a secondary source (by the auditors), and the auditors do review the whole report, although I accept there is a difference. It is of course also normally the only source for any figures in it; when a report is on the web there is little benefit in instead referencing a newspaper report for a simple figure, which will be reported without the extra detail in the notes to the accounts & so forth. In the same way newspapers are apt to ignore the differences between the audited and unaudited parts of the report when reporting. Obviously no professional analyst etc would prefer to take their figures from a newspaper report than from the actual accounts, and there are reasons for this. Johnbod (talk) 10:14, 26 May 2010 (UTC)[reply]
Self-published is not a synonym for unreliable. I have not said that the audited statement (or the annual report that contains it) isn't reliable. I have said that it is a primary source and that it is published by the author (the company), not by someone else (an independent/third-party publisher, or the CIA, or Martians). WhatamIdoing (talk) 17:17, 26 May 2010 (UTC)[reply]
It is part-written (most of the text in fact), audited, and if necessary re-written by the auditors (the only ones to sign the audited part), who are required by law to be independent, and published as a legal requirement. The authors of few other sources are required under pain of heavy legal penalties to be independent. Johnbod (talk) 19:18, 26 May 2010 (UTC)[reply]
Let's try this in smaller steps, with an example:
  1. Who wrote Enron's 1999 annual report?
  2. Who published Enron's 1999 annual report?
  3. Are the answers to Question 1 and Question 2 better described as being "the same" or "different"?
    • If "the same", then it is self-published.
    • If "different", then it is non-self-published.
This really isn't that hard: If the entity that writes (part or all) of something is the same entity that publishes it, then it is self-published. The fact that Enron hired (supposedly) independent auditors to help write, or at least sign off on, parts of Enron's annual report, which was clearly published directly by Enron, does not mean that Enron is not, in both legal and practical terms, both an author and the sole publisher of its 1999 annual report.
NB that "self-published" is not a fancy way of spelling "unreliable". It is a concise way of spelling "the author is the publisher". Self-published things are frequently quite reliable. WhatamIdoing (talk) 20:55, 26 May 2010 (UTC)[reply]

I would say that for the sake of WP, a company's own website is not self-published, as in practice we treat self-published sources and primary sources (such as companies' web sites) differently. Most self-published sources are a synonym for "You probably shouldn't use this", while primary sources are mostly a synonym for, "You can use this, but be careful with it." Angryapathy (talk) 20:45, 25 May 2010 (UTC)[reply]

I think the point is that when we say "self published", we mean that the individual human author publishes the material himself, with no formal vetting layers. Once multiple other independent people (even if employees of the same company) are involved in vetting the material, in some formal mechanism, it is no longer a "self" project. Crum375 (talk) 21:05, 25 May 2010 (UTC)[reply]
Crum, you have just defined The Mulberry Advance (a respectable, traditional, dead-tree, small-town newspaper with a single employee, and consequently zero vetting layers) as a self-published source, and press releases(!!) from multinational corporations and political campaigns, which tend to be signed off on by legal, marketing, and financial departments, as non-self-published sources.
This is not fundamentally a difficult concept: If "you" publish it "yourself", then it is "self-published". If "you" publish "their" work, then it is non-self-published. When "you" refers to a corporation or other non-human entity, then any employee or agent of the corporation is part of "you": The actions taken by Coca-Cola's own legal department really are considered the actions of the company, not the actions of individual, independent humans. WhatamIdoing (talk) 21:15, 25 May 2010 (UTC)[reply]
You're conflating "self-published" with "primary source", again.
They are different things. "self-published" refers to secondary sources, and refers to lack of editorial process and rigor. A newspaper which is accepted by a community and newspaper publishing peers, even if it has one employee, still meets a minimum definition of editorial process and rigor. A website, which nobody vets and can give feedback on, does not.
"Primary source" is where information comes from to start with. A company press release announcing a new product is a primary source, a company financial statement is a primary source, etc.
The distinctions between primary and secondary sources are a key factor in understanding information reliability. Georgewilliamherbert (talk) 21:25, 25 May 2010 (UTC)[reply]
WI, regarding your Mulberry Tree paper, if it's run by one man, yes it is technically "self-published". But, if it has a good reputation for accuracy and fact-checking, and reports on things other than the author/publisher himself, it could be used "carefully" to support some small-town events, for example. And as GWH noted above, once a paper like that is in long term circulation, it gets vetted by peers and readers. So in its favor are the facts that it has a good reputation for accuracy, and it's a secondary source, while in its disfavor is the fact that it's a one man show. So we probably won't use it to support contentious issues, but for births, weddings, and funerals, carefully, perhaps. Crum375 (talk) 21:45, 25 May 2010 (UTC)[reply]
The part of your definition that disturbs me most is the part in which you declare that any press release — say, the ones BP is putting out about its oil spill problems — are non-self-published simply because multiple employees of the company were involved in writing it.
(As a point of fact, The Mulberry Advance may not be self-published: the sole employee may not also be the publisher. Usually, with a small-town newspaper, the publisher is the business owner, rather than an employee. With a large outfit, there's more division; e.g., the publisher of the NYT is Arthur Ochs Sulzberger, Jr., and the owner is The New York Times Company.) WhatamIdoing (talk) 22:02, 25 May 2010 (UTC)[reply]
Observe that it is quite possible for a published tertiary source to be based on primary and secondary sources that never got published if those primary and secondary sources were reliably archived. In a forensic case for instance, a bullet or DNA sample is a primary source. Images and analyses of them may become the first published source on the topic when they get to court, but that should not be confused with it being a primary source of information. LeadSongDog come howl! 21:42, 25 May 2010 (UTC)[reply]
I disagree. A report which analyzes bullets and DNA samples, given lab data, is still primary, if it performs the original analysis of that data and shows how the lab results match (or don't match) the suspect. In other words, if it's a report produced by a professional investigator to make an original incriminating (or exculpating) case against a suspect, then it's primary, since that investigator is involved in analyzing the lab data, and making original conclusions about them. If someone else, like a news journalist, reads that report and describes it to the public, it would be secondary. Crum375 (talk) 21:55, 25 May 2010 (UTC)[reply]


George, I'm not conflating "self-published" with primary, etc.—but you are. Self-published does not "refer to secondary sources": it refers to publications that are published by the author (whether that author is a human or a corporation).
  1. Who published something and
  2. Whether the publication is primary/secondary/tertiary
are completely separate issues. All of the possible combinations exist. Consider these examples:
Example of self-published source Example of properly published source
Example of primary source Grandma posting at Blogspot about her house burning down. The first-hand, "eyewitness" report in the local newspaper written by the reporter who was dispatched to the scene of the fire.
Example of secondary source A meta-analysis posted on a researcher's own website. A meta-analysis printed in a scholarly journal.
Example of tertiary source The dictionary mentioned above. The current version of Merriam-Webster
Whether a source is self-published is completely separate from whether it is primary, secondary, or tertiary. Self-published sources can be any of these types. I think it would help a lot if people stopped pretending that Grandma's blog either doesn't exist, or isn't a primary source, or isn't self-published (and those are your three options, if you keep insisting that self-publication can only apply to secondary sources). WhatamIdoing (talk) 21:54, 25 May 2010 (UTC)[reply]
WhatamIdoing - I'm not confusing the two. I'm using "self-published source" in the same sense that it's used throughout Wikipedia WP:RS and WP:V discussions - as shorthand for "self-published secondary source". Your comments that you believe that www.coca-cola.com is a "self-published source" indicate that you don't understand the local terminology used by the policies. Primary sources are primary sources, no matter where they're found. Georgewilliamherbert (talk) 22:15, 25 May 2010 (UTC)[reply]
I see: you are talking about whether something is WikiJargonSelfPublished (that is, a special kind of self-publication, defined nowhere [e.g., notice that your claimed criteria of applying solely to secondary sources is not mentioned at SPS], only applicable to certain rarefied aspects of Wikipedia, and that excludes many self-published sources). I'm saying: Let's stop caring about whether something is WikiJargonSelfPublished. Let's write our policies so that recommend that editors notice whether a source is RealWorldSelfPublished. WhatamIdoing (talk) 22:25, 25 May 2010 (UTC)[reply]

Break 1

WI, there is no wikijargon here. When we say "self published", we mean it in the most common way, i.e. an individual publishing stuff on his own, without help, or perhaps with a couple of his buddies. To take this simple definition and stretch it to cover Coca Cola, so a giant corporation becomes a "self publisher", that's a huge stretch, and we don't do it. Crum375 (talk) 22:34, 25 May 2010 (UTC)[reply]

Crum, there's a whole lot of WikiJargon here, if George's claim that primary sources can't be considered self-published on Wikipedia is true. For your made-up definition based on size -- a definition that is also not mentioned in SPS, I see -- I'd like to say that nobody else in the world -- nobody except a few stalwart preservers of this page on Wikipedia -- thinks that major corporations are too large to publish their own materials.
Who do you think is writing BP's press releases? Who do you think publishes ("makes them available to the public") them?
I can tell you what the answer is, according to the long-established norms of the publishing industry and all copyright laws: They are written as a work for hire for a very large corporation (this actually does make the employer/corporation legally the author), and they are directly published by the same very large corporation. They are neither written by, nor published by individuals, or Martians, or anyone other than their corporate author. WhatamIdoing (talk) 22:59, 25 May 2010 (UTC)[reply]
Again, you are missing the point. When we say "self published", we don't mean it literally, because by the literal meaning, virtually all sources do their own publishing, including the New York Times, Nature, and Scientific American. By "self published" we mean simply that an individual human being published the material himself. We don't include corporations in "self", as that would include nearly everyone and everything, and make the term practically meaningless. Crum375 (talk) 23:05, 25 May 2010 (UTC)[reply]
Again, you keep saying that stories in The New York Times publish themselves, and I keep telling you that The New York Times is published by Arthur Ochs Sulzberger, Jr. (an individual human) for The New York Times Company (the publication's owner). Lining up Wikipedia's definition with the definition that has been used, with substantial success, in the real world for literally centuries does not result in the alleged slippery slope.
If you mean "stuff published by an individual or very small group", then you need to say "stuff published by an individual or very small group". You'll still be left with the difficulty of explaining why stories written by a couple of professional journalists should be outranked by the excretions of a large committee made up of corporate lawyers and publicists, but at least people reading the page will actually know that you don't actually care about the relationship of the author and the publisher. WhatamIdoing (talk) 23:29, 25 May 2010 (UTC)[reply]
Yes, you are correct that that's the general meaning. The current verbiage is: "self-published media—including but not limited to books, newsletters, personal websites, open wikis, personal or group blogs, Internet forum postings, and tweets—are largely not acceptable." Can you suggest an improvement? Crum375 (talk) 23:44, 25 May 2010 (UTC)[reply]
Sure, I can give you several suggestions for improvement. One option is to start the section with an actual definition, e.g., "Self-publishing is the publication of books and other media by the authors of those works, rather than by established, third-party publishers." This, of course, has the effect of defining advertisements, corporate press releases, and spam from political campaigns as self-published works, even if the organization is quite large, and defining most newspaper articles as non-self-published, even if the newspaper is rather small, which I understand that you rather oppose.
If you don't like that, another option involves removing the misleading term "self-published" entirely, and substituting what you think the intent of the section is, i.e., "Sources published by individuals or small groups—including but not limited to books...—are largely not acceptable. Sources published by large groups, including self-promotional materials from major corporations, are largely considered reliable (although they may be subject to restricted use as primary sources)." NB that I don't think this is a good standard, but if you're right that this is what Wikipedia intends to communicate, then Wikipedia should have courage of its convictions and say so plainly, rather than camouflaging it as an issue of who issues the publication rather than an issue of how many humans have been involved in it. WhatamIdoing (talk) 00:22, 26 May 2010 (UTC)[reply]
We can't change established core policies on a whim, and I see no need to, as the core concepts have worked fairly well up to now. And I am sure you will see that even with your own suggestion, it's tricky to make changes in something that works. You suggest: "Sources published by individuals or small groups—including but not limited to books...—are largely not acceptable." The problem with saying outright "small groups" is that there are some fairly small groups which may be, to use your own example, a small town paper, with 4 employees, which do a great job publishing their local highly reputable newspaper, while there are other larger groups which have a website full of junk. So if we tried to nail this down, we'd run into all sorts of problems. This is why we leave it more flexible, by trying to convey the meaning of "self published" through the examples. Not perfect, but nothing ever is, and this works. Perhaps an essay can help, or a wiki glossary. Crum375 (talk) 00:35, 26 May 2010 (UTC)[reply]
IMO, the core concept has worked very badly, if we mean to say "Small groups are less reliable than large groups", and instead we tell editors, "Things published by the author are less reliable than things published by an independent publisher." If we mean to say that David is a worse source than Goliath, then we should say it. If we don't -- and I believe that we don't -- then let's not say "Self-published sources are worse than properly published sources" on the actual policy page, but then surprise everyone by claiming that these words actually mean that "Small groups are worse sources than large groups" instead of what they actually say. WhatamIdoing (talk) 01:32, 26 May 2010 (UTC)[reply]
There is no surprise, and no discrepancy. When we say "self published", we mean published by the authors themselves, as opposed to going through a vetting process and having other independent eyes inspecting and verifying the material prior to publication. We clarify here and elsewhere that reliability of sources relates to the number of vetting layers included in the publishing process. So it all fits in, and there are no surprises. Crum375 (talk) 01:44, 26 May 2010 (UTC)[reply]
Really? Are you firmly committed to that position?
Because I've been saying that "self-published means published by the author" all day long, and George has told me that I'm wrong, because self-publication can't happen to primary sources, and you've told me that I'm wrong because self-published means that it's only things published by an individual author or small group of authors, i.e., that there is some magic number above which a group of authors becomes too big to publish what they write themselves.
Perhaps you would now answer one of the questions I asked above: For the purpose of SPS, who is the author of these publications? And who is the publisher? And if you conclude that they are the same, are we agreed that these press releases are actually self-published, regardless of the fact that lots of individual employees were (presumably) involved in the publication?
And if you actually, finally believe that "Self-publishing is the publication of books and other media by the authors of those works, rather than by established, third-party publishers" (i.e., the basic dictionary definition is correct), can we include that statement in WP:SPS, so that people will quit saying things like "Self-publication is the self-publication of secondary sources" or "Self-publication is any kind of publication by individuals or small groups"? WhatamIdoing (talk) 03:38, 26 May 2010 (UTC)[reply]

Yes, when WP says "self published", for example in WP:SPS, we mean published by the authors, with minimal if any vetting layers. This is independent of their primacy: they can be primary, secondary or tertiary, but generally they are not a reliable source, except for information about themselves, if it's not unduly self-serving. As far as your BP press release, a press release from a large corporation, like any other publication it produces, is not "self published" in the WP meaning, because the authors of the releases have to run them by other employees (e.g. lawyers and executives) who vet them before release. Typically, such releases are primary sources, since they discuss subjects with which the authors and their employers are involved. But again, the issue for reliability is the number of vetting layers, not whether it's a corporation selling widgets or newspapers. Crum375 (talk) 03:55, 26 May 2010 (UTC)[reply]

So in your world, corporate authorship doesn't exist, right? So ten guys who publish a book are self-published, but the same ten authors, if they have the foresight to incorporate their business, publishing the same book, are not self-published? WhatamIdoing (talk) 03:59, 26 May 2010 (UTC)[reply]
When we say "self publish", we focus on the individual human authors who actually compose the words, and the number and quality of vetting layers between those original words and their final publication. If it's a corporation, odds are they would be more concerned about liability, and would be more likely to have a formal policy in place for vetting their published documents. When we say "self published material is unreliable", we mean material written and published by the authors, with minimal if any vetting, is unreliable. If there are 10 guys who incorporate, they would likely also set up formal corporate policies, including one for vetting their published materials. But the bottom line again is the number of vetting layers. The more known or expected layers, the higher the expected reliability. And corporate liability influences the number and quality of the vetting layers. All of this is independent of source primacy, though corporations typically write about themselves, and are therefore likely to be primary sources for that material. Crum375 (talk) 04:10, 26 May 2010 (UTC)[reply]
I don't think that you know what you're talking about. Incorporation makes these ten authors less liable for what they write. Incorporation means that if they screw up, nobody can take their homes away from them. If they screw up as a partnership, their personal assets are jointly and severally on the line.
But I have specified that we are publishing "the same book". Are you certain that "the same book", written by "the same ten authors", is published by the authors if they are a partnership, and is not published by the same ten authors if they have incorporated? WhatamIdoing (talk) 04:35, 26 May 2010 (UTC)[reply]
Incorporation protects the individual owners' homes, but it leaves their combined work effort, where they focus and invest their time and energy, very much open to lawsuits, a libel one in this case. So when they set up their corporation, they would normally take steps to minimize its legal risks, and one of the first steps would be to set up formal policies to ensure the quality of their product(s). This is especially true if there are more than a couple of employees. So yes, if the same book is written in a corporate environment where a quality management system is likely to have been instituted, including formal quality assurance policies, that book, along with any other product, is likely to be of higher quality and therefore more reliable. So in general, individuals publishing their own material without formal vetting layers are less reliable than corporations doing so. But if we have specific knowledge that an unincorporated group or organization has a formal quality assurance mechanism in place, with multiple pairs of eyes regularly vetting published output, that would count towards reliability too. The bottom line again: the more vetting layers, the more eyes expected to routinely inspect the output before publication, the more reliable it is. Crum375 (talk) 12:44, 26 May 2010 (UTC)[reply]
Yeah, you're not getting it.
Incorporation means that only the work product is on the line. Non-incorporation means everything is on the line: work product, homes, cars, bank accounts -- everything. Why would any (rational) person take more care to protect a small fraction of their assets than all of their assets, including that small fraction?
Incorporation of a small business in most of the US costs about $200, requires you to fill out one or two simple forms, and takes about an hour. That's it: Once the state processes the paperwork, your business is incorporated. Sloppy, lazy, careless people are perfectly capable of doing this. People who fully expect to get sued do this as a matter of course. None of this paperwork involves creating a quality management system, formal policies, or anything else. There are millions of corporations in the US that have only a single person involved in them. There are thousands of individuals who have created and maintain multiple corporations all by themselves.
But to drag you back to the subject: The question is not, "Is a corporate press release a reliable source on Wikipedia?" The question is, "Did the business both write and publish it?" (Or, was it written by Martians and published by the CIA, or something like that?)
Again: The question is not reliability. The question is only self-publication (is the author the same as the publisher). WhatamIdoing (talk) 17:35, 26 May 2010 (UTC)[reply]

And let me reply to your point of journalists vs. corporate lawyers. That's apples and oranges, because normally journalists would write a story about a third party, and be a secondary source, while the legal eagles would write about their own corporation, and be primary sources. Independent of their reliability (though we assume both groups are reliable), the latter would generally be used only to help describe the corporation itself, while the former could be used as a general source. Crum375 (talk) 23:53, 25 May 2010 (UTC)[reply]
Eyewitness reports are always primary sources, even if they are written by professional journalists, and this section makes no distinction, or even any mention at all, of whether a self-published source is primary or secondary. WhatamIdoing (talk) 00:26, 26 May 2010 (UTC)[reply]
Yes, if a journalist gives an eyewitness report, or tells about his family, that would be primary. This is why I used the word "normally", in "normally journalists would write a story about a third party." And we don't tell you if a self-published source is primary or secondary because it can be either one, and in general it has no effect on its reliability, since they are normally not accepted except to describe themselves (where they would typically, but not always, be primary). There is more detail about primary vs. secondary in WP:PSTS. Crum375 (talk) 00:41, 26 May 2010 (UTC)[reply]

Break 2

WI, grandma's blog is self published, but may well be secondary, if she describes things independent of herself. Reliability is independent of source primacy: a primary source can be highly reliable, like a crime lab result, while a secondary source can be highly unreliable, like grandma's blog. Crum375 (talk) 22:01, 25 May 2010 (UTC)[reply]
It's true that a blog could be a secondary source, but I have specified Grandma's posting about her own house burning down, and posting about an experience that directly happened to you is always a primary source. WhatamIdoing (talk) 22:04, 25 May 2010 (UTC)[reply]

Perhaps this dispute has to do with the fact that we use the notion "self-published" for at least two purposes, and in the case of large organisations publishing about themselves it makes sense to consider them as self-published for one of them, but not for the other:

  • The stricter rules that govern self-published material are a mechanism for discarding irrelevant stuff: Things that nobody wants to read but which the author absolutely wants published, for whatever reason. Such authors and their few loyal fans have a tendency to appear on Wikipedia to promote these sources. To keep Wikipedia clean, we try to minimise our use of such stuff.
  • When something has gone through a third-party publisher there is a presumption that usually there aren't any really bad POV excesses. Reasons include an editor who takes care to keep the quality of the publisher's publications high; that authors who get formally published are generally of higher quality; a higher degree of self-control when writing for a large audience; and that pure advertising usually doesn't get published in this way.

When the New York Times writes about itself we can assume that the first point doesn't apply. The second point actually depends on the media they are using: An article about the NYT in the NYT will probably be written to the newspaper's normal standards and therefore should not be regarded as self-published in either way. (Or if it is, there should be a common sense exception for such sources.) When the NYT self-publishes a book about itself it's a different matter. In this case there is a good chance that it crosses the line to advertising. The first problem does not apply in this case, but part of the second does. The situation isn't very different from a self-published paper by a respected scientist: Many of the problems of self-publishing don't apply, but we still consider it self-published.

Like almost everything, whether something is self-published or not is not purely a matter of black or white, or even of shades of grey. The world is full of colours.

There can be communication problems if one editor argues from the plain meaning of the word "self-published", one argues from its wiki meaning as literally defined in policies/guidelines, and yet another argues from the spirit of the policies/guidelines. All these approaches are valid, and they may lead to opposite results. Hans Adler 22:43, 25 May 2010 (UTC)[reply]

I'd like to refine your example: When Arthur Ochs Sulzberger, Jr. (the publisher) runs a story in The New York Times (the publication) about The New York Times Company (the owner), the first point (probably) doesn't apply. When that corporation publishes its advertising rate card, or an advertisement that is attempting to attract subscribers, then the first point does apply. WhatamIdoing (talk) 23:05, 25 May 2010 (UTC)[reply]

Self-published acceptable for "existence" or attribution?

The SP policy says that the rationale is that anyone can easily self-publish and "claim to be an expert in a certain field". But what if the work is referenced not to substantiate a fact, but rather to demonstrate the existence of an opinion?

For example, I publish a website critical of a cult I used to belong to, which includes original source documents (MSM excerpts, advertisements the group published, etc.), as well as the stories of other former members. I think that the site would be a suitable reference for WP article statements such as, "Former group member Michael Bluejay now publishes a website critical of the group, asserting that it is actually a mind-control cult", or "Former members of the group now say that the group operates as a mind-control cult." The site isn't used to substantiate some special fact, only to show the existence of a claim being made. So I don't think this use goes against the intent of the policy, since the intent of the policy is to prevent using self-published sources to justify facts, not the existence of claims.

Taking this a step further, I'm hoping to quote another former member who described his cult experience as a factor in the failure of his marriage. Here again, the article wouldn't claim as a fact that ex-members' marriages failed as a result or their being involved in the group, only that they made that claim. Is the site acceptable to show that the claim has been made?

Incidentally, the overwhelming majority of sources we're using for the article aren't self-published, but in a couple of cases my site is the only source available for the bit in question. I know my site isn't not the best source, but I think it's better than nothing. The quote above by the former member about his marriage failure hasn't been published anywhere else. Hence my problem.

I'm keen on hearing others' thoughts about all this. Thanks, MichaelBluejay (talk) 07:11, 26 May 2010 (UTC)[reply]

My own personal view is that curent policy on self-published sources is fundamentally flawed. The idea that "Self-published material may in some circumstances be acceptable" conflicts with the specific issue of independence, which I have set out above[1]. If the publisher and the author and one of the same person, then this source of information can never be classed as reliable, in the same way that I am writing here on this talk page cannot be classed as being reliable either. A source without some form of oversight just does not operate the weakest forms of checks or balances that provide a modicum of assurance that the source can be relied upon. I think that a source that is self-published is just one form of source which is not indpendent, because the author is the same person as the publisher. --Gavin Collins (talk|contribs) 08:20, 26 May 2010 (UTC)[reply]

Thanks, but I'm not sure you're listening to me. First, I gave an instance in which the publisher is *not* the author, but more importantly, you seemed to ignore my whole point that using a source to show the *existence* of thought (the source *itself* is such thought) seems to be quite a different matter from relying on a source to justify *facts*. Agreed that self-published sources are usually unreliable in the latter case, and the policy says as much, but doesn't really address the former. That's what I'm seeking some feedback on. MichaelBluejay (talk) 16:10, 26 May 2010 (UTC)[reply]

Really, Gavin? You're absolutely sure that Barack Obama's political website cannot possibly, under any circumstances, be a reliable source of the fact that Obama ran for President, or that his slogan was Yes we can? Is this on the theory that 100% of campaign staffers with access to the website are too stupid to know their own slogan? WhatamIdoing (talk) 17:22, 26 May 2010 (UTC)[reply]
Really, WhatamIdoing. Remember Watergate. It seems to me that Presidents can be good or bad sources, but when they publish their own stuff on their own site, I would not trust them with a barge poll. --Gavin Collins (talk|contribs) 21:59, 26 May 2010 (UTC)[reply]

Okay guys, you're really hijacking my section here. I had some very specific questions that I opened up to discussion. I'm hoping to get some comments on them. Thanks, MichaelBluejay (talk) 02:03, 27 May 2010 (UTC)[reply]

The article seems to have multiple reliable sources. I wouldn't think your self-published site should be used. Due to the sorts of claims made on your site, I don't believe Wikipedia can link to it for any reason. Nothing against you, or your site, but I think that sort of thing would be a bit of a problem for Wikipedia for legal reasons. Have you been recognized by a reliable third-party as being knowledgeable on this subject? If so, maybe your site could be used, but I still wouldn't think so: not with the subject matter of your site.  Chickenmonkey  02:35, 27 May 2010 (UTC)[reply]

Well, I've been quoted in newspaper articles as a former member of the group who is now critical of it, and I think at least one of them has mentioned that I run the website. I have a hard time believing that WP could ever be held accountable for content generated by its users. Has that ever happened? Finally, I'm really hoping to see the issue I raised discussed in a broad context, not just about my own website. That is, can a self-published site be a good source to show the *existence* of thought (i.e., the source *itself* is the evidence of that thought, inherently)? I know that a self-published source is usually insufficient to back statements of *fact*. What I'm asking about is entirely, completely different. MichaelBluejay (talk) 05:24, 27 May 2010 (UTC)[reply]

I think it can be acceptable. The problem is showing that this opinion is WP:DUE, not that the opinion exists. WhatamIdoing (talk) 05:29, 27 May 2010 (UTC)[reply]
I'm quite sure using a self-published website to show the existence of an opinion/thought is acceptable, but only if that site is reliable -- which would mean third-party sources have recognized the site as knowledgeable and reliable.
I don't know of any instances in particular where Wikipedia has been held accountable for content generated by its users, but potentially libelous material is advised against in WP:BLP -- which doesn't apply to groups, but I think specifically of Scientology in this particular situation. That doesn't mean we shouldn't address criticisms of groups. That just means we should take special care to ensure adherence to policies like WP: Undue weight, WP: Notability, WP: No original research, and this one WP: Verifiability. In situations like yours, where a group is being labeled a "cult", we must be especially careful to rely solely on very reliable sources.  Chickenmonkey  05:57, 27 May 2010 (UTC)[reply]
In any case you would have a WP:COI issue. I can't conceive of circumstances where a WP editor should be citing or discussing a website that he publishes, particularly one advocating a (self-professed) POV. Even making a talkpage argument for its citation seems pretty questionable. Leave it up to others to write that article. At the very most, I would ask other editors on the talkpage if they thought the source was useful, while being sure to disclose my conflict there.LeadSongDog come howl! 12:48, 27 May 2010 (UTC)[reply]

Thank you. We do have an admin leading the rewrite of the article and policing the sources. But anyway, again, I'm really hoping to see the issue I raised discussed in a broad context, not just about my own website. That is, can a self-published site be a good source to show the *existence* of thought (i.e., the source *itself* is the evidence of that thought, inherently)? I know that a self-published source is usually insufficient to back statements of *fact*. What I'm asking about is entirely, completely different. MichaelBluejay (talk) 05:04, 28 May 2010 (UTC)[reply]

Bottom line

The bottom line is that we can reasonably assume of our reliable sources that they have a professional structure in place for checking facts and legal issues before publication. They have professional people who are paid to say "no, don't publish this." That is absent with personal websites and blogs where an individual or a small unprofessional group is publishing straight to the website. That is what we mean by self-published: that no one stands between the writer and publication. There are no checks and balances. No one is paid to say no.

That is never the case with The New York Times or Coca Cola or the White House, whether they're writing about themselves or something else. When writing about themselves, those organizations are primary sources of information about themselves, but they are never self-published sources within the meaning of this policy. SlimVirgin talk contribs 17:45, 26 May 2010 (UTC)[reply]

Then we need to quit using the words "self-published" (meaning: the author is the publisher) and start saying what we mean, e.g., "Bureaucratic procedures improve reliability" or "If a corporate press release is vetted by the same number of humans as a story in The New York Times, then these sources should be considered equally reliable."
Right now, we're saying that we care whether the author is the same person as publisher on the page, but SV and Crum believe that we mean the number of humans involved in writing and publishing the source is the major issue. This is, at best, misleading. If you really believe that this is the point of this section, then I dare you to change it to say, in plain language, what it allegedly means. WhatamIdoing (talk) 18:38, 26 May 2010 (UTC)[reply]
The term "self-published" has been used in this policy to mean the same thing for years, and I haven't seen it cause confusion in articles, so I can't see any benefit in changing it. And the policy already says what you recommend it ought to say: "In general, the best sources have a professional structure in place for checking or analyzing facts, legal issues, evidence, and arguments; as a rule of thumb, the greater the degree of scrutiny given to these issues, the more reliable the source." SlimVirgin talk contribs 18:43, 26 May 2010 (UTC)[reply]
I've always intepreted the policy in the same way as WhatamIdoing. You can have 27 layers of vetting, but if the vetters belong to the same organization as the author, and they are vetting information about that organization, then I think that counts as self-published. Do we really trust a corporation to be objective or even 100% accurate in what information they post on their own website or in their own press releases? Heck no. A corporate website is a marketing tool, not a bastion of journalistic ethics. This doesn't exclude those websites from being used as sources in some cases; the corporation is clearly an expert on itself and thus some of the information could be appropriately included. Karanacs (talk) 19:11, 26 May 2010 (UTC)[reply]
I agree, Karanacs, but that just means they're primary sources. It doesn't make them self-published. There's no benefit in extending the definition of "self-published" to include anyone or any company writing about itself. And in any event when the White House, for example, writes about itself, it still has multiple layers of professional oversight to get through. Plus, we don't require of any of our sources that they be objective or 100 percent accurate. SlimVirgin talk contribs 22:02, 26 May 2010 (UTC)[reply]

If this is truly how Wikipedia defines "self-published":

By "self publishing" we mean individuals or small groups who have no or minimal oversight levels. Crum375 (talk) 18:19, 25 May 2010 (UTC)

The way Wikipedia defines "self-published" is absolutely correct. By "absolutely correct" I, of course, mean "completely wrong". See? I can misuse words, too (I apologize for being frank). If a corporation puts out a press release, that's self-published, because the entity itself wrote it. No matter how many vetting layers it has, all of those vetting layers exist within one entity. Whether something is "self-published" is completely separate from whether it can be used as a reliable source. If some actor publishes, on their website, the movies they've been in: that information is "self-published". That would also be a "primary source". Does that mean we inherently can't trust this information? No. If that same actor publishes, on their website, a blog that says "The director called me the best actor ever": that information is "self-published". That would also be a "secondary source". It's obviously not reliable. The same "self-published" source, both reliable at times and unreliable at others. With that said, if a corporation, on their website, publishes an indisputable fact (when they were founded, for instance), that's "self-published" and a "primary source". You get the point.

I do completely understand the intention of saying "we try not to use self-published material, such as blogs, etc, because they may not be reliable", but that has nothing to do with material being "self-published". At the very least, there needs to be a discussion on correcting how Wikipedia is telling its editors to interpret "self-published". Why would it be acceptable for Wikipedia to redefine a word?

As was said at WT:N:

There's no sense in a small number of editors making up its own definitions. These terms are used in a certain way in the academic world, so we should try to stick to that usage. It varies a little between subjects, but what's being suggested here is something I don't recognize at all. SlimVirgin talk contribs 17:32, 25 May 2010 (UTC)

By the way, "By 'self publishing' we mean individuals or small groups who have no or minimal oversight levels." So, how many oversight levels is enough? How many individuals is too many?  Chickenmonkey  20:36, 26 May 2010 (UTC)[reply]

Like everything in life, it comes down to editorial judgment and commons sense. The New York Times have many vetting layers, while Podunk Weekly may have just a couple, but the point is that when we say "self published" we mean the author can press a couple of buttons and have his output published. When it's not self published, it means the author needs to have several other people, ideally paid professionals, vet and/or approve his version before it's published. When legal vetting for potential liability is involved, it's even better. If it's a couple of guys and their dog publishing something, it's self published. If it's a big corporation, like the NYT or Coca Cola, it's not. In general, corporations are more reliable than a loose bunch of individuals, but we need to evaluate the number of vetting layers and their quality, and decide whether the author is able to just press the "Save" button and publish, or whether he's got a bunch of hoops to jump through. The former being "self published", the latter generally not. Crum375 (talk) 21:04, 26 May 2010 (UTC)[reply]
I should clarify, I'm not arguing that the policy should change. It reads plainly that "self-published media [...] are largely unacceptable." That's fine, because most "self-published" media shouldn't be accepted as reliable. What I am/was commenting on is the discussion that has taken place on this talk page where "self-published" has been misunderstood. That said, the phrase "what we mean is XXX" should not have to be used. If it does, the policy should probably be more clearly written: perhaps to include "corporate publications" as S Marshall has stated below. Currently the policy doesn't say not to use self-published sources, but that they are "largely unacceptable". It seems, and I may be wrong, that many editors believe this policy to be meaning that we should not use self-published sources.  Chickenmonkey  22:19, 26 May 2010 (UTC)[reply]
  • "Published" is where someone independent of the author assumes the risk, responsibility and profit involved with releasing the material. "Self-published" is where the author takes the risk, responsibility and profit. "Corporate publication" is a separate category. It's like being "published" in that the author isn't the person who takes the risk—the author's boss does—but because the author and the person taking responsibility aren't independent of each other, it resembles self-publication as well.—S Marshall T/C 21:29, 26 May 2010 (UTC)[reply]
No, in the wiki meaning "independence" is not really an issue. You can have the chairman of the New York Time write an opinion column, and his words would be still vetted by other employees including corporate lawyers, before publication. They are clearly not "independent", but they are still different pairs of eyes, and they are paid to review and vet material, no matter whose, before it goes out. And that's the essence of "self publishing": if you can press a "Save" button and publish your material whenever you want to, it's "self published". If you need to run it by a bunch of people whose job it is to review publications before they go out, then it's not self published. Corporations are generally not self published, because they have professional vetting mechanisms in place. And it doesn't matter who is the "boss", only whether the vetting mechanisms are in place to correct mistakes and help reduce liability. Crum375 (talk) 22:12, 26 May 2010 (UTC)[reply]
S Marshall is right, and every single reliable source supports his definition. WhatamIdoing (talk) 23:14, 26 May 2010 (UTC)[reply]
(edit conflict)Well, Crum375's view has the benefit of being simple and easy to understand, but I think that when we're discussing verifiability in terms of the reliability of sources, "independence" is an issue. I think it's overly simplistic to disregard it. My position is that the trustworthiness of a source isn't a binary, yes-or-no thing. There's a continuum.

For example, let's imagine an article on a soft drink, where there's a dispute about the potential health effects of drinking it. The sources might be: (1) a full series of experiments by a professor at Oxford University, published by Oxford University Press; (2) a corporate press release by the manufacturer; and (3) a television interview with someone who got sick after drinking a can of Moxi-pop. Technically, all of these sources have been seen by someone who's paid to edit them, but you wouldn't give them equal weight, would you?

This is why I say it's about risk, responsibility and profit. The television interview gets least weight because the TV station is not to be held responsible for what the person being interviewed has said, even if there's been so much editing that the interview has been cut from half an hour to two minutes. The corporate press release gets the next most, but there's no independence and there's a profit motive. What you'd believe, and what you'd want the article to be based on, is the academic study.—S Marshall T/C 23:34, 26 May 2010 (UTC)[reply]

PS: I want to clarify that I'm not saying a corporate press-release is to be treated the same as a self-published book or a blog post. I don't think that's right. I think that for many matters, we can take what a large company says about itself at face value. If the company's website says it has 325 employees and a turnover of $40 million, then we can report that as fact.—S Marshall T/C 23:46, 26 May 2010 (UTC)[reply]
If you have a soft drink manufacturer writing about its soft drinks, it would be a primary source. Such sources, though often reliable, cannot be used to support contentious claims, and cannot be interpreted or analyzed, and otherwise must be used very cautiously, per WP:PSTS. We would normally require a secondary source, which by definition should not be involved with Coca Cola, to report on its health effects. Also, if it's a scientific issue, we'd prefer to rely on high quality peer-reviewed publications or mainstream media. Crum375 (talk) 23:52, 26 May 2010 (UTC)[reply]
Then I don't think we're disagreeing. :) Up until that post, you seemed to be saying that a press release was as reliable as an independently-published article, and I do understand why WhatamIdoing would want to challenge that.—S Marshall T/C 00:01, 27 May 2010 (UTC)[reply]
Reliability is not a blank check. A primary source can be very reliable to tell us the temperatures in Moscow over the last century, but not about their significance. A press release, which is also primary, can be very reliable to tell us that Corporation X has decided to add widget Y to its product line, but not to compare it to its competition, or evaluate its health effects. Crum375 (talk) 01:22, 27 May 2010 (UTC)[reply]

Bottom line, redux

Experienced Wikipedia editors and admins understand what we mean by the policy and believe it's sufficiently clear. Objections complaining about jargon or denotations of particular words are worthwhile to clear up individual confusion regarding points. However - now that we have cleared up those points, the policy stands as it has stood for some time. It has withstood the test of time and multiple challenges by those seeking to abuse the site in some manner or another.

All Wikipedia policy is subject to ongoing review and evolution, but the proper venue for this sort of review is the Village Pump. I predict that a discussion there will with near unanimity support the existing policy, but anyone who believes that it's fundamentally flawed should feel welcome to take it up there and attempt to change people's minds. Georgewilliamherbert (talk) 21:35, 26 May 2010 (UTC)[reply]

I think it is flawed. We refer to sources as being self-published, but what we really mean is that the source is not independent. I think this is the venue to discuss this issue, as I feel that if we have make it clear what independence is, then this policy will be clearer as to what is a self-published source, and why they are not considered to be reliable. --Gavin Collins (talk|contribs) 22:17, 26 May 2010 (UTC)[reply]
No, we don't mean by "self published" not independent. See also my reply above. What we mean by "self published" is that the author can just press a "Save" button and publish his stuff whenever he wants. Once you introduce a professional vetting layer, and legal liability scrutiny, by multiple people, and every author (even "the boss") needs to jump through these hoops before his stuff is published, then it's not self published. Crum375 (talk) 22:25, 26 May 2010 (UTC)[reply]
Which is to say that Wikipedia has invented its very own definition of "self-published", that has absolutely nothing to do with whether the author is the publisher, but instead is being used as a secret code word for "big organization" or "editorial control". This purported definition of "self-published" directly contradicts every reliable source and introduces concepts totally outside reliable sources.
I don't mind praising editorial control on this page. I just don't want editorial control to be spelled s-e-l-f-p-u-b-l-i-s-h-e-d, to the predictable confusion of every editor who isn't in on the joke. WhatamIdoing (talk) 22:37, 26 May 2010 (UTC)[reply]
I still can't see the point of wanting to change the way we use "self-published." Sources writing about themselves are primary sources, and that somewhat limits the ways in which they may be used. What is the benefit of attaching a new term to them? SlimVirgin talk contribs 22:44, 26 May 2010 (UTC)[reply]
The issue of independence is broader than whether a source is self-published or not. Sometimes groups of commentators can club together to give the impression that their views are independent, say by starting a magazine that endoreses their viewpoint. A good example of this is the article Socionomics; the sources for this turn of phrase are basically a group of stock market analysts trying to promote their proprietary brand of predicting new stock market trends, despite the fact that accepted economic theory (e.g. Random walk hypothesis) says you can't. Its an article topic that has been deleted several times, but basically because it fails the independence test as it is defined in WP:N. Now where does that test come from? All of the other principles contained in the notability guideline originate from one or more of Wikipedia's content polices. I think independence is a topic that has been omitted from WP:V, or is not set out explicitly, and has become subsumed by WP:SPS. Perhaps I am completely wrong on this issue, but this is what I suspect to be the case. --Gavin Collins (talk|contribs) 22:59, 26 May 2010 (UTC)[reply]
Gavin, my question is why the concept of a primary source isn't enough. A primary source is an involved source. I'm wondering why that isn't sufficient for the purpose of judging independence. SlimVirgin talk contribs 23:53, 26 May 2010 (UTC)[reply]

You can have a bunch of people get together and produce a wonderful, accurate, highly reliable and reputable product, or they can produce pure junk. This is why we specify that a reliable source must have a reputation for accuracy and fact checking. If the reputation is that a source produces junk, then it's not reliable, regardless of whether it's self published or not. Crum375 (talk) 23:10, 26 May 2010 (UTC)[reply]

The point? How about so that it's intelligible to people who don't already know the secret code words? How about so we live up to your (quoted above) claim that Wikipedia shouldn't make up its own definitions of basic terms? So that Wikipedia's policies don't look like they were written by (or interpreted by) people too stupid to figure out a dictionary definition? So that editors won't keep saying "it's self-published" when they actually mean "it's not independent" or "there's no editorial control"? So that Wikipedians won't waste time with conversations like "That's self-published" — "No, it's not, the author is X and the publisher is Y" — "No, I mean that it's 'WikiJargonSelfPublished', which the real world calls 'written by one person and published by another person, but there aren't very many humans altogether'".
(Gavin is right that self-publications can be independent, and non-self-publications may be non-independent.) WhatamIdoing (talk) 23:09, 26 May 2010 (UTC)[reply]
Could you stop being so rude, please? We have our own use of lots of terms on WP. Original research, for example. Verifiability is another example. When I cautioned against making up terms I was talking about primary/secondary sources. As for self-published, the White House would laugh to hear themselves called self-published. They wouldn't laugh to hear a blog called that. I think our use of it is quite normal. But I've completely lost track of what you're arguing or trying to achieve. SlimVirgin talk contribs 23:57, 26 May 2010 (UTC)[reply]

There are no "code words": self publishing means the author presses a "Save" button and publishes the material himself. If he has to run it by multiple vetting layers before it can be published, then it's no longer "self" published. No codes. Crum375 (talk) 23:13, 26 May 2010 (UTC)[reply]

Every single reliable source disagrees with your made-up definition. WhatamIdoing (talk) 23:15, 26 May 2010 (UTC)[reply]
Here is the very first definition Google came up with for "'self publishing' definition": "Definition # The simplest definition of self-publishing is when an author produces and publicizes his own book for public consumption." (on www.ehow.com/facts_5017513_definition-selfpublishing.html) Crum375 (talk) 23:27, 26 May 2010 (UTC)[reply]
And you'll note that there's not one word in that definition that says "unless he had a dozen lawyers sign off on it" or "except when a corporate author is involved" or "except when there are multiple vetting layers" or any of the other unsupportable claims you've been adding to it. An author who runs his publication through ten million vetting layers, and publishes it himself, is still self-published -- according to the very definition that you quote. WhatamIdoing (talk) 23:33, 26 May 2010 (UTC)[reply]
Not at all. If you work for the New York Times, write an article with some other reporters, and have your corporation publish it, it is not self published. That's because the article would have to go through vetting layers and probably legal scrutiny before it can go out. Crum375 (talk) 23:42, 26 May 2010 (UTC)[reply]
Okay, let's go back to a more basic concept. Do you know what "publication" means? It means "producing and distributing" materials (publicity, mentioned in your definition, is helpful to the distribution, but not actually essential to the act).
Do we agree that if the author is the publisher, it is self-published — full stop, no exceptions, no further conditions, that whenever the author=publisher, then the material is self-published? WhatamIdoing (talk) 23:47, 26 May 2010 (UTC)[reply]
The words "author", "publishing", "publication" and "publisher" mean different things in different contexts. This is why we should focus on the process. If the human individual writing a document can press a button at will, and distribute the document to the public, then it's "self published". If the document must be scrutinized by different professionals before it can go out, then it's not self published. Trying to make this more complicated than this will only add confusion and gain nothing. Crum375 (talk) 23:59, 26 May 2010 (UTC)[reply]
The process is spelled e-d-i-t-o-r-i-a-l o-v-e-r-s-i-g-h-t. If that's what you actually want to address in this particular section on this page, then you need to "correct your spelling", as it were.
NB that I (strongly) favor editorial oversight; I just don't agree that it is spelled anything like s-e-l-f-p-u-b-l-i-c-a-t-i-o-n. WhatamIdoing (talk) 00:10, 27 May 2010 (UTC)[reply]
I don't entirely agree, no. There are exceptions. For example, if The Times publishes an article that's about itself, then that's "published", not "self-published". And besides, it doesn't seem to address the basic division between you two. I think Crum375's view is about whether the author and the publisher are the same person, and I think your view is about whether the author and the publisher are the same corporation.—S Marshall T/C 23:55, 26 May 2010 (UTC)[reply]
The publication does not publish itself.
I agree that Crum is having difficulties with the concept of corporate authorship, but he is also apparently convinced that "did the same person/entity write it and publish it" is the same issue as "was there any editorial oversight"? WhatamIdoing (talk) 00:10, 27 May 2010 (UTC)[reply]
Oh, I'm sure Crum375 understands corporate authorship. I'm concerned that both of you have at times seemed to be taking binary, on-off positions, where one was saying "Corporate authorship isn't self-publication!" and the other was saying "Corporate authorship is self-publication!" Neither of those are true. Corporate authorship is corporate authorship, and it has its own place on the hierarchy of trustworthiness. It's above "self-published" but below "published by a reputable, independent source".—S Marshall T/C 00:20, 27 May 2010 (UTC)[reply]
Crum's comments above indicate rather plainly that he believes that the author of a press release written as a work for hire is the employee rather than the employer. While this is true in a biological sense, it is absolutely untrue in a legal/copyright sense. WhatamIdoing (talk) 00:27, 27 May 2010 (UTC)[reply]
Well, I'm not concerned about that. What matters to me is how to evaluate the sources I see, and I don't pretend a corporate press release is "self-published" or "published". It's neither; in terms of trustworthiness, a corporate document doesn't fit in either of those two slots. It's a different thing that editors need to think about in a different way.—S Marshall T/C 00:37, 27 May 2010 (UTC)[reply]
A press release by a corporation is not SPS, because its author cannot send it to the public unless it undergoes professional scrutiny by multiple layers. It is generally a reliable source for WP purposes, but almost always a primary source, because it discusses the corporation in which the author is involved. And primary sources can only be used in limited ways and very carefully, per WP:PSTS. Crum375 (talk) 00:46, 27 May 2010 (UTC)[reply]
I think there are more shades of grey than that, and for me, it rather depends on the corporation. I'm sure a press release by MacDonalds undergoes professional scrutiny, but I wouldn't want to see a press release by Mrs Miggins' Pie Shop treated in quite the same way!—S Marshall T/C 00:52, 27 May 2010 (UTC)[reply]
It all depends on the topic. Mrs Miggins' press release announcing her latest Apple Pies would be a very reliable source for a description of her products, while MacDonald's would be OK for their latest menu addition. Neither one would be considered a reliable source for competing products. Crum375 (talk) 00:59, 27 May 2010 (UTC)[reply]
Well, maybe not always "very reliable" for a description of the products. "Mrs Miggins sells apple pies," fine. "Mrs Miggins sells the most popular apple pies in Oklahoma," I would not think is fine. But I'd believe MacDonalds if they said "MacDonalds sells the most popular burgers in the United States." Essentially, I think some press releases are more trustworthy than others.—S Marshall T/C 01:29, 27 May 2010 (UTC)[reply]
You may believe MacDonalds, but WP would require a better source for claims relating to competition. But we could always use in-text attribution and say, "According to MacDonald's, they sell more burgers than Burger King." The same for Miggins if she says she sells more than the store across the street. Bottom line: press releases by corporations are generally reliable primary sources, and generally to describe themselves. Crum375 (talk) 01:54, 27 May 2010 (UTC)[reply]
Again, I'm concerned that this is too simplistic and in fact I think it contradicts your earlier position, which was that verifiability is enhanced by the number of people paid to check the content. MacDonalds' press release would've passed before a lot of editorial eyes, but Mrs Miggins' one might've been written by the sales manager and sent to the newspaper. When it comes to verifiability, not all corporate sources are equal.—S Marshall T/C 09:46, 27 May 2010 (UTC)[reply]

Yes, you are correct that source "reliability" is not black and white, but it's also very much a function of the statement we are trying to support. There is no "absolute reliability", and we expect sources to be more knowledgeable than others about topics close them, although less objective. So corporations can tell us how very reliably how many types of widgets they sell, but not how good they are to your health, or how they stack up compared to the competition. And although a bigger corporation has more vetting layers of higher quality than the smaller ones, when it comes to their own products they would be likely be equally accurate, because the small corporation typically has a simpler product line and less information to process, so things may balance out. Crum375 (talk) 11:17, 27 May 2010 (UTC)[reply]

Let's all agree that the trustworthiness of a source is a continuum rather than a yes/no issue on whether it's self-published, and move on from that. On the specific example of the health effects of a product, the manufacturer can be very reliable. For example, here in the UK, thanks to COSHH, many manufacturers have to issue product safety data sheets, and a COSHH sheet for Portland cement is a highly reliable source for the health consequences of handling it even if it comes from the manufacturer. But a press release from the same maker would not be so reliable, because the press release doesn't have to adhere to the same exacting standards.—S Marshall T/C 12:04, 27 May 2010 (UTC)[reply]
Yes, source reliability is a continuum, as you say, and some are more reliable than others for the same information. This is where we have to use our editorial judgment and decide (by consensus) for each particular situation which source is best, among potentially many which are at least minimally "reliable". Crum375 (talk) 12:13, 27 May 2010 (UTC)[reply]

Specific proposal

I propose adding the following text to the top of WP:SPS:

Self-publication is the publication of a work by its author, without the involvement of an established, third-party publisher. This includes any and all individuals, small groups, and corporate authors who publish their own works on paper, electronically, or in any other media form, so long as the author is also the publisher.



Self-published and non-self-published sources may or may not be independent of the subject.[1] Self-published and non-self-published sources can be primary, secondary, or tertiary sources.[2]

  1. ^ Examples: A book that is both written and published by a historian about the Roman Empire is an independent, self-published source. Memoirs written by a retired politician and published by a major publishing house is a non-independent, non-self-published source.
  2. ^ Examples: A blog posting about a house fire, written by the person whose house burned down, is a primary, self-published source. A newspaper story about the same fire, written by a reporter on the scene, is a primary, non-self-published source.

I propose keeping all of the other text in the section the same (except, possibly, cleaning up the comma splice in the first sentence). This does not change the policy; it only provides a basic, verifiable (e.g., [2][3][4]) definition of what a self-published source is and dispels some unverifiable myths whose existence is amply proven by this discussion. WhatamIdoing (talk) 23:10, 26 May 2010 (UTC)[reply]

Your definition would rule out the New York Times as a source, since it is a corporation which publishes material on its own. Crum375 (talk) 23:33, 26 May 2010 (UTC)[reply]
No, it wouldn't. The New York Times is published by Arthur Ochs Sulzberger, Jr. — not by itself, not by its journalists, not by its owner, not by its editors — and that is practically the definition of an "established publisher", which the above directly names as a criterion for non-self-published materials. WhatamIdoing (talk) 23:36, 26 May 2010 (UTC)[reply]
The New York Times is a corporation which publishes its own material. It creates the content, vets it, and publishes it. According to your proposal, it would be a "self publisher" and therefore unreliable. Crum375 (talk) 23:45, 26 May 2010 (UTC)[reply]
You're missing some critical subtelties: The New York Times Company (the corporation) publishes its own materials — advertising rate cards, for example. It does not publish The New York Times (the paper). It owns the paper. The publisher is Arthur Ochs Sulzberger, Jr., and nothing in The New York Times (paper) except his rare publisher editorials is both written and published by him (and therefore self-published). WhatamIdoing (talk) 23:57, 26 May 2010 (UTC)[reply]
You're trying to introduce unnecessary confusion. Again I have to ask: what would be the benefit of the change in terms of how we approach sources in articles? SlimVirgin talk contribs 23:59, 26 May 2010 (UTC)[reply]
No, I'm trying to get editors to quit telling me that a source written by X and published by Y is self-published, and conversely that a source written and published by exactly the same entity is non-self-published if enough lawyers were involved. Editorial oversight is not the same as non-self-publication. WhatamIdoing (talk) 00:06, 27 May 2010 (UTC)[reply]
If a source written by X and published by Y is known or expected to have multiple vetting layers, or scrutiny by paid professionals who check it for accuracy and potential liability before it's able to go out to the public, then it's not self published. If author X is able to just press a button and send his material to the public, then it is self published. That's the basic framework. Use your common sense for in-between cases, where the guiding principle is how many vetting layers, including legal scrutiny, exist between the original author and the published document. Crum375 (talk) 00:14, 27 May 2010 (UTC)[reply]
Show me the reliable source that adds anything like your "guiding principle". Just show me a dictionary definition that says "Anything with editorial oversight or layers of vetting can't be self-published, even if the author is exactly the same as the publisher." Just one reliable source that says "Whether a publication is self-published is determined by how many vetting layers, including legal scutiny" are involved. Just one decent source -- that's all I need. WhatamIdoing (talk) 00:18, 27 May 2010 (UTC)[reply]

I quoted you a source above, which says "The simplest definition of self-publishing is when an author produces and publicizes his own book for public consumption." There are two separate issues here, self publishing and reliability, which are very related. The more vetting layers, the more we consider a source "reliable" for WP purposes. But to be not self published, you just need to have "some" editorial oversight, so you can't just "produce and publicize your own book for public consumption", per the above definition. Not being self published does not make you a reliable source, but it is a step in the right direction. Crum375 (talk) 00:31, 27 May 2010 (UTC)[reply]

Nope, I'm not finding the word "vetting" in your definition. Do you? Do you see "produces and publicizes his own book for public consumption without editorial oversight" in the definition? Or does the sentence stop without mentioning this concept? WhatamIdoing (talk) 00:49, 27 May 2010 (UTC)[reply]
Vetting means that other professionals are involved in scrutinizing the product for liability, accuracy, and other issues. If that exists, then by definition it won't be an "author [who] produces and publicizes his own book for public consumption". Professional vetting and legal scrutiny by others prior to publication is clearly the opposite of producing and publicizing your own material, i.e. not SPS. Crum375 (talk) 01:05, 27 May 2010 (UTC)[reply]
I said, "show me a source," not "draw your own, unsupported conclusions based on your assumption of the most common circumstance that results in self-publication". As far as I can tell, your source does not say a single word about editorial oversight. It does not make an exception for the author hired twenty lawyers to vet the manuscript, even if he took every single bit of their advice. I see nothing that says "produces and publicizes his own material except when the author is a forty-person committee in a large corporation with a formal structure that requires six vice presidents to sign off before publication".
What I see is "an author [who] produces and publicizes his own book for public consumption" -- end of sentence, no further conditions, no exceptions.
Let me be clearer: You apparently believe that "A source is not self-published, even if the author and the publisher are the same individual human, so long as enough paid professionals have vetted the manuscript". Show me a reliable source that says this plainly and directly enough to include that statement, in Self-publishing -- without violating WP:NOR.
(I believe that no such source exists, because nobody except you defines self-publishing this way, but I'm willing to change my mind if you can produce a source.) WhatamIdoing (talk) 01:17, 27 May 2010 (UTC)[reply]
You said, "You apparently believe that 'A source is not self-published, even if the author and the publisher are the same individual human, so long as enough paid professionals have vetted the manuscript'": Correct. If the publisher of the NYT writes an editorial in his paper, he will have lots of paid professionals vetting his manuscript, so he will not be publishing his own material by himself. And because he doesn't do it himself, he is not "self published", and neither is any other reporter or columnist on his payroll. They must all have their manuscripts undergo professional scrutiny by others before publication, so they are not "self published". Again, we use the term "self published" on WP in its most common sense, which corresponds to the common definition I linked to above, i.e. material published by the author without being screened by professionals prior to publication. Crum375 (talk) 01:33, 27 May 2010 (UTC)[reply]
(edit conflict) I'm sorry, but that makes no sense, to me. The amount of vetting affects the reliability of a source, as you've said. It has no bearing on whether the source is self-published.  Chickenmonkey  01:48, 27 May 2010 (UTC)[reply]
The amount of vetting affects the reliability, but essentially any vetting by paid professionals means the author is no longer doing the entire publication process by himself, and according to the definition I provided above, the source is no longer an SPS. Crum375 (talk) 02:01, 27 May 2010 (UTC)[reply]


So where's your source? If your additional restriction really is "the most common sense", then why is it not defined that way in any reliable source? If your made-up insertion of editorial oversight is "the most common sense", then why doesn't even one single reliable source mention that issue? Why do you keep claiming that the definition linked above includes language like "without being screened by professionals prior to publication", when nobody can actually find words like that in the source you're supposedly citing? WhatamIdoing (talk) 01:43, 27 May 2010 (UTC)[reply]

As I mentioned above, we use the term SPS in its most common meaning, which is an author doing the entire publication process on his own. I provided a link to that definition above. Therefore, once you introduce additional scrutiny and vetting layers, that author is no longer doing the entire publication process on his own, hence that is no longer an SPS. Thus the NYT is not an SPS, as expected. Not sure what else you want. Crum375 (talk) 01:59, 27 May 2010 (UTC)[reply]

I want a source that says your bit about "the entire publication process on his own". Your source doesn't say that. WhatamIdoing (talk) 02:10, 27 May 2010 (UTC)[reply]
My source says the "author produces and publicizes his own book for public consumption". You seem to read into it that it can include an entire professional organization which helps produce that material. But by your reading, it would make the NYT "self published". So clearly your reading is incorrect, and "author produces" means the author does it, not others. That is also the common understanding of every person: "self published" book means you write it up, and print it, without letting professionals screen it for accuracy or legal issues. Crum375 (talk) 02:58, 27 May 2010 (UTC)[reply]

For one, Self-published does not equal unreliable. For two, The New York Times is a publication which publishes the work of reputable journalists reporting on third-party information. If The New York Times published a piece on how great itself is, that would be of questionable reliability, right?  Chickenmonkey  23:58, 26 May 2010 (UTC)[reply]
No. On WP "self published" means an individual author who can press a button at will and distribute his material to the public. The New York Times is not "self published" by this definition, because we focus on the individual, not the corporation. And when the NYT writes about itself it becomes a primary source for that material; still reliable, but must be used very cautiously and in a restricted fashion. Crum375 (talk) 00:04, 27 May 2010 (UTC)[reply]
You keep saying "on Wikipedia". Why should Wikipedia redefine words? I agree that The New York Times is not "self published", but for apparently different reasons. I understand primary, secondary, and tertiary sources.  Chickenmonkey  00:09, 27 May 2010 (UTC)[reply]
Wikipedia does not redefine words. But some common words, which can have multiple meaning, are defined on WP so we can all be on the same page. Crum375 (talk) 00:16, 27 May 2010 (UTC)[reply]
It makes sense, if there are multiple meanings to a word, for Wikipedia to dictate which definition it is going to adhere to. That's true. However, the definition you're offering is one entirely created by some unknown entity, because WP:V does not contain it. Reading WP:V, I am entirely fine with how it is currently written. Where is this definition written that you keep quoting? I feel like I'm arguing, and I don't want it to seem like I am. So, if it does seem that way, I apologize ahead of time. In my opinion, I am on the same page as Wikipedia policy and you're quoting something I've never heard before.  Chickenmonkey  00:31, 27 May 2010 (UTC)[reply]

We define reliable sources by saying in WP:V, "In general, the best sources have a professional structure in place for checking or analyzing facts, legal issues, evidence, and arguments; as a rule of thumb, the greater the degree of scrutiny given to these issues, the more reliable the source." This tells us that reliable sources have in place multiple layers of vetting, for legal issues, technical accuracy, etc. The WP:SPS section tells us "Anyone can create a website or pay to have a book published, then claim to be an expert in a certain field. For that reason self-published media—including but not limited to books, newsletters, personal websites, open wikis, personal or group blogs, Internet forum postings, and tweets—are largely not acceptable." Clearly it focuses on "self published" as material published directly by the authors without professional vetting layers. SPS and RS are related, though not one-to-one, in that SPS is generally not RS, except about the author. But being non SPS does not make a source automatically reliable. Crum375 (talk) 00:40, 27 May 2010 (UTC)[reply]

That is, exactly, what it says. "The best sources" not "The only sources we can use". Also, "including but not limited to", means that is not the complete listing of materials that fall under "self-published". Yes it focuses on these materials, but it states that there are other materials that also fall under "self-published". I'm not concerned with your definition of "self-published" because I feel WP:V states it fine. Can we at least agree that just because a source is "self-published" (under either definition) doesn't necessarily mean it isn't reliable?  Chickenmonkey  01:01, 27 May 2010 (UTC)[reply]
As long as we agree that SPS is a source published by an author with minimal or no editorial oversight or legal screening, yes, it can be used as a reliable source to describe the author himself, per WP:SPS. Crum375 (talk) 01:16, 27 May 2010 (UTC)[reply]
A self-published source is a source published by the author of said source. We can agree to disagree on that. Also, per WP:SPS, a self-published source can be used as a reliable source on a subject upon which the author is proven to be knowledgeable, except with WP:BLP. I believe this discussion, or at least my part in it, as reached an end. Good day.  Chickenmonkey  01:30, 27 May 2010 (UTC)[reply]
Importantly, the quotation that Crum provides here is not in WP:SPS. It's in WP:SOURCES, and it is equally true for both self- and non-self-published sources. SOURCES does not say, "Non-self-published sources have a professional structure in place for checking or analyzing facts...": It applies to both self-published and non-self-published sources; to primary, secondary, and tertiary sources; to independent and non-independent sources. This is a description of editorial oversight. It is not a statement about whether the author and publisher are the same. (And again: I fully support the use of sources with editorial oversight. The issues here is whether Wikipedia is going to make up a definition of "self-published" as meaning "without editorial oversight" instead of following the reliable sources. WhatamIdoing (talk) 01:08, 27 May 2010 (UTC)[reply]
There are no outside sources which tell WP how to formulate its policies. We use "self published" in the most common sense: a source which is typically an author publishing his own material, which is an individual pressing a 'Save' button to publish his stuff, with no additional layers of legal scrutiny or accuracy checking by professionals. And as such, an SPS is unreliable, per WP:SOURCES and WP:SPS, except as a possible source about the author himself. Crum375 (talk) 01:16, 27 May 2010 (UTC)[reply]
My definition includes your scenario; it also includes the scenario in which an individual presses the 'Save' button to publish his stuff, with dozens of layers of legal scrutiny, accuracy checking by professionals, an imprimatur from the Pope himself, and an endorsement from the Dalai Lama. My definition includes this scenario because all of the reliable sources include this scenario, because the common-sense definition includes this scenario, and because conflating issues self-publication with editorial oversight (or primary/secondary/tertiary status; or independence from the subject matter) creates confusion among editors who are told "press releases aren't self-published" when they most certainly are (and thus, for example, are ineligible under WP:N). WhatamIdoing (talk) 01:23, 27 May 2010 (UTC)[reply]
A press release by a corporation is very definitely not "self published" by the common definition, which I linked to above, and if your definition makes it so, then your definition is plainly wrong. A press release by a corporation undergoes serious vetting by multiple paid individuals, as well as legal counsel, before it goes out. This is definitely not "self publishing" by the common or WP definition, which says that "self publishing" is an author publishing his material "by himself". In the press release of a corporation, there is no single individual who makes up a press release and presses a button to publish it, any more than the NYT publisher can write his editorial that way. Corporations screen their output through multiple layers and therefore that's not "self publishing", not the NYT, and not Coca Cola. Crum375 (talk) 02:12, 27 May 2010 (UTC)[reply]
Are you aware that the definition you link above is entirely silent on the issue of "serious vetting by multiple paid individuals, as well as legal counsel"? That those words really, truly do not appear in the definition you cite? That there is, in fact, absolutely no prohibition in that definition against the self-published author hiring large team of paid individuals and legal counsel? WhatamIdoing (talk) 02:27, 27 May 2010 (UTC)[reply]
The definition says "The simplest definition of self-publishing is when an author produces and publicizes his own book for public consumption." If there is a professional organization which vets the material, checking it for accuracy and screening it for legal issues, then the author is not producing his own book. Other professional people are doing it. Those other professionals are what makes the NYT not a "self published source". Crum375 (talk) 02:34, 27 May 2010 (UTC)[reply]
Your cited definition says the self-published "author produces and publicizes his own book". It does not say that the "author produces and publicizes his own book all by himself".
There's nothing in your cited definition that prohibits the author from hiring a proofreader to read the book, an artist to design the cover, a printer to print the book, a binder to bind the book, a magazine to tell readers about the book — not one word about these things. Your definition says, "author produces and publicizes his own book" and stops. WhatamIdoing (talk) 03:03, 27 May 2010 (UTC)[reply]

As I tried to explain to you, if you use your interpretation that "author produces" means "author plus a professional organization which screen the material produce", it would include the NYT as "self published source". Ergo, your interpretation is wrong, and "author produces" means what it says, and nothing more. Crum375 (talk) 03:18, 27 May 2010 (UTC)[reply]

If you're right, then I'm sure you'll have no trouble showing me a reliable source that says self-publication is "author produces and publicizes his own book, but not when they author plus a professional organization which screen the material produces and publicizes his own book". Any reliable source. One reliable source. All I need from you is exactly one, single, solitary source that doesn't stop dead when it gets to the end of the plain statement that "author publishes his own work," but keeps going on with your caveats about professionals or layers or vettings.
And if you cannot possibly find one single reliable source that says this, then perhaps you'll admit that your personal, private definition is neither verifiable nor the "common" definition. WhatamIdoing (talk) 03:48, 27 May 2010 (UTC)[reply]
There is no "personal private definition". There is the most common definition which I have linked to above, which says that SPS means "the author produces his own book." It is you who is trying to interpret "the author produces" to include an organization of professionals screening the book for accuracy and legal liability issues. This interpretation would make the NYT a "self published source", so it's clearly wrong. If you think it's not wrong, then you have to explain why. Crum375 (talk) 04:00, 27 May 2010 (UTC)[reply]
"The author produces his own book" does not preclude the author from hiring help. Is it your theory that if the author doesn't physically print the pages himself, or hires a proofreader, or asks his lawyer to look it over, that it quits being "his own book"? WhatamIdoing (talk) 04:27, 27 May 2010 (UTC)[reply]
If the author has a paid professional organization in place, then that combined mechanism, the author as the producer of the raw data and the other staff as the means for vetting it for accuracy and potential liability, stops being "the author produces his own book." It might be the author's company produces his own book, if he is the owner, but that's not the common definition of "self" as in "self published". And again, if we accept your interpretation that the definition of "self published" does allow for a professional vetting organization to be there along with the author, it would include the NYT as "self published", so clearly your interpretation defies comon sense. Crum375 (talk) 11:28, 27 May 2010 (UTC)[reply]
Is it your theory that the existence of a paid, professional staff makes the author not be the author, or that it makes the author not be the publisher?
Or perhaps I've misunderstood, and you think that the definition's use of the singular is the critical point? That is, that "the author publishes his own work" is self-publishing, and "the authors publish their own work" is non-self-publishing? WhatamIdoing (talk) 17:32, 27 May 2010 (UTC)[reply]
See my reply just below from 11:45, 27 May 2010 (UTC). Crum375 (talk) 17:42, 27 May 2010 (UTC)[reply]
I have not been finding myself to be lucky in interpreting your responses. Your response below, for example, says to me, "I have conflated self-publication with the absence of editorial oversight, and I think that WP:SPS says that reliability is sometimes low if the Library of Congress record has the same name in the "author" and "publisher" fields, but that it means that reliability is sometimes low if there is no editorial oversight".
(IMO, both are true on Wikipedia: self-publication is less reliable than non-self-publication, and publication (whether self- or non-self-) without editorial oversight is less reliable than publication with editorial oversight. IMO this page should have a "layers of vetting" section very similar to the "self-publication" section, and also an "independent from the subject matter" section -- but NB that they would be separate sections.)
Since I don't seem to be understanding your response, please consider answering exactly the questions I've asked:
Given that the author, if acting alone, would unquestionably be considered the publisher for my hypothetical source:
  1. If a paid, professional staff is involved, does the author-publisher quit being the author?
  2. If a paid, professional staff is involved, does the author-publisher quit being the publisher?
  3. If multiple humans write a source, and exactly the same multiple humans publish it, is this maybe "selves-publication" or "non-self-publication" in your mind? WhatamIdoing (talk) 18:12, 27 May 2010 (UTC)[reply]
Here's a practical example: Kelly Link self-published the chapbook 4 Stories by founding Small Beer Press. Small Beer Press has an editor. Does that mean Kelly Link did not self-publish her chapbook?
Source: about.com  Chickenmonkey  04:40, 27 May 2010 (UTC)[reply]
When an author owns his own publishing company, we need to make a judgment call as to whether it is "self published". If it's a large company which publishes works other than those by the same author, then it's more likely to be not "self published". So the NYT is clearly not self published, even when the chairman writes an opinion piece in it. If it's a cover for a vanity press, then it's still self-published. What we are looking for are paid professional vetters, who can tell the author, "this fact is wrong," or "if we publish this, we'll get sued." If it's basically the author pushing all buttons and publishing whatever he wants, he is self published. Crum375 (talk) 11:45, 27 May 2010 (UTC)[reply]
The source I provided from about.com regards Kelly Link as "self-published". The New York Times is published by The New York Times Company. Therefore, no, it is not self-published. For that matter, the journalists working for The New York Times are not self-published either; they are published by Arthur Ochs Sulzberger, Jr.. It's a gray area, for sure. We, on Wikipedia, have to judge the journalistic integrity of such organizations as The New York Times, USA Today, and The National Enquirer, separately. None of them are "self-published", but have varying degrees of reliability. The same applies to materials that are "self-published", such as: corporate press releases, personal websites, and certain books. It doesn't matter if an editor regards a source as "self-published" or not, the same judgment of integrity is made. Therefore, you can continue believing your definition of "self-published" and I will continue believing my definition "self-published".  Chickenmonkey  18:39, 27 May 2010 (UTC)[reply]
The NYT and its various related companies are effectively one single entity: there is no one company which writes the material and another which vets it. There is one combined entity which writes, vets and publishes the material. This is true for most newspapers, and of course it's not "self published" in the WP sense. When we say "self published", we mean it in the "vanity press" sense, that an individual author can write a document, press a few buttons and send it out to the world, perhaps after paying some fee. If he has to go through an organized system of professional fact checking and legal scrutiny, including people who can tell him "no, this fact is wrong", or "if you publish this, we'll do under", it's no longer a "vanity press" or a "self publishing" operation. Corporate press releases, although normally primary sources, are never "self published", since they normally undergo professional vetting and legal scrutiny before they are released. The only definition of "self published" that clearly distinguishes a small newspaper and a vanity press operation is that "self published" means you as individual write the material, press some buttons and release the material yourself, with minimal or no interference. Crum375 (talk) 19:00, 27 May 2010 (UTC)[reply]

The New York Times is not "self-published" in any sense, WP or otherwise. The New York Times is published by The New York Times Company, just like The Boston Globe. Journalism is filled with gray areas. Yes, The New York Times and The Boston Globe are part of the same entity -- The New York Time Company --, but in terms of journalism, they are separate. This is especially true in how Wikipedia should treat them and the journalists they employ. In terms of reliability, they must all be judged separately based on their integrity. Corporate press releases should be treated differently from journalism, whether you think it's "self-published" or not. If The New York Times Company puts out a press release, it is "self-published" even though it probably went through internal vetting layers, unlike The New York Times Company publishing The New York Times or The Boston Globe.  Chickenmonkey  19:42, 27 May 2010 (UTC)[reply]

As I tried to explain above, that the NYT consists of several related corporations makes no difference, because you don't have one company writing and another company vetting. And in any case, most smaller newspapers are just one company, which does the writing and the vetting, and they are still not "self published" in the WP sense, any more than any other corporation. Crum375 (talk) 19:48, 27 May 2010 (UTC)[reply]
The New York Times doesn't consist of several related corporations. The New York Times Company does, is that what you mean? I don't believe newspapers are corporations and they should not be treated as such. I'm not saying any newspaper is self-published. Newspapers have writers, editors, and publishers. If a company releases a press release, like the one I linked to above, the company writes it and releases it and it doesn't have to worry about journalistic integrity.  Chickenmonkey  20:24, 27 May 2010 (UTC)[reply]
"I don't believe newspapers are corporations": If you can find one major newspaper which is not a corporation I'd be surprised. Can you please tell me which major newspaper (or other news organization) is not a corporation? Crum375 (talk) 20:35, 27 May 2010 (UTC)[reply]
I may be wrong, but I would say newspapers are publications which are owned and published by corporations. If you are correct, that newspapers are corporations, then that would mean they are recognized as separate legal entities from their owners. I don't believe this to be the case.  Chickenmonkey  20:56, 27 May 2010 (UTC)[reply]
I have no idea what you mean, sorry. You say, "newspapers are publications which are owned and published by corporations". What legal entity is a "publication"? Crum375 (talk) 21:10, 27 May 2010 (UTC)[reply]
Publication#Legal definition and copyright I'm certainly not an expert on this (or anything, really heh). My understanding is, newspapers are the publication of collected works of which the copyright holders have granted permission to be published, through their employment with the newspaper -- much like the writer of a book grants a publishing company permission to distribute said book.  Chickenmonkey  21:25, 27 May 2010 (UTC)[reply]

I think you may be confused. A publication is a piece of work, it's not an organization. A newspaper or magazine, as a legal entity or organization, is normally a corporation, as far as I know, though I am willing to learn more. Crum375 (talk) 21:54, 27 May 2010 (UTC)[reply]

No, really: A newspaper is a publication. See the first sentence of Newspaper: "A newspaper is a regularly scheduled publication..." See wikt:newspaper, "A publication, usually published daily or weekly..." Using "newspaper" to refer to "the entity that prints the publication" is just a figure of speech (a metonym). WhatamIdoing (talk) 22:00, 27 May 2010 (UTC)[reply]
Yes, exactly what WhatamIdoing has said here.  Chickenmonkey  22:47, 27 May 2010 (UTC)[reply]
Not at all. A newspaper is also a pile of paper, so it all depends on context. I was asking what kind of a legal entity is the company or organization which produces a newspaper, i.e. the legal entity that pays the various individuals to produce it and is liable for their output. As far as I know, it is generally a corporation, but I am willing to learn otherwise. Crum375 (talk) 23:27, 27 May 2010 (UTC)[reply]
Yes: A newspaper is a publication — a pile of (printed) paper (with a certain type of content). Its legal status is essentially the same as a book — a thing (with intellectual property implications), not a group of people or an activity. A newspaper is not the business organization that writes, produces, and distributes it.
It is possible for the business that owns the publication to be any form of business, e.g., sole proprietorship, partnership, LLC, corporation — anything. In the US, where incorporation is a trivial act and provides asset protection and tax benefits, most, but certainly not all, newspapers are owned by a corporation. The New York Observer and the The Philadelphia Inquirer, for example, are owned by limited liability companies. The companies (whatever their form) fairly often own more than one newspaper; for example, in my local area, the daily paper is one of about 50 owned by a large, privately held corporation, and then there's a small-town weekly that is owned by a company that also happens to own the weekly in the next town over. Most of the employees work on both publications; altogether, the business has one publisher, one editor, and one reporter (and five non-editorial employees) writing, producing, and distributing the two newspapers.
In those parts of the world where incorporation is more of a bother, the smaller businesses might be less likely to choose incorporation. WhatamIdoing (talk) 01:35, 28 May 2010 (UTC)[reply]

Crum, can you look at these two sentences, and tell me if they are identical?

  1. "Self-publishing is when an author produces and publicizes his own book for public consumption."
  2. "Self-publishing is when an author produces and publicizes his own book for public consumption, with minimal or no interference." WhatamIdoing (talk) 19:16, 27 May 2010 (UTC)[reply]
No, they are not "identical", but they are equivalent, and I can accept them both as defining "self publishing". Self publishing on WP is a guy and his dog pushing some buttons and getting their stuff out. When professionals stand in the way, who are paid to say "no" when the material doesn't pass muster, it's not self publishing in the common—and WP—sense. Crum375 (talk) 19:43, 27 May 2010 (UTC)[reply]
Okay: You personally believe that the common definition of self-publication includes "with minimal or no interference", right?
Now: Can you produce an actual, reliable source that includes any words even remotely like "with minimal or no interference"? Pretend that I'm trying to put sentence #2 into the lead of Self-publishing, and I need a source that directly says something like "with minimal or no interference" to overcome a WP:BURDEN challenge from an editor who says that the words "with minimal or no interference" are not supportable. Can you identify such a source for me -- one that says something like "with minimal or no interference" in a plain, direct, NOR-compliant fashion? WhatamIdoing (talk) 19:57, 27 May 2010 (UTC)[reply]
Yes, I do believe that "self publishing" is when you can distribute your stuff to the public "with minimal or no interference", and this is very compatible with the definition from the source I gave you, as well as the wiki article. But we are not discussing here the wiki article — that discussion belongs on the article's talk page. The topic here is the verifiability policy, and for policies we need working definitions, which are based on or related to the corresponding wiki articles, but not directly tied to them. Therefore, you won't normally see us providing sources or references inside policy pages, because they are not articles. Content policies need to clarify what we expect editors to do when writing articles, not explain to them how the world works. But in any case, I don't see a big difference between our working definition and the article, or the definition I linked to above. Crum375 (talk) 20:13, 27 May 2010 (UTC)[reply]

Another break

Do you agree that your additional criterion, while compatible with the definition, is not actually part of the definition you provided? WhatamIdoing (talk) 20:17, 27 May 2010 (UTC)[reply]
If you were to require a mathematically exact definition, none of this would work, and we'd have no encyclopedia. For our own working definition, and within the boundaries of common sense, the WP definition and the definition I linked to, as well as the current self publishing wiki article, are all the same or very similar. Crum375 (talk) 20:30, 27 May 2010 (UTC)[reply]
I'm not looking for a mathematically exact definition. You have asserted that the common definition includes words to the effect of "with minimal or no interference". If that sort of language really is part of the common definition, then you should have no trouble at all showing me a source that contains language at least remotely like that.
If, on the other hand, that criterion really isn't part of the common definition, then can you and I acknowledge that apparent fact -- a bit like adults who, having looked into the cupboard and discovered there's absolutely no food in it, don't keep saying that there really is a whole lot of food in the cupboard, even thought we can't see any? WhatamIdoing (talk) 20:38, 27 May 2010 (UTC)[reply]
I am sorry for not following your cupboard analogy. In fact, I am not really sure what your point is. We have a working definition on WP which is consistent with those commonly used, such as the one I gave you and the wiki article. That definition is essentially that "self publishing" is a guy publishing his own stuff, with no paid professionals to stop him when he is wrong or when there are legal issues. It seems to me we are going around in circles. Crum375 (talk) 20:45, 27 May 2010 (UTC)[reply]
I look into the sources, and I find no words like "with minimal or no oversight".
You look into the sources, and you also find no words like "with minimal or no oversight", right?
But you tell me that words like "with minimal or no oversight" are actually part of the common definition -- despite the fact that neither of us can actually find words like this in any reliable source, right?
If this criterion is actually part of the common definition, why can't you show me one source that actually includes this criterion? Is this the kind of "common" like "so popular that nobody goes there any more", or "so common that none of the sources mention it", or "so full of food, that the cupboard is completely empty", i.e., "common" meaning "exactly the opposite of common"?
Is it, in fact, possible that this allegedly common criterion is not actually part of any source's definition -- that the common definition actually doesn't include a condition like "with minimal or no oversight"? WhatamIdoing (talk) 20:55, 27 May 2010 (UTC)[reply]

Again, for WP policies working definitions we don't need to follow some external definition verbatim, it's enough to be close, esp. since the external ones may vary. In this case, the wiki article says, "Self-publishing is the publishing of books, micropublishing on-line works and other media by the authors of those works", the definition I linked to says "self-publishing is when an author produces and publicizes his own book for public consumption", and WP effectively defines it as a the guy who can press some buttons, and with no paid professionals or a vetting mechanism to stop him, can send his stuff out to the world. Those are all consistent with each other, and as I said, I think we are going around in circles. Crum375 (talk) 21:19, 27 May 2010 (UTC)[reply]

Are we agreed, then, that "with minimal or no oversight", or words to that effect, don't actually appear in any reliable definition? That, in fact, it is possible for an author to subject his work to an enormous level of oversight, and still publish his own work himself, and -- no matter how reliable the source might be as a result of that vetting -- the resulting publication is actually, technically, according to every definition you've found, still technically self-published, solely on the grounds that the name of the author is exactly the same as the name of the publisher?
Because that's what I want to add to this policy: a plain, bald statement that "[wikt:self-publishing|Self-publication]] is the publication of a work by its author, without the involvement of an established, third-party publisher" -- and to stop there, without adding any unverifiable statements about oversight.
Does that work for you? Based on what you've learned, do you agree that the common definition of self-published does not actually include any statements about layers of vetting or editorial oversight? Do you think that the above definition is wrong? WhatamIdoing (talk) 22:07, 27 May 2010 (UTC)[reply]
No, you are missing or ignoring the key point. The issue is not that the author has "subjected his work to enormous oversight". That could still be self published. What will make it not self published is if there is a structure in place, of paid professionals, whose job it is to vet the author's raw output, for factual accuracy and legal liability, and to actually stop such output from being published if it fails to meet their standards. Even if the author happens to be the "big boss", like "Chief" in the Daily Planet, they can still tell him, "this is wrong, we can't say that", or "this is libelous, we'll get sued if we print it", etc. Just having the author himself vet his output, or subject it to vetting as you call it, when he can at any point decide, "this is enough", and press the "Save" button, is still "self publish". So again, "Chief" is not self published, even if his "Daily Planet" is very small, as long as there is a professional organization in place to control his output before it goes out. Crum375 (talk) 22:43, 27 May 2010 (UTC)[reply]
Okay, I think that "enormous oversight" is the same as "a structure in place, of paid professionals, whose job it is to vet the author's raw output, for factual accuracy and legal liability". Do you?
I think that the ability "to actually stop such output from being published if it fails to meet their standards" is a defining duty of the publisher. (The publisher is the person who decides whether it goes to the public.) Thus, if someone other than the author can "actually stop such output from being published if it fails to meet their standards", then the author is not the publisher. Do you agree? WhatamIdoing (talk) 22:50, 27 May 2010 (UTC)[reply]
No, "enormous oversight" is not the same as "a structure in place, of paid professionals, whose job it is to vet the author's raw output, for factual accuracy and legal liability." The question is only whether there is such structure, not whether there is "oversight" of any kind. And the key point is that those professionals who vet the output before it's published are paid to stop it if it's libelous or factually incorrect, etc. You are forcing the position of a "publisher", which may or may not exist. That publisher could be just a figurehead, or someone who does actual vetting, or completely nonexistent as an individual, in which case it is simply the corporate "person". But there is no need for any such "publisher" for the professional structure to be in place and strictly enforced, which will make the publication not "self published". And no, I don't agree with "if someone other than the author can 'actually stop such output from being published if it fails to meet their standards', then the author is not the publisher." Take my "Chief" example from above. Chief could be the publisher, not the editor in chief, but he could still be stopped from releasing junk or libel. He could potentially override the objections, but not likely, because all those objecting pros would make excellent plaintiff witnesses if he is subsequently sued for libel. Crum375 (talk) 23:40, 27 May 2010 (UTC)[reply]
Just what is your structure supposed to be doing, if not overseeing things? Twiddling its thumbs?
Yes, I really am forcing us to confront the idea of a publisher. I'm doing that because every single relevant reliable source believes that self-publication requires this idea that publishers exist.
In my world (and the world of the reliable sources) a publisher can be an individual human, a group of humans, or a corporation. Whatever entity decides to produce and distribute a publication is the publisher. Every single publication was published by someone. Publications do not spring forth fully formed from the forehead of Zeus: they are deliberately published by their publishers.
In my world (and the world of the reliable sources), an author can be an individual human, a group of humans, or a corporation. Whatever entity writes the publication is the author. Every single publication was authored by someone. Publications do not spring forth fully formed from the forehead of Zeus: they are deliberately written by their authors.
Publications are written by people we call "authors" and published by people we call "publishers".
If Chief can be stopped by someone — really, truly stopped, against Chief's own will, not merely advised in the strongest possible terms that this is a disastrous idea that will land them all in court and told that if Chief doesn't come to his senses right now, then the lawyer is going to call for the nice men in the white coats to bring a padded truck ASAP, because Chief really, really, really needs to make a different decision — then Chief is not the publisher, because Chief is not the entity that actually decides to produce and distribute the publication.
Do you see how this works? "Person who writes = Author". "Person who publishes = Publisher". "Person who follows other people's directions about publication = Not the publisher. WhatamIdoing (talk) 23:57, 27 May 2010 (UTC)[reply]

The simplistic division of labor you specify, into an author (or authors) vs. a publisher (or publishers), is incorrect, even in large publishing houses, because there are intermediaries between the ones who do the writing and the ones who approve the release for publication. In fact, those intermediaries, who do much of the factual and legal legwork, contribute significantly to the overall vetting structure. But you don't need to have this polarized structure at all. You can have an author, a vetting organization, perhaps some final proof reading, and a final mechanical publishing step. The "big boss" could be uninterested in the daily details, only that the material is well vetted. And he himself may end up writing some stuff, which he expects to be vetted no less than that of his underlings. In such cases there is no "authors" vs. "publishers" dichotomy, simply one or more authors, a professional vetting structure, and a purely mechanical final publishing step. The "publisher" concept is not needed to make the output reliable, and not "self published". Again, on WP by "self published" we specifically refer to the concept of individuals creating and distributing their own content, with minimal if any professional vetting. This is the most common definition of the term, and this is what we mean by it in our content policies. Crum375 (talk) 00:21, 28 May 2010 (UTC)[reply]

I agree that it can, on occasion, be difficult to identify exactly who the authors are, and who the publishers are. The fact that it's hard to figure out exactly which names to put in the "author" blank or the "publisher" blank on a form — and the fact that far more people might be involved than those who do authoring and those who do publishing — does not change the fact that every single publication was authored by someone and was published by someone, or the fact that when these "someones" are the same, then the publication is self-published according to every single reliable source on the planet.
Your definition is NOT "the most common definition". It is your own, made-up definition. You have failed to find a single source that shares your opinion. To quote Jimbo, if it is true that "with minimal if any professional vetting" is part of the common definition, then it should be easy to supply a reference, shouldn't it?
I'm really not sure how to get this through to you, without resorting to something that sounds a lot like I'm saying, "Liar, liar pants on fire": There is not one single reliable source on this planet that restricts self-publication to situations in which the author is the same as the publisher "with minimal if any professional vetting." Not one. For you to keep saying that this criterion, which cannot be found anywhere, is "the most common", is verging on intellectual dishonesty. If your definition cannot be found anywhere, it is NOT "the most common definition". Definitions that cannot be found in any dictionary anywhere in the world are NOT "common" definitions. WhatamIdoing (talk) 00:46, 28 May 2010 (UTC)[reply]
My definition is the same as the common one that I supplied, and consistent with the current wiki article. The point about the vetting is simply extra explanation, since many editors could see a typical newspaper as "self published", i.e. when it clearly isn't so under our definition. So what distinguishes a small newspaper from the guy and his dog who just push a button to publish their stuff? The professional structure that the newspaper has in place to vet the material. This is our working definition, so editors will be able to know when an operation is self published, like the button-pushing guy, vs. not self published, like the small newspaper or equivalent. Crum375 (talk) 00:57, 28 May 2010 (UTC)[reply]
That's a good explanation of "the specific kinds of self-published sources that we are particularly worried about" (and IMO with good cause); it is not, however, an accurate definition of "self-published". It is self-published-plus, not just plain self-published.
I think we need to address self-published ("plain", not "plus") in this section. I think that we need another, separate section that addresses failure to vet things. I believe that "Is the author the publisher?" is a completely separate question from "Did that source use reasonable controls?".
For example, Dressed to Kill (book) is self-published: The authors are the only people involved in the tiny business that nominally published it. It is also badly vetted and talks absolute nonsense about the cause of breast cancer.
By contrast, Coca-Cola's website was written by the business, and published by the business -- which makes it "self-published" -- but it is well vetted, with excellent controls (within reasonable limits, that are apparent to you). The fact that it's well reviewed doesn't change the identity of the author or the publisher (and therefore its status as a self-publication), but that fact does change the reliability.
Finally, consider the case of the ravings of a crackpot, or a hoax, that is published by another, completely separate entity: Consider, e.g., The Archko Volume, written by W. D. Mahan (now dead, and mercy on his soul) and currently published by the major publishing house McGraw-Hill (among others). This is not self-published, but it is badly vetted. No matter how much distance is between the author and the publisher, it is still a badly vetted hoax -- but it is (no longer) self-published.
I proposed the addition of a four-sentence definition/description above; you basically seem to agree with it at this point, but you seem to want it to talk about more than just self-publication. Here's the text; would you read it again, and let me know if this works for you, strictly as a definition of the single consideration of self-publication, not as a catalog of every important consideration of reliability?

Self-publication is the publication of a work by its author, without the involvement of an established, third-party publisher. This includes any and all individuals, small groups, and corporate authors who publish their own works on paper, electronically, or in any other media form, so long as the author is also the publisher. Self-published and non-self-published sources may or may not be independent of the subject.[1] Self-published and non-self-published sources can be primary, secondary, or tertiary sources.[2]

  1. ^ Examples: A book that is both written and published by a historian about the Roman Empire is an independent, self-published source. Memoirs written by a retired politician and published by a major publishing house is a non-independent, non-self-published source.
  2. ^ Examples: A blog posting about a house fire, written by the person whose house burned down, is a primary, self-published source. A newspaper story about the same fire, written by a reporter on the scene, is a primary, non-self-published source.
  3. Thanks, WhatamIdoing (talk) 02:00, 28 May 2010 (UTC)[reply]
    I disagree with the premise that "an established, third-party publisher" is required, or even a key criterion, for sources not to be "self published". There are many reliable sources which have no "third party publisher", but have a good vetting structure in place and are therefore not "self published" within the wiki policy meaning. The insistence on a "third party publisher" would turn these reliable sources into "self published sources", which on WP are considered by default unreliable, so all existing articles which depend on such sources would be in limbo, for no good reason. Clearly this won't fly. Crum375 (talk) 02:08, 28 May 2010 (UTC)[reply]
    I took that from a source, but I think other sources would say "established or third-party" rather than "established and third-party".
    Would the insertion of the word "or" change reconcile you to this definition? WhatamIdoing (talk) 02:53, 28 May 2010 (UTC)[reply]
    I don't see how it would help anything. If you take all existing reliable sources which have a good reputation for accuracy and fact checking, with a professional vetting structure, and have no identifiable individual or corporate "publisher" in place other than the organization or group proper, they would be instantly deemed "self published" and hence unreliable by this change. Crum375 (talk) 03:01, 28 May 2010 (UTC)[reply]
    "Self-published" does not equal "unreliable", and I don't see how this change would alter any source, anyway.  Chickenmonkey  03:05, 28 May 2010 (UTC)[reply]
    "self published" = "unreliable" on WP. In general, such sources can only tell us about themselves, if they are not unduly self serving. Crum375 (talk) 03:14, 28 May 2010 (UTC)[reply]
    That's simply not what it says. It says self-published sources are "largely not acceptable". Self-published sources should be used with caution. We use our best discretion to determine if the self-publisher is reliable. Self-published sources should never be used as third-party sources on living people.  Chickenmonkey  03:23, 28 May 2010 (UTC)[reply]

    "Largely not acceptable" means that by default they are unacceptable. You have to make a special case to show that they are acceptable, and normally it would be only when they talk about themselves, per WP:SELFPUB. Crum375 (talk) 03:31, 28 May 2010 (UTC)[reply]

    What? I'm not an expert on the publishing field; so, I can accept that I may not be one hundred percent clear on what exactly "self-published" means (though, I think I do), but I know what "largely" means and it's not "completely".  Chickenmonkey  03:35, 28 May 2010 (UTC)[reply]
    If you read the entire WP:SPS section you'll see what "largely" means. It refers to two special SPS exceptions: one is the established expert allowance, and the other is any SPS describing itself. There are no other listed exclusions, and "largely" refers to those two only. Crum375 (talk) 03:41, 28 May 2010 (UTC)[reply]
    That's what I said. We use our discretion to determine if the self-publisher is reliable (either is an expert, or is talking about the self-publisher itself). The fact that there are exceptions means "self-published" does not equal "unreliable".  Chickenmonkey  03:49, 28 May 2010 (UTC)[reply]
    The vast majority of sources don't meet those exceptions, which means that for all of them, being downgraded from reliable (today) to unreliable (because of being reclassified as SPS) would cause all articles currently relying on them to disappear, or be drastically chopped. This is for sources that have an internal structure run by paid professionals for vetting their published material, and do a good job of fact checking and liability screening. I don't think you'd get this kind of decimation of previously-reliable sources to be accepted by the community. Crum375 (talk) 04:50, 28 May 2010 (UTC)[reply]

    Confusion

    Crum, I'm confused by your response. You don't think that the publisher of a large newspaper would generally be considered an "established publisher"? Or is your concern that editors whose answer to the question, "Who is the publisher?" is "I haven't the foggiest idea" would declare that "unknown publisher" is "same name as the author"? WhatamIdoing (talk) 03:09, 28 May 2010 (UTC)[reply]

    Where did I say that the publisher of a newspaper is not considered "established"? And yes, if you have a group where no one in particular is identified as "the publisher", its publisher would be by default the group itself, which is a tautology, since every source is a publisher in a sense, because it published the material in question. Crum375 (talk) 03:19, 28 May 2010 (UTC)[reply]
    You said, "if they...have no identifiable individual or corporate "publisher" in place other than the organization or group proper, they would be instantly deemed "self published"..."
    On what grounds to you claim that anything published by an organization or group would be deemed self-published? On the grounds that the editor can't figure out who the publisher is, and sloppily equates "I don't know" with "Author's name"? On the grounds that the editor thinks that a newspaper publisher isn't "an established publisher"?
    This really isn't that hard:
    • This source was written by ("author") Clifford Krauss and John M. Broder. It was published by ("publisher") Arthur Ochs Sulzberger, Jr. Are these the same? No? Then it is not self-published.
    • This source was written by Giles Whittell. It was published by a member of the News International Group (very probably Times Newspapers Limited). Is "Giles Whittell" the same as "a member of the News International Group"? No? Then it is not self-published -- even though there is "no identifiable individual or corporate "publisher" in place other than the organization or group proper."
    It's exactly the same analysis: "Do the author and publisher match?" is the question, not "Who is the publisher?". So where's the problem? WhatamIdoing (talk) 03:50, 28 May 2010 (UTC)[reply]
    The problem is that you say "as long as the author is also the publisher", and in a different place above you said that both "author" and "publisher" are amorphous terms that could cover groups of individuals. Therefore, for any smaller organization, where the exact composition of the "author" and "publisher" groups is vague or has significant overlap (even by their own members), you'd end up with a situation where author (group) = publisher (group), therefore, "self published", therefore, unreliable. So according to your proposal, such an organization would be deemed unreliable and not usable as a general source even if they had a structure in place, consisting of paid professionals, for vetting their output. This is unacceptable, because it conflicts with WP's concept of "self published", and would render many currently well sourced articles unsourced. Clearly this does not make sense. Crum375 (talk) 04:03, 28 May 2010 (UTC)[reply]
    Yes: Where author (group) = publisher (group), it is self-published.
    No: Where author (group) = publisher (group), it is not (necessarily) unreliable.
    Self-publication is not a fancy way of spelling unreliable. Many self-published sources are highly reliable sources. WhatamIdoing (talk) 04:46, 28 May 2010 (UTC)[reply]
    No, groups can be so amorphous that you won't know for sure who is the "publisher" and who is the "author", yet they can have a high quality structure in place for fact checking and legal screening. They can have a reputation for accuracy and fact checking, and yet they'd become "self published" overnight and hence unreliable if your version becomes policy. That's clearly unacceptable. Much WP content is based on such sources, and we can't downgrade them to "unreliable" and "self published" just because you introduce a new requirement for "third party" publisher. Crum375 (talk) 04:59, 28 May 2010 (UTC)[reply]
    If you do not know who the author is, or you do not know who the publisher is, then you cannot determine whether a work is self-published.
    Consequently, in those situations in which "you won't know for sure who is the "publisher" and who is the "author"", the only answer you can give to "Is this self-published?" is I DON'T KNOW. Do you understand that?
    Would you be happier if the first proposed sentence were shortened to "Self-publication is the publication of a work by its author"? I had thought that an explicit exception for the involvement of an established publisher would have pleased you, by making it perfectly clear that The Times is not self-published. WhatamIdoing (talk) 15:57, 28 May 2010 (UTC)[reply]
    The problem is that your proposal leaves the critical terms vague. Who is "author"? One human? Two? Twenty? The entire organization, or a significant subset? And same for "publisher", is it a human person? A corporate entity? What happens in the case of a smaller organization where many people participate in the writing and publishing process, who are the author and publisher in such cases? Is the material "self published" if both are amorphous?
    To summarize, as I see it, your proposal will at best add confusion, at worst knock down a big fraction of article content which is currently based on reliable sources which have a good reputation for accuracy and fact checking with good structure in place for vetting their content by professionals. I just don't see any gain. Crum375 (talk) 16:07, 28 May 2010 (UTC)[reply]
    You don't think that we can use basic, plain-English, dictionary definitions for "author" and "publisher", e.g., wikt:author#Noun and wikt:publisher#Noun, and leave the application to the editors' best judgment?
    In defining self-published, I see a substantial gain: No editor will be able to look at this definition and say, "Ah, self-published means 'without lots of layers of vetting, no matter who the author and publisher are'" or "I see, on Wikipedia, self-published means 'a secondary source written and published by the same person, but not a primary source written and published by the same person'". And since both of those absolutely erroneous definitions have been put forward on this and other pages, surely defining the term as meaning "not these errors" would reduce the confusion that is being caused by these erroneous, unverifiable assertions.
    Do you think that it is desirable to have this demonstrably false definitions circulating around Wikipedia? WhatamIdoing (talk) 16:29, 28 May 2010 (UTC)[reply]
    (ec) Trying to reduce confusion by adding more is pouring gasoline on a fire. There is no necessary connection between "self published" and the primacy of sources — self published sources can be primary, secondary or tertiary, but they are almost always considered unreliable as WP sources, per WP:SPS. And SPS does mean that it is a source where the author and publisher are essentially the same, typically one human, without any layers of vetting such as fact checking and legal scrutiny. So it seems to me that all your proposal would achieve is more confusion, and not help in any way I can see. I believe SV asked you this before: can you provide an example of an article where the present policy language fails, in your view? Crum375 (talk) 16:56, 28 May 2010 (UTC)[reply]
    Gavin provided an example below, and since WP:N directly references WP:SPS on this point, notability is a relevant example.
    The absence of any definition at all on this page permitted an editor to make up his own definition that conflated primacy with self-publication. Do you actually think that it is desirable to have these demonstrably false definitions circulating around Wikipedia, uncontradicted by this page? WhatamIdoing (talk) 17:16, 28 May 2010 (UTC)[reply]

    There are lots of people around with misconceptions about our policies. In most cases, I think it's because they don't read them carefully. Do you have your own example of an article you were personally involved in, where in your view a change in WP:SPS wording would have helped? Crum375 (talk) 19:31, 28 May 2010 (UTC)[reply]

    My personal experience is being told by SlimVirgin at WP:Notability that Coca-Cola, Inc.'s website is not written and published by the company.
    I am hoping to not have to bloat WP:N with an apparently endless list of statements like, "Corporate websites are not evidence of notability, no matter how many 'layers of vetting' you believe they have. Slick brochures are not evidence of notability, even if they're secondary sources..." (and so forth).
    I am hoping, in fact, that we can all agree that a company's own advertisements are both written and published by the company, and therefore self-published, and so we can stick with our existing, brief, simple "no self-published stuff" statement.
    In my experience, having an admin like SlimVirgin assert that Coca-Cola's website isn't self-published as far as Wikipedia is concerned is exactly the kind of nonsense that multiplies into endless problems.
    But never mind my motivation: I ask again -- do you, or don't you, choose to go on record as saying that in your opinion Wikipedia benefits from not contradicting these demonstrably false definitions of self-publication? WhatamIdoing (talk) 19:59, 28 May 2010 (UTC)[reply]
    WAID, forget "self publishing" for the moment. In the case of most companies, their website is typically very reliable, but is also a primary source. This is because it generally describes the company itself, its products and its activities, hence it is involved in the content being described, and is considered a primary source for it, per WP:PSTS. As you know, we are not allowed to use primary sources to establish notability, and we are also not allowed to base an article solely on them. So the issue in your case is not if a company's website is "self published" (generally no), or "reliable" (generally yes), but whether it is primary, which it most often is. And for notability purposes, or to create an article without other secondary sources, it can't be used. Hopefully this helps. Crum375 (talk) 21:12, 28 May 2010 (UTC)[reply]
    No, you forget the circumstances that prompted my proposal, and answer my direct and repeated question.
    • You agree that these definitions are false.
    • You agree these false definitions exist on Wikipedia.
    Are you actually prepared to go on record as saying that Wikipedia is best served by not contradicting these demonstrably false definitions? WhatamIdoing (talk) 21:31, 28 May 2010 (UTC)[reply]
    You say that these are "demonstrably false", but you have yet to demonstrate an article you were involved in where there was any problem. The Coca Cola issue is hopefully resolved, since that was a primary source, and I don't see anything "false" anywhere, nor anything "contradictory" that needs fixing. If you can point to something specific, please do. And again, please show a real article you were involved in where the policies as currently written were problematic, in your view. Crum375 (talk) 22:06, 28 May 2010 (UTC)[reply]
    Am I to understand that you now believe that a source can only be self-published if it is a secondary source? I thought we had agreed (1) that such a definition was given [as Wikipedia's definition, not merely somewhere in the world] and (2) that such a definition was actually false. Have you changed your mind on that point? WhatamIdoing (talk) 22:43, 28 May 2010 (UTC)[reply]
    Sorry, not following. Can you show me where I said or implied that "a source can only be self-published if it is a secondary source"? I thought I made it quite clear before that self-published sources can come in all flavors: primary, secondary or tertiary. And I am still waiting for you to demonstrate a problem in the existing policy which you encountered while working on an article. Crum375 (talk) 01:37, 29 May 2010 (UTC)[reply]
    You haven't made that particular claim, but other editors have: '"self-published" refers to secondary sources'. Do we once again agree
    1. that "self-published" does not refer to secondary sources AND
    2. that this false definition was put forth by a mistaken, but well-intentioned editor? WhatamIdoing (talk) 02:14, 29 May 2010 (UTC)[reply]

    I am sorry but I don't have any special knowledge why other people do or say things. As I tried to explain several times now, SPS can come in all flavors: primary, secondary and tertiary. All of them are normally unreliable, except in limited circumstances, as described in WP:SPS. I am still waiting for you to show me the example from an article you have worked on, where you consider the existing policy problematic. Crum375 (talk) 02:31, 29 May 2010 (UTC)[reply]

    I'm going to take the above as an admission that you are aware that these demonstrably false definitions exist, and that you, personally, believe these other editors to be wrong.
    Now: You are requesting evidence that these false definitions have caused problems in the main namespace. Please show me the policy that says the community can't, or shouldn't, address known misconceptions until after the problem's blown up in the main namespace. WhatamIdoing (talk) 02:56, 29 May 2010 (UTC)[reply]
    I am sorry, but I haven't made any "admissions" about any "demonstrably false" anything — I am simply describing to you what the current policies say. It sounds from the above that you have not encountered any specific problem with the policies, but are trying to prevent future ones. But the way we design policies is to handle known cases, otherwise they become unrealistic or impractical. This is why it's important for you to provide actual examples where the current policy runs into problems, because then we can try to find a way to cover the problem case, without causing breakage elsewhere. In most situations, at least in my experience, the problem is not the policy itself but the way it is applied (or misapplied). Crum375 (talk) 03:07, 29 May 2010 (UTC)[reply]
    So you think that the linked statement didn't happen? Or you think that primary sources can't be self-published? Or, perhaps while you object to the term "admission", you really do think that an experienced editor really did espouse that definition, and you really do think that definition really is wrong? WhatamIdoing (talk) 03:25, 29 May 2010 (UTC)[reply]
    This is a talk page for a policy, and our goal is to improve the policy. I try to avoid guessing why editor X says Y, and I normally just assume good faith. I don't object to the word "admission" on principle, but I have not made any "admissions" that I am aware of. As far primary sources and SPS, I tried to explain this several times now, but I'll try one last time: SPS can come in all flavors: primary, secondary and tertiary. Which logically also means that each of those three flavors can be either SPS or not SPS. Not sure what more can be said about it. Crum375 (talk) 03:53, 29 May 2010 (UTC)[reply]



    It should be clear to all by now that reliability and self publication are independent characteristics. As for "who is author" and "who is publisher" the answers are the same as for anything else here. They are who reliable sources say they are. If the publication has any real credence it will be on WorldCat and in at least one depositary library such as the British Library or Library of Congress. It will have been catalogued to indicate the publisher. For major newspapers there will often be books about the newspaper, but they will not always reflect current information about the publisher. For that purpose official sources such as EDGAR are often useful, wherein legally required filings identify publishers that are publicly traded corporations. For a normally trustworthy publication or publisher, we extend that trust to the assumption that the byline correctly attributes the article to its authentic author (if for no other reason than that writers don't like seeing their work attributed to others and tend to make a stink when it happens). In short, "I don't know" is irrelevant. "I can't cite" matters. LeadSongDog come howl! 16:44, 28 May 2010 (UTC)[reply]

    Grandma is an author

    'Self-published sources' is an odd term... it seems to be more appropriate to an era where it would be very uncommon for on to have the ability or means to 'publish' (i.e. make available to a material amount of people) their opinions or reporting. Although I believe the term Vanity press was used to refer to on one hand, those with the means who could publish whatever fit their 'vanity' to a large amount of people, and on the other hand some hardscrablle authors who believed so much in there work they paid the way for their works to get published - some known today, likely many unsuccessfully and unknown today. Self published sources would not always be primary sources, but when being secondary sources, when do they become reliable? To continue the analogies:
    1. Grandma blogs about her house burning down
    2. Gma blogs about her neighbor's house burning down
    3. Gma blogs about her neighbor's house burning down on a web site she has ran for her retirement community for years, which has been mentioned and linked to multiple times by a local news paper
    4. Gma blogs about her neighbor's house burning down on the local newspaper web site using some type of user content area
    5. Gma blogs about her neighbor's house burning down as a paid contributor to the local newspaper
    6. Gma blogs about her neighbor's house burning down and her blog post is converted to article format and published in the paper version of the local newspaper
    7. Gma wirte about her neighbor's house burning down for the new york times

    Maybe she stops being 'self-published' around step 4 or 5... when does she become a reliable source? I don't think the two necessarily are tied together.Cander0000 (talk) 03:10, 27 May 2010 (UTC)[reply]

    The distinction being proposed here appears to only add confusion, not relieve it. I'm not even sure if it's accurate or true, and I certainly cannot understand what "real-life" editing issues it could possibly hope to solve. Jayjg (talk) 06:45, 27 May 2010 (UTC)[reply]
    The Grandma analogy is clever - maybe we can kick her around a bit more to develop a shared understanding of this issue. Going back to the start of analogy, lets speculate that "Grandma started the fire in order to make a bogus insurance claim". From this perspective, Grandma may not be a self-published source in all cases, but she has strong connection with the subject matter, to the point where she is suspected of not being a disintersted source of information. Thus, whether the source is self-published or not, if the source is not independent of the subject, the reliablity of the source must be brought into question regadless of whether it is a primary, secondary or tertiary source. --Gavin Collins (talk|contribs) 09:26, 27 May 2010 (UTC)[reply]
    I think that #5 is usually when Grandma is normally considered non-self-published, although the details of #4's set up matter here. (I assume that Grandma isn't the newspaper's publisher of record.)
    The fact that Grandma in #5 is the author of material published by someone else does not tell us whether Grandma has a massive conflict of interest/that she's independent of the subject matter, or whether the story was properly vetted (e.g., if the editors were out sick that week because Grandma poisoned them all with her famous fruitcake so that she could publish her story without any interference). These other factors definitely affect reliability, but they don't tell us whether Grandma is both the author and the publisher. WhatamIdoing (talk) 20:14, 27 May 2010 (UTC)[reply]
    You forgot one senario... Grandma worked as a fire Marshal for 40 years and has written several well critiqued books about firefighting... and then blogs about her neighbor's house burning down. Her blog is completely self-published... but due to the "expert exception" her blog might be considered highly reliable. Blueboar (talk) 20:20, 27 May 2010 (UTC)[reply]
    I agree: In that scenario, Retired Fire Marshall Grandma is both a highly reliable source and a self-published source. WhatamIdoing (talk) 20:40, 27 May 2010 (UTC)[reply]

    Example?

    It's hard to follow what the aim of the above is. Could someone give a real example of a problem that's being caused by our current use of "self-published"? SlimVirgin talk contribs 21:41, 27 May 2010 (UTC)[reply]

    Yes: The current lack of an explicit definition has resulted in three editors (including yourself) asserting that "self-published" means something different from what every single reliable source says it means. You and Crum have asserted that it means "self-published without editorial oversight", and George has asserted that it means "self-published secondary source".
    If we provide a plain statement of what self-published means, according to every single reliable source, then editors will not be able to make up their own unverifiable definitions to suit themselves, and they will not be able to falsely tell other editors that "Wikipedia defines self-published this way", when it does no such thing. WhatamIdoing (talk) 22:10, 27 May 2010 (UTC)[reply]
    What I'm requesting is a real example from an article showing that the way the policy uses "self-published" is causing a problem. SlimVirgin talk contribs 22:45, 27 May 2010 (UTC)[reply]
    I could give the example of a dispute about self-published sources regarding the author Dan Willis, for whom notability was claimed using only some forum postings and book listings. There was clearly a difference of opinion between the two administrators and myself about what is self-bublished, but the ambiguity of current definition may be the source of the disagreement. Although Dan Willis may or may not have written the coverage cited in the article himself, the fact that he could have done so, or his publisher or agent could have done so suggest to me that a self-published source is one that cannot be classed as being reliable because it has not been the subject of editorial oversight (aka fact checking), but also because the source is not independent of the subject matter. WP:SPS as it is written is a combination of these two missing characteristics: editorial oversight and independence. I think we need to seperate the two to make this policy clear. --Gavin Collins (talk|contribs) 14:05, 28 May 2010 (UTC)[reply]
    You're talking about notability now, Gavin, and this policy isn't about that. But using Dan Willis, can you give me an example from that dispute of a source where calling it a primary source wasn't enough? That is, an example of a source that needed the words "self-published" to be added to the description before it could be disqualified. SlimVirgin talk contribs 16:00, 28 May 2010 (UTC)[reply]

    Newspaper and magazine "blogs"

    WP:NEWSBLOG currently says "Posts left by readers may never be used as sources." But is this not like letters to the editor? Does it not depend on whether the person or organisation writing can be identified as a notable expert on the subject under discussion? For example there was a magazine published during the Victorian period Notes and Queries Online, which was a sort of editorial overseen paper blog. Information from N&Q was cited and included in academic publications. In most of the cases I have seen academic selection of information extracted from N&Q was from from known experts in their field. -- PBS (talk) 22:18, 26 May 2010 (UTC)[reply]

    Letters to the editor are screened in some way. Posts on blogs aren't, so we have no idea who's writing them. SlimVirgin talk contribs 22:42, 26 May 2010 (UTC)[reply]
    SV is working from incomplete information. Posts on some blogs are screened; posts on others are not. Carefully moderated blogs might have approximately the same level of editorial oversight as the letters to the editor page. (Which is to say: neither of these are particularly strong sources.) WhatamIdoing (talk) 23:12, 26 May 2010 (UTC)[reply]
    Can you give an example of a newspaper blog that screens posts in that way? SlimVirgin talk contribs 23:50, 26 May 2010 (UTC)[reply]
    The BBC isn't quite a newspaper, but Stephanie Flanders has a blog here that I would certainly accept as a reliable source.—S Marshall T/C 00:08, 27 May 2010 (UTC)[reply]
    I think you may be confused. Blogs on respected media are actually columns, and are considered reliable per WP:NEWSBLOG. What is at question here are reader comments posted to blogs or online articles. Crum375 (talk) 00:19, 27 May 2010 (UTC)[reply]
    Oh. Yes, I see that; I took SV's remark in isolation and replied to it. But now that I've read it, I'm not sure that PBS's question is about reader comments on blogs. I think it's more about these.—S Marshall T/C 00:28, 27 May 2010 (UTC)[reply]
    Saying that a reader post to a blog "might have approximately the same level of editorial oversight as the letters to the editor page", even if true (and that is unlikely) is hardly a ringing endorsement - quite the opposite, in terms of being a reliable source. In any event, posts to newspaper blogs are only screened, at best, in the most rudimentary way; that is, if they contain obscenities, or are overtly racist, or are simple spam, or contravene some law, or the like. Letters to the Editor are carefully screened, if for no other reason than the fact that it is expensive to print newspapers, and the Letters to the Editor page have very limited space. On the other hand, space for user comments on blogs is essentially free, so just about anything is allowed. Jayjg (talk) 06:50, 27 May 2010 (UTC)[reply]
    If I were to write a letter to editor of The Times and forge the signature of Karl Marx it would be incumbent on the editor to a) check the postmark to ensure it hadn't somehow been caught up in the postal system for decades, b) decide based on content whether it deserved publication anyhow, and if it did, c) add an editorial comment to alert any unsuspecting readers to the improbable attribution. Few open blogs make even basic attempts to verify the author's identity. For a newspaper-operated open blog, it would be straightforward to use customer account numbers or require an automatic confirmation email, but this is seldom done. For registered contributors there may be a karma system, but it only goes so far. LeadSongDog come howl! 13:58, 27 May 2010 (UTC)[reply]
    Such trusting faith.  ;-)
    In a large newspaper, the editor probably won't even see the envelope, and s/he'll assume that the letter is written by some reader whose name really is "Karl Marx" — a Karl Marx, e.g., [5], not the Karl Marx. They don't know, or really even care, if you're using your real/everyday name. You can — and people do — use middle names, maiden names, and made-up names. Typed, unsigned, plain-paper letters that are supposedly from local politicians usually get double-checked, and letters that are obviously from the same crank (every newspaper has at least one) get discarded, but that's really about it. WhatamIdoing (talk) 17:01, 27 May 2010 (UTC)[reply]
    As an example, consider the New York Times blogs: They screen do comments (at the minimal level), but they also select comments for highlighting in their "Comments of the Moment" section. That section receives the same level of active editorial selection, and suffers from the same limitations on space, as the letters page of a typical newspaper. IMO the "Comments of the Moment" section is (1) still reader comments and (2) selected by an editor. WhatamIdoing (talk) 17:01, 27 May 2010 (UTC)[reply]

    Let us suppose that a opinion is published in a newspaper and added to the newspapers bloc site, in which the newspaper columnist claims that something in a recently published book was wrong, and that the author of the book replies on the blog with a clarification. It would be incumbent on the newspaper to check that it was indeed the author of the book and the author of the acknowledgement was one and the same person (if not they would leave themselves open to legal action). I'm thinking along the lines of David Irving's letter to the Times in 1966, but published in this day and age as a reply in a blog on www.timesonline.co.uk/tol/comment/blogs/. -- PBS (talk) 23:22, 27 May 2010 (UTC)[reply]

    The external form of a medium is not the definitve factor. Obviously most of what appears in most such blogs is worthless, but this is not universally true. Posts left by readers stand on the authority of their authors. In most cases this is not much, in some cases it is. Each case of this sort has to be justified, but it's become a regular ode of expression by the most highly reputed authors and publications. DGG ( talk ) 05:15, 28 May 2010 (UTC)[reply]
    Quite! But WP:NEWSBLOG currently says "Posts left by readers may never be used as sources." (my emphasise). -- it comes down to the old distinction we make about reliable sources, "the piece of work itself (a document, article, paper, or book), the creator of the work (for example, the writer), and the publisher of the work". Usually a Newspaper readers comment to a blog fails on two of the criteria. The content and the creator. But occasionally, as with letters to the editor, a reader's posting will tick all three boxes. As such the blanket ban imposed by "may never" is not always going to benefit the project. Should the wording be modified, or should editors rely on WP:IAR? -- PBS (talk) 10:36, 31 May 2010 (UTC)[reply]

    Conflicting clauses

    Addition of SlimVirgin at 19:37, 9 April 2010 - All material in Wikipedia articles must be attributable to a reliable published source to show that it is not a Wikipedian's original research, but in practice not everything need actually be attributed. I can't understand it. There is wide consensus about this addition to the rules? X-romix (talk) 09:47, 27 May 2010 (UTC)[reply]

    Makes sense to me. It must be theoretically possible to attribute it, but in practice there's often no need to actually attribute it.--Kotniski (talk) 09:51, 27 May 2010 (UTC)[reply]
    Not true. If an article's coverage is not attributed with an in line citation, how can you tell if it meets WP:BURDEN if that is the only coverage there is? --Gavin Collins (talk|contribs) 09:56, 27 May 2010 (UTC)[reply]
    Depends who "you" is. Being attributed with a citation is only one (possibly fairly minor) factor in whether a particular reader is able to check a particular statement. If it's something for which 1000 sources can readily be found just by Googling, for example, then it really doesn't help much if we provide a specific citation. (I'm not saying we shouldn't, just that it's not an absolute priority.)--Kotniski (talk) 11:29, 27 May 2010 (UTC)[reply]
    "In practice there's often no need to actually attribute it" sounds like WP:ATA#Crystal to me. --Gavin Collins (talk|contribs) 11:55, 27 May 2010 (UTC)[reply]
    Not sure what you mean, but the fact that in practice not everything needs to be attributed has been WP's core policy from day one. Essays, which are random thoughts by random anonymous individuals, without consensus, cannot override policies or guidelines, and I am not even sure that the essay you cite is conflicting with this policy. Crum375 (talk) 12:02, 27 May 2010 (UTC)[reply]
    While it is possible to find and cite a source that says "2+2=4", we all know that it would not contribute much to the accuracy or reliability of WP to cite every detail that trivial. Of course editors considering the base 3 POV may slap on a {{cn}} and argue on the talkpage "That's wrong, 2+2=11 in base 3!", leading to further refinement of the article (although in general human readers work in decimal arithmetic and our anthropocentric POV reflects that). We often don't cite things that are so trivial unless someone actively challenges them. Every citation takes time and effort away from other work and there is a modicum of editorial judgement involved. Slapping {{cn}} on every assertion in WP would not be constructive. So far, this approach has worked out pretty well as most editors actually do want to get things (small r) right.LeadSongDog come howl! 13:32, 27 May 2010 (UTC)[reply]
    The key is that things need to be cited "when challenged or likely to be challenged"... This concept has a long history on Wikipedia and has firm consensus. What causes all the commotion, confusion and angst are statements that were originally added with the idea that they were "unlikely to be challenged" (and so were not cited)... but which are now being challenged.
    I think we all understand what a challenge is... although we often disagree as to whether a specific challenge is legitimate (or frivolous). But whether legitimate or not, I think it has been established that the ultimate solution to any challenge is to provide a reliable source. We may feel that it is a pointless hassle to do so, but doing so will resolve the challenge.
    We don't have a clear statement as to what "likely to be challenged" means. Nor can we. This is because "likely to be challenged" if often a judgment call... and often depends on how controversial the topic is. A good rule of thumb is... "when in doubt, cite it". But even then, editors will frequently disagree as to whether a statement is "likely to be challenged". That is in the nature of Wikipedia. Leaving aside the addition of deliberate OR and POV pushing... we are always going to have editors who add material that they assume does not need a citation (because they are sure that it is correct and non-controversial), but sometimes they will be wrong in that assumption... On the other side, we are also always going to have editors who take an overly legalistic, literal approach to challenging (and challenge things that realistically don't have to be cited). We can not legislate either out of existence, no matter how we word this policy. What we can do is encourage our editors to do good research, to not take challenges personally, and ask them not to make challenge frivolously. Blueboar (talk) 14:24, 27 May 2010 (UTC)[reply]
    I agree with X-romix: this amendement cannot stand. I think it is a very personal interpretation of WP:BURDEN that "in practice not everything need actually be attributed". I think this policy can only be silent on this issue, because I read this as a licence to stall any possible challenge. In practise, the one statement that is not attributed is going to be the one challenged and disputed. --Gavin Collins (talk|contribs) 16:39, 27 May 2010 (UTC)[reply]
    It's not an "amendment", since it's always been part of the core policy, from the day WP was founded in its current format. And it works very well in practice, since all WP content is based on it: anything challenged, likely to be challenged, or quoted must be attributed, anything else can be unattributed until challenged. All material, without exception, must be attributable to a reliable source. Crum375 (talk) 16:53, 27 May 2010 (UTC)[reply]
    It's not any kind of "personal interpretation" either, just an obvious statement of fact - just click around Wikipedia for a few seconds and see what a large proportion of statements are not actually attributed, and consider what a pale shadow of its globally successful self Wikipedia would be if all those statements were to suddenly disappear.--Kotniski (talk) 17:05, 27 May 2010 (UTC)[reply]
    Yes, this is correct: The policy is, and AFAICT has always been, that material must be verifiable, not already verified. Entirely unreferenced articles (although bad practice) are actually 'legal' on Wikipedia, so long as it is possible to verify the information (e.g., by asking your favorite web search engine about it).
    Additionally, there is, and has never been, any requirement to use inline citations (so long as there are no direct quotations and no challenges). See WP:CITE#General_reference: This section wouldn't exist if Wikipedia actually required inline citations in every article. WhatamIdoing (talk) 17:07, 27 May 2010 (UTC)[reply]
    I tried to say that with this edit, but was instantly reverted. I couldn't be bothered to bring the matter to the talk page at that time so I left it. Right now, the policy says inline citations are a "must", which is fairly typical: policies document how the kind of editor who's active on policy pages thinks things ought to be done, not how the kind of editor who writes content thinks things ought to be done.—S Marshall T/C 18:47, 27 May 2010 (UTC)[reply]
    Inline citations are a must for anything challenged or likely to be challenged, and for all quotations. That doesn't mean that every single thing needs a citation. SlimVirgin talk contribs 18:52, 27 May 2010 (UTC)[reply]
    I don't agree with you. Jimbo wrote: "If it is true, it should be easy to supply a reference". [6]. I think that article with lacks of sources - is a wide gate for mass (hundreds in one article) "mistakes", original researches, conflicts of interests, hoaxes, nonsence, non-quality and non-neutral articles. Falsificators, propagandists and original researchers do not wants to show their sources. Conscientious users always can supply references in all paragraphs of their text, or in the bibliography section of their article. X-romix (talk) 23:03, 27 May 2010 (UTC)[reply]
    Certainly: The absence of sources can cause all sorts of problems. But the fact remains that unless and until some editor actually gets around to challenging a given statement, Wikipedia does not actually require editors to support the statement with an inline citation (except for direct quotations). WhatamIdoing (talk) 23:17, 27 May 2010 (UTC)[reply]
    The absence of sources can cause all sorts of problems. The presence of sources makes article better and more helpful for readers and students, allows to check and clean up article with some errors and falsifications, allows to evaluate the quality of used sources. So, WHY NOT provide all sources in Wikipedia? I don't understand WHY I need not to read and write articles, but talking and challenging with falsificators, clowns and political propagandists who can't find sources for their own texts. X-romix (talk) 13:39, 29 May 2010 (UTC)[reply]
    S Marshall, I see where you're coming from, but BURDEN really only applies when a challenge exists. Once that challenge exists, an inline citation is required. Consequently, in the context of that particular section, an inline citation is actually required. WhatamIdoing (talk) 22:53, 27 May 2010 (UTC)[reply]
    Gavin collins is wrong. Not all material on Wikipedia must be verified. The "verifiable not verified" soundbite is longstanding. Often facts are such common knowledge that to verify them would be ridiculous, or at least of little practical importance. Where the boundary is crossed is an editorial judgment on a case-by-case basis. If an editor wants individual facts or opinions verified or attributed, then they should: 1. do it themselves; 2. tag them; 3. remove them if dubious or potentially violating BLP. Fences&Windows 17:18, 27 May 2010 (UTC)[reply]
    ...or 4. discuss the issue on the article talk page. or 5. discuss the issue at WP:RSN... there are lots of ways to deal with material that you think should be sourced but isn't. Which one is the best way to deal with it depends on the specifics of the situation. But Fences is correct... you must be able to cite even the most obvious statements... but you are not necessarily required to actually cite them. (unless they are challenged or likely to be challenged.) Blueboar (talk) 22:51, 27 May 2010 (UTC)[reply]
    • I agree with Blueboar and F&W. Inline citations are not required except for specific disputed facts. Requiring them is way beyond the standard of any normal academic tertiary source, of even the highest standards. Things must be referenced, certainly, but to in a sufficiently exact manner to permit verification,and nothing more is generally necessary. The only sorts of writing I know which actually use inline references for everything are legal opinions and medical textbooks, two very specialized genres, notyorious for their lack of general readability. DGG ( talk ) 05:03, 28 May 2010 (UTC)[reply]
    • After my attempt at a FA and long experience at AfD, I've learned that if it can be challenged, someone will challenge it. There are Wikipedians who assume everyone's either a marketer, a spammer or a vandal—and if they've done a lot of new pages patrol, that suspicion's understandable. I'm finding myself giving one or two citations a sentence. (I actually started two yesterday, which is very unusual for me; they're Hannah Monyer and Dominique Reiniche, both BLPs. You can see I've ended up writing to a pattern: first sentence, name, date of birth, nationality, profession, with inline citation. Second sentence, assertion of notability, with inline citation.) Experience with suspicious patrollers has taught me that if you want to get past NPP and through the AfD process, one citation per sentence is the only safe way. But the encyclopaedia shouldn't be like that.

      Constant inline citations that appear next to the material they support should not be necessary. In a logical world, I should be able to write, "The following three paragraphs are based on Thompson 2002, chapter III," and provided Thompson 2002's in the bibliography, that should suffice. As a longstanding Wikipedian and autoreviewer, I should arguably be able to source something to "Mr D Smith (with his credentials as an expert), personal correspondence" and editors should trust me.

      But there's a vociferous group of editors who believe that inline citations are there so that they can conveniently check (and quibble) every sentence. I think we've all seen the editors at AfD who will challenge a source because it's not in English. ("I can't check it!"—Well, mate, that's not my fault. I can prove I've done the research, but I'm not going to teach you German, or drive to your house with half the contents of my bookshelf so you can read the offline sources. Whether or not you can check it is not my problem.)

      If I were writing for de.wiki, or fr.wiki, or for any print encyclopaedia, it would be enough to provide a bibliography. Only en.wiki has the inline citation rule and it makes it unnecessarily hard to write.

      Another problem is that when I say these things, editors come back to me with alphabet soup links to policy, combined with strongly-worded sentences in the emphatic declarative. They say to me, "We do things like this, because of WP:THISRULE. It's supported by a longstanding consensus", apparently without understanding that this is a logical fallacy. I'm not talking about how we do things now. I'm talking about how, in a logical world, things ought to be done.—S Marshall T/C 08:39, 28 May 2010 (UTC)[reply]

    Simply put, the idea that "All material in Wikipedia articles must be attributable to a reliable published source... but in practice not" is rubbish policy. --Gavin Collins (talk|contribs) 09:03, 28 May 2010 (UTC)[reply]
    That addresses one small part of my argument rather than the whole. Can I take it you agree with everything else I say?—S Marshall T/C 09:08, 28 May 2010 (UTC)[reply]
    I sympathise, but like you I have found that adding citations is the only way to avoid disputes, and all of Wikipedia policies hinge on being able to review the sources, so attribution is key to resolving these disputes. If there is a fallback position that attribution is not necessary in every case, then these disputes can't be resolved. I have been in a long mediation case where a group of editors were adamant that just because content was unsourced, it could not be labled as original research, but once the case got underway it was shown that all the unsourced content was original research. This is why I object strongly to this "get out of jail free" clause. --Gavin Collins (talk|contribs) 09:24, 28 May 2010 (UTC)[reply]
    You seem to think that all of Wikipedia is the subject of dispute. Maybe you work in more controversial subject areas, but in many areas people don't go around disputing things all the time, so there really isn't any compulsion (whatever that would mean) to add sources for every statement. It would be quite nice if we did have a citation for every statement, but we don't want to give people the impression that they're somehow breaking Wikipedia's rules any time they add statements without citations - that would drive away even more editors than we currently are doing.--Kotniski (talk) 11:11, 28 May 2010 (UTC)[reply]
    There's a nub of substance in S Marshall's comment above. When stubbing a new article we do need to have a suitable citation for the assertion that establishes the subject's notability. Otherwise it simply should not survive. OTOH there are indeed entire classes of articles, such as discographies, that are very sparsely cited without really compromising accuracy. Sometimes things are not "likely to be challenged" simply because it is too easy to check. LeadSongDog come howl! 12:56, 28 May 2010 (UTC)[reply]
    I disagree: Subjects are notable if the sources exist, not if the sources are already cited in the article. Lion, for example, was a notable subject in its very first, utterly unsourced stub, not a while later when someone started adding sources. If one of those aggressive NPPers tags an unsourced article for deletion because the NPPer is ignorant of the existence of sources, then the NPPer has screwed up — an understandable error, but still a mistake.
    Yes, if we want to avoid disputes, we'll all do the equivalent of defensive medicine on Wikipedia — but that doesn't mean that we must write and cite to defend against potential challenges. WhatamIdoing (talk) 16:12, 28 May 2010 (UTC)[reply]
    (edit conflict) Well, I write/translate a lot of biographies, and Wikipedia's BLP policy presently treats anything biographical as if it had anthrax, leprosy and bird flu, so yes, I probably do work in more "controversial" subject areas.

    In response to Gavin Collins: I've written a lot of material that the average editor couldn't check—because the average editor doesn't speak the language the source is written in, or because he doesn't have the same books on his bookshelf. Things don't have to be checkable by everyone.—S Marshall T/C 12:59, 28 May 2010 (UTC)[reply]

    I agree that standalone articles do need to have a suitable citation to establish the article's compliance with WP:BURDEN, which is why I think saying that its "not necessary in practise" shold be dropped as it is a licence for original research. However, I don't agree with LeadSongDog that there exist facts that "too easy to check"; my experience with fictional topics is that a lot crap is contained in unsourced articles such as Gaius Baltar that seems plausible, but in reality is actually madeup and because there are no citations it is impossible to check whether it is original research or not. I propose that we drop the "everything need actually be attributed clause althogether" clause because it encourages sloppy editing. --Gavin Collins (talk|contribs) 13:40, 28 May 2010 (UTC)[reply]
    Stand-alone articles do not need a suitable citation until their contents are challenged. BURDEN does not say "The burden of evidence lies with the editor who adds or restores material and he has to provide that source the very second he adds or restores material." BURDEN does not apply until some other editor comes along and says, "Really? Then WP:PROVEIT." WhatamIdoing (talk) 16:17, 28 May 2010 (UTC)[reply]

    Leadsongdog, discographies are not an example of articles with few sources, because each disc (record or CD) mentioned is a citation (unless, of course, the label, cover, or case does not mention that the person who is the subject of the discography was involved with the disc.)

    Gavin.collins, fiction is a poor example to mention when trying to give examples of unacceptable uncited statements, because unlike the real world, anything is possible in fiction, so there is know way to know if a statement is true except by reading/viewing the work of fiction.

    For a more realistic example, consider this example from an article Gavin.collins has worked on, Moneylender:

    Moneylenders who are unregulated, engage in predatory lending or seek to enforce loan agreements by illegal means such as extortion are commonly referred to as loan sharks.

    Anyone familiar with the English language knows this is true. Anyone unfamiliar with English can easily look it up in almost any dictionary. No citation is necessary. Gavin.collins, are you willing to say that anyone could possibly start a good-faith dispute about this statement? Jc3s5h (talk) 14:14, 28 May 2010 (UTC)[reply]

    (edit conflict)Well, fictional or pop culture topics are perhaps a different animal. They tend to be attractive to fans, and fans are, by and large, people with strong opinions who believe themselves knowledgeable about their subject. Only some are correct, and that's a place where verifiability standards need to set the bar very high.

    Personally I think Wikipedia's coverage of fictional topics is beyond excessive in proportion to our coverage of factual ones, and it blows my mind that we've got people writing FAs about The Halo Graphic Novel and the Metallica Discography when the biographies of three quarters of the Gottfried Wilhelm Leibniz Prizewinning scientists, and about ninety percent of the national-level politicians of Europe, are still redlinks.

    But producing an article that's unchallengeably-referenced in every respect results in proseline. My last attempt at that—History of Hertfordshire—has 170 separate references, 14 footnotes and 27 volumes in the bibliography. It's totally neutral in point of view and it's also virtually unreadable, partly because I could've organised it better, but mostly because there's nothing to it but lede followed by a seven thousand word list of facts that are pretending to be prose.—S Marshall T/C 14:19, 28 May 2010 (UTC)[reply]

    • ... and in response to Jc3s5h, you were the person who reverted my change to inline citations. The problem with the line you're taking is that things that ought to be uncontroversial, aren't. I ought to be able to write:

      A human is a hairless ape ultimately descended from tree-dwelling primates.

      But, there's the Christian right who will insist on describing that as "theory" and wanting what they would call "legitimate alternative views" to be represented in the article. Many of them have long since departed for safer climes like Conservapedia, but not all, and the truth is that no matter what you write, there's usually some raving lunatic or someone with a point to make who'll go around challenging what ought to be uncontroversial.—S Marshall T/C 14:27, 28 May 2010 (UTC)[reply]
    The statement S Marshall puts forward as something that ought to be uncontroversial is anything but. Many religious people believe that humans have immortal souls and will experience an afterlife, but other animals do not. If they are right, there is a huge difference between humans and other animals, and there is justification for saying that humans are not apes.
    The approach of the cite-every-sentence crowd in this discussion, of giving examples of statements that seem uncontroversial, but upon closer examination are controversial after all, is the wrong approach. This crowd should demonstrate that all statements are controversial, and thus need a citation. I don't think they can prove any such thing, so I don't think their argument can prevail. Jc3s5h (talk) 14:40, 28 May 2010 (UTC)[reply]
    In response to Jc3s5h, the quotation from the article Moneylender was written by me, and to be frank, its not very good, as it contains a form of mass attribution ("they are are commonly referred to") that could easily be challenged. I don't think we should encourage unsourced content, even if I have been guilty of adding some myself.
    The biggest objection I have to unsourced content is that it invites what I call "goldfish editing", namely lack of attribution encourages other editors to nibble away at unsourced coverage, changing a word or a sentence here and there, but with no memory of what went before. Eventually the content may be changed hundreds of times, without any substantial change being made. For example, until citations were added the lead paragraph of the article Accountancy, the unsourced content was the subject of tiny goldfish edits virtually every day. We need to encourage editors to add citations, not unsourced content which is why the "not everything" clause needs to go. --Gavin Collins (talk|contribs) 14:45, 28 May 2010 (UTC)[reply]
    Wikipedia is built on unsourced content. Adding unsourced content (if you know that you could source it if you wanted to) probably adds far more to the value of the encyclopedia that spending the same amount of time adding citations. So no, I don't agree that "we need to encourage editors to add citations, not unsourced content". At least, not everywhere - it would probably be true in areas that are known to be controversial, but most areas aren't.--Kotniski (talk) 14:53, 28 May 2010 (UTC)[reply]
    Perhaps the discography example was poorly chosen. I've stubbed quite a few articles on academic journals. These I usually start out by populating a few fields in template:infobox journal, particularly the Title, url, ISSN, and OCLC fields. The latter two are in effect citations to the respective WorldCat entries. They do not meet a normal form of citation, but they still provide sceptical editors with an easy means of verification that I'm not just engaging in WP:OR. The link to the publisher's website is helpful for populating things for which SPS are normally accepted as reliable, e.g. the editor, publisher, frequency parameters. If I'll assert that it is the official journal of a certain society then I populate the infobox with a link to that society's website. Then I'll write the first line of the article with information from those sources, tag it with {{journal-stub}} and hit "Save page". I have not, at that point, made any formal citation, but I have still linked to sources that support the accuracy of what I say. I've yet to have a problem with new page patrollers using this approach. It isn't the form of citation that really matters, but that the information can be readily verified by anyone who cares to. Only a few articles about journals grow to any great size or accumulate large numbers of citations, but even at this stub stage, they have utility to readers. I then add redirects from the various abbreviate titles (with and without periods). Suddenly, all over WP, dozens of redlinks turn blue.LeadSongDog come howl! 15:37, 28 May 2010 (UTC)[reply]
    Sure, unsourced additions to articles are not great and I always work from and cite sources myself. I would never encourage anyone to add unverified information into Wikipedia. But disallowing unsourced content in the way Gavin.collins proposes would be a major policy change. It would in effect allow removal of all unsourced content and speedy deletion of all unsourced articles, which would be massively disruptive. This is why the "if challenged or likely to be challenged" bit is sensible - it allows the sourcing or removal of unsourced material to proceed at a measured pace rather than in a blitz of deletion. Fences&Windows 15:40, 28 May 2010 (UTC)[reply]
    I am not saying we this policy should state unsourced content is not allowed, but that the statement "in practice not everything need actually be attributed" is rubbish guidance, and should be removed. --Gavin Collins (talk|contribs) 21:17, 28 May 2010 (UTC)[reply]
    Sounds like excellent guidance to me, correct and well stated. Why do you think it's "rubbish"? Crum375 (talk) 21:23, 28 May 2010 (UTC)[reply]
    Maybe I am the only one who sees the conflict between say "All articles must be ...but in practise not". Mayb we should change this sentence to "all articles should be attributable to a reliable published source to show that it is not a Wikipedian's original research" and drop the get out exemption at the end? --Gavin Collins (talk|contribs) 16:16, 29 May 2010 (UTC)[reply]
    As we said right at the start of this thread, there is no conflict. "Attributable" is not the same as "attributed", any more than "payable" is the same as "paid" or "breakable" the same as "broken". --Kotniski (talk) 16:20, 29 May 2010 (UTC)[reply]
    You are splitting hairs. If a sign says "You must never touch the overhead power line...but in practise you can", we would say that is a rubbish warning. It does not matter if the overhead power line is never to be "touched", or should not be considered "touchable"; its the get out clause at the end that is nonsense. --Gavin Collins (talk|contribs) 16:37, 29 May 2010 (UTC)[reply]
    Yes. But what does that have to do with the sentence we're discussing here, which isn't in anything like that form? --Kotniski (talk) 16:49, 29 May 2010 (UTC)[reply]
    I might be wrong on this, but its a good analogy. On the one hand, this policy says that articles must be attributable to show that they are original research, but on the other it says not everything in practice need actually be attributed. I see conflict between the two statements. Surely we can bring these two conflicting statements together to write one clear statement of guidance: articles should be attributable to show that they are not original research. --Gavin Collins (talk|contribs) 21:01, 29 May 2010 (UTC)[reply]

    Which policy says that "articles must be attributable to show that they are original research"? Crum375 (talk) 21:22, 29 May 2010 (UTC)[reply]

    I see no conflict at all and I truly wonder if many here are talking right past each other because those questioning the language added do not understand what the words ending "able" and "ed" mean in this context. The complete missing of the mark of the power line analogy above makes me think I'm right.
    • Attributable means capable of being attributed. It means that it is possible to find reliable sources for the information.
    • Attributed means sources are present in the article; the subject that was always attributable has had the reliable sources added to become attributed.

    There is no a "get of of jail free" clause, and the statement can only be seen in that way if the subject statement's meaning is misunderstood. So, following on my defining of the words, let me translate the text we are here about:

    "All material in Wikipedia articles must be attributable to a reliable published source to show that it is not a Wikipedian's original research, but in practice not everything need actually be attributed"

    It means:

    All material in Wikipedia articles must be capable of being verified with reliable published source (it must be possible to place sources), because Wikipedia does not announce new things; information that has not already been published outside of Wikipedia is original research. Because we require the ability of information to be sourced, as noted by "attributable", but do not require actual sourcing for every statement to appear in article text, a corollary of the verifiability policy is that if someone questions whether something is capable of being sourced (whether is attributable), by challenging it, then showing the capacity to be sourced must be proved by producing and citing the actual source(s). Once that is done, the material has been attributed.--70.107.78.246 (talk) 22:16, 29 May 2010 (UTC)[reply]

    How many articles without sources you have already written? In some cases any other user can find source, but in other cases it is too difficult (especially if author used some propaganda or fiction). Jimbo wrote: "I heard it somewhere" pseudo information must be aggressively removed.[7] - now it is a part of rule about verifiability. There is many factual inaccuracy (we found more than 100 factual errors in one "featured" article, and more than 50 errors in other "featured" article, written by arbiters of Wikipedia) and political propaganda without any sources. Propagandists and falsificators do not like to show their sources and prefer to hide sources and names. X-romix (talk) 09:12, 30 May 2010 (UTC)[reply]
    So name and shame if necessary. That's part of why we have a full history of edits. Wikiblame can help, too. But for most articles, where propaganda isn't an issue, a {{cn}} tag serves the purpose. LeadSongDog come howl! 15:23, 31 May 2010 (UTC)[reply]
    We don't need to same and blame; we already have policies which say that this amendment conflicts with Wikipedia policy. Lets put the relevant statements side by side:
    • WP:V: "All material in Wikipedia articles must be attributable to a reliable published source to show that it is not original research, but in practice not everything need actually be attributed";
    • WP:OR: "Citing sources and avoiding original research are inextricably linked: to demonstrate that you are not presenting original research, you must cite reliable sources that provide information directly related to the topic of the article, and that directly support the information as it is presented".
    My experience in the Kender mediation is that any content that is not the subject of an inline citation is defintely original research. No if's, no buts, unsourced content is crap - it can't be checked, and more than likely it has been madeup. The example given that "Paris is the capital of France" is entirely misleading. A better example would be given by "Paris is the cultural capital of Europe" or "Paris is the city of lights". Who said this, where to these statements come from? These are the questions that only attribution can provide answers to. --Gavin Collins (talk|contribs) 21:16, 31 May 2010 (UTC)[reply]
    Gavin, you're confusing unattributed with unattributable. The former is okay in certain circumstances, the latter not allowed. SlimVirgin talk contribs 21:36, 31 May 2010 (UTC)[reply]
    Gavin, I admit I'm new to this discussion, but can you explain/expand on your point that "Paris is the city of lights" is "crap" if it isnt attributed. From where I stand reading your post you are saying the sentence, if in an article unattributed it is crap; but once it has a citation it becomes legit and "Truth"... Either the sentence is good or not does not depend on all of us knowing it is good or not through observation, if the citation was in French and I cant speak French then I must assume AGF that someone who speaks French will speak up if the citation does not say that; until then I must assume it does. How is this any different than assuming that the sentence without a source is in fact true? If an unsourced sentence is so dubious that I believe a citation is needed I can go to the search engine of my choice and find one, if I dont find one THEN it becomes crap. A "factoid" in an article is not Schrodinger's cat in limbo between truth and crap until observed by an outsider. It seems your entire philosophy is based upon an entire lack of good faith on any Wikipedian's contributions. This is entirely against the philosophy on which Wikipedia is based upon. Once you go down the slippery slope of saying EVERYTHING must be attributed not just attributable you then will see that the next step is that EVERYTHING must be verified by everyone, if one person cant verify something themselves then it is "crap". We simply can not throw away AGF and become paranoid on everyone's motive for an addition.Camelbinky (talk) 23:17, 31 May 2010 (UTC)[reply]