Wikipedia:Google searches and numbers
|This proposal has become dormant through lack of discussion by the community.|
One of the biggest fallacies in determining the notability of a subject, which is part of determining whether a topic should have its own Wikipedia article, is the view that the results of a Google search can be used to assess notability. A Google search using the title or keywords of an article or subject has become known as a "Google test". It may be easy to view a subject as being notable solely because a Google search produces a huge number of hits, not notable because the search produces very few hits, or a hoax because it produces none at all. While such searches are indeed a very useful starting point, they do not in themselves determine notability or the lack thereof.
An obscure 1700s philosophical theory that is referenced in a number of widely respected older paper books may not show up on a Google search. But no Google hits does not mean that this theory is non-notable or a hoax. In fact, this theory may be notable under Wikipedia's rules, as it is described in multiple reliable sources. On the other hand, a reality TV contestant's name may generate a thousand Google hits–fan chat pages and blog posts regarding his sex life–but none of these may be reliable sources.
When performing a plain web search, it is possible that an awful lot of hits will turn up. But most probably, the majority of these will not count as reliable sources. Google News, Google Books, and Google Scholar provide results that are more likely to be reliable sources. But you would only be able to verify that these hits are reliable sources by reading the articles or books. While you may not be able to view all of them on the Google site itself, and many of them are previews, the search can at least show that the sources exist.
There is nothing wrong with pointing others to a list of these Google "hits" when trying to get others to improve an article or to save it when up for deletion, even within one's comment. This is actually a good idea if you are looking for others to help save an article. But the Google search results alone are not grounds for protecting an article from deletion.
Google searches are not references
It has become a practice in deletion discussions to quote a Google search or Google News search and say "look at all the results, there's your references" or "Two thousand Google hits, must be notable!" However, Google provides everything that can be found online, a huge majority of which are by no means reliable sources, and Google News reprints large swathes of material which may or may not be reliable, may or may not be relevant to the subject of the article, and may or may not still be there by the time the AfD closes (note that a full citation of a news article found online, with the author, title, newspaper name, etc. is still valid even if the website is discontinued. However, a bare url that no longer works may render an online source useless).
So therefore, if you find sources using Google related to a topic under discussion for deletion, great! But cite the exact reference or source you've found, rather than making a vague wave at the Google search numbers and saying that this large number proves the article's subject is notable, verifiable, and worth climbing the Reichstag over. The converse is also true: do not argue in AfDs that "Zero Google hits, must be non-notable."
Why are Google results not valid?
There are various reasons why the results of a Google search and their numbers mean nothing when it comes to establishing notability.
Wikipedia is not a dictionary
Wikipedia is not a dictionary. A dictionary focuses on words or phrases, exactly as they are titled, and generally without deviating from that title. Wikipedia as an encyclopedia, whose purpose is to tell about a person, group, place, object, event, or concept. Any of these may be known by one or more titles or groups of words, and any such title may have more than one meaning. While every Wikipedia article has a title, it is not the title that defines the subject, but the information contained within.
Search engines like Google focus on words or phrases, like the title of an article that one would likely enter into one. For example, if one wanted information on oil painting, s/he may enter the two words "oil painting" into a search engine (in quotes). This will likely produce plenty of web sites bearing the words "oil painting" in succession. As this is such a well known concept, it is likely many of these hits will tell about oil painting. But the query may also produce a site that contains the words "She was eating a salad topped with olive oil, painting a picture of a tree, and listening to music." This sentence has the words "oil painting" in succession, and therefore, would turn up in such a google query. But it has nothing to do with oil painting.
If you were to enter the phrase "was running laps" into a search engine, you would get a number of hits that contain these words in that exact succession. The sentence fragment may appear on a site that reads something like "He was running laps at the local track." But this does not mean there should be an article titled Was running laps.
A google search of the common word if produces several billion hits. On Wikipedia, the title If does not define the word if. Rather, it leads to a disambiguation page displaying a long list of subjects, including many songs, that happened to be titled "if" or with the initials IF. Still, the meaning of the common word if is restricted to a dictionary entry, and can only be written about on Wiktionary.
Many terms have multiple meanings
Many words, phrases, and other combinations of words have more than one meaning. For example, the term "4:30" to most people can refer to the time on the clock or to biblical verses. But writing an article on either of these examples using this exact title is not suitable. The title 4:30 is the name of a film. Not all GHits of 4:30 will produce sites pertaining to the film. Nevertheless, 4:30 is solely used on Wikipedia for the film.
The term Astro Boy has many uses. It is mostly known as a TV series, but there is also a disambiguation page listing other uses for this title. If a Google search of the term is performed, it is unclear how many results pertain to which meaning.
Not all websites are reliable sources
A Google search may produce hundreds, thousands, even millions of hits bearing the exact title of the article or other pages on the subject derived from key words. But only sites qualifying as reliable sources can be used to render a subject notable and to verify the accuracy of information. Most others do not qualify as permissible external links, let alone references.
Many, and often most websites fail to do just that. There are many websites aimed at selling a product or service. Wikipedia is not an advertising space, and such sites linked from an article would violate Wikipedia's advertising policy. Others include blogs, self-published sources, clones of Wikipedia, and other non-neutral or verifiable sources of information.
The best way to find actual reliable sources is not by a plain Google search, but with Google News, Books, and Scholar. Even so, this does not mean that any number renders notability or that all sources found in the search are reliable either for that article or for any article. Still, sources meeting the criteria are easier to find this way.
Not all sources provide in-depth coverage
Even if you do find one or more sources considered "reliable" by some standard, it does not automatically mean that they are good enough to support a particular subject. For example, if you wanted to write an article on a street, you may find plenty of news articles that trivially mention that street, and these articles may very well be useful in rendering other subjects notable. Sure, googling will bring them up. They may even help establish notability for another subject. But with their trivial mentions, they do not bring notability to the street.
Listing Google search results
After reading this, you may think that listing the results of a Google search in a deletion debate is a bad thing. That is not true at all. Listing them may actually be helpful in saving an article from deletion.
While the Google results will not make or break the case, they may be helpful toward others in making necessary improvements to save an article from deletion, or merely to agree what should be done.
The editor who provides the listing of Google results may not be able to make the necessary improvements him/herself. Doing so is not required. But others who see these results may be able to take care of this, or even mention that these more specific sources do exist, even if they do not add the sources themselves (see WP:HASREFS).