Talk:Comparison of web search engines

From Wikipedia, the free encyclopedia
Jump to: navigation, search
WikiProject Internet (Rated Start-class, High-importance)
WikiProject icon This article is within the scope of WikiProject Internet, a collaborative effort to improve the coverage of the internet on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
Start-Class article Start  This article has been rated as Start-Class on the project's quality scale.
 High  This article has been rated as High-importance on the project's importance scale.
 

Neutrality issues[edit]

This page was nominated for speedy deletion per WP:Criteria for speedy deletion § G11 proposed for deletion by ‎Hydrox and (when deletion was contested) {{multiple issues}} was ammended with |npov=March 2012. Though I see no promotion or inaccuracies in current revision (what can we do if most search engines are pure evil and DuckDuckGo was specifically designed not to be such?), it may be a good idea to think of additional criteria to compare search engines on. Probably removing some criteria (eg. "Privacy: No sharing" and "Designed for anonymizer" seem to be good candidates) should be removed. Ideas? — Dmitrij D. Czarkoff (talk) 15:20, 29 March 2012 (UTC)

Firstly, I did not nominate for speedy, but I suggested deletion through WP:PROD, which is a slightly different process. I am glad to see that there has been enough interest to contest the deletion.
But to get to the point, this is a complicated issue. First of all, your claim that "most search engines are pure evil" is not backed by reliable facts, but is to my understanding based on a widely propagated online Fear, Uncertainty and Doubt campaign. At times it's online commentators, and at times it's the competitors accusing each other through smear campaigns, like for example this video on Microsoft's official YouTube channel demonstrates.
What is of concern to me, is that this "comparison" only compares "privacy" features, and on seemingly arbitrary grounds (the compared features are chosen so that all search engines except DuckDuckGo get minimum score, and DuckDuckGo gets all points).
I would also like to see the exact phrases in Google's and Microsoft's privacy policies pinpointed, that allow the companies to e.g. 'share' you data and 'track' you. I am convinced that these are not binary (yes/no) truths; the companies always place limitations on how they allows themselves to use the data that they collect. There are also limitations by law, which vary vastly from country to country. But my greater concern is that the article seems to be grossly simplifying matters just to give one search engine product an apparently better standing.
Also, the article says:
The list is necessarily a function of this definition; alternative definitions include the Open Source Definition and the Debian Free Software Guidelines.
This statement does not make any sense, because it does not specify on which definition of "Internet privacy / ethics" metrics were chosen, apart from the original contributor's despotism.
As such (I am sorry that I did not originally have time to explain this in such detail), this page appears to be promotion of DuckDuckGo search engine. Per WP:NPOV, it's trivial to me at least that we should treat all competing products in the market place equally, irrespective of if they are open source or otherwise aligned with our (Wikipedia's) goals or not. I am sure, that if someone had made a "comparison" wiki page that portrayed Google or Microsoft in an unequivocally positive fashion, there would be a small army of editors and sysadmins eagerly reverting and deleting the content. (As a sidenote, this also seems to demonstrate a systematic bias issue.)
I am looking forward to sourcing this article to reliable, verifiable sources, including to journalistic or academic 3rd party reviews and to specific clauses in these products' privacy policies. --hydrox (talk) 18:44, 30 March 2012 (UTC)
Just a couple of points:
  1. Being biased doesn't equal to being spam. As well as being critical doesn't equal to spreading FUD.
  2. These claims are easily verifiable. Examples:
  3. Still I'm not sure that this comparison should be sourced at all: comparisons are generally subject to WP:SAL, and the sourcing of lists is currently the topic under discussion with no clear consensus. I believe that the sources should be in the articles on participants, as otherwise this page will get overwhelmed by references.
  4. Anyway, the only way this comparison can be fixed regarding WP:NPOV is a selection of criteria that are generally considered important in independent reliable sources (eg. those two above). Thus I repeat my questions: any ideas?
Dmitrij D. Czarkoff (talk) 21:31, 30 March 2012 (UTC)
I went ahead and removed a few columns: "Designed for anonymizer" and "Link to SSL (HTTPS) pages when possible". These both seem quite unique fringe features that are specific only to DuckDuckGo, and thus not suitable for general web search engine comparison.
The original contributor had claimed that Google 'shares' your data. I thought I had learned earlier that this is not the case, and lo and behold, they really do claim not to share users' data. Note that such claim (that they are not to sharing your data) has relation to EU privacy law, in that if they were actually then sharing your data, they would then become criminally liable. So I think we have good grounds to trust this primary source (Google's own blog).
Based on this I think the original contributor just invented these claims of privacy violations from his head without checking any sources or such. I have consequently tagged the article for factual inaccuracy. Have not yet checked, but I pretty much bet Microsoft and Yahoo too have similar clauses in their privacy policy as Google.
I failed to verify that Google uses "tracking" from the source that was provided (EFF instructions for removing your Google page history). At best, it seems to indicate that users can opt-out from tracking if they choose so. --hydrox (talk) 13:40, 1 April 2012 (UTC)
I'm pretty sure that "No internet censorship" column should feature the "a" (currently) note in heading. It seems pretty obvious that this column discuss the cases of internet censorship rules that are not legally imposed. I'll help you with EFF source verification: it says that until you opt out, your data is used. Furthermore, it states: "Note that disabling Web History in your Google account will not prevent Google from gathering and storing this information and using it for internal purposes. It also does not change the fact that any information gathered and stored by Google could be sought by law enforcement." Voila, claim is verified! — Dmitrij D. Czarkoff (talk) 15:44, 1 April 2012 (UTC)

Please, give me the link saying that user tracking is a feature, not a drawback. — Dmitrij D. Czarkoff (talk) 17:24, 2 April 2012 (UTC)

I see, everyone loves Google enough to remove material showing Google doing something that isn't well-accepted by the world. Still, referenced material should stay whoever likes or dislikes it. In the end, there's a talk page right here for everybody who feels his best feelings toward Google are offended. — Dmitrij D. Czarkoff (talk) 17:35, 2 April 2012 (UTC)

I notice that in some columns yes is green while no is red, but in other columns yes is red and no is green, based on which answer is "better." It doesn't seem like that judgement is neutral. In some circumstances non-personalized results are a feature--some people want to avoid personalized results, thus search engines which specifically avoid personalization. Are there any objections to removing the color from the tables to make them simply factual? The whole purpose of the color is to express an opinion about the valence of the words yes and no. Craig Butz (talk) 19:06, 2 March 2013 (UTC)

I have gone ahead and removed yes and no templates from personalized results column, as the color indicates a bias about whether personalization is desirable or not.Craig Butz (talk) 21:52, 5 March 2013 (UTC)

"Internet censorship" column[edit]

First, "Internet censorship" is at best a misleading name for the column because "censorship" is something governments do, not individual companies. Second, if we were to rename it to something like "results filtering" or some such, I disagree that this should be a simple "Yes/No" table column at all. There is a wide range of nuance in what would constitute such result filtering, and that cannot be adequately captured in a Yes/No button, like you can with "Is HTTPS available?" This subject is far too nuanced and contentious to be adequately summarized in a table Yes/No column. The discussions should appear at the articles for each browser. Agree? Zad68 (talk) 17:45, 2 April 2012 (UTC)

I don't see the problem with yes/no values. Eg. hiding children pornography from search results is wanted feature in most cases, but if it happens with no using setting it is censorship. I also have no idea on why you think that "censorship" is the word only describing the governments' practices; it describe action regardless its subject. — Dmitrij D. Czarkoff (talk) 17:54, 2 April 2012 (UTC)
The classical, legal definition of "censorship" is indeed limited to something governments do. It isn't technically correct to call what TV networks do when they bleep the f-word out of movies "censorship." TV networks can broadcast the f-word all day long, but they will get fined for it, as well as probably lose advertising revenue. Theirs is a business decision. In particular, it was not "censorship" when Google refused to carry abortion-related ads. You write "it describe action regardless its subject," but this gets it backwards: it's not the subject matter, it's the actor. If we're writing an encyclopedia, we should choose words for their correct meaning. However I'll stipulate to the fact people use "censorship" much more casually. So, to our discussion:
  • First could you please read Criticism_of_Google#Censorship. All the items you brought up as examples of Google "censorship" are covered here, and then some. Let's go over each one.
  1. Google filters results per requirements from the DMCA
  2. Google refused to run advertisements for Oceana; Google policy has since changed
  3. Google 'delisted' Inquisition, (claiming) not due to content, but due to Inquisition's manipulation of results
  4. Google refused to run advertisements for abortion; Google got sued, policy has since changed
  5. Google results in certain locales removed hate-group results to comply with law, with link to Chilling Effects
  6. Google Auto Complete did not complete for Bittorrent, etc.; search results are still available
On the table you already put the qualifier "Excluding censorship, that is mandated by law, such as compliance to Strafgesetzbuch section 86a and Digital Millennium Copyright Act takedown notices" which covers some of these, Google policy has changed on some, and others relate to Google's advertising policy, and not the search results, and one was due to a group manipulating Google's ranking algorithm. Given this, it does not look like Google would qualify to be tagged with a Yes in this column. Do you agree? Again, I think the point here is obvious, that a Yes/No check box is not appropriate, this is far too nuanced. You'd have to write an unfeasibly enormous disclaimer or explanation for the column, bigger than the one you already started, and which is open to question. Zad68 (talk) 18:20, 2 April 2012 (UTC)
The "torrents" case completely qualifies for search censorship, as well as some other cases I just don't have time to search for now. As a matter of fact, several of the censorship issues made me drop my use of Google in favor of DDG back when it was relatively less useful, so I already voted for Google's "yes" here with my legs. Still I fail to see the nuances that would make "yes"/"no" values difficult to set, so may be you could explain me that issue in more detail? — Dmitrij D. Czarkoff (talk) 18:39, 2 April 2012 (UTC)
If you find reliable secondary sources to back you up on the "other cases" you are thinking of, please bring them, you could add them to Criticism_of_Google#Censorship.
The "torrents" case is actually a very good subject to push forward here so that you understand my point. Yes or No, does Google deliver to you search results when you search for "Bittorrent"? If you answer Yes, there is no censorship here. Zad68 (talk) 18:45, 2 April 2012 (UTC)
IMHO this is a clear "yes": instant search is a feature of Google search product, so censoring something there means censoring in Google search. — Dmitrij D. Czarkoff (talk) 19:17, 2 April 2012 (UTC)
You avoided the question: Did you or did you not get results relevant to Bittorrent when you entered that search term into Google's search text entry box, and then pressed ENTER? Also you're now splitting things apart, you admit that Google's search feature has multiple relevant parts that are trying to be addressed under the table entry "Google." Zad68 (talk) 19:23, 2 April 2012 (UTC)
I avoided answering this question (answer: yes, I got them) because it isn't relevant to the topic. The very fact that Google blocks torrent query in "instant search" makes a censorship case. IMHO "censorship" field should be set to "yes" if any features show signs of censorship. — Dmitrij D. Czarkoff (talk) 20:08, 2 April 2012 (UTC)
Making some search results slightly less convenient to get to is not "censorship." It really seems like you are going far out of your way to slap Google with a big red Yes of censorship, using personal interpretations of words so far out of common understanding that it is really misrepresenting the truth. Please reconsider and be more reasonable. If we can't come to consensus on this, let's bring in a half-dozen outside editors to add balance to this discussion. Which direction would you like to go? Zad68 (talk) 14:10, 3 April 2012 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── The whole issue is completely binary: either there are issues on topic or not. If there are issues, it's "yes", if there's no issues, it's "no". And this isn't related to my attitude towards Google: if you spot the source claiming the similar issue on behalf of my favorite DDG, I'll be as passionate in setting it "a big red Yes of censorship". Period, and nothing to reconsider. — Dmitrij D. Czarkoff (talk) 16:34, 3 April 2012 (UTC)

I'll invite a wider audience of editors to this discussion. Zad68 (talk) 16:50, 3 April 2012 (UTC)
  • Zad68 & Czarkoff agree that Google handles the "Instant search" of searches starting with "Bittorrent" differently from how it handles other searches. Google does not show auto-complete results for searches starting with "Bittorrent" and it does for other search items.
  • Zad68 & Czarkoff agree that if you start a search with "Bittorrent" and press ENTER, Google produces appropriate results relevant to "Bittorrent"

Hey, guys, I'm here from the 3O board. I'm tending to agree with Zad68 that this specific isntance (i.e. the lack of autocompletion to bittorrent) is *not* censorship. "Internet censorship is the control or suppression of the publishing of, or access to information on the Internet." Not autocompleting "bittorrent" affects neither publishing nor access to information about it. I can already hear the objection "but it makes it less convenient to access it, which affects access." Strictly speaking, that's probably true; my reply is twofold: first, there are *many* search terms that don't autocomplete, due to other suggestions offered first; it's not censorship there. I understand what you'd say about how this one term was singled out, but it's really just a technicality. If BitTorrent were significantly harder to search for than anything else, while not being impossible, then you'd have a point. Like, if there was a prompt("Are you sure you want to search for BitTorrent?"), that could be considered censorship, even though you can still get the search results, but not autocompleting isn't quite the same. Second: really? Really? C'mon. It's not *significantly* inconveniencing anyone. If this is the only thign that's being used to support the censorship claim, then I would oppose putting a "yes" in the censorship column.

Now, all that said, I would be surprised if this is really the best example that can be dug up, and really, if you just wanted to source a "yes" to this statement of Google Policy, that could be justifiable, since Google reserves the right to censor its results, even if it doesn't actually do so. I'd probably propose a "Possible" for that column; it doesn't have to be a boolean yes/no. JM2C. Writ Keeper 18:09, 3 April 2012 (UTC)

To me this policy statement doesn't warrant "yes" in the column, as it doesn't prove the actual censorship act; be it the only concern, I would oppose "yes" tag. Also I just can't imagine non-boolean answer to the question "Does XYZ search engine censor search results?" It may either censor somehow (and there is no real difference in the amount of effort/topics, as the reliability of search engine is compromised) or not censor at all. In this regard I get the instant search/search box issue cited above as compromising reliability. — Dmitrij D. Czarkoff (talk) 19:48, 3 April 2012 (UTC)
But it doesn't compromise the search engine's reliability. The search engine is scarcely affected at all. On those grounds, I would say Google warrants a firm "no." (btw, I don't particularly agree to the policy making for a yes in that category, either, I just meant htat it was another defensible line of argument, in the sense that Google doesn't guarantee freedom of censorship). Writ Keeper 19:59, 3 April 2012 (UTC)
See, this is a technical feature of Google's search product, that was specifically manipulated. The last part is exactly the problematic one. Though we still have Criticism of Google#Censorship and there are some cases in a wild (eg. [1], [2] and a lot of similar contexts). FWIW, Google omits the results when the publishers are suspected in "search optimization", which itself is censorship of a kind. I used the "Bittorrent" case as most clear and offensive, but if it's minor, feel free to choose among the overwhelming amount of other cases. — Dmitrij D. Czarkoff (talk) 20:08, 3 April 2012 (UTC)

Review this article: The Search Engine Backlash Against 'Content Mills'. The article says of DuckDuckGo's founder Weinberg: "The content he's blocking includes everything from "Made-for-Adsense" sites that offer no content at all to sites that offer only, in Weinberg's judgment, low-quality content designed specifically to rank highly in Google's search index." So there is valid Web content out there, relevant pages that could appear in DDG's search results, but DDG is blocking it, and you are not able to see those results. Is this not censorship? Shouldn't DDG then also have a Yes in the censorship column, according to the very broad interpretation being applied here? Zad68 (talk) 20:57, 3 April 2012 (UTC)

Indeed. Fixed in the table. — Dmitrij D. Czarkoff (talk) 22:38, 3 April 2012 (UTC)
The problem here is that we can't give any context to what "Internet censorship" is. This is what I hate about "Comparison" articles; there's no way to give context when you have to put either a yes or a no in the box. A word with such heavy connotations as "censorship" will mean different things to different people, and we can't actually specify which of those meanings we are talking about. Does the autosuggest thing in fact constitute censorship? In my mind, no, it clearly doesn't. But obviously reasonable people's opinions are differing, and we have no way to expand on what that "yes" might mean. Ditto with the search engine result manipulation; one man's censorship is another's honest attempt to improve the search algorithms by weeding out artificially inflated sites. I guess what I'm trying to say is that answering "does Google censor" is hard because it's not the right question to be asking. At the end of the day, I'm not sure there's going to be any way to boil this down to a yes/no while being neutral. Writ Keeper 22:52, 3 April 2012 (UTC)
As EVERY search engine will commit "Internet censorship" in this way, can we agree now that the column does not provide any useful, encyclopedic value? I do not want to throw the baby out with the bath water. If people want to look up the topic of censorship of this or that search engine, they can go to the individual articles for each engine. This article could be useful, but let's get rid of this column. Can we agree? Zad68 (talk) 01:04, 4 April 2012 (UTC)
I'd certainly agree to removing the column. Don't see a way to keep it neutral without a paragraph's worth of disclaimers and specifiers. Writ Keeper 04:24, 4 April 2012 (UTC)
Agree. — Dmitrij D. Czarkoff (talk) 07:34, 4 April 2012 (UTC)
Thank you! Zad68 (talk) 13:14, 4 April 2012 (UTC)

"Daily queries" ==> "Daily direct queries", adjusted down DDG number[edit]

I changed the column heading "Daily queries" to "Daily direct queries". The "Daily queries" number that was being given before for DDG included "API" queries, which provides only a limited subset of result functionality, see: http://duckduckgo.com/api.html I'd think that most readers of the article would expect "Daily queries" to be queries that end-users are running to provide full search result functionality, and you can't compare DDG's limited API query count against, for example, Google or Bing's full-query search count--not apples vs. apples. Zad68 (talk) 14:06, 7 May 2012 (UTC)

ixquick and ads issues[edit]

https://www.ixquick.com

Please insert these search machine also... — Preceding unsigned comment added by 80.245.147.81 (talk) 08:54, 15 November 2012 (UTC)


Many of the web search engines uses ads.This shoud be changed in the apposite column(ddg and others use them).

Too few engines, too little info[edit]

Many search engines are missing, e.g., Buenosearch. Alta Vista is historically very important, yet it is not mentioned. There are also many specialized engines, web crawlers, and metasearchers such as Dogpile that are omitted. I sure a lot fo the ?s can be replaced with just a little effort by people in the know. 211.225.33.104 (talk) 03:00, 6 January 2014 (UTC)

In Korea, Naver is the dominant seach engine.211.225.33.104 (talk) 03:01, 6 January 2014 (UTC)
If you know other search engines just add them. --Txt.file (talk) 08:35, 18 June 2014 (UTC)

Digital Rights Table Headings Unclear[edit]

Some of the sections for the "digital rights" table need to be clarified. For instance:

  • What does "Proxy view on search" mean?
  • What *kind* of information sharing does this table describe? Is this sharing with law enforcement, government, marketing companies, or what? Does this just refer to a subset or category of sharing? Voluntary or involuntary (subpoenas being involuntary)?
  • What are the criteria for inclusion in these comparisons? Is there a notability threshold? For instance, Blekko, WbSrch, Sogou, and Exalead are included on the [List_of_search_engines] page, but are not included here. — Preceding unsigned comment added by 207.162.213.254 (talk) 21:29, 6 May 2014 (UTC)