Jump to content

Surface web: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
Addbot (talk | contribs)
m Bot: Migrating 1 interwiki links, now provided by Wikidata on d:q1476753
Kompowiec2 (talk | contribs)
common and bergie web
Line 5: Line 5:
A 2005 study <ref>[http://www.cs.uiowa.edu/~asignori/web-size/ Univ. of Iowa study (Jan 2005)]</ref> queried the [[Google]], [[MSN]], [[Yahoo!]], and [[Ask.com|Ask Jeeves]] search engines with search terms from 75 different languages and determined that there were over 11.5 billion web pages in the publicly indexable Web as of January 2005.
A 2005 study <ref>[http://www.cs.uiowa.edu/~asignori/web-size/ Univ. of Iowa study (Jan 2005)]</ref> queried the [[Google]], [[MSN]], [[Yahoo!]], and [[Ask.com|Ask Jeeves]] search engines with search terms from 75 different languages and determined that there were over 11.5 billion web pages in the publicly indexable Web as of January 2005.


As of June 2008, the indexed web contains at least 63 billion pages.<ref>[http://www.worldwidewebsize.com/ The size of the World Wide Web]</ref>
As of 26 June 2012, the indexed web contains at least 13.52 billion pages.<ref>[http://www.worldwidewebsize.com/ The size of the World Wide Web]</ref>

==common and bergie web==
'''Common web''' - this part surface web - here found general the most popular pages, e.g. [[YouTube]] or [[Facebook]] for common people.

'''Bergie web''' however this web, in which you found illegal content ([[Warez]], instructions under the guise of educational objectives.).


==References==
==References==

Revision as of 11:13, 23 April 2013

The surface Web (also known as the Clearnet, the visible Web or indexable Web) is that portion of the World Wide Web that is indexable by conventional search engines. The part of the Web that is not reachable this way is called the Deep Web. Search engines construct a database of the Web by using programs called spiders or Web crawlers that begin with a list of known Web pages. The spider gets a copy of each page and indexes it, storing useful information that will let the page be quickly retrieved again later. Any hyperlinks to new pages are added to the list of pages to be crawled. Eventually all reachable pages are indexed, unless the spider runs out of time [1] or disk space. The collection of reachable pages defines the Surface Web.

For various reasons (e.g., the Robots Exclusion Standard, links generated by JavaScript and Flash, password-protection) some pages can not be reached by the spider. These 'invisible' pages are referred to as the Deep Web.

A 2005 study [2] queried the Google, MSN, Yahoo!, and Ask Jeeves search engines with search terms from 75 different languages and determined that there were over 11.5 billion web pages in the publicly indexable Web as of January 2005.

As of 26 June 2012, the indexed web contains at least 13.52 billion pages.[3]

common and bergie web

Common web - this part surface web - here found general the most popular pages, e.g. YouTube or Facebook for common people.

Bergie web however this web, in which you found illegal content (Warez, instructions under the guise of educational objectives.).

References

See also