Distributed crawling is carried out by an open sourceclient application installed on volunteers’personal computers (PCs). After authentication procedures, the application registers each PC as a distributed crawling node. The crawler periodically receives tasks from the management console to download specified websites, parse their content and submit the results into parsed content storage. Crawling processes are activated when the users’ computers are idle.
Internet content parse results from several crawlers are compared by the management console to increase crawling results' accuracy. Crawling results can be stored to be used by thematic and general search engines with different search algorithms, such as Google, Live, Yahoo!, Froogle, etc.