Internet research

From Wikipedia, the free encyclopedia
Jump to: navigation, search
This article is about using the Internet for research; for the field of research about the Internet, see Internet studies.

Internet research is the practice of using the Internet, especially the World Wide Web, for research. The internet is widely used and readily accessible to hundreds of millions of people in many parts of the world. It can provide practically instant information on most topics, and has a profound impact on the way ideas are formed and knowledge is created.

Research is a broad term. Here, it is used to mean "looking something up (on the Web)". It includes any activity where a topic is identified, and an effort is made to actively gather information for the purpose of furthering understanding. Common applications of Internet research include personal research on a particular subject (something mentioned on the news, a health problem, etc.), students doing research for academic projects and papers, and journalists and other writers researching stories. It should be distinguished from scientific research - research following a defined and rigorous process - carried out on the Internet; from straightforward finding of specific info, like locating a name or phone number; and from research about the Internet.

Compared to the Internet, print physically limits access to information. A book has to be identified, then actually obtained. On the Net, the Web can be searched, and typically hundreds or thousands of pages can be found with some relation to the topic, within seconds. In addition, email (including mailing lists), online discussion forums (aka message boards, BBS's), and other personal communication facilities (instant messaging, IRC, newsgroups, etc.) can provide direct access to experts and other individuals with relevant interests and knowledge. However, difficulties persist in verifying a writer's credentials, and therefore the accuracy or pertinence of the information obtained—see also the article Reliability of Wikipedia.

Further difficulties in internet research center around search tool bias and whether the searcher has sufficient skill to draw meaningful results from the abundance of material typically available.[1] The first resources retrieved may not be the most suitable resources to answer a particular question. For example, popularity is often a factor used in structuring internet search results but the most popular information is not always the most correct or representative of the breadth of knowledge and opinion on a topic.

Contents

[edit] Search skills

Part of the skill set for using internet research with finesse involves adding the Boolean search terms Not (-) and OR (|) to search engine queries. “Advanced internet searching” is a term that organizations like Monash University quoted below use to describe the learning of Boolean terms and the choosing of appropriate search tool.

Quality on the internet is more than a ‘difficulty’ with specific search techniques having evolved to assist us to reveal the quality of information found online. The use of context searching (using the inurl: field search) and the retrieval of endorsements (using primarily the link field search but with other options) reveals much of the information we need to make a better quality assessment. A familiarity with context and endorsement searching is thus an internet research skill that lies beyond “advanced internet searching”.

Once we recognize search tool bias and the superficial anonymity of internet information, we open the door to further search skills that personalize and sharpen the focus of our searching. Deep Url interpretation and a recognition of format can counter some of the initial anonymity. Context searching can also assist here. There is also a skill in evolving a search question as we repeatedly make requests of search engines and as we grow more familiar with a topic and how it is organized.

Since internet information – particularly of a certain quality or standard – can be organized in other ways besides word choice and prominence (as attended by global search engines), some information may require further search skills to retrieve. A familiarity with midpoints like directories, ‘invisible’ databases and an attentiveness to further types of organization may reveal the key to finding missing information. A thesaurus, for example, may prove critical to connecting information organized under the business term “staff loyalty” to information addressing the preferred nursing term “personnel loyalty” (MeSH entry for Medline by the [US] National Library of Medicine). This may only come to light by noticing the absence of nursing-related research when collecting “staff loyalty” related material. Experience noticing the volume of information (particularly by repeatedly searching in a precise manner) and looking for gaps in collections of information can help overcome this.


As the volume of internet information continues to rise, emerging search skills like link companion searches and triangulation as well knowledge of as well-established tools like regional search engines will be needed to retrieve information unlikely to reach the attention of those who depend on search engine ranking to reveal relevant information. This situation calls into question what constitutes an internet search - particularly how it is different from a search engine offering an internet recommendation - and the meaning and difficulties in searching the internet comprehensively.

[edit] Search tools

The most popular search tools for finding information on the internet include Web search engines, metasearch engines, Web directories, and specialty search services. A Web search engine uses software known as a Web crawler to follow the hyperlinks connecting the pages on the World Wide Web. The information on these Web pages is indexed and stored by the search engine. To access this information, a user enters keywords in a search form and the search engine queries its indices to find Web pages that contain the keywords and displays them in search engine results page (SERP). The SERP list typically includes hyperlinks and brief descriptions of the content found. Search results are ranked using complex algorithms, which take into consideration the location and frequency of keywords on a Web page, along with the quality and number of external hyperlinks pointing at the Web page.

A Metasearch engine enables users to enter a search query once and have it run against multiple search engines simultaneously, creating a list of aggregated search results. Since no single search engine covers the entire web, a metasearch engine can produce a more comprehensive search of the web. Most metasearch engines automatically eliminate duplicate search results. However, metasearch engines have a significant limitation because the most popular search engines, such as Google, are not included because of legal restrictions.

A Web directory organizes subjects in a hierarchical fashion that lets users investigate the breadth of a specific topic and drill down to find relevant links and content. Web directories can be assembled automatically by algorithms or handcrafted. Human-edited Web directories have the distinct advantage of higher quality and reliability, while those produced by algorithms can offer more comprehensive coverage. The scope of Web directories are generally broad, such as DMOZ, Yahoo! and The WWW Virtual Library, covering a wide range of subjects, while others focus on specific topics.

Specialty search tools enable users to find information that conventional search engines and metasearch engines cannot access because the content is stored in databases. In fact, the vast majority of information on the web is stored in databases that require users to go to a specific site and access it through a search form. Often, the content is generated dynamically. As a consequence, Web crawlers are unable to index this information. In a sense, this content is "hidden" from search engines, leading to the term invisible or deep Web. Specialty search tools have evolved to provide users with the means to quickly and easily find deep Web content. These specialty tools rely on advanced bot and intelligent agent technologies to search the deep Web and automatically generate specialty Web directories, such as the Virtual Private Library.

[edit] Internet Research Software

Internet Research Software enables you to capture information you find while performing internet research. This information can then be organized in various ways included tagging and hierarchical trees. The goal is to collect information relevant to a specific research project in the one place, so that it can be found and accessed again quickly.

These tools also allow captured content to be edited and annotated and some allow the ability to export to other formats. Other features common to outliners include the ability to use full text search which aids in quickly locating information and filters enable you to drill down to see only information relevant to a specific query.

By capturing and keeping information you don't have to worry about web pages and whole sites disappearing or being inaccessible. Internet Research Software greatly enhances internet research by enabling you to build knowledge and reuse it. Some of the most popular available software includes Surfulater and WebResearch Professional for Windows, Evernote (multiple platforms), DEVONthink (MacOSX), Springpad and Diigo (web based) and Scrapbook, a Firefox extension.

[edit] See also

[edit] References

  1. ^ Hargittai, E. (April 2002). "Second-Level Digital Divide: Differences in People’s Online Skills". First Monday. http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/view/942/864. Retrieved February 5, 2010. 

[edit] External links

Personal tools
Namespaces
Variants
Actions
Navigation
Interaction
Toolbox
Print/export
Languages