Citation index

From Wikipedia, the free encyclopedia
  (Redirected from Citation indexing)
Jump to: navigation, search

A citation index is a kind of bibliographic database, an index of citations between publications, allowing the user to easily establish which later documents cite which earlier documents. A form of citation index is first found in 12th-century Hebrew religious literature. Legal citation indexes are found in the 18th century and were made popular by citators such as Shepard's Citations (1873). In 1960, Eugene Garfield's Institute for Scientific Information (ISI) introduced the first citation index for papers published in academic journals, first the Science Citation Index (SCI), and later the Social Sciences Citation Index (SSCI) and the Arts and Humanities Citation Index (AHCI). The first automated citation indexing was done by CiteSeer in 1997. Other sources for such data include Google Scholar.

History[edit]

The earliest known citation index is an index of biblical citations in rabbinic literature, the Mafteah ha-Derashot, attributed to Maimonides and probably dating to the 12th century. It is organized alphabetically by biblical phrase. Later biblical citation indexes are in the order of the canonical text. These citation indices were used both for general and for legal study. The Talmudic citation index En Mishpat (1714) even included a symbol to indicate whether a Talmudic decision had been overridden, just as in the 19th-century Shepard's Citations.[1][2] Unlike modern scholarly citation indexes, only references to one work, the Bible, were indexed.

In English legal literature, volumes of judicial reports included lists of cases cited in that volume starting with Raymond's Reports (1743) and followed by Douglas's Reports (1783). Simon Greenleaf (1821) published an alphabetical list of cases with notes on later decisions affecting the precedential authority of the original decision.[3]

The first true citation index dates to the 1860 publication of Labatt's Table of Cases...California..., followed in 1872 by Wait's Table of Cases...New York.... But the most important and best-known citation index came with the 1873 publication of Shepard's Citations.[3]

Major citation indexing services[edit]

General-purpose academic citation indexes include:

  • ISI (now part of Thomson Reuters) publishes the ISI citation indexes in print and compact disc. They are now generally accessed through the Web under the name Web of Science, which is in turn part of the group of databases in the Web of Knowledge.
  • Elsevier publishes Scopus, available online only, which similarly combines subject searching with citation browsing and tracking in the sciences and social sciences.
  • Indian Citation Index is an online citation data which covers peer reviewed journals published from India. It covers major subject areas such as scientific, technical, medical, and social sciences and includes arts and humanities. The citation database is the first of its kind in India.

Each of these offer an index of citations between publications and a mechanism to establish which documents cite which other documents. They differ widely in cost: the ISI databases and Scopus are available by subscription (generally to libraries).

In addition, CiteSeer and Google Scholar are freely available online.

Citation analysis[edit]

Main article: Citation analysis

While citation indexes were originally designed for information retrieval, they are increasingly used for bibliometrics and other studies involving research evaluation. Citation data is also the basis of the popular journal impact factor.

There is a large body of literature on citation analysis, sometimes called scientometrics, a term invented by Vasily Nalimov, or more specifically bibliometrics. The field blossomed with the advent of the Science Citation Index, which now covers source literature from 1900 on. The leading journals of the field are Scientometrics, Informetrics, and the Journal of the American Society of Information Science and Technology. ASIST also hosts an electronic mailing list called SIGMETRICS at ASIST.[4] This method is undergoing a resurgence based on the wide dissemination of the Web of Science and Scopus subscription databases in many universities, and the universally available free citation tools such as CiteBase, CiteSeerX, Google Scholar, and the former Windows Live Academic (now available with extra features as Microsoft Academic Search).

Legal citation analysis is a citation analysis technique for analyzing legal documents to facilitate the understanding of the inter-related regulatory compliance documents by the exploration the citations that connect provisions to other provisions within the same document or between different documents. Legal citation analysis uses a citation graph extracted from a regulatory document, which could supplement E-discovery - a process that leverages on technological innovations in big data analytics.[5][6][7][8]

History[edit]

In a 1965 paper, Derek J. de Solla Price described the inherent linking characteristic of the SCI as "Networks of Scientific Papers".[9] The links between citing and cited papers became dynamic when the SCI began to be published online. The Social Sciences Citation Index became one of the first databases to be mounted on the Dialog system[10] in 1972. With the advent of the CD-ROM edition, linking became even easier and enabled the use of bibliographic coupling for finding related records. In 1973, Henry Small published his classic work on Co-Citation analysis which became a self-organizing classification system that led to document clustering experiments and eventually an "Atlas of Science" later called "Research Reviews".

The inherent topological and graphical nature of the worldwide citation network which is an inherent property of the scientific literature was described by Ralph Garner (Drexel University) in 1965.[11]

The use of citation counts to rank journals was a technique used in the early part of the nineteenth century but the systematic ongoing measurement of these counts for scientific journals was initiated by Eugene Garfield at the Institute for Scientific Information who also pioneered the use of these counts to rank authors and papers. In a landmark paper of 1965 he and Irving Sher showed the correlation between citation frequency and eminence in demonstrating that Nobel Prize winners published five times the average number of papers while their work was cited 30 to 50 times the average. In a long series of essays on the Nobel and other prizes Garfield reported this phenomenon. The usual summary measure is known as impact factor, the number of citations to a journal for the previous two years, divided by the number of articles published in those years. It is widely used, both for appropriate and inappropriate purposes—in particular, the use of this measure alone for ranking authors and papers is therefore quite controversial.

In an early study in 1964 of the use of Citation Analysis in writing the history of DNA, Garfield and Sher demonstrated the potential for generating historiographs, topological maps of the most important steps in the history of scientific topics. This work was later automated by E. Garfield, A. I. Pudovkin of the Institute of Marine Biology, Russian Academy of Sciences and V. S. Istomin of Center for Teaching, Learning, and Technology, Washington State University and led to the creation of the HistCite [12] software around 2002.

Automatic citation indexing was introduced in 1998 by Lee Giles, Steve Lawrence and Kurt Bollacker [13] and enabled automatic algorithmic extraction and grouping of citations for any digital academic and scientific document. Where previous citation extraction was a manual process, citation measures could now scale up and be computed for any scholarly and scientific field and document venue, not just those selected by organizations such as ISI. This led to the creation of new systems for public and automated citation indexing, the first being CiteSeer (now CiteSeerX, soon followed by Cora, which focused primarily on the field of computer science and information science. These were later followed by large scale academic domain citation systems such as the Google Scholar and Microsoft Academic. Such autonomous citation indexing is not yet perfect in citation extraction or citation clustering with an error rate estimated by some at 10% though a careful statistical sampling has yet to be done. This has resulted in such authors as Ann Arbor, Milton Keynes, and Walton Hall being credited with extensive academic output.[14] SCI claims to create automatic citation indexing through purely programmatic methods. Even the older records have a similar magnitude of error.

See also[edit]

References[edit]

  1. ^ Bella Hass Weinberg, "The Earliest Hebrew Citation Indexes" in Trudi Bellardo Hahn, Michael Keeble Buckland, eds., Historical Studies in Information Science, 1998, p. 51ff
  2. ^ Bella Hass Weinberg, "Predecessors of Scientific Indexing Structures in the Domain of Religion" in W. Boyden Rayward, Mary Ellen Bowden, The History and Heritage of Scientific and Technological Information Systems, Proceedings of the 2002 Conference, 2004, p. 126ff
  3. ^ a b Fred R. Shapiro, "Origins of Bibliometrics, Citation Indexing, and Citation Analysis: The Neglected Legal Literature" Journal of the American Society of Information Science 43:5:337-339 (1992)
  4. ^ "The American Society for Information Science & Technology". The Information Society for the Information Age. Retrieved 2006-05-21. 
  5. ^ [1][dead link]
  6. ^ Mohammad Hamdaqa and A. Hamou-Lhadj, "Citation Analysis: An Approach for Facilitating the Understanding and the Analysis of Regulatory Compliance Documents", In Proc. of the 6th International Conference on Information Technology, Las Vegas, USA
  7. ^ "E-Discovery Special Report: The Rising Tide of Nonlinear Review". Hudson Global. Retrieved 1 July 2012.  by Cat Casey and Alejandra Perez
  8. ^ "What Technology-Assisted Electronic Discovery Teaches Us About The Role Of Humans In Technology - Re-Humanizing Technology-Assisted Review". Forbes. Retrieved 1 July 2012. 
  9. ^ Derek J. de Solla Price (July 30, 1965). "Networks of Scientific Papers" (PDF). SCIENCE 149 (3683): 510–515. doi:10.1126/science.149.3683.510. PMID 14325149. 
  10. ^ "Dialog, A Thomson Business". "Dialog invented online information services". Retrieved 2006-05-21. 
  11. ^ http://www.garfield.library.upenn.edu/rgarner.pdf
  12. ^ Eugene Garfield, A. I. Pudovkin, V. S. Istomin (2002). "Algorithmic Citation-Linked Historiography—Mapping the Literature of Science". Presented the ASIS&T 2002: Information, Connections and Community. 65th Annual Meeting of ASIST in Philadelphia, PA. November 18–21, 2002. Retrieved 2006-05-21. 
  13. ^ C.L. Giles, K. Bollacker, S. Lawrence, "CiteSeer: An Automatic Citation Indexing System," DL'98 Digital Libraries, 3rd ACM Conference on Digital Libraries, pp. 89-98, 1998.
  14. ^ Postellon DC (March 2008). "Hall and Keynes join Arbor in the citation indexes". Nature 452 (7185): 282. doi:10.1038/452282b. PMID 18354457. 

External links[edit]