Citation graph

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search

In information science and bibliometrics, a citation graph (or citation network) is a directed graph in which each vertex represents a document and in which each edge represents a citation from the current publication to another.

The best known example is probably the citation graph where academic papers are the vertices, as described in the classic 1965 article "Networks of Scientific Papers"[1] by Derek J. de Solla Price. Another example is formed by court judgements in which judges refer to earlier judgements to support their decisions. Citation analysis in a legal context is therefore an important commercial field. Patents are another well known example since they must refer to earlier patents which are known as prior art.

A typical application for a citation graph is to try to measure the impact may be used in citation analysis as the basis for calculating measures of scientific impact such as the h-index, and for studying the structure and development of different fields of academic inquiry.

The actual construction of citation graphs can be complicated in practice as there is often no standard format for the citations in a bibliography. This makes the record linkage of the citations in a document to their cited document time consuming. Worse there are often errors in the citations introduced at any one of many stages of the process of publishing a document. However, there is a long history of creating a database of citations, also known as a citation index, and so there is lot of information about such problems.

In principle each document will have a unique publication date and can only refer to earlier documents. This means that the edges of the graph are not only directed but they are also acyclic, that is there are no loops in the graph. So a perfect citation graph is an example of a directed acyclic graph. In practice this is not always true, for instance an academic paper goes through several versions in the publishing process and it may be possible to update bibliographies at different times in such a way to lead to edges that apparently point forward in time. However, in practice it appears such citations are often less than 1% of the total number of links.[2]

See also[edit]


  1. ^ Derek J. de Solla Price (July 30, 1965). "Networks of Scientific Papers" (PDF). Science. 149 (3683): 510–515. Bibcode:1965Sci...149..510D. doi:10.1126/science.149.3683.510. PMID 14325149.
  2. ^ James R Clough; Jamie Gollings; Tamar V Loach; Tim S Evans (2015). "Transitive reduction of citation networks" (PDF). Journal of Complex Networks. 3 (2): 189–203. arXiv:1310.8224. doi:10.1093/comnet/cnu039. S2CID 10228152.
  • An, Yuan; Janssen, Jeannette; Milios, Evangelos E. (2004), "Characterizing and Mining the Citation Graph of the Computer Science Literature", Knowledge and Information Systems, 6 (6): 664–678, doi:10.1007/s10115-003-0128-3, S2CID 348227.
  • Yong, Fang; Rousseau, Ronald (2001), "Lattices in citation networks: An investigation into the structure of citation graphs", Scientometrics, 50 (2): 273–287, doi:10.1023/A:1010573723540, S2CID 413673.
  • Lu, Wangzhong; Janssen, J.; Milios, E.; Japkowicz, N.; Zhang, Yongzheng (2007), "Node similarity in the citation graph", Knowledge and Information Systems, 11 (1): 105–129, doi:10.1007/s10115-006-0023-9, S2CID 26234247.