Bibliographic coupling

From Wikipedia, the free encyclopedia
Jump to: navigation, search
Documents A and B both cite documents C, D and E, hence the documents A and B have a bibliographic coupling strength of three.
Figure visualizing bibliographic coupling between Documents A and B.

Bibliographic coupling, like Co-citation, is a similarity measure that uses citation analysis to establish a similarity relationship between documents. Bibliographic coupling occurs when two works reference a common third work in their bibliographies. It is an indication that a probability exists that the two works treat a related subject matter.[1]

Two documents are bibliographically coupled if they both cite one or more documents in common. The "coupling strength" of two given documents is higher the more citations to other documents they share. The figure to the right illustrates the concept of bibliographic coupling. In the figure, documents A and B both cite documents C, D and E. Thus, documents A and B have a bibliographic coupling strength of 3 - the number of elements in the intersection of their two reference lists.

Similarly, two authors are bibliographically coupled if the cumulative reference lists of their respective oeuvres each contain a reference to a common document, and their coupling strength also increases with the citations to other documents that their share. If the cumulative reference list of an author's oeuvre is determined as the multiset union of the documents that the author has co-authored, then the author bibliographic coupling strength of two authors (or more precisely, of their oeuvres) is defined as the size of the multiset intersection of their cumulative reference lists, however.[2]

Bibliographic coupling can be useful in a wide variety of fields, since it helps researchers find related research done in the past. On the other hand, two documents are co-cited if they are both independently cited by one or more documents.

History[edit]

The concept of bibliographic coupling was introduced by M. M. Kessler of MIT in a paper published in 1963,[3] and has been embraced in the work of the information scientist Eugene Garfield.[4] It is one of the earliest citation analysis methods for document similarity computation and some have questioned its usefulness, pointing out that two works may reference completely unrelated subject matter in the third. Furthermore, bibliographic coupling is a retrospective similarity measure,[5] meaning the information used to establish the similarity relationship between documents lies in the past and is static, i.e. bibliographic coupling strength cannot change over time, since outgoing citation counts are fixed.

The co-citation analysis approach introduced by Henry Small and published in 1973 addressed this shortcoming of bibliographic coupling by considering a document's incoming citations to assess similarity, a measure that can change over time. Additionally, the co-citation measure reflects the opinion of many authors and thus represents a better indicator of subject similarity.[6]

In 1972 Robert Amsler published a paper[7] describing a measure for determining subject similarity between two documents by fusing bibliographic coupling and co-citation analysis.[8]

In 1981 Howard White and Belver Griffith introduced author co-citation analysis (ACA).[9] Not until 2008 did Dangzhi Zhao and Andreas Strotmann combine their work and that of M. M. Kessler to define author bibliographic coupling analysis (ABCA), noting that as long as authors are active this metric is not static and that it is particularly useful when combined with ACA.[2]

More recently, in 2009, Gipp and Beel introduced a new approach termed Co-citation Proximity Analysis (CPA). CPA is based on the concept of co-citation, but represents a refinement to Small's measure in that CPA additionally considers the placement and proximity of citations within a document's full-text. The assumption is that citations in closer proximity are more likely to exhibit a stronger similarity relationship.[10]

In summary, a chronological overview of citation analysis methods includes:

  • Bibliographic coupling (1963)
  • Co-citation analysis (published 1973)
  • Amsler measure (1972)
  • Author co-citation analysis (1981)
  • Author bibliographic coupling analysis (2008)
  • Co-citation proximity analysis (CPA) (2009)

Applications[edit]

Online sites that make use of bibliographic coupling include The Collection of Computer Science Bibliographies and CiteSeer.IST

Further reading[edit]

For an interesting summary of the progression of the study of citations see.[11] The paper is more a memoir than a research paper, filled with decisions, research expectations, interests and motivations—including the story of how Henry Small approached Belver Griffith with the idea of co-citation and they became collaborators, mapping science as a whole.

See also[edit]

References[edit]

Bibliographic Coupling[edit]

  • Kessler, M. M. (1963). "Bibliographic coupling between scientific papers". American Documentation. 14 (1): 10–25. doi:10.1002/asi.5090140103. 
  • Kessler, M. M. (1963). "An experimental study of bibliographic coupling between technical papers". IEEE Transaction on Information Theory. 9 (1): 49. doi:10.1109/tit.1963.1057800. 

Author Bibliographic Coupling[edit]

  • Zhao, D.; Strotmann, A. (2008). "Evolution of research activities and intellectual influences in information science 1996–2005: Introducing author bibliographic-coupling analysis". Journal of the American Society for Information Science and Technology. 59 (13): 2070–2086. doi:10.1002/asi.20910. 

Co-citation analysis[edit]

  • Small, Henry (1973). "Co-citation in the scientific literature: a new measure of the relationship between two documents". Journal of the American Society for Information Science. 24: 265–269. doi:10.1002/asi.4630240406. 
  • Small, Henry; Griffith, B. C. (1974). "The structure of scientific literatures (I) Identifying and graphing specialties". Science Studies. 4 (1): 17–40. doi:10.1177/030631277400400102. 
  • Griffith, B. C.; et al. (1974). "The structure of scientific literatures (II) Towards a macro- and micro-structure for science". Science Studies. 4 (4): 339–365. doi:10.1177/030631277400400402. 
  • Collins, H. M. (1974). "The TEA set: Tacit knowledge and scientific networks". Science Studies. 4 (2): 165–186. doi:10.1177/030631277400400203. 

Co-citation Proximity Analysis (CPA)[edit]

Author Co-citation Analysis (ACA)[edit]

Citation Studies in a More General Context[edit]

  • Henry Small (1978). "Cited Documents as Concept Symbols," Social Studies of Science, vol.8, p. 327-340. [1]
  • Henry Small (1982). "Citation context analysis." In: Brenda Dervin and M. J. Voigt, eds., Progress in Communication Sciences, volume 3, pp. 287–310. Ablex Publishing, 1982.
  • Blair, David C.; Maron, M. E. (1985). "An evaluation of retrieval effectiveness for a full-text document-retrieval system". Communications of the ACM. 28 (3): 289–299. doi:10.1145/3166.3197. 
  • Brin, Sergey; Page, Lawrence (1998). "The anatomy of a large-scale hypertextual Web search engine". Computer Networks and ISDN Systems. 30 (1-7): 107–117. doi:10.1016/s0169-7552(98)00110-x. 
  • He, Yulan; Cheung Hui, Siu (2002). "Mining a web citation database for author co-citation analysis". Information Processing and Management: An International Journal. 38 (4): 491–508. doi:10.1016/s0306-4573(01)00046-2. 
  • S. Bradshaw (2003). "Reference directed indexing: Redeeming relevance for subject search in citation indexes." Proceedings of the European Conference on Research and Advanced Technology for Digital Libraries (ECDL), pp. 499–510.
  • Anna Ritchie, Simone Teufel & Stephen Robertson (2006). "Creating a test collection for citation-based IR experiments." Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, pp. 391–398, June 4–09, 2006, New York.
  • Iwayama, Makoto; Fujii, Atsushi; Kando, Noriko; Marukawa, Yozo (2006). "Evaluating patent retrieval in the third NTCIR workshop". Information Processing and Management: An International Journal. 42 (1): 207–221. doi:10.1016/j.ipm.2004.08.012. 
  • Atsushi Fujii (2007). "Enhancing patent retrieval by citation analysis." SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 793–794. ACM
  • Trevor Strohman, W. Bruce Croft & David Jensen (2007). "Recommending citations for academic papers." Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, July 23–27, 2007, Amsterdam, The Netherlands.
  • Anna Ritchie, Stephen Robertson & Simone Teufel (2008). "Comparing citation contexts for information retrieval." CIKM '08 Proceeding of the 17th ACM conference on Information and knowledge management. ACM
  • Malte Schwarzer, Moritz Schubotz, Norman Meuschke, Corinna Breitinger & Bela Gipp (2016). "Evaluating Link-based Recommendations for Wikipedia" JCDL '16 Proceedings of the 16th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL). ACM

Notes[edit]

  1. ^ Martyn, J (1964). "Bibliographic coupling". Journal of Documentation. 20 (4): 236. doi:10.1108/eb026352. 
  2. ^ a b Zhao, D.; Strotmann, A. (2008). "Evolution of research activities and intellectual influences in information science 1996–2005: Introducing author bibliographic-coupling analysis". Journal of the American Society for Information Science and Technology. 59 (13): 2070–2086. doi:10.1002/asi.20910. 
  3. ^ "Bibliographic coupling between scientific papers," American Documentation 24 (1963), pp. 123-131.
  4. ^ See for example "Multiple Independent Discovery and Creativity in Science," Current Contents, Nov. 3, 1980, pp. 5-10, reprinted in Essays of an Information Scientist, vol. 4 (1979-80), pp. 660-665.
  5. ^ Garfield Eugene, 2001.From Bibliographic Coupling to Co-Citation Analysis via Algorithmic Historio-Bibliography presented at Drexel University, Philadelphia, PA
  6. ^ Henry Small, 1973. "Co-citation in the scientific literature: A new measure of the relationship between two documents". Journal of the American Society for Information Science (JASIS), volume 24(4), pp. 265-269. doi = 10.1002/asi.4630240406
  7. ^ Robert Amsler, Dec. 1972 "Applications of citation-based automatic classification", Linguistics Research Center, University Texas at Austin, Technical Report 72-14.
  8. ^ Class Amsler written by Bruno Martins and developed by the XLDB group of the Department of Informatics of the Faculty of Sciences of the University of Lisbon in Portugal
  9. ^ White, Howard D.; Griffith, Belver C. (1981). "Author Cocitation: A Literature Measure of Intellectual Structure". Journal of the American Society for Information Science. 32 (3): 163–171. doi:10.1002/asi.4630320302. 
  10. ^ Bela Gipp and Joeran Beel, 2009 Citation Proximity Analysis (CPA) – A new approach for identifying related work based on Co-Citation Analysis in Proceedings of the 12th international conference on scientometrics and informetrics (issi’09), Rio de Janeiro (Brazil), 2009, pp. 571-575.
  11. ^ Small, Henry (2001). "Belver and Henry". Scientometrics. 51 (3): 489–497. 
  12. ^ Bela Gipp, Norman Meuschke & Mario Lipinski, 2015. "CITREC: An Evaluation Framework for Citation-Based Similarity Measures based on TREC Genomics and PubMed Central" in Proceedings of the iConference 2015, Newport Beach, California, 2015.

External links[edit]

Jeppe Nicolaisen, Bibliographic coupling in Birger Hjørland, ed., Core Concepts in Library and Information Science