From Wikipedia, the free encyclopedia
Jump to: navigation, search

Arnetminer is a free online service used to index and search academic social networks.


Arnetminer is designed to search and perform data mining operations against academic publications on the Internet, using social network analysis to identify connections between researchers, conferences, and publications.[1] This allows it to provide services such as expert finding, geographic search, reviewer recommendation, association search, course search, academic performance evaluation, and topic modeling.

Arnetminer was created as a research project in social influence analysis, social network ranking, and social network extraction. A number of peer-reviewed papers have been published arising from the development of the system. It has been in operation for more than three years, and has indexed 1,300,000 researchers and more than three million publications.[2] The research was funded by the Chinese National High-tech R&D Program and the National Science Foundation of China.

Arnetminer is commonly used in academia to identify relationships between and draw statistical correlations about research and researchers. It has attracted 2,766,356 independent IP accesses from 220 countries. The product has been used in Elsevier's SciVerse platform,[3] and academic conferences such as SIGKDD, ICDM, PKDD, WSDM.


Arnetminer automatically extracts the researcher profile from the web. It collects and identifies the relevant pages, then uses a unified approach to extract data from the identified documents. It also extracts publications from online digital libraries using heuristic rules.

It integrates the extracted researchers’ profiles and the extracted publications. It employs the researcher name as the identifier. A probabilistic framework has been proposed to deal with the name ambiguity problem in the integration. The integrated data is stored into a researcher network knowledge base (RNKB).

The principal other product in the area are Google Scholar, Elsevier's Scirus, and the open source project CiteSeer.


It was initiated and created by professor Jie Tang from Tsinghua University, China. It was first launched in March 2006. The following provide a list of updates in the past years:

  • March 2006, Version 0.1, Functions include researcher profiling, expert search, conference search, and publication search. The system was developed in Perl;
  • August 2006, Version 1.0, The system was re-implemented in Java;
  • July 2007, Version 2.0, New functions include researcher interest mining, association search, survey paper finding (unavailable now);
  • April 2008, Version 3.0, New functions include query understanding, new GUI, and search log analysis;
  • November 2008, Version 4.0, New functions include graph search, topic modeling, NSF/NSFC funding information extraction;
  • April 2009, Version 5.0, New functions include Profile edition, open API service, Bole search, course search (unavailable now);
  • December 2009, Version 6.0, New functions include academic performance evaluation, user feedback, conference analysis;
  • May 2010, Version 7.0, New functions include name disambiguation, paper-reviewer recommendation, ArnetPage creation;
  • March 2012, Version II, renamed as AMiner, rewrote all the codes and redesign the GUI. New functions include: geographic search, ArnetAPP platform.


Arnetminer published several datasets for academic research purpose, including DBLP+citation[4] (a data set augmenting citations into the DBLP data from Digital Bibliography & Library Project), Name Disambiguation,[5] Social Tie Analysis.[6] For more available datasets and source codes for research, please refer to.[7]

See also[edit]


  1. ^ Jie Tang; Jing Zhang; Limin Yao; Juanzi Li; Li Zhang; Zhong Su (2008). "ArnetMiner: extraction and mining of academic social networks". Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining (New York: ACM). 
  2. ^ "Arnetminer: introduction". Retrieved 28 May 2010. 
  3. ^ "SciVerse - HUB - Home". Retrieved 24 April 2012. 
  4. ^ "DBLP Papers + Citation Relationship". Retrieved 24 April 2012. 
  5. ^ "Name Disambiguation". Retrieved 24 April 2012. 
  6. ^ "Inferring Social Ties in Large Networks". Retrieved 24 April 2012. 
  7. ^ "Open Data and Codes by Arnetminer". Retrieved 24 April 2012. 

External links[edit]

Further reading[edit]

  • Chi Wang, Jiawei Han, Yuntao Jia, Jie Tang, Duo Zhang, Yintao Yu, and Jingyi Guo. Mining Advisor-Advisee Relationships from Research Publication Networks. InProceedings of the Sixteenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD'2010).
  • Jie Tang, Jimeng Sun, Chi Wang, and Zi Yang. Social Influence Analysis in Large-scale Networks. In Proceedings of the Fifteenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD'2009). pp. 807–816.
  • Jie Tang, Ruoming Jin, and Jing Zhang. A Topic Modeling Approach and its Integration into the Random Walk Framework for Academic Search. In Proceedings of 2008 IEEE International Conference on Data Mining (ICDM'2008). pp. 1055–1060.
  • Jie Tang, Limin Yao, Duo Zhang, and Jing Zhang. A Combination Approach to Web User Profiling. ACM Transactions on Knowledge Discovery from Data (TKDD), (vol. 5 no. 1), Article 2 (December 2010), 44 pages.