Social genome

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search

The social genome is the collection of data about members of a society that is captured in ever-larger and ever-more complex databases (e.g., government administrative data, operational data, social media data etc.). Some have used the term digital footprint to refer to individual traces.


There have been two distinct uses of the term. First, the word Social Genome was used in a letter to the editor submission to Science in response to a seminal article about using big data for social science by King.[1] The letter[2] was published, but the word social genome was edited out of the letter. The original submission states, “A well-integrated federated data system of administrative databases updated on an ongoing basis could hold a collective representation of our society, our social genome.” Kum and others continue to use the word since 2011, with it being defined in a peer reviewed article in 2013.[3] It states “Today there is a constant flow of data into, out of, and between ever-larger and ever-more complex databases about people. Together, these digital traces collectively capture our social genome, the footprints of our society.” In 2014, a vision paper[4] on population informatics was published which further elaborated on the term.

Second, separately at about the same time, a group of researchers led by the Brookings Institution started the Social Genome Project which built a data-rich model to map the pathway to the Middle class by tracing the life course from birth until middle age. The first paper[5] was published in 2012.

See also[edit]


  1. ^ King, Gary (2011-02-11). "Ensuring the Data-Rich Future of the Social Sciences". Science. 331 (6018): 719–721. doi:10.1126/science.1197872. ISSN 0036-8075. PMID 21311013.
  2. ^ Kum, Hye-Chung; Ahalt, Stanley; Carsey, Thomas M. (June 10, 2011). "Dealing with data: governments records". Science. 332 (6035): 1263. doi:10.1126/science.332.6035.1263-a. PMID 21659589.
  3. ^ Kum, Hye-Chung; Ahalt, Stanley (2013-01-01). "Privacy-by-Design: Understanding Data Access Models for Secondary Data". AMIA Joint Summits on Translational Science proceedings AMIA Summit on Translational Science. 2013: 126–130. ISSN 2153-4063. PMC 3845756. PMID 24303251.
  4. ^ Kum, Hye-Chung; Krishnamurthy, A.; Machanavajjhala, A.; Ahalt, S.C. (2014-01-01). "Social Genome: Putting Big Data to Work for Population Informatics". Computer. 47 (1): 56–63. doi:10.1109/MC.2013.405. ISSN 0018-9162.
  5. ^ "Pathways to the Middle Class: Balancing Personal and Public Responsibilities". The Brookings Institution. 2012-09-20. Retrieved 2015-11-28.

External links[edit]