Data shadow

From Wikipedia, the free encyclopedia

Data shadows refer to the information that a person leaves behind unintentionally while taking part in daily activities such as checking their e-mails, scrolling through social media or even by using their debit or credit card.[1][2][3]

The term data shadow was coined in 1972 by Kerstin Anér, a member of the Swedish legislature.[4]

The generated information has the potential to create a vastly detailed record of an individual's daily trails, which includes the individual's thoughts and interests, whom they communicate with, information about the organizations with which they work or interact with and so forth.[2] This information can be dispersed to a dozen organizations and servers depending on their use.[5] Along with Individuals, the activities of institutions and organizations are also tracked. Data Shadows are closely linked with data footprints, which are defined as the data that has been left behind by the individual themselves through various activities such as online activities, communication information, and transactions.[2] In a chapter for the book Geography and Technology, researcher Matthew Zook and his co-authors note that data shadows have come as a result of people becoming "digital individuals" and that these shadows are continually evolving and changing.[6] They are used to model and predict political opinions, and make inferences about a person's political values or susceptibility to advertising.[1]

Digital footprint[edit]

The data or digital footprints are obtained from monitoring and tracking individuals’ digital activities. Digital footprints provide a drive for companies such as Facebook and Google to invest in obtaining data generated from these footprints, in order to be sold to marketers.[7] As illustrated by Bodle, users are willing to give up their information to companies they trust.[8] Although collecting individuals’ data raises several ethical concerns, it can be valuable for the healthcare data analytics and new health services.[9] For instance, access to such data can help healthcare services shed light on the causes of disease, the effects or side-effects the treatments might have and can facilitate a tailored analysis according to the individual's characteristics.[10]


Dataveillance gives rise to data shadows since it allows for the identification, classification and representation of individuals or organizations.[11] Dataveillance is defined as a mode of surveillance which tracks, monitors or regulates an individual using digital activity including their personal details and social media activities.[5] In 2013, Edward Snowden’s revelations on the National Security Agency's PRISM program, that the organization would “receive” emails, video clips, photos, voice and video calls, social networking details, logins and other data held by a range of US internet firms”.[12] It is also revealed that corporate social networks share their information with the intelligence agencies.[13] Platform owners such as Google and Facebook, anchor the trust of their users and reassure them that their information is protected through portrayal of corporate codes of conduct such as “Do no evil” and “Making the world transparent and connected” [13] However, as pointed out by Bodle, platform owners are themselves collecting the user's information and using it for purposes they deem to be necessary.[8]


Europe has brought about a long-awaited revision to their data protection framework called the Data Protection Directive.[14] According to this, European Union countries are forced to remove personal data of individuals upon their request if the information is obsolete or irrelevant.[15] Information privacy is defined as the right that individuals have over the data about themselves when subjected to a third party,[2] and that the data should not be available to any organization or person without their approval.[2] However, this is not the case when data is collected and collated for marketers: online marketers use cookies, spyware, adware and so on to capture rich data about their customers.[16] State agencies also collect citizen data for security purposes.[2] The aftermath of the 9/11 terrorist attacks has allowed US national security agencies to increase their collation and exchange of information in order to strengthen the United States Intelligence Community (USIC) and to minimize potential threats.[17]


  1. ^ a b Howard, Philip N. (2005). New Media Campaigns and the Managed Citizen. New York, NY: Cambridge University Press. pp. 93, 144. ISBN 9780521612272.
  2. ^ a b c d e f Kitchin, Rob (2014). The data revolution: big data, open data, data infrastructures & their consequences. Sage Publications Ltd. pp. xvii, 222 pages.
  3. ^ Koops, E.J (2011). "Forgetting footprints, shunning shadows: A critical analysis of the 'right to be forgotten' in big data practice". SCRIPTed. 8 (3): 229–256. SSRN 1986719.
  4. ^ Steven Bellovin (2021-06-29). ""Where Did "Data Shadow" Come From?"". CircleID.
  5. ^ a b Raley, Rita (2013). "Data and countervailence". Raw Data Is an Oxymoron. MIT Press. pp. 121–146. ISBN 9780262518284.
  6. ^ Zook, Matthew; Dodge, Martin; Aoyama, Yuko; Townsend, Anthony (2004-03-31). "New Digital Geographies: Information, Communication, and Place". Geography and Technology. Springer Science & Business Media. p. 169. ISBN 9781402018718.
  7. ^ Wyner, Gordon, Wyner, Gordon (2015). "Digital footprints abound". 27 (1): 16. {{cite journal}}: Cite journal requires |journal= (help)CS1 maint: multiple names: authors list (link)
  8. ^ a b German; Kathleen, M; Drushel, Bruce (2011). The ethics of emerging media: information, social norms, and new media technology. Continuum. p. 279.
  9. ^ Harjumaa, Marja; Saraniemi, Saila; Pekkarinen, Saara; Lappi, Minna; Similä, Heidi; Isomursu; Minna (2017-12-05). "Feasibility of digital footprint data for health analytics and services: an explorative pilot study". BMC Medical Informatics and Decision Making. 16 (1): 139. doi:10.1186/s12911-016-0378-0. PMC 5112682. PMID 27829413.
  10. ^ Kostkova, Patty; Brewer, Helen; de Lusignan, Simon; Fottrell, Edward; Goldacre, Ben; Hart, Graham; Koczan, Phil; Knight, Peter; Marsolier, Corinne (2016). "Who Owns the Data? Open Data for Healthcare". Frontiers in Public Health. 4: 7. doi:10.3389/fpubh.2016.00007. ISSN 2296-2565. PMC 4756607. PMID 26925395.
  11. ^ Selwyn, Neil (2017-12-05). "Data entry: towards the critical study of digital data and education". Learning, Media and Technology. 40 (1): 64–82. doi:10.1080/17439884.2014.921628. S2CID 143752752.
  12. ^ Kelion, Leo (2013-06-25). "Q&A: Prism internet surveillance". BBC News. Retrieved 2017-11-25.
  13. ^ a b van Dijck, José (2014). "Datafication, dataism and dataveillance: Big Data between scientific paradigm and ideology". Surveillance & Society. 12 (2). ISSN 1477-7487.
  14. ^ Gutwirth, Serge (2013). European data protection: coming of age (1. Aufl., 2013, 1 ed.). Springer. p. 437. Retrieved 25 November 2017.
  15. ^ Walker, R. K. (2012). "The Right to be Forgotten". Hastings Law Journal. 64: 257–261.
  16. ^ Ashworth, Laurence; Free, Clinton (August 26, 2006). "Marketing Dataveillance and Digital Privacy: Using Theories of Justice to Understand Consumers' Online Privacy Concerns". Journal of Business Ethics. 67 (2): 107–123. doi:10.1007/s10551-006-9007-7. ISSN 0167-4544. S2CID 143800212.
  17. ^ Mace, Robyn R (2009). Intelligence, Dataveillance, and Information Privacy. Springer Berlin Heidelberg.