Jump to content

Social profiling: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
RayBall (talk | contribs)
Added paragraphs
RayBall (talk | contribs)
Added text and photo
Line 10: Line 10:
== Social Media Profiling ==
== Social Media Profiling ==
According to Warren and Brandeis (1890), disclosure of private information and the misuse of it can damage people’s feelings and cause considerable damage in people’s lives<ref>{{Cite journal|last=D. Warren|first=Samuel|last2=D. Brandeis|first2=Louis|date=December, 1890|title=The Right to Privacy|url=http://groups.csail.mit.edu/mac/classes/6.805/articles/privacy/Privacy_brand_warr2.html|journal=Harvard Law Review|volume=IV|pages=}}</ref>. Social network provide people access to intimate online interactions; therefore, information access control, information transactions, privacy issues, connections and relationships on social media etc. have become important research fields and are subject to general concern of the public. According to Ricard Fogues and other co-authors, "any privacy mechanism has at its base an access control", that dictate "how permissions are given, what elements can be private, how access rules are defined, and so on"<ref name=":8">{{Cite journal|last=Fogues|first=Ricard|last2=Such|first2=Jose M.|last3=Espinosa|first3=Agustin|last4=Garcia-Fornes|first4=Ana|date=2015-05-04|title=Open Challenges in Relationship-Based Privacy Mechanisms for Social Network Services|url=http://dx.doi.org/10.1080/10447318.2014.1001300|journal=International Journal of Human–Computer Interaction|volume=31|issue=5|pages=350–370|doi=10.1080/10447318.2014.1001300|issn=1044-7318}}</ref>. Current access control for social media accounts tend to still be very simplistic: there is very limited diversity in the category of relationships on for social network accounts. User's relationships to others are, on most platforms, only categorized as "friend" or "none-friend" and people may leak important information to "friends" inside their social circle but not necessarily users to they consciously want to share the information to.<ref name=":8" /> The below section is concerned with social media profiling and what profiling information on social media accounts can achieve.
According to Warren and Brandeis (1890), disclosure of private information and the misuse of it can damage people’s feelings and cause considerable damage in people’s lives<ref>{{Cite journal|last=D. Warren|first=Samuel|last2=D. Brandeis|first2=Louis|date=December, 1890|title=The Right to Privacy|url=http://groups.csail.mit.edu/mac/classes/6.805/articles/privacy/Privacy_brand_warr2.html|journal=Harvard Law Review|volume=IV|pages=}}</ref>. Social network provide people access to intimate online interactions; therefore, information access control, information transactions, privacy issues, connections and relationships on social media etc. have become important research fields and are subject to general concern of the public. According to Ricard Fogues and other co-authors, "any privacy mechanism has at its base an access control", that dictate "how permissions are given, what elements can be private, how access rules are defined, and so on"<ref name=":8">{{Cite journal|last=Fogues|first=Ricard|last2=Such|first2=Jose M.|last3=Espinosa|first3=Agustin|last4=Garcia-Fornes|first4=Ana|date=2015-05-04|title=Open Challenges in Relationship-Based Privacy Mechanisms for Social Network Services|url=http://dx.doi.org/10.1080/10447318.2014.1001300|journal=International Journal of Human–Computer Interaction|volume=31|issue=5|pages=350–370|doi=10.1080/10447318.2014.1001300|issn=1044-7318}}</ref>. Current access control for social media accounts tend to still be very simplistic: there is very limited diversity in the category of relationships on for social network accounts. User's relationships to others are, on most platforms, only categorized as "friend" or "none-friend" and people may leak important information to "friends" inside their social circle but not necessarily users to they consciously want to share the information to.<ref name=":8" /> The below section is concerned with social media profiling and what profiling information on social media accounts can achieve.

=== Privacy Leak ===
A lot of information are voluntarily shared on Online Social Networks, and many people rest assured that different social network accounts on different platforms won't be linked as long as they don't grant permission to these links. However, according to Diane Gan, information gathered online enabled "target subjects to be identified on other social networking sites such as Foursquare, Instagram, LinkedIn, Facebook and Google+, where more personal information was leaked" <ref name=":9">{{Cite journal|last=Gan|first=Diane|last2=Jenkins|first2=Lily R.|date=2015-03-23|title=Social Networking Privacy—Who’s Stalking You?|url=http://www.mdpi.com/1999-5903/7/1/67|journal=Future Internet|language=en|volume=7|issue=1|pages=67–93|doi=10.3390/fi7010067}}</ref>.

The majority of social networking platforms use the "opt out approach" for their features. If users wish to protect their privacy, it is user's own responsibility to check and change the privacy settings as a number of them are set to default option<ref name=":9" />. A major social network platforms have developed geo-tag functions and are in popular usage. This is concerning because 39% users have experienced profiling hacking; 78% burglars have used major social media networks and Google Street-view to select their victims; an astonishing 54% of burglars attempted to break into empty houses when people posted their statuses and geo-locations. <ref>{{Cite news|url=https://www.instantcheckmate.com/crimewire/post/social-media-and-crime-2#prettyPhoto|title=Social Media And Crime|access-date=2017-04-23|language=en}}</ref>


=== Facebook ===
=== Facebook ===
Line 15: Line 20:


However, due to the privacy policy design, acquiring true information on Facebook is no trivial task. Often, Facebook users either refuse to disclose true information or set information only visible to friends, Facebook users who "LIKE" your page are also hard to be identified. To do online profiling of users and to cluster users, marketers and companies can and will access the following kinds of data: gender, the IP address and city of each user through the Facebook Insight page, who "LIKED" a certain user, a page list of all the pages that a person "LIKED" ([[transaction data]]), other people that a user follow (even if it exceeds the first 500, which we usually can not see)and all the publicly shared data<ref name=":6" />. Through finding the commonalities of user's interest and application of data computer methods, users can easily be clustered and categorized to marketers and even be identified.
However, due to the privacy policy design, acquiring true information on Facebook is no trivial task. Often, Facebook users either refuse to disclose true information or set information only visible to friends, Facebook users who "LIKE" your page are also hard to be identified. To do online profiling of users and to cluster users, marketers and companies can and will access the following kinds of data: gender, the IP address and city of each user through the Facebook Insight page, who "LIKED" a certain user, a page list of all the pages that a person "LIKED" ([[transaction data]]), other people that a user follow (even if it exceeds the first 500, which we usually can not see)and all the publicly shared data<ref name=":6" />. Through finding the commonalities of user's interest and application of data computer methods, users can easily be clustered and categorized to marketers and even be identified.
[[File:Streamd.in Application.png|thumb|Streamd.in Application]]


=== Twitter ===
=== Twitter ===
Like facebook, [[Twitter]] is also a crucial tunnel for users to leak important information, often unconsciously, but able to be accessed and collected by others.
Appeared initially on the Internet in March 2006, Twitter was platform on which you can connect and communicate with any other user in just 140 characters<ref name=":9" />. Like facebook, [[Twitter]] is also a crucial tunnel for users to leak important information, often unconsciously, but able to be accessed and collected by others.


According to Rachel Numer, in a sample of 10.8 million tweets by more than 5,000 users, their posed and publicly shared information are enough to reveal user's income range<ref name=":7">{{Cite web|url=http://eds.a.ebscohost.com/eds/detail/detail?sid=d332ba90-cdb8-4174-aad6-0a40226c7707@sessionmgr4008&vid=2&hid=4113&bdata=JnNpdGU9ZWRzLWxpdmU=#AN=110842086&db=eih|title=Money Talks--and Tweets: Start Your Search!|website=eds.a.ebscohost.com|language=en|access-date=2017-04-23}}</ref>. A postdoctoral researcher from the [[University of Pennsylvania]], Daniel Preoţiuc-Pietro and his colleagues were able to categorize 90% of user into corresponding income groups. Their existing collected data, after feeding into a machine-learning model, will generate reliable predictions on the characteristics of each income group <ref name=":7" />.
According to Rachel Numer, in a sample of 10.8 million tweets by more than 5,000 users, their posed and publicly shared information are enough to reveal user's income range<ref name=":7">{{Cite web|url=http://eds.a.ebscohost.com/eds/detail/detail?sid=d332ba90-cdb8-4174-aad6-0a40226c7707@sessionmgr4008&vid=2&hid=4113&bdata=JnNpdGU9ZWRzLWxpdmU=#AN=110842086&db=eih|title=Money Talks--and Tweets: Start Your Search!|website=eds.a.ebscohost.com|language=en|access-date=2017-04-23}}</ref>. A postdoctoral researcher from the [[University of Pennsylvania]], Daniel Preoţiuc-Pietro and his colleagues were able to categorize 90% of user into corresponding income groups. Their existing collected data, after feeding into a machine-learning model, will generate reliable predictions on the characteristics of each income group <ref name=":7" />.

Photo on the right depicts an mobile App called Streamd.in. It displays live tweets on Google Maps by using geo-location details attached to the tweet, and traces the user's movement in the real world.


=== Profiling Photos on Social Network ===
=== Profiling Photos on Social Network ===

Revision as of 17:54, 23 April 2017

Social profiling is the process of constructing a user's profile using his or her publicly and voluntarily shared social data. In general, profiling refers to the data science process of generating a person's profile with computerized algorithms and technology.[1] There are many mediums and platforms for sharing these information with the help of the increasing number of successful social networks, including but not limited to LinkedIn, Google+, Facebook and Twitter etc.[2]

Social Profile and Social Data

A person's social data refers to the personal data that they publicly and voluntarily share either online or offline[3] (for more information, see social data revolution). A large amount of these data, including one's language, location and interest, is shared through social media and social network. Altogether, this information can construct a person's social profile.

Personalized Meta-Search Engines

The ever-increasing online content has resulted in the lack of proficiency of centralized search engine's results.[4][5] It can no longer satisfy user's demand for information. A possible solution that would increase coverage of search results would be meta-search engines,[4] an approach that collects information from numerous centralized search engines. A new problem thus emerges, that is too much data and too much noise is generated in the collection process. Therefore, a new technique called personalized meta-search engines emerges, which refers to an user's profile (largely social profile) to filter the search results. A user's profile can be a combination of a number of things, including but not limited to, "a user’s manual selected interests, user’s search history", and personal social network data.[4]

Social Media Profiling

According to Warren and Brandeis (1890), disclosure of private information and the misuse of it can damage people’s feelings and cause considerable damage in people’s lives[6]. Social network provide people access to intimate online interactions; therefore, information access control, information transactions, privacy issues, connections and relationships on social media etc. have become important research fields and are subject to general concern of the public. According to Ricard Fogues and other co-authors, "any privacy mechanism has at its base an access control", that dictate "how permissions are given, what elements can be private, how access rules are defined, and so on"[7]. Current access control for social media accounts tend to still be very simplistic: there is very limited diversity in the category of relationships on for social network accounts. User's relationships to others are, on most platforms, only categorized as "friend" or "none-friend" and people may leak important information to "friends" inside their social circle but not necessarily users to they consciously want to share the information to.[7] The below section is concerned with social media profiling and what profiling information on social media accounts can achieve.

Privacy Leak

A lot of information are voluntarily shared on Online Social Networks, and many people rest assured that different social network accounts on different platforms won't be linked as long as they don't grant permission to these links. However, according to Diane Gan, information gathered online enabled "target subjects to be identified on other social networking sites such as Foursquare, Instagram, LinkedIn, Facebook and Google+, where more personal information was leaked" [8].

The majority of social networking platforms use the "opt out approach" for their features. If users wish to protect their privacy, it is user's own responsibility to check and change the privacy settings as a number of them are set to default option[8]. A major social network platforms have developed geo-tag functions and are in popular usage. This is concerning because 39% users have experienced profiling hacking; 78% burglars have used major social media networks and Google Street-view to select their victims; an astonishing 54% of burglars attempted to break into empty houses when people posted their statuses and geo-locations. [9]

Facebook

Formation and maintenance of Social Media accounts and its relationships with others are associated with various social outcomes[10]. For many firms, customer relationship management is essential and partially done through facebook[11]. Before the emergence and prevalence of social media, customer identification is primarily functioned upon information that a firm can directly acquire[12]: for example, it may be through a customer's purchasing process or volunteered act of completing a survey/loyalty program. However, the rise of social media has greatly reduced the process of building a customer's profile/model based on such available data. Marketers now greatly seek customer information through facebook[11]; this may include a variety of information users disclose to all users or partial users on facebook: name, gender, date of birth, e-mail address, sexual orientation, marital status, interests, hobbies, favorite sports team(s), favorite athlete(s), or favorite music, more importantly Facebook connections[11].

However, due to the privacy policy design, acquiring true information on Facebook is no trivial task. Often, Facebook users either refuse to disclose true information or set information only visible to friends, Facebook users who "LIKE" your page are also hard to be identified. To do online profiling of users and to cluster users, marketers and companies can and will access the following kinds of data: gender, the IP address and city of each user through the Facebook Insight page, who "LIKED" a certain user, a page list of all the pages that a person "LIKED" (transaction data), other people that a user follow (even if it exceeds the first 500, which we usually can not see)and all the publicly shared data[11]. Through finding the commonalities of user's interest and application of data computer methods, users can easily be clustered and categorized to marketers and even be identified.

File:Streamd.in Application.png
Streamd.in Application

Twitter

Appeared initially on the Internet in March 2006, Twitter was platform on which you can connect and communicate with any other user in just 140 characters[8]. Like facebook, Twitter is also a crucial tunnel for users to leak important information, often unconsciously, but able to be accessed and collected by others.

According to Rachel Numer, in a sample of 10.8 million tweets by more than 5,000 users, their posed and publicly shared information are enough to reveal user's income range[13]. A postdoctoral researcher from the University of Pennsylvania, Daniel Preoţiuc-Pietro and his colleagues were able to categorize 90% of user into corresponding income groups. Their existing collected data, after feeding into a machine-learning model, will generate reliable predictions on the characteristics of each income group [13].

Photo on the right depicts an mobile App called Streamd.in. It displays live tweets on Google Maps by using geo-location details attached to the tweet, and traces the user's movement in the real world.

Profiling Photos on Social Network

The advent and universality of social media network have boosted the role of images and visual information dissemination[14]. Many visual information on social media transmits messages from the author, location information and other personal information. In a study done by Cristina Segalin, Dong Seon Cheng and Marco Cristani, they found that profiling user posts' photos can reveal personal traits such as personality and mood[14]. In the study, Convolutional Neural Networks (CNNs) is introduced. It builds on the main characteristics of Computational Aesthetics CA (emphasizing "computational methods," "human aesthetic point of view," and "the need to focus on objective approaches"[14]) defined by Hoenig (Hoenig, 2005). This tool can extract and identify content in photos.

Tags

In a study called "A Rule-Based Flickr Tag Recommendation System", the author suggests personalized tag recommendations[15], largely based on user profiles and other web resources. It has proven to be useful in many aspects: "web content indexing", "multimedia data retrieval," and enterprise Web searches[15].

Delicious

Flickr

Zooomr

Marketing and Social Profiling

Nowadays, marketers and retailers are increasing their market presence by creating their own pages on social media, on which they post information, ask people to like and share to enter into contests, and much more. Studies show that on average a person spends about 23 minutes on a social networking site per day.[16] Therefore, companies from small to large ones are investing in gathering user behavior information, rating, reviews, and more.[17]

Tools for Social Profiling

Klout

Klout is a popular online tool that focuses on assessing a user's social influence by social profiling. It takes several social media platforms (such as Facebook, Twitter etc) and numerous aspects into account and generate a user's score from 1-100. Regardless of one's number of likes for a post, or connections on LinkedIn, social media contains plentiful personal information. Klout generates a single score that indicates a person's influence.

In a study called "How Much Klout do You Have...A Test of System Generated Cues on Source Credibility" done by Chad Edwards, Klout scores can influence people's perceived credibility.[18] As Klout Score becomes an popular combined-into-one-score method of accessing people's influence, it can be a convenient tool and a biased one at the same time. A study of how social media followers influence people's judgments done by David Westerman illustrates that possible bias that Klout may contain.[19] In one study, participants were asked to view six identical mock twitter pages with only one major independent variable: page followers. Result shows that pages with too many or too fewer followers would both decrease its credibility, despite of its similar content. Klout score may be subject to the same bias as well.[19]

While this is sometimes used during recruitment process, it remains to be controversial.

Kred

Follower Wonk

Keyhole

Consequences

Social Credit Score in China

The Chinese government hopes to establish a "social-credit system" that aims to score "financial creditworthiness of citizens", social behavior and even political behaviour[20]. This system will be combining big data and social profiling technologies. According to Celia Hatton from BBC News, everyone in China will be expected to enroll in a national database that includes and automatically calculates your fiscal information, political behavior, social behavior and daily life including minor traffic violations ------ a single score that evaluates a citizen's trustworthiness[21].

Credibility score, social influence score and other comprehensive evaluations of people aren't rare in other countries. However, China's "social-credit system" remains to be controversial as this single score can be an reflection of a personal every aspect[21]. Indeed, “much about the social-credit system remains unclear”[20].

References

  1. ^ Kanojea, Sumitkumar; Mukhopadhyaya, Debajyoti; Girase, Sheetal (2016). "User Profiling for University Recommender System using Automatic Information Retrieval". Procedia Computer Science. 78: 5–12.
  2. ^ Vu, Xuan Truong; Abel, Marie-Hélène; Morizet-Mahoudeaux, Pierre (2015-10-01). "A user-centered and group-based approach for social data filtering and sharing". Computers in Human Behavior. Computing for Human Learning, Behaviour and Collaboration in the Social and Mobile Networks Era. 51, Part B: 1012–1023. doi:10.1016/j.chb.2014.11.079.
  3. ^ Fontinelle, Amy (2017-02-06). "Social Data". Investopedia. Retrieved 2017-04-03.
  4. ^ a b c Saoud, Zakaria; Kechid, Samir (2016-04-01). "Integrating social profile to improve the source selection and the result merging process in distributed information retrieval". Information Sciences. 336: 115–128. doi:10.1016/j.ins.2015.12.012.
  5. ^ Lawrence, Steve; Giles, C. Lee (1999-07-08). "Accessibility of information on the web". Nature. 400 (6740): 107–107. doi:10.1038/21987. ISSN 0028-0836.
  6. ^ D. Warren, Samuel; D. Brandeis, Louis (December, 1890). "The Right to Privacy". Harvard Law Review. IV. {{cite journal}}: Check date values in: |date= (help)
  7. ^ a b Fogues, Ricard; Such, Jose M.; Espinosa, Agustin; Garcia-Fornes, Ana (2015-05-04). "Open Challenges in Relationship-Based Privacy Mechanisms for Social Network Services". International Journal of Human–Computer Interaction. 31 (5): 350–370. doi:10.1080/10447318.2014.1001300. ISSN 1044-7318.
  8. ^ a b c Gan, Diane; Jenkins, Lily R. (2015-03-23). "Social Networking Privacy—Who's Stalking You?". Future Internet. 7 (1): 67–93. doi:10.3390/fi7010067.{{cite journal}}: CS1 maint: unflagged free DOI (link)
  9. ^ "Social Media And Crime". Retrieved 2017-04-23.
  10. ^ Park, Namkee; Lee, Seungyoon; Kim, Jang Hyun (2012-09-01). "Individuals' personal network characteristics and patterns of Facebook use: A social network approach". Computers in Human Behavior. 28 (5): 1700–1707. doi:10.1016/j.chb.2012.04.009.
  11. ^ a b c d van Dam, Jan-Willem; van de Velden, Michel (2015-02-01). "Online profiling and clustering of Facebook users". Decision Support Systems. 70: 60–72. doi:10.1016/j.dss.2014.12.001.
  12. ^ Zhu, Feng; Zhang, Xiaoquan (Michael) (2013-05-29). "Impact of Online Consumer Reviews on Sales: The Moderating Role of Product and Consumer Characteristics". Journal of Marketing. 74 (2): 133–148. doi:10.1509/jmkg.74.2.133.
  13. ^ a b "Money Talks--and Tweets: Start Your Search!". eds.a.ebscohost.com. Retrieved 2017-04-23.
  14. ^ a b c Segalin, Cristina; Cheng, Dong Seon; Cristani, Marco (2017-03-01). "Social profiling through image understanding: Personality inference using convolutional neural networks". Computer Vision and Image Understanding. Image and Video Understanding in Big Data. 156: 34–50. doi:10.1016/j.cviu.2016.10.013.
  15. ^ a b Cagliero, Luca; Fiori, Alessandro; Grimaudo, Luigi (2013-01-01). Ramzan, Naeem; Zwol, Roelof van; Lee, Jong-Seok; Clüver, Kai; Hua, Xian-Sheng (eds.). Social Media Retrieval. Computer Communications and Networks. Springer London. pp. 169–189. doi:10.1007/978-1-4471-4555-4_8. ISBN 9781447145547.
  16. ^ "Facebook Dominates, the Emergence of reddit and Hulu: Taking a Look at 4 Years of Distracting Websites at RescueTime". RescueTime Blog. 2011-10-03. Retrieved 2017-04-07.
  17. ^ Engineers., Institute of Electrical and Electronics; Society., IEEE Communications (2011-01-01). 2011 IEEE 5th International Conference on Internet Multimedia Systems Architecture and Application : [IMSAA 11] : December 12-13, 2011, Bangalore, India. IEEE. ISBN 9781457713286. OCLC 835764725.
  18. ^ Edwards, Chad; Spence, Patric R.; Gentile, Christina J.; Edwards, America; Edwards, Autumn (2013-09-01). "How much Klout do you have … A test of system generated cues on source credibility". Computers in Human Behavior. 29 (5): A12–A16. doi:10.1016/j.chb.2012.12.034.
  19. ^ a b Westerman, David; Spence, Patric R.; Van Der Heide, Brandon (2012-01-01). "A social network as information: The effect of system generated reports of connectedness on credibility on Twitter". Computers in Human Behavior. 28 (1): 199–206. doi:10.1016/j.chb.2011.09.001.
  20. ^ a b "China invents the digital totalitarian state". The Economist. Retrieved 2017-04-14.
  21. ^ a b Hatton, Celia (2015-10-26). "China 'social credit': Beijing sets up huge system". BBC News. Retrieved 2017-04-14.