Social data analysis

From Wikipedia, the free encyclopedia
Jump to: navigation, search

Social data analysis is a style of analysis in which people work in a social, collaborative context to make sense of data. The term was introduced by Martin Wattenberg in 2005[1] and recently also addressed as big social data analysis in relation to big data computing.

Social data analysis comprises two main constituent parts: 1) data generated from social networking sites (or through social applications), and 2) sophisticated analysis of that data, in many cases requiring real-time (or near real-time) data analytics, measurements which understand and appropriately weigh factors such as influence, reach, and relevancy, an understanding of the context of the data being analyzed, and the inclusion of time horizon considerations. In short, social data analytics involves the analysis of social media in order to understand and surface insights which is embedded within the data.[2]

Basic definition[edit]

On a Social Data Analysis system or network, users store data sets and create visual representations. The datasets and visualisations/graphs are accessible to other users of the network or website. Users can create new and interesting visualisations/graphs as well as associated commentary from the same data sets. The discussion mechanisms often use frameworks such as a blogs and wikis to drive this social exploration/Collaborative intelligence.

This is a new slant on business intelligence where social exploration of data can lead to serious analysis and important insight that the initiating user did not envisage/explore (for whatever reason).

How to get social data[edit]

With the development of Web 2.0, social networks are more and more popular. More and more scholars are working on social data analyses, hoping to find interesting results from the analyses. Usually, we can retrieve the social data from a variety of social networks, such as Twitter, Facebook, We Feel Fine, Wikipedia etc. Since most of the social networks provide us with the API, it's not difficult for us to retrieve the data. Using API to get data is like sending a request to the website and then the website returns the requested data in form of XML or in form of JSON. Since sometimes the data we request is more private, we may need to pay for the API in order to get the data we want. Social data can also be fetched by adding social login.

Indexing the data in bulk can be harder than accessing simpler APIs. Six_Apart was the first social media company to provide a (free) firehose of content for all the posts in their network (provided over XMPP). Twitter later came along and provided a firehose as did companies like Spinn3r, Datasift, and GNIP.

Methods of analysis[edit]

In most cases, we want to find out the relationships between social data and another event or we want to get interesting results from social data analyses to predict some events. There are some outstanding articles in this field, including Twitter Mood Predicts The Stock Market,[3] Predicting The Present With Google Trends[4] etc. In order to accomplish these goals, we need the appropriate methods to do the analyses. Usually, we use statistic methods, methods of machine learning or methods of data mining to do the analyses.

Universities all over the world are opening graduate program in Social Data Analysis.

Key concepts[edit]

When talking about social data analytics, there are a number of factors it's important to keep in mind (which we noted earlier):[2]

  • Sophisticated Data Analysis: what distinguishes social data analytics from sentiment analysis is the depth of the analysis. Social data analysis takes into consideration a number of factors (context, content, sentiment) to provide additional insight.
  • Time consideration: windows of opportunity are significantly limited in the field of social networking. What's relevant one day (or even one hour) may not be the next. Being able to quickly execute and analyze the data is an imperative.
  • Influence Analysis: understanding the potential impact of specific individuals can be key in understanding how messages might be resonating. It's not just about quantity, it's also very much about quality.
  • Network Analysis: social data is also interesting in that it migrates, grows (or dies) based on how the data is propagated throughout the network. It's how viral activity starts—and spreads.

See also[edit]


  1. ^ 2005: Baby Names, Visualization, and Social Data Analysis Martin Wattenberg. IEEE Symposium on Information Visualization.
  2. ^ a b IBM Emerging Technology - jStart - On the Horizon - Social data analytics
  3. ^ Bollen, Johan; Mao, Huinan; Zeng, Xiaojun (2011). "Twitter mood predicts the stock market". Journal of Computational Science. 2 (1): 1–8. 
  4. ^ Choi, Hyunyoung; Varian, Hal (2012). "Predicting the present with google trends". Economic Record. 88 (s1): 2–9.