Big data requires advanced technologies to competently process large quantities of data within acceptable time frames. The traditional means by which official statistics are analyzed and disseminated, both commercially and privately, consists of a large capital outlay for data life-cycle management and infrastructure. Additionally, the business processes by which official statistics are extracted and disseminated are inefficient, costly and require specific expertise.(1)
An evolution in analytics has emerged with the introduction of cloud based self-service models. In 2006 the Australian Bureau of Statistics (ABS) led the world in the online dissemination of statistics based on a desk-top self-service model. With the release of TableBuilder(2) (formerly CDATA Online) in 2008, powered by Space-Time Research’s SuperWEB2 platform. This was the very first instance of a government providing a self-service dissemination of official statistics extracted directly from unit record microdata(3)
A step forward has now made Big Data Analytics a reality in the Cloud. In 2012 Space-Time Research embarked on a research and development project to re-architect the technology involved in desk-top self-service analytics. The goal was to get analytics off the desktop and into the cloud. The project became known as SuperDataHub. Before this model existed, a typical user would have started by searching out data-cubes and downloading the cubes into specialized desktop software. Storage, archive and retrieval, would have to be managed by the user, placing considerable burden on IT systems and requiring considerable technical expertise. The vision was to see organisations who produce official statistics, maintain data sovereignty without restricting inter-agency data merging or third-party application development. End users would be able to upload and merge proprietary data with official statistics without granting organisational firewall access. A cloud based platform would enable community access, sharing and commentary on official statistics to further promote evidence-based decision making. There several software development firms leading the way in such computing. Your obvious IBM and Microsoft along with some smaller firms such as Space-Time Research, Tableau, Birst, Click View and Qlik View to name a few.
Cloud Analytics is designed to make official statistical data readily categorized and available with the click of a mouse via the users web browser. Cubes are available in the cloud, so the user no longer has to spend time and cost downloading, storing or archiving large volumes of data. The user does not have to install expensive software, perform updates or upgrades to newer versions. All this is now handled on an external servers hosted by various providers.
The real benefits to the world of analytics is that it brings all the advantages of cloud computing to data exploration, analysis and sharing. Organisations no longer face the task of managing individual client applications and data, there is one copy located on a central cloud-based server. Every user has the latest version without the IT department spending endless time performing updates on individual machines.
The result of the development of Cloud Analytics has yielded higher velocity data, better computing techniques and semantic networks. This type of model could lead the way to automated correlation of datasets providing users with information opportunities we have never seen (or thought of) before.(4)
- 1 Statistics New Zealand's Move to Process-oriented Statistics Production: progress, lessons and the way forward. Tracey Savage
- 2 http://www.abs.gov.au/websitedbs/censushome.nsf/home/tablebuilder
- 3 - “Making Census Data Available with CDATA Online”, Department of Innovation, May 25, 2010. http://showcase.govspace.gov.au/item/making-census-data-available-with-cdata-online/
- 4 Cloud-Based Self Service Analytics A.Naish http://www.statistics.gov.hk/wsc/CPS109-P12-S.pdf