In marketing, Geodemographic segmentation is a multivariate statistical classification technique for discovering whether the individuals of a population fall into different groups by making quantitative comparisons of multiple characteristics with the assumption that the differences within any group should be less than the differences between groups.
The information technologies employed in geodemographic segmentation include geographic information system and database management software.
- Geographic information system: a business tool for interpreting data that consists of a demographic database, digitized maps, a computer and software.
- Database management software: a computer program in which data are captured on the computer, updated, maintained and organized for effective use and manipulation of data.
Geodemographic segmentation is based on two simple principles:
- People who live in the same neighborhood are more likely to have similar characteristics than are two people chosen at random.
- Neighborhoods can be categorized in terms of the characteristics of the population which they contain. Any two neighborhoods can be placed in the same category, i.e., they contain similar types of people, even though they are widely separated.
Clustering algorithms in geodemographic segmentation
The use of different algorithms leads to different results, but there is no single best approach for selecting the best algorithm, just as no algorithm offers any theoretical proof of its certainty (Grekousis and Hatzichristos 2012). One of the most frequently used techniques in geodemographic segmentation is the widely known k-means clustering algorithm. In fact most of the current commercial geodemographic systems are based on a k-means algorithm. Still, clustering techniques coming from artificial neural networks, genetic algorithms, or fuzzy logic are more efficient within large, multidimensional databases (Brimicombe 2007).
Neural networks can handle non-linear relationships, are robust to noise and exhibit a high degree of automation. They do not assume any hypotheses regarding the nature or distribution of the data and they provide valuable assistance in handling problems of a geographical nature that, to date, have been impossible to solve. One of the best known and most efficient neural network methods for achieving unsupervised clustering is the Self-Organizing Map (SOM). SOM has been proposed as an improvement over the k-means method, for it provides a more flexible approach to census data clustering The SOM method has been recently used by Spielman and Thill (2008) to develop geodemographic clustering of a census dataset concerning New York City.
Another way of characterizing an individual polygon’s similarity to all the regions is based on fuzzy logic. The basic concept of fuzzy clustering is that an object may belong to more than one clusters. In binary logic, the set is limited by the binary yes - no definition, meaning that an object either belongs or not to a cluster. Fuzzy clustering allows a spatial unit to belong to more than one clusters with varying membership values. Most studies concerning geodemographic analysis and fuzzy logic employ the Fuzzy C-Means algorithm and the Gustafson-Kessel algorithm (Grekousis and Hatzichristos 2012, Feng and Flowerdew 1999).
Geodemographic segmentation systems
Famous geodemographic segmentation systems are Claritas Prizm (US), Tapestry (US), CAMEO (UK), ACORN (UK) and MOSAIC (UK) system. New systems targeting subgroups of the population are also emerging. For example, Segmentos examines the geodemographic lifestyles of Hispanics in the United States. Both MOSAIC and ACORN use Onomastics to infer the ethnicity from resident names.
The CAMEO Classifications are a set of consumer classifications that are used internationally by organisations as part of their sales, marketing and network planning strategies.
CAMEO UK has been built at postcode, household and individual level and classifies over 50 million British consumers. It has been built to accurately segment the British market into 68 distinct neighbourhood types and 10 key marketing segments.
Internationally Global CAMEO is the largest consumer segmentation system in the world, covering 40 nations. There is also single global classification CAMEO International which segments across borders.
CAMEO was developed and is maintained by Callcredit Information Group.
A Classification Of Residential Neighborhoods (Acorn) is developed by CACI in London. It is the only geodemographic tool currently available that is built using current year data rather than 2011 Census information. Acorn helps to analyse and understand consumers in order to increase engagement with customers and service users to deliver strategies across all channels. Acorn segments all 1.9 million UK postcodes into 6 categories, 18 groups and 62 types.
Mosaic UK is Experian’s people classification system. Originally created by Prof Richard Webber (visiting Professor of Geography at Kings College University, London) in association with Experian. The latest version of Mosaic was released in 2009. It classifies the UK population into 15 main socio-economic groups and, within this, 67 different types.
Mosaic UK is part of a family of Mosaic classifications that covers 29 countries including most of Western Europe, the United States, Australia and the Far East.
Mosaic Global is Experian's global consumer classification tool. It is based on the simple proposition that the world's cities share common patterns of residential segregation. Mosaic Global is a consistent segmentation system that covers over 400 million of the world’s households using local data from 29 countries. It has identified 10 types of residential neighbourhood that can be found in each of the countries.
In Australia, geoSmart is a geodemographic segmentation system based on the principle that people with similar demographic profiles and lifestyles tend to live near each other. It is developed by an Australian supplier of geodemographic solutions, RDA Research.
geoSmart geodemographic segments are produced from the Australian Census (Australian Bureau of Statistics) demographic measures and modeled characteristics, and the system is updated for recent household growth. The clustering creates a single segment code that is represented by a descriptive statement or a thumbnail sketch.
In Australia, geoSmart is mainly used for database segmentation, customer acquisition, trade area profiling and letterbox targeting, although it can be used in a broad range of other applications.
The Output Area Classification
The Output Area Classification (OAC) is the UK Office for National Statistics' (ONS) free and open geodemographic segmentation based upon the UK Census of Population 2011. It classifies 41 census variables into a three-tier classification of 7, 21, and 52 groups.
The perceived advantages of OAC over other commercial classifications stem from the fact that the methodology is open and documented, and that the data is open and freely available to both the public and commercial organizations, subject to licensing conditions.
OAC has a wide variety of potential applications, from geographic analysis to social marketing and consumer profiling. The UK public sector is one of the main users of OAC.
ESRI Community Tapestry
This method classifies US neighborhoods into 65 market segments, based on socioeconomic and demographic factors, then consolidates these 67 segments into 14 types of LifeModes with names such as "High Society", "Senior Styles", and "Factories and Farms". The smallest spatial granularity of data is produced at the level of the U.S. Census Block Group.
Brimicombe, A. J. 2007. A dual approach to cluster discovery in point event data sets. Computers, Environment and Urban Systems, 31, 4–18.
Feng, Z., Flowerdew, R., 1999. The use of fuzzy classification to improve geodemographic targeting. In B.Gittings (Ed.),Innovations in GIS 6 London:Taylor &Francis, (pp. 133 –144).
Grekousis G., Hatzichristos T., 2012. Comparison of two fuzzy algorithms in geodemographic segmentation analysis: The Fuzzy C-Means and Gustafson–Kessel methods. Applied Geography, 34, pp 125–136. http://dx.doi.org/10.1016/j.apgeog.2011.11.004
Spielman, S.E., Thill, J.C., 2008. Social area analysis, data mining and GIS. Computers, Environment and Urban Systems, 32, 110-122.