In marketing, Geodemographic segmentation is a multivariate statistical classification technique for discovering whether the individuals of a population fall into different groups by making quantitative comparisons of multiple characteristics with the assumption that the differences within any group should be less than the differences between groups.
The information technologies employed in geodemographic segmentation include geographic information system and database management software.
- Geographic information system: a business tool for interpreting data that consists of a demographic database, digitized maps, a computer and software.
- Database management software: a computer program in which data are captured on the computer, updated, maintained and organized for effective use and manipulation of data.
Geodemographic segmentation is based on two simple principles:
- People who live in the same neighborhood are more likely to have similar characteristics than are two people chosen at random.
- Neighborhoods can be categorized in terms of the characteristics of the population which they contain. Any two neighborhoods can be placed in the same category, i.e., they contain similar types of people, even though they are widely separated.
Clustering algorithms in geodemographic segmentation
The use of different algorithms leads to different results, but there is no single best approach for selecting the best algorithm, just as no algorithm offers any theoretical proof of its certainty (Grekousis and Hatzichristos 2012). One of the most frequently used techniques in geodemographic segmentation is the widely known k-means clustering algorithm. In fact most of the current commercial geodemographic systems are based on a k-means algorithm. Still, clustering techniques coming from artificial neural networks, genetic algorithms, or fuzzy logic are more efficient within large, multidimensional databases (Brimicombe 2007).
Neural networks can handle non-linear relationships, are robust to noise and exhibit a high degree of automation. They do not assume any hypotheses regarding the nature or distribution of the data and they provide valuable assistance in handling problems of a geographical nature that, to date, have been impossible to solve. One of the best known and most efficient neural network methods for achieving unsupervised clustering is the Self-Organizing Map (SOM). SOM has been proposed as an improvement over the k-means method, for it provides a more flexible approach to census data clustering The SOM method has been recently used by Spielman and Thill (2008) to develop geodemographic clustering of a census dataset concerning New York City.
Another way of characterizing an individual polygon’s similarity to all the regions is based on fuzzy logic. The basic concept of fuzzy clustering is that an object may belong to more than one clusters. In binary logic, the set is limited by the binary yes - no definition, meaning that an object either belongs or not to a cluster. Fuzzy clustering allows a spatial unit to belong to more than one clusters with varying membership values. Most studies concerning geodemographic analysis and fuzzy logic employ the Fuzzy C-Means algorithm and the Gustafson-Kessel algorithm (Grekousis and Hatzichristos 2012, Feng and Flowerdew 1999).
Geodemographic segmentation systems
Famous geodemographic segmentation systems are Claritas Prizm (US), Tapestry (US), CAMEO (UK), ACORN (UK) and MOSAIC (UK) system. New systems targeting subgroups of the population are also emerging. For example, Segmentos examines the geodemographic lifestyles of Hispanics in the United States. Both MOSAIC and ACORN use Onomastics to infer the ethnicity from resident names. 
The CAMEO Classifications is a set of consumer classifications that are used internationally by organisations as part of their sales, marketing and network planning strategies. CAMEO UK has been built at postcode level and classifies over 42 million British consumers. It has been built to accurately segment the British market into 57 distinct neighbourhood types and 10 key marketing segments. CAMEO was developed and is maintained by Callcredit Marketing Solutions.
A Classification Of Residential Neighborhoods (Acorn) is developed by CACI in London. It is the only geodemographic tool currently available that is built using current year data rather than 2011 Census information. Acorn helps to analyse and understand consumers in order to increase engagement with customers and to deliver strategies across all channels. Acorn segments all 1.9 million UK postcodes 6 categories, 18 groups and 62 types.
Mosaic UK is Experian’s people classification system. Originally created by Prof Richard Webber (visiting Professor of Geography at Kings College University, London) in association with Experian. The latest version of Mosaic was released in 2009. It classifies the UK population into 15 main socio-economic groups and, within this, 67 different types.
Mosaic UK is part of a family of Mosaic classifications that covers 29 countries including most of Western Europe, the United States, Australia and the Far East.
Mosaic Global is Experian's global consumer classification tool. It is based on the simple proposition that the world's cities share common patterns of residential segregation. Mosaic Global is a consistent segmentation system that covers over 400 million of the world’s households using local data from 29 countries. It has identified 10 types of residential neighbourhood that can be found in each of the countries.
In Australia, geoSmart is a geodemographic segmentation system based on the principle that people with similar demographic profiles and lifestyles tend to live near each other. It is developed by an Australian supplier of geodemographic solutions, RDA Research.
geoSmart geodemographic segments are produced from the Australian Census (Australian Bureau of Statistics) demographic measures and modeled characteristics, and the system is updated for recent household growth. The clustering creates a single segment code that is represented by a descriptive statement or a thumbnail sketch.
In Australia, geoSmart is mainly used for database segmentation, customer acquisition, trade area profiling and letterbox targeting, although it can be used in a broad range of other applications.
The Output Area Classification
The Output Area Classification (OAC) is the UK Office for National Statistics' (ONS) free and open geodemographic segmentation based upon the UK Census of Population 2001. It classifies 41 census variables into a 3 tier classification of 7, 21 and 52 groups. It is expected that a revised and enhanced version of OAC will becoming available with the release of the UK 2011 Census data in roughly 2013.
The perceived advantages of OAC over other commercial classifications stem from the fact that the methodology is open and documented, and the data is open and freely available. This means that OAC is not a black box, nor is it expensive to use, in fact it is free.
OAC has a wide variety of potential applications, from locational analysis to social marketing and consumer profiling. The UK public sector are increasingly taking up OAC as it represents a real cost saving during a time of recession.
Brimicombe, A. J. 2007. A dual approach to cluster discovery in point event data sets. Computers, Environment and Urban Systems, 31, 4–18.
Feng, Z., Flowerdew, R., 1999. The use of fuzzy classification to improve geodemographic targeting. In B.Gittings (Ed.),Innovations in GIS 6 London:Taylor &Francis, (pp. 133 –144).
Grekousis G., Hatzichristos T., 2012. Comparison of two fuzzy algorithms in geodemographic segmentation analysis: The Fuzzy C-Means and Gustafson–Kessel methods. Applied Geography, 34, pp 125–136. http://dx.doi.org/10.1016/j.apgeog.2011.11.004
Spielman, S.E., Thill, J.C., 2008. Social area analysis, data mining and GIS. Computers, Environment and Urban Systems, 32, 110-122.