Parallel coordinates
Parallel coordinates is a common way of visualizing high-dimensional geometry and analyzing multivariate data.
To show a set of points in an n-dimensional space, a backdrop is drawn consisting of n parallel lines, typically vertical and equally spaced. A point in n-dimensional space is represented as a polyline with vertices on the parallel axes; the position of the vertex on the ith axis corresponds to the ith coordinate of the point.
History
Parallel coordinates were invented by Philbert Maurice d'Ocagne (fr) in 1885,[1] and were independently re-discovered and popularised by Al Inselberg [2] in 1959 and systematically developed as a coordinate system starting from 1977. Some important applications are in collision avoidance algorithms for air traffic control (1987—3 USA patents), data mining (USA patent), computer vision (USA patent), Optimization, process control, more recently in intrusion detection and elsewhere.
Higher dimensions
Adding more dimensions in parallel coordinates (often abbreviated ||-coords or PCs) involves adding more axes. The value of parallel coordinates is that certain geometrical properties in high dimensions transform into easily seen 2D patterns. For example, a set of points on a line in n-space transforms to a set of polylines (or curves) in parallel coordinates all intersecting at n − 1 points. For n = 2 this yields a point-line duality pointing out why the mathematical foundations of parallel coordinates are developed in the Projective rather than Euclidean space. Also known are the patterns corresponding to (hyper)planes, curves, several smooth (hyper)surfaces, proximities, convexity and recently non-orientability.[3] Since the process maps a k-dimensional data onto a lower 2D space, some loss of information is expected. The loss of information can be measured using Parseval's identity (or energy norm).
Statistical considerations
When used for statistical data visualisation there are three important considerations: the order, the rotation, and the scaling of the axes.
The order of the axes is critical for finding features, and in typical data analysis many reorderings will need to be tried. Some authors have come up with ordering heuristics which may create illuminating orderings.[4]
The rotation of the axes is a translation in the parallel coordinates and if the lines intersected outside the parallel axes it can be translated between them by rotations. The simplest example of this is rotating the axis by 180 degrees.[5]
The necessity of scaling stems from the fact that the plot is based on interpolation (linear combination) of consecutive pairs of variables.[5] Therefore, the variables must be in common scale, and there are many scaling methods to be considered as part of data preparation process that can reveal more informative views.
A smooth parallel coordinate plot is achieved with splines.[6] In the smooth plot, every observation is mapped into a parametric line (or curve), which is smooth, continuous on the axes, and orthogonal to each parallel axis. This design emphasizes the quantization level for each data attribute.[5] If one uses the Fourier interpolation of degree equals to the data dimensionality, then an Andrews plot[7] is achieved.
Reading
Inselberg (Inselberg 1997) made a full review of how to visually read out parallel coords' relational patterns.[8] When most lines between two parallel axis are somewhat parallel to each others, that suggests a positive relationship between these two dimensions. When lines cross in a kind of superposition of X-shapes, that's negative relationship. When lines cross randomly or are parallel, that show there is no particular relationship.
Software
While there is a large amount of papers about parallel coordinates, there are only few notable software publicly available to convert databases into parallel coordinates graphics.[9] Notable software are Parvis, XDAT, Mondrian, GGobi, and Macrofocus High-D. Libraries include Protovis.js,[10] D3.js[11][12] provide basic examples, while more complex examples are also available.[13][14] D3.Parcoords.js[15] (a D3-based library) and Macrofocus High-D API (a Java library) specifically dedicated to ||-coords graphic creation have also been published.
See also
References
- ^ d'Ocagne, Maurice (1885). Coordonnées parallèles et axiales : Méthode de transformation géométrique et procédé nouveau de calcul graphique déduits de la considération des coordonnées parallèles. Paris: Gauthier-Villars.
- ^ Inselberg, Alfred (1985). "The Plane with Parallel Coordinates". Visual Computer. 1 (4): 69–91. doi:10.1007/BF01898350.
- ^ Inselberg, Alfred (2009). Parallel Coordinates: VISUAL Multidimensional Geometry and its Applications. Springer. ISBN 978-0387215075.
- ^ Yang, Jing; Peng, Wei; Ward, Matthew O.; Rundensteiner, Elke A. (2003). "Interactive Hierarchical Dimension Ordering Spacing and Filtering for Exploration of High Dimensional Datasets" (PDF). IEEE Symposium on Information Visualization (INFOVIS 2003): 3–4.
- ^ a b c Moustafa, Rida; Wegman, Edward J. (2006). "Multivariate continuous data – Parallel Coordinates". Graphics of Large Datasets: Visualizing a Million. Springer. pp. 143–156. ISBN 978-0387329062.
{{cite book}}
: Unknown parameter|editors=
ignored (|editor=
suggested) (help) - ^ Moustafa, Rida; Wegman, Edward J. (2002). "On Some Generalizations of Parallel Coordinate Plots" (PDF). Seeing a million, A Data Visualization Workshop, Rain am Lech (nr.), Germany.
- ^ Andrews, David F. (1972). "Plots of High-Dimensional Data". International Biometric Society. 18 (1): 125–136. JSTOR 2528964.
- ^ Inselberg, A. (1997), "Multidimensional detective", Information Visualization, 1997. Proceedings., IEEE Symposium on, pp. 100–107
- ^ Kosara, Robert (2010). "Parallel Coordinates".
- ^ Bostock, Mike (2011). "Protovis.js: Parallel Coordinates".
- ^ Bostock, Mike (2012). "D3.js: Parallel Coordinates".
- ^ Davies, Jason (2011). "Parallel%20Coordinates".
- ^ Chang, Kai (2012). "Nutrient Contents - Parallel Coordinates".
- ^ http://bl.ocks.org/syntagmatic
- ^ Chang, Kai (2012). "Parallel Coordinates (beta)".
Further reading
- Heinrich, Julian and Weiskopf, Daniel (2013) State of the Art of Parallel Coordinates, Eurographics 2013 - State of the Art Reports, pp. 95-116
- Moustafa, Rida (2011) Parallel coordinate and parallel coordinate density plots, Wiley Interdisciplinary Reviews: Computational Statistics
Vol 3(2), pp. 134-148.
External links
- Alfred Inselberg's Homepage, with Visual Tutorial, History, Selected Publications and Applications
- An Investigation of Methods for Visualising Highly Multivariate Datasets by C. Brunsdon, A. S. Fotheringham & M. E. Charlton, University of Newcastle, UK
- Parallel coordinates plot in GGobi
- Parallel coordinates plot in the public-domain software package XmdvTool
- Using Curves to Enhance Parallel Coordinate Visualisations by Martin Graham & Jessie Kennedy, Napier University, Edinburgh, UK
- Clustergram: A graph for visualizing cluster analyses based on the Parallel Coordinates of each observations cluster mean over the number of potential clusters (implemented in R).
- XDAT – a free GPL JAVA-based software for plotting parallel coordinates.
- Parallel Coordinates, a tutorial by Robert Kosara
- High-D A multi-platform commercial tool for creating parallel coordinates visualizations (with examples)
- Parallel coordinates plot in Omniscope