Multiway data analysis
Multiway data analysis is a method of analyzing large data sets by representing the data as a multidimensional array. The proper choice of array dimensions and analysis techniques can reveal patterns in the underlying data undetected by other methods.
The study of multiway data analysis was first formalized as the result of a conference held in 1988. The result of this conference was the first text specifically addressed to this field, Coppi and Bolasco's Multiway Data Analysis. At that time, the application areas for multiway analysis included statistics, econometrics and psychometrics. In recent years, applications have expanded to include chemometrics, agriculture, social network analysis and the food industry.
Composition of multiway data analysis
Multiway data analysts use the term way to refer to a dimension of the data while reserving the word mode for the methods or models used to analyze the data.:xviii
In this sense, we can define the various ways of data to analyze:
- One-way data is a vector, with a single data value for each discrete or continuous value of the single dimension.
- Two-way data is a matrix, with a single data value for each discrete or continuous value of two separate dimensions; a spreadsheet can be used to visualize such data in the case of discrete dimensions.
- Three-way data can be viewed as a stack of matrices (or similarly, as a workbook of multiple spreadsheets), adding a third dimension. Such data might represent the temperature at different locations (two-way data) sampled over different times (the third dimension, leading to three-way data)
- Four-way data, using the same spreadsheet analogy, can be represented as a file folder full of separate workbooks.
- Five-way data and six-way data can be represented by similarly higher levels of data aggregation.
In general, the several dimensions represented in the data set may be measured at different times, or in different places, using different methodologies, and may contain inconsistencies such as missing data or discrepancies in data representation.
The multiway model refers to the selection of the number and nature of dimensions used to represent the data available. The ultimate goal is to reduce the multiple dimensions down to one or two (by detecting the patterns within the data) that can then be presented to human decision-makers.
Multiway data analysis can be employed in various multiway applications so as to address the problem of finding hidden multilinear structure in multiway datasets. Following are examples of applications in different fields:
- Computer vision
- Electroanalytical chemistry
- Process analysis
- Social network analysis/web-mining
Multiway processing is the execution of designed and determined multiway model(s) transforming multiway data to the desirable level by addressing the specific need of particular multiway application. A typical example of data generated with a potentiometric electronic tongue illustrates relevant multiway processing.
- Coppi, R.; Bolasco, S., eds. (1989). Multiway Data Analysis. Amsterdam: North-Holland. ISBN 9780444874108.
- Kroonenberg, Pieter M. (2008). Applied Multiway Data Analysis. Wiley Series in Probability and Statistics. 702. John Wiley & Sons. p. xv. ISBN 9780470237991.
- Bro, Rasmus (20 November 1998). Multi-way Analysis in the Food Industry: Models, Algorithms, and Applications (PDF) (Ph.D. thesis). University of Amsterdam.
- Acar, Evrim; Yener, Bulent. Unsupervised Multiway Data Analysis: A Literature Survey (PDF) (Thesis). Rensselaer Polytechnic Institute.
- Cartas, Raul; Mimendia, Aitor; Legin, Andrey; del Valle, Manel (2011). "Multiway Processing of Data Generated with a Potentiometric Electronic Tongue in a SIA System". Electroanalysis. doi:10.1002/elan.201000642.