Singular spectrum analysis
Singular spectrum analysis (SSA) combines elements of classical time series analysis, multivariate statistics, multivariate geometry, dynamical systems and signal processing. Its roots lie in the classical Karhunen (1946)–Loève (1945, 1978) spectral decomposition of time series and random fields and in the Mañé (1981)–Takens (1981) embedding theorem.
In practice, SSA is a nonparametric spectral estimation method based on embedding a time series
:
in a vector space of dimension
. SSA proceeds by diagonalizing the
lag-covariance matrix
of
to obtain spectral information on the time series, assumed to be stationary in the weak sense. The matrix
can be estimated directly from the data as a Toeplitz matrix with constant diagonals (Vautard and Ghil, 1989), i.e., its entries
depend only on the lag
:
An alternative way to compute
, is by using the
``trajectory matrix"
that is formed by
lag-shifted copies of
, which are
long; then
The
eigenvectors
of the lag-covariance matrix
are called temporal empirical orthogonal functions (EOFs). The eigenvalues
of
account for the partial variance in the direction
and the sum of the eigenvalues, i.e., the trace of
, gives the total variance of the original time series
. The name of the method derives from the singular values
of
.
Contents |
[edit] Decomposition and reconstruction
Projecting the time series onto each EOF yields the corresponding temporal principal components (PCs)
:
An oscillatory mode is characterized by a pair of nearly equal SSA eigenvalues and associated PCs that are in approximate phase quadrature (Ghil et al., 2002). Such a pair can represent efficiently a nonlinear, anharmonic oscillation. This is due to the fact that a single pair of data-adaptive SSA eigenmodes often will capture better the basic periodicity of an oscillatory mode than methods with fixed basis functions, such as the
and
used in the Fourier transform.
The window width
determines the longest periodicity captured by SSA. Signal-to-noise separation can be obtained by merely inspecting the slope break in a "scree diagram" of eigenvalues
or singular values
vs.
. The point
at which this break occurs should not be confused with a ``dimension"
of the underlying deterministic dynamics (Vautard and Ghil, 1989).
A Monte-Carlo test (Allen and Robertson, 1996) can be applied to ascertain the statistical significance of the oscillatory pairs detected by SSA. The entire time series or parts of it that correspond to trends, oscillatory modes or noise can be reconstructed by using linear combinations of the PCs and EOFs, which provide the reconstructed components (RCs)
:
here
is the set of EOFs on which the reconstruction is based. The values of the normalization factor
, as well as of the lower and upper bound of summation
and
, differ between the central part of the time series and the vicinity of its endpoints (Ghil et al., 2002).
[edit] Multivariate extension
Multi-channel SSA (or M-SSA) is a natural extension of SSA to an
-channel time series of vectors or maps with
data points
. In the meteorological literature, extended EOF (EEOF) analysis is often assumed to be synonymous with M-SSA. The two methods are both extensions of classical principal component analysis (PCA) but they differ in emphasis: EEOF analysis typically utilizes a number
of spatial channels much greater than the number
of temporal lags, thus limiting the temporal and spectral information. In M-SSA, on the other hand, one usually chooses
. Often M-SSA is applied to a few leading PCs of the spatial data, with
chosen large enough to extract detailed temporal and spectral information from the multivariate time series (Ghil et al., 2002).
[edit] Spatio-temporal gap filling
The gap-filling version of SSA can be used to analyze data sets that are unevenly sampled or contain missing data (Kondrashov and Ghil, 2006). For a univariate time series, the SSA gap filling procedure utilizes temporal correlations to fill in the missing points. For a multivariate data set, gap filling by M-SSA takes advantage of both spatial and temporal correlations. In either case: (i) estimates of missing data points are produced iteratively, and are then used to compute a self-consistent lag-covariance matrix
and its EOFs
; and (ii) cross-validation is used to optimize the window width
and the number of leading SSA modes to fill the gaps with the iteratively estimated "signal," while the noise is discarded.
[edit] Brief history
Broomhead and King (1986a, b) proposed to use SSA and M-SSA in the context of nonlinear dynamics for the purpose of reconstructing the attractor of a system from measured time series. These authors provided an extension and a more robust application of the Mañé (1981)-Takens (1981) idea of reconstructing dynamics from a single time series.
Ghil, Vautard and their colleagues (Vautard and Ghil, 1989; Ghil and Vautard, 1991; Vautard et al., 1992) noticed the analogy between the trajectory matrix of Broomhead and King, on the one hand, and Karhunen (1946)-Loève (1945) principal component analysis in the time domain, on the other. Thus, SSA can be used as a time-and-frequency domain method for time series analysis — independently from attractor reconstruction and including cases in which the latter may fail.
At present, the papers dealing with the methodological aspects and the applications of SSA number in the hundreds. Introductions to and reviews of the literature are provided by Elsner and Tsonis (1996), Danilov and Zhigljavsky (1997), Golyandina et al. (2001), and Ghil et al. (2002). Recently it is been used for tool condition monitoring and bearing fault detection.
[edit] References
- Allen, M.R., and A.W. Robertson: "Distinguishing modulated oscillations from coloured noise in multivariate datasets", Clim. Dyn., 12, 775–784, 1996.
- Broomhead, D.S., and G.P. King: "Extracting qualitative dynamics from experimental data", Physica D, 20, 217–236, 1986a.
- Broomhead, D.S., and G. P. King: "On the qualitative analysis of experimental dynamical systems". Nonlinear Phenomena and Chaos, Sarkar S (Ed.), Adam Hilger, Bristol, pp. 113–144, 1986b.
- Danilov, D. and Zhigljavsky, A. (Eds.). (1997):Principal Components of Time Series: the Caterpillar method, University of St. Petersburg Press. (In Russian.)
- Elsner, J.B. and Tsonis, A.A.: Singular Spectral Analysis. A New Tool in Time Series Analysis, Plenum Press, 1996.
- Ghil, M., and R. Vautard: "Interdecadal oscillations and the warming trend in global temperature time series", Nature, 350, 324–327, 1991.
- Ghil, M., R. M. Allen, M. D. Dettinger, K. Ide, D. Kondrashov, et al. (2002) Advanced spectral methods for climatic time series, Rev. Geophys. 40(1), 3.1–3.41. doi:10.1029/2000RG000092
- Golyandina, N., Nekrutkin, V. and Zhigljavsky, A. (2001): Analysis of Time Series Structure: SSA and related techniques. Chapman and Hall/CRC. ISBN 1584881941
- Karhunen, K.: Zur Spektraltheorie stochastischer Prozesse, Ann. Acad. Sci. Fenn. Ser. A1, Math. Phys., 34, 1946.
- Kondrashov, D., and M. Ghil: Spatio-temporal filling of missing points in geophysical data sets, Nonlin. Processes Geophys., 13, 151–159, 2006.
- Loève, M.: Probability Theory, Vol. II, 4th ed., Springer-Verlag, 1978.
- Mañé, R.: "On the dimension of the compact invariant sets of certain nonlinear maps". Dynamical Systems and Turbulence, Eds. D. A. Rand and L. S. Young, Springer-Verlag, New York, pp. 230–242, 1981.
- Takens, F.: "Detecting strange attractors in turbulence". Dynamical Systems and Turbulence, D. A. Rand and L.-S. Young (Eds.), Springer-Verlag, New York, pp. 366–381, 1981.
- Vautard, R., and M. Ghil: "Singular spectrum analysis in nonlinear dynamics, with applications to paleoclimatic time series", Physica D, 35, 395–424, 1989.



