Orthogonal array

In mathematics, an orthogonal array (more specifically, a fixed-level orthogonal array) is a "table" (array) whose entries come from a fixed finite set of symbols (for example, {1,2,...,v}), arranged in such a way that there is an integer t so that for every selection of t columns of the table, all ordered t-tuples of the symbols, formed by taking the entries in each row restricted to these columns, appear the same number of times. The number t is called the strength of the orthogonal array. Here are two examples:

1	1	1
2	2	1
1	2	2
2	1	2

0	0	0	0
0	0	1	1
0	1	0	1
0	1	1	0
1	0	0	1
1	0	1	0
1	1	0	0
1	1	1	1

The example at left is that of an orthogonal array with symbol set {1,2} and strength 2. Notice that the four ordered pairs (2-tuples) formed by the rows restricted to the first and third columns, namely (1,1), (2,1), (1,2) and (2,2), are all the possible ordered pairs of the two element set and each appears exactly once. The second and third columns would give, (1,1), (2,1), (2,2) and (1,2); again, all possible ordered pairs each appearing once. The same statement would hold had the first and second columns been used. This is thus an orthogonal array of strength two.

In the example on the right,^[1] the rows restricted to the first three columns contain the 8 possible ordered triples consisting of 0's and 1's, each appearing once. The same holds for any other choice of three columns. Thus this is an orthogonal array of strength 3.

A mixed-level orthogonal array is one in which each column may have a different number of symbols. An example is given below.

Orthogonal arrays generalize, in a tabular form, the idea of mutually orthogonal Latin squares. These arrays have many connections to other combinatorial designs and have applications in the statistical design of experiments, coding theory, cryptography and various types of software testing.

Definition

For t ≤ k, an orthogonal array of type (N, k, v, t) – an OA(N, k, v, t) for short – is an N × k array whose entries are chosen from a set X with v points (a v-set) such that in every subset of t columns of the array, every t-tuple of points of X is repeated the same number of times. The number of repeats is usually denoted λ.

In many applications these parameters are given the following names:

N is the number of experimental runs,

k is the number of factors,

v is the number of levels,

t is the strength, and

λ is the index.

The definition of strength leads to the parameter relation

N = λv^t.

An orthogonal array is simple if it does not contain any repeated rows. (Subarrays of t columns may have repeated rows, as in the OA(18, 7, 3, 2) example pictured in this section.)

An orthogonal array is linear if X is a finite field F_q of order q (q a prime power) and the rows of the array form a subspace of the vector space (F_q)^k.^[2] The right-hand example in the introduction is linear over the field F₂. Every linear orthogonal array is simple.

In a mixed-level orthogonal array, the symbols in the columns may be chosen from different sets having different numbers of points, as in the following example:^[3]

0	0	0	0	0
1	1	1	1	0
0	0	1	1	1
1	1	0	0	1
0	1	0	1	2
1	0	1	0	2
0	1	1	0	3
1	0	0	1	3

This array has strength 2:

Any pair of the first four columns contains each of the ordered pairs (0, 0), (0, 1), (1, 0) and (1, 1) two times.
Columns 4 and 5 – or column 5 with any one of the other columns – contains each ordered pair (i, j) once, where i = 0 or 1 and j = 0, 1, 2, or 3.

It may thus be denoted may be denoted OA(8, 5, 2⁴4¹, 2), as is discussed below. The expression 2⁴4¹ indicates that four factors have 2 levels and one has 4 levels.

As in this example, there is no single ``index" or repetition number λ in a mixed-level orthogonal array of strength t: Each subarray of t columns can have a different λ.

Terminology and notation

The terms symmetric and asymmetric are sometimes used for fixed-level and mixed-level. Here symmetry refers to the property that all factors have the same number of levels, not to the "shape" of the array: a symmetric orthogonal array is almost never a symmetric matrix.

The notation OA(N, k, v, t) is sometimes contracted so that one may, for example, write simply OA(k, v),^[4] as long as the text makes clear the unstated parameter values. In the other direction, it may be expanded for mixed-level arrays. Here one would write OA(N, k, v₁···v_k, t), where column i has v_i levels. This notation is usually shortened when values v are repeated, so that one writes OA(8, 5, 2⁴4¹, 2) for the example at the end of the last section, rather than OA(8, 5, 2·2·2·2·4, 2). In similar fashion, one may shorten OA(N, k, v, t) to OA(N, v^k, t) for fixed-level arrays.

This OA notation does not explicitly include the index λ, but λ can be recovered from the other parameters via the relation N = λv^t. This is effective when the parameters all have specific numerical values, but less so when a class of orthogonal arrays is intended. For example, when indicating the class of arrays having strength t = 2 and index λ=1, the notation OA(N, k, v, 2) is insufficient to determine λ by itself. This is typically remedied by writing OA(v², k, v, 2) instead. While notations that explicitly include the parameter λ do not have this problem, they cannot easily be extended to denote mixed-level arrays.

Some authors define an OA(N, k, v, t) as being k × N rather than N × k. In such cases the strength of the array is defined in terms of a subset of t rows rather than columns.

Except for the prefix OA, the notation OA(N, k, v, t) is the same as that introduced by Rao.^[5] While this notation is very common, it not universal. Hedayat, Sloane and Stufken^[6] recommend it as standard, but list eight alternatives found in the literature, and there are others.^[8]

Examples

An example of an OA(16, 5, 4, 2); a strength 2, 4-level design of index 1 with 16 runs:

1	1	1	1	1
1	2	2	2	2
1	3	3	3	3
1	4	4	4	4
2	1	4	2	3
2	2	3	1	4
2	3	2	4	1
2	4	1	3	2
3	1	2	3	4
3	2	1	4	3
3	3	4	1	2
3	4	3	2	1
4	1	3	4	2
4	2	4	3	1
4	3	1	2	4
4	4	2	1	3

An example of an OA(27, 5, 3, 2) (written as its transpose for ease of viewing):^[9]

0	0	0	0	0	0	0	0	1	1	1	1	1	1	1	1	1	2	2	2	2	2	2	2	2	2
0	0	1	1	1	2	2	2	0	0	0	1	1	1	2	2	2	0	0	0	1	1	1	2	2	2
1	2	0	1	2	0	1	2	0	1	2	0	1	2	0	1	2	0	1	2	0	1	2	0	1	2
0	0	1	1	1	2	2	2	2	2	2	0	0	0	1	1	1	1	1	1	2	2	2	0	0	0
1	2	1	2	0	2	0	1	0	1	2	1	2	0	2	0	1	0	1	2	1	2	0	2	0	1

This example has index λ = 3.

Trivial examples

An array consisting of all k-tuples of a v-set, arranged so that the k-tuples are rows, automatically ("trivially") has strength k, and so is an OA(v^k, k, v, k). Any OA(N, k, v, k) would be considered trivial since such arrays are easily constructed by simply listing all the k-tuples of the v-set λ times.

Mutually orthogonal Latin squares

An OA(n², 3, n, 2) is equivalent to a Latin square of order n. For k ≤ n+1, an OA(n², k, n, 2) is equivalent to a set of k − 2 mutually orthogonal Latin squares of order n. Such index one, strength 2 orthogonal arrays are also known as Hyper-Graeco-Latin square designs in the statistical literature.

Let A be a strength 2, index 1 orthogonal array on an n-set of elements, identified with the set of natural numbers {1,...,n}. Choose and fix, in order, two columns of A, called the indexing columns. Because the strength is 2 and the index is 1, all ordered pairs (i, j) with 1 ≤ i, j ≤ n appear exactly once in the rows of the indexing columns. Here i and j will in turn index the rows and columns of a n×n square. Take any other column of A and fill the (i, j) cell of this square with the entry that is in this column of A and in the row of A whose indexing columns contain (i, j). The resulting square is a Latin square of order n. For example, consider this OA(9, 4, 3, 2):

1	1	1	1
1	2	2	2
1	3	3	3
2	1	2	3
2	2	3	1
2	3	1	2
3	1	3	2
3	2	1	3
3	3	2	1

By choosing columns 3 and 4 (in that order) as the indexing columns, the first column produces the Latin square

1	2	3
3	1	2
2	3	1

while the second column produces the Latin square

1	3	2
3	2	1
2	1	3

These two squares, moreover, are mutually orthogonal. In general, the Latin squares produced in this way from an orthogonal array will be orthogonal Latin squares, so the k − 2 columns other than the indexing columns will produce a set of k − 2 mutually orthogonal Latin squares.

This construction is completely reversible and so strength 2, index 1 orthogonal arrays can be constructed from sets of mutually orthogonal Latin squares.^[10]

Latin squares, Latin cubes and Latin hypercubes

Orthogonal arrays provide a uniform way to describe these diverse objects which are of interest in the statistical design of experiments.

Latin squares

As mentioned in the previous section, a Latin square of order n can be thought of as an OA(n², 3, n, 2). Actually, the orthogonal array can lead to six Latin squares since any ordered pair of distinct columns can be used as the indexing columns. However, these are all isotopic and are considered equivalent. For concreteness we shall always assume that the first two columns in their natural order are used as the indexing columns.

Latin cubes

In the statistics literature, a Latin cube is a three-dimensional n × n × n matrix consisting of n layers, each having n rows and n columns such that the n distinct elements which appear are repeated n² times and arranged so that in each layer parallel to each of the three pairs of opposite faces of the cube all the n distinct elements appear and each is repeated exactly n times in that layer.^[11]

Note that with this definition a layer of a Latin cube need not be a Latin square. In fact, no row, column or file (the cells of a particular position in the different layers) need be a permutation of the n symbols.^[12]

A Latin cube of order n is equivalent to an OA(n³, 4 ,n, 2).^[9]

Two Latin cubes of order n are orthogonal if, among the n³ pairs of elements chosen from corresponding cells of the two cubes, each distinct ordered pair of the elements occurs exactly n times. A set of k − 3 mutually orthogonal Latin cubes of order n is equivalent to an OA(n³, k, n, 2).^[9] An example of a pair of mutually orthogonal Latin cubes of order three was given as the OA(27, 5, 3, 2) in the Examples section above.

Unlike the case with Latin squares, in which there are no constraints, the indexing columns of the orthogonal array representation of a Latin cube must be selected so as to form an OA(n³, 3, n, 3).

Latin hypercubes

An m-dimensional Latin hypercube of order n of the rth class is an n × n × ... ×n m-dimensional matrix having n^r distinct elements, each repeated n^m − r times, and such that each element occurs exactly n ^{m − r − 1} times in each of its m sets of n parallel (m − 1)-dimensional linear subspaces (or "layers"). Two such Latin hypercubes of the same order n and class r with the property that, when one is superimposed on the other, every element of the one occurs exactly n^m − 2r times with every element of the other, are said to be orthogonal.^[13]

A set of k − m mutually orthogonal m-dimensional Latin hypercubes of order n is equivalent to an OA(n^m, k, n, 2), where the indexing columns form an OA(n^m, m, n, m).

History

The concepts of Latin squares and mutually orthogonal Latin squares were generalized to Latin cubes and hypercubes, and orthogonal Latin cubes and hypercubes by Kishen (1942).^[14] Rao (1946) generalized these results to arrays of strength t. The present notion of orthogonal array as a generalization of these ideas, due to legendary scientist C. R. Rao, appears in Rao (1947),^[15] with his generalization to mixed-level arrays appearing in 1973.^[16]

Rao initially used the term "array" with no modifier, and defined it to mean simply a subset of all treatment combinations – a simple array. The possibility of non-simple arrays arose naturally when making treatment combinations the rows of a matrix. Hedayat, Sloane and Stufken^[17] credit K. Bush^[18] with the term "orthogonal array".

Other constructions

Hadamard matrices

There exists an OA(4λ, 4λ − 1, 2, 2) if and only if there exists a Hadamard matrix of order 4λ.^[19] To proceed in one direction, let H be a Hadamard matrix of order 4m in standardized form (first row and column entries are all +1). Delete the first row and take the transpose to obtain the desired orthogonal array.^[20] The following example illustrates this. (The reverse construction is similar.)

The order 8 standardized Hadamard matrix below (±1 entries indicated only by sign),

+	+	+	+	+	+	+	+
+	+	+	+	−	−	−	−
+	+	−	−	+	+	−	−
+	+	−	−	−	−	+	+
+	−	+	−	+	−	+	−
+	−	+	−	−	+	−	+
+	−	−	+	+	−	−	+
+	−	−	+	−	+	+	−

produces the OA(8, 7, 2, 2):^[21]

+	+	+	+	+	+	+
+	+	+	−	−	−	−
+	−	−	+	+	−	−
+	−	−	−	−	+	+
−	+	−	+	−	+	−
−	+	−	−	+	−	+
−	−	+	+	−	−	+
−	−	+	−	+	+	−

Using columns 1, 2 and 4 as indexing columns, the remaining columns produce four mutually orthogonal Latin cubes of order 2.

Codes

Let C ⊆ (F_q)ⁿ, be a linear code of dimension m with minimum distance d. Then C^⊥ (the orthogonal complement of the vector subspace C) is a (linear) OA(q^n-m, n, q, d − 1) where
λ = q^{n − m − d + 1}.^[22]

Applications

Threshold schemes

Secret sharing (also called secret splitting) consists of methods for distributing a secret amongst a group of participants, each of whom is allocated a share of the secret. The secret can be reconstructed only when a sufficient number of shares, of possibly different types, are combined; individual shares are of no use on their own. A secret sharing scheme is perfect if every collection of participants that does not meet the criteria for obtaining the secret, has no additional knowledge of what the secret is than does an individual with no share.

In one type of secret sharing scheme there is one dealer and n players. The dealer gives shares of a secret to the players, but only when specific conditions are fulfilled will the players be able to reconstruct the secret. The dealer accomplishes this by giving each player a share in such a way that any group of t (for threshold) or more players can together reconstruct the secret but no group of fewer than t players can. Such a system is called a (t, n)-threshold scheme.

An OA(v^t, n+1, v, t) may be used to construct a perfect (t, n)-threshold scheme.^[23]

Let A be the orthogonal array. The first n columns will be used to provide shares to the players, while the last column represents the secret to be shared. If the dealer wishes to share a secret S, only the rows of A whose last entry is S are used in the scheme. The dealer randomly selects one of these rows, and hands out to player i the entry in this row in column i as shares.

Factorial designs

A factorial experiment is a statistically structured experiment in which several factors (watering levels, antibiotics, fertilizers, etc.) are applied to each experimental unit at finitely many levels, which may be quantitative or qualitative.^[24] In a full factorial experiment all combinations of levels of the factors need to be tested. In a fractional factorial design only a subset of treatment combinations are used.

An orthogonal array can be used to design a fractional factorial experiment. The columns represent the various factors and the entries are the levels at which the factors are observed. An experimental run is a row of the orthogonal array, that is, a specific combination of factor levels. The strength of the array determines the resolution of the fractional design. When using one of these designs, the treatment units and trial order should be randomized as much as the design allows. For example, one recommendation is that an appropriately sized orthogonal array be randomly selected from those available, and that the run order then be randomized.

Mixed-level designs occur naturally in the statistical setting.

Quality control

Orthogonal arrays played a central role in the development of Taguchi methods by Genichi Taguchi, which took place during his visit to Indian Statistical Institute in the early 1950s. His methods were successfully applied and adopted by Japanese and Indian industries and subsequently were also embraced by US industry albeit with some reservations^{[citation needed]}. Taguchi's catalog^[25] contains both fixed- and mixed-level arrays.

Testing

Orthogonal array testing is a black box testing technique which is a systematic, statistical way of software testing.^[26]^[27] It is used when the number of inputs to the system is relatively small, but too large to allow for exhaustive testing of every possible input to the systems.^[26] It is particularly effective in finding errors associated with faulty logic within computer software systems.^[26] Orthogonal arrays can be applied in user interface testing, system testing, regression testing and performance testing. The permutations of factor levels comprising a single treatment are so chosen that their responses are uncorrelated and hence each treatment gives a unique piece of information. The net effect of organizing the experiment in such treatments is that the same piece of information is gathered in the minimum number of experiments.

Notes

^ Hedayat, Sloane & Stufken 1999, Table 1.3
^ Stinson 2003, pg. 225
^ Hedayat, Sloane & Stufken 1999, Table 9.10(b)
^ Stinson 2003, p. 140
^ Rao 1947, p. 129
^ Hedayat, Sloane & Stufken 1999, p. 2
^ Stinson 2003, p. 225
^ See, for example, ^[7].
^ ^a ^b ^c Dénes & Keedwell 1974, pg. 191
^ Stinson 2003, pp. 140–141, Section 6.5.1
^ Dénes & Keedwell 1974, pg. 187 credit the definition to Kishen (1950, pg. 21)
^ In the combinatorialist's preferred definition, each row, column and file would contain a permutation of the symbols, but this is only a special type of Latin cube called a permutation cube.
^ Dénes & Keedwell 1974, pg. 189
^ Raghavarao 1988, pg. 9
^ Raghavarao 1988, pg. 10
^ Rao 1973, p. 354
^ Hedayat, Sloane & Stufken 1999, p. 4
^ Bush 1950
^ Hedayat, Sloane & Stufken 1999, Theorem 7.5
^ Stinson 2003, pg. 225, Theorem 10.2
^ Stinson 2003, pg. 226, Example 10.3
^ Stinson 2003, pg. 231, Theorem 10.17
^ Stinson 2003, pg. 262, Theorem 11.5
^ Street & Street 1987, pg. 194, Section 9.2
^ Taguchi 1986
^ ^a ^b ^c Pressman, Roger S (2005). Software Engineering: A Practitioner's Approach (6th ed.). McGraw–Hill. ISBN 0-07-285318-2.
^ Phadke, Madhav S. "Planning Efficient Software Tests". Phadke Associates, Inc. Numerous articles on utilizing Orthogonal Arrays for Software and System Testing.

References

Box, G. E. P.; Hunter, W. G.; Hunter, J. S. (1978). Statistics for Experimenters: An Introduction to Design, Data Analysis, and Model Building. John Wiley and Sons. ISBN 9780471093152.
Bush, K. A. (1950). Orthogonal arrays (PhD). University of North Carolina.
Dénes, J.; Keedwell, A. D. (1974), Latin Squares and Their Applications, New York-London: Academic Press, ISBN 0-12-209350-X, MR 0351850
Hedayat, A.S.; Sloane, N.J.A.; Stufken, J. (1999), Orthogonal Arrays, Theory and Applications, New York: Springer
Kishen, K. (1942), "On latin and hyper-graeco cubes and hypercubes", Current Science, 11: 98–99
Kishen, K. (1950), "On the construction of latin and hyper-graeco-latin cubes and hypercubes", J. Indian Soc. Agric. Statistics, 2: 20–48
Raghavarao, Damaraju (1988). Constructions and Combinatorial Problems in Design of Experiments (corrected reprint of the 1971 Wiley ed.). New York: Dover.
Raghavarao, Damaraju and Padgett, L.V. (2005). Block Designs: Analysis, Combinatorics and Applications. World Scientific.{{cite book}}: CS1 maint: multiple names: authors list (link)
Rao, C.R. (1946), "Hypercubes of strength ''d'' leading to confounded designs in factorial experiments", Bulletin of the Calcutta Mathematical Society, 38: 67–78
Rao, C.R. (1947), "Factorial experiments derivable from combinatorial arrangements of arrays", Supplement to the Journal of the Royal Statistical Society, 9 (1): 128–139, doi:10.2307/2983576, JSTOR 2983576
Rao, C. R. (1973). "Some combinatorial problems of arrays and applications to design of experiments". In Srivastava, Jagdish N. (ed.). A Survey of Combinatorial Theory. North Holland. ISBN 0-7204-22620.
Stinson, Douglas R. (2003), Combinatorial Designs: Constructions and Analysis, New York: Springer, ISBN 0-387-95487-2
Street, Anne Penfold & Street, Deborah J. (1987). Combinatorics of Experimental Design. Oxford U. P. [Clarendon]. ISBN 0-19-853256-3.
Taguchi, Genichi (1986). Orthogonal Arrays and Linear Graphs. Dearborn, MI: American Supplier Institute.

External links

This article incorporates public domain material from the National Institute of Standards and Technology

[1] Hedayat, Sloane & Stufken 1999, Table 1.3

[2] Stinson 2003, pg. 225

[3] Hedayat, Sloane & Stufken 1999, Table 9.10(b)

[4] Stinson 2003, p. 140

[5] Rao 1947, p. 129

[6] Hedayat, Sloane & Stufken 1999, p. 2

[7] Stinson 2003, p. 225

[8] See, for example, ^[7].

[Dénes_1974_loc=pg._191-9] Dénes & Keedwell 1974, pg. 191

[10] Stinson 2003, pp. 140–141, Section 6.5.1

[11] Dénes & Keedwell 1974, pg. 187 credit the definition to Kishen (1950, pg. 21)

[12] In the combinatorialist's preferred definition, each row, column and file would contain a permutation of the symbols, but this is only a special type of Latin cube called a permutation cube.

[13] Dénes & Keedwell 1974, pg. 189

[14] Raghavarao 1988, pg. 9

[15] Raghavarao 1988, pg. 10

[16] Rao 1973, p. 354

[17] Hedayat, Sloane & Stufken 1999, p. 4

[18] Bush 1950

[19] Hedayat, Sloane & Stufken 1999, Theorem 7.5

[20] Stinson 2003, pg. 225, Theorem 10.2

[21] Stinson 2003, pg. 226, Example 10.3

[22] Stinson 2003, pg. 231, Theorem 10.17

[23] Stinson 2003, pg. 262, Theorem 11.5

[24] Street & Street 1987, pg. 194, Section 9.2

[25] Taguchi 1986

[Pressman,_p._446-26] Pressman, Roger S (2005). Software Engineering: A Practitioner's Approach (6th ed.). McGraw–Hill. ISBN 0-07-285318-2.

[27] Phadke, Madhav S. "Planning Efficient Software Tests". Phadke Associates, Inc. Numerous articles on utilizing Orthogonal Arrays for Software and System Testing.

[1]

[2]

[3]

[4]

[5]

[6]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24]

[25]

[26]

[27]

[7]

v t e Design of experiments
Scientific method	Scientific experiment Statistical design Control Internal and external validity Experimental unit Blinding Optimal design: Bayesian Random assignment Randomization Restricted randomization Replication versus subsampling Sample size
Treatment and blocking	Treatment Effect size Contrast Interaction Confounding Orthogonality Blocking Covariate Nuisance variable
Models and inference	Linear regression Ordinary least squares Bayesian Random effect Mixed model Hierarchical model: Bayesian Analysis of variance (Anova) Cochran's theorem Manova (multivariate) Ancova (covariance) Compare means Multiple comparison
Designs Completely randomized	Factorial Fractional factorial Plackett–Burman Taguchi Response surface methodology Polynomial and rational modeling Box–Behnken Central composite Block Generalized randomized block design (GRBD) Latin square Graeco-Latin square Orthogonal array Latin hypercube Repeated measures design Crossover study Randomized controlled trial Sequential analysis Sequential probability ratio test
Glossary Category Mathematics portal Statistical outline Statistical topics