Combinatorial number system

From Wikipedia, the free encyclopedia
Jump to: navigation, search

In mathematics, and in particular in combinatorics, the combinatorial number system of degree k (for some positive integer k), also referred to as combinadics, is a correspondence between natural numbers (taken to include 0) N and k-combinations, represented as strictly decreasing sequences ck > ... > c2 > c1 ≥ 0. Since the latter are strings of numbers, one can view this as a kind of numeral system for representing N, although the main utility is representing a k-combination by N rather than the other way around. Distinct numbers correspond to distinct k-combinations, and produce them in lexicographic order; moreover the numbers less than \tbinom nk correspond to all k-combinations of { 0, 1, ..., n − 1}. The correspondence does not depend on the size n of the set that the k-combinations are taken from, so it can be interpreted as a map from N to the k-combinations taken from N; in this view the correspondence is a bijection.

The number N corresponding to (ck,...,c2,c1) is given by

N=\binom{c_k}k+\cdots+\binom{c_2}2+\binom{c_1}1.

The fact that a unique sequence so corresponds to any number N was observed by D.H. Lehmer.[1] Indeed a greedy algorithm finds the k-combination corresponding to N: take ck maximal with \tbinom{c_k}k\leq N, then take ck − 1 maximal with \tbinom{c_{k-1}}{k-1}\leq N - \tbinom{c_k}k, and so forth. Finding the number N, using the formula above, from the k-combination (ck,...,c2,c1) is also known as "ranking", and the opposite operation (given by the greedy algorithm) as "unranking"; these operations are known by those names in most Computer algebra systems, and in computational mathematics.[2][3]

The originally used term "combinatorial representation of integers" is shortened to "combinatorial number system" by Knuth,[4] who also gives a much older reference;[5] the term "combinadic" is introduced by James McCaffrey [6] (without reference to previous terminology or work).

Unlike the factorial number system, the combinatorial number system of degree k is not a mixed radix system: the part \tbinom{c_i}i of the number N represented by a "digit" ci is not obtained from it by simply multiplying by a place value.

The main application of the combinatorial number system is that it allows rapid computation of the k-combination that is at a given position in the lexicographic ordering, without having to explicitly list the k-combinations preceding it; this allows for instance random generation of k-combinations of a given set. Enumeration of k-combinations has many applications, among which software testing, sampling, quality control, and the analysis of lottery games.

Ordering combinations[edit]

A k-combination of a set S is a subset of S with k (distinct) elements. The main purpose of the combinatorial number system is to provide a representation, each by a single number, of all \tbinom nk possible k-combinations of a set S of n elements. Choosing, for any n, {0, 1, ..., n − 1} as such a set, it can be arranged that the representation of a given k-combination C is independent of the value of n (although n must of course be sufficiently large); in other words considering C as a subset of a larger set by increasing n will not change the number that represents C. Thus for the combinatorial number system one just considers C as a k-combination of the set N of all natural numbers, without explicitly mentioning n.

In order to ensure that the numbers representing the k-combinations of {0, 1, ..., n − 1} are less than those representing k-combinations not contained in {0, 1, ..., n − 1}, the k-combinations must be ordered in such a way that their largest elements are compared first. The most natural ordering that has this property is lexicographic ordering of the decreasing sequence of their elements. So comparing the 5-combinations C = {0,3,4,6,9} and C' = {0,1,3,7,9}, one has that C comes before C', since they have the same largest part 9, but the next largest part 6 of C is less than the next largest part 7 of C'; the sequences compared lexicographically are (9,6,4,3,0) and (9,7,3,1,0). Another way to describe this ordering is view combinations as describing the k raised bits in the binary representation of a number, so that C = {c1,...,ck} describes the number

2^{c_1}+2^{c_2}+\cdots+2^{c_k}

(this associates distinct numbers to all finite sets of natural numbers); then comparison of k-combinations can be done by comparing the associated binary numbers. In the example C and C' correspond to numbers 10010110012 = 60110 and 10100010112 = 65110, which again shows that C comes before C'. This number is not however the one one wants to represent the k-combination with, since many binary numbers have a number of raised bits different from k; one wants to find the relative position of C in the ordered list of (only) k-combinations.

Place of a combination in the ordering[edit]

The number associated in the combinatorial number system of degree k to a k-combination C is the number of k-combinations strictly less than C in the given ordering. This number can be computed from C = { ck, ..., c2, c1 } with ck > ... > c2 > c1 as follows. From the definition of the ordering it follows that for each k-combination S strictly less than C, there is a unique index i such that ci is absent from S, while ck, ..., ci+1 are present in S, and no other value larger than ci is. One can therefore group those k-combinations S according to the possible values 1, 2, ..., k of i, and count each group separately. For a given value of i one must include ck, ..., ci+1 in S, and the remaining i elements of S must be chosen from the ci non-negative integers strictly less than ci; moreover any such choice will result in a k-combinations S strictly less than C. The number of possible choices is \tbinom{c_i}i, which is therefore the number of combinations in group i; the total number of k-combinations strictly less than C then is

\binom{c_1}1+\binom{c_2}2+\cdots+\binom{c_k}k,

and this is the index (starting from 0) of C in the ordered list of k-combinations. Obviously there is for every N ∈ N exactly one k-combination at index N in the list (supposing k ≥ 1, since the list is then infinite), so the above argument proves that every N can be written in exactly one way as a sum of k binomial coefficients of the given form.

Finding the k-combination for a given number[edit]

The given formula allows finding the place in the lexicographic ordering of a given k-combination immediately. The reverse process of finding the k-combination at a given place N requires somewhat more work, but is straightforward nonetheless. By the definition of the lexicographic ordering, two k-combinations that differ in their largest element ck will be ordered according to the comparison of those largest elements, from which it follows that all combinations with a fixed value of their largest element are contiguous in the list. Moreover the smallest combination with ck as largest element is \tbinom{c_k}k, and it has ci = i − 1 for all i < k (for this combination all terms in the expression except \tbinom{c_k}k are zero). Therefore ck is the largest number such that \tbinom{c_k}k\leq N. If k > 1 the remaining elements of the k-combination form the k − 1-combination corresponding to the number N-\tbinom{c_k}k in the combinatorial number system of degree k − 1, and can therefore be found by continuing in the same way for N-\tbinom{c_k}k and k − 1 instead of N and k.

Example[edit]

Suppose one wants to determine the 5-combination at position 72. The successive values of \tbinom n5 for n = 4, 5, 6, ... are 0, 1, 6, 21, 56, 126, 252, ..., of which the largest one not exceeding 72 is 56, for n = 8. Therefore c5 = 8, and the remaining elements form the 4-combination at position 72 − 56 = 16. The successive values of \tbinom n4 for n = 3, 4, 5, ... are 0, 1, 5, 15, 35, ..., of which the largest one not exceeding 16 is 15, for n = 6, so c4 = 6. Continuing similarly to search for a 3-combination at position 16 − 15 = 1 one finds c3 = 3, which uses up the final unit; this establishes 72=\tbinom85+\tbinom64+\tbinom33, and the remaining values ci will be the maximal ones with \tbinom{c_i}i=0, namely ci = i − 1. Thus we have found the 5-combination {8, 6, 3, 1, 0}.

Applications[edit]

One could use the combinatorial number system to list or traverse all k-combinations of a given finite set, but this is a very inefficient way to do that. Indeed, given some k-combination it is much easier to find the next combination in lexicographic ordering directly than to convert a number to a k-combination by the method indicated above. To find the next combination, find the smallest i ≥ 2 for which ci ≥ ci−1+2 (if there is no such i, take i = k+1); then increase ci−1 by one and set all cj with j < i − 1 to their minimal value j − 1. If the k-combination is represented by its characteristic vector, that is as a binary value with k bits 1, then the next such value can be computed without any loop using bitwise arithmetic: the following function will advance x to that value or return false:

// find next k-combination
bool next_combination(unsigned long& x) // assume x has form x'01^a10^b in binary
{
  unsigned long u = x & -x; // extract rightmost bit 1; u =  0'00^a10^b
  unsigned long v = u + x; // set last non-trailing bit 0, and clear to the right; v=x'10^a00^b
  if (v==0) // then overflow in v, or x==0
    return false; // signal that next k-combination cannot be represented
  x = v +(((v^x)/u)>>2); // v^x = 0'11^a10^b, (v^x)/u = 0'0^b1^{a+2}, and x ← x'100^b1^a
  return true; // successful completion
}

This is called Gosper's hack;[7] corresponding assembly code was described as item 175 in HAKMEM.

On the other hand the possibility to directly generate the k-combination at index N has useful applications. Notably, it allows generating a random k-combination of an n-element set using a random integer N with 0\leq N<\tbinom nk, simply by converting that number to the corresponding k-combination. If a computer program needs to maintain a table with information about every k-combination of a given finite set, the computation of the index N associated to a combination will allow the table to be accessed without searching.

See also[edit]

References[edit]

  1. ^ Applied Combinatorial Mathematics, Ed. E. F. Beckenbach (1964), pp.27−30.
  2. ^ http://www.site.uottawa.ca/~lucia/courses/5165-09/GenCombObj.pdf
  3. ^ http://www.sagemath.org/doc/reference/sage/combinat/subset.html
  4. ^ Knuth, D. E. (2005), "Generating All Combinations and Partitions", The Art of Computer Programming, 4, Fascicle 3, Addison-Wesley, pp. 5−6, ISBN 0-201-85394-9 .
  5. ^ Pascal, Ernesto (1887), Giornale di Matematiche 25, pp. 45−49 
  6. ^ McCaffrey, James (2004), Generating the mth Lexicographical Element of a Mathematical Combination, Microsoft Developer Network 
  7. ^ Knuth, D. E. (2009), "Bitwise tricks and techniques", The Art of Computer Programming, 4, Fascicle 1, Addison-Wesley, p. 54, ISBN 0-321-58050-8