# Minkowski's question-mark function

In mathematics, the Minkowski question-mark function, denoted by ?(x), is a function possessing various unusual fractal properties, defined by Hermann Minkowski in 1904. It maps quadratic irrational numbers to rational numbers on the unit interval, via an expression relating the continued fraction expansions of the quadratics to the binary expansions of the rationals, given by Arnaud Denjoy in 1938. In addition, it maps rational numbers to dyadic rationals, as can be seen by a recursive definition closely related to the Stern–Brocot tree.

## Definition

If [a0; a1, a2, …] is the continued-fraction representation of an irrational number x, then

$\operatorname {?} (x)=a_{0}+2\sum _{n=1}^{\infty }{\frac {\left(-1\right)^{n+1}}{2^{a_{1}+\cdots +a_{n}}}},$ whereas if [a0; a1, a2, …, am] is a continued-fraction representation of a rational number x, then
$\operatorname {?} (x)=a_{0}+2\sum _{n=1}^{m}{\frac {\left(-1\right)^{n+1}}{2^{a_{1}+\cdots +a_{n}}}}.$ ## Intuitive explanation

To get some intuition for the definition above, consider the different ways of interpreting an infinite string of bits beginning with 0 as a real number in [0, 1]. One obvious way to interpret such a string is to place a binary point after the first 0 and read the string as a binary expansion: thus, for instance, the string 001001001001001001001001... represents the binary number 0.010010010010..., or 2/7. Another interpretation views a string as the continued fraction [0; a1, a2, …], where the integers ai are the run lengths in a run-length encoding of the string. The same example string 001001001001001001001001... then corresponds to [0; 2, 1, 2, 1, 2, 1, …] = 3 − 1/2. If the string ends in an infinitely long run of the same bit, we ignore it and terminate the representation; this is suggested by the formal "identity":

$[0;a_{1},\dots a_{n},\infty ]=[0;a_{1},\dots ,a_{n}+{\frac {1}{\infty }}]=[0;a_{1},\dots ,a_{n}].$ The effect of the question-mark function on [0, 1] can then be understood as mapping the second interpretation of a string to the first interpretation of the same string, just as the Cantor function can be understood as mapping a triadic base-3 representation to a base-2 representation. Our example string gives the equality

$\operatorname {?} \left({\frac {{\sqrt {3}}-1}{2}}\right)={\frac {2}{7}}.$ ## Recursive definition for rational arguments

For rational numbers in the unit interval, the function may also be defined recursively; if p/q and r/s are reduced fractions such that |psrq| = 1 (so that they are adjacent elements of a row of the Farey sequence) then

$\operatorname {?} \left({\frac {p+r}{q+s}}\right)={\frac {1}{2}}\left[\operatorname {?} \left({\frac {p}{q}}\right)+\operatorname {?} \left({\frac {r}{s}}\right)\right].$ Using the base cases

$\operatorname {?} \left({\frac {0}{1}}\right)=0\quad {\text{ and }}\quad \operatorname {?} \left({\frac {1}{1}}\right)=1,$ it is then possible to compute ?(x) for any rational x, starting with the Farey sequence of order 2, then 3, etc.

If pn−1/qn−1 and pn/qn are two successive convergents of a continued fraction, then the matrix

${\begin{pmatrix}p_{n-1}&p_{n}\\q_{n-1}&q_{n}\end{pmatrix}}$ has determinant ±1. Such a matrix is an element of SL(2, Z), the group of 2 × 2 matrices with determinant ±1. This group is related to the modular group.

## Self-symmetry

The question mark is clearly visually self-similar. A monoid of self-similarities may be generated by two operators S and R acting on the unit square and defined as follows:

{\begin{aligned}S(x,y)&=\left({\frac {x}{x+1}},{\frac {y}{2}}\right),\\[5px]R(x,y)&=(1-x,1-y).\end{aligned}} Visually, S shrinks the unit square to its bottom-left quarter, while R performs a point reflection through its center.

A point on the graph of ? has coordinates (x, ?(x)) for some x in the unit interval. Such a point is transformed by S and R into another point of the graph, because ? satisfies the following identities for all x ∈ [0, 1]:

{\begin{aligned}\operatorname {?} \left({\frac {x}{x+1}}\right)&={\frac {\operatorname {?} (x)}{2}},\\[5px]\operatorname {?} (1-x)&=1-\operatorname {?} (x).\end{aligned}} These two operators may be repeatedly combined, forming a monoid. A general element of the monoid is then

$S^{a_{1}}RS^{a_{2}}RS^{a_{3}}\cdots$ for positive integers a1, a2, a3, …. Each such element describes a self-similarity of the question-mark function. This monoid is sometimes called the period-doubling monoid, and all period-doubling fractal curves have a self-symmetry described by it (the de Rham curve, of which the question mark is a special case, is a category of such curves). The elements of the monoid are in correspondence with the rationals, by means of the identification of a1, a2, a3, … with the continued fraction [0; a1, a2, a3,…]. Since both

$S:x\mapsto {\frac {x}{x+1}}$ and
$T:x\mapsto 1-x$ are linear fractional transformations with integer coefficients, the monoid may be regarded as a subset of the modular group PSL(2, Z).

The question mark function provides a one-to-one mapping from the non-dyadic rationals to the quadratic irrationals, thus allowing an explicit proof of countability of the latter. These can, in fact, be understood to correspond to the periodic orbits for the dyadic transformation. This can be explicitly demonstrated in just a few steps.

Define two moves: a left move and a right move, valid on the unit interval $0\leq x\leq 1$ as

$L_{D}(x)={\frac {x}{2}}$ and $L_{C}(x)={\frac {x}{1+x}}$ and
$R_{D}(x)={\frac {1+x}{2}}$ and $R_{C}(x)={\frac {1}{2-x}}$ The question mark function then obeys a left-move symmetry
$L_{D}\circ ?=?\circ L_{C}$ and a right-move symmetry
$R_{D}\circ ?=?\circ R_{C}$ where $\circ$ denotes function composition. These can be arbitrary concatenated. Consider, for example, the sequence of left-right moves $LRLLR.$ Adding the subscripts C and D, and, for clarity, dropping the composition operator $\circ$ in all but a few places, one has:
$L_{D}R_{D}L_{D}L_{D}R_{D}\circ ?=?\circ L_{C}R_{C}L_{C}L_{C}R_{C}$ Arbitrary finite-length strings in the letters L and R correspond to the dyadic rationals, in that every dyadic rational can be written as both $y=n/2^{m}$ for integer n and m and as finite length of bits $y=0.b_{1}b_{2}b_{3}\cdots b_{m}$ with $b_{k}\in \{0,1\}.$ Thus, every dyadic rational is in one-to-one correspondence with some self-symmetry of the question mark function.

Some notational rearrangements can make the above slightly easier to express. Let $g_{0}$ and $g_{1}$ stand for L and R. Function composition extends this to a monoid, in that one can write $g_{010}=g_{0}g_{1}g_{0}$ and generally, $g_{A}g_{B}=g_{AB}$ for some binary strings of digits A, B, where AB is just the ordinary concatenation of such strings. The dyadic monoid M is then the monoid of all such finite-length left-right moves. Writing $\gamma \in M$ as a general element of the monoid, there is a corresponding self-symmetry of the question mark function:

$\gamma _{D}\circ ?=?\circ \gamma _{C}$ ### Isomorphism

An explicit mapping between the rationals and the dyadic rationals can be obtained providing a reflection operator

$r(x)=1-x$ and noting that both
$r\circ R_{D}\circ r=L_{D}$ and $r\circ R_{C}\circ r=L_{C}$ Since $r^{2}=1$ is the identity, an arbitrary string of left-right moves can be re-written as a string of left moves only, followed by a reflection, followed by more left moves, a reflection, and so on, that is, as $L^{a_{1}}rL^{a_{2}}rL^{a_{3}}\cdots$ which is clearly isomorphic to $S^{a_{1}}TS^{a_{2}}TS^{a_{3}}\cdots$ from above. Evaluating some explicit sequence of $L_{D},R_{D}$ at the function argument $x=1$ gives a dyadic rational; explicitly, it is equal to $y=0.b_{1}b_{2}b_{3}\cdots b_{m}$ where each $b_{k}\in \{0,1\}$ is a binary bit, zero corresponding to a left move and one corresponding to a right move. The equivalent sequence of $L_{C},R_{C}$ moves, evaluated at $x=1$ gives a rational number $p/q.$ It is explicitly the one provided by the continued fraction $p/q=[a_{1},a_{2},a_{3},\cdots ,a_{j}]$ keeping in mind that it is a rational because the sequence $(a_{1},a_{2},a_{3},\cdots ,a_{j})$ was of finite length. This establishes a one-to-one correspondence between the dyadic rationals and the rationals.

### Periodic orbits of the dyadic transform

Consider now the periodic orbits of the dyadic transformation. These correspond to bit-sequences consisting of a finite initial "chaotic" sequence of bits $b_{0},b_{1},b_{2},\cdots ,b_{k-1}$ , followed by a repeating string $b_{k},b_{k+1},b_{k+2},\cdots ,b_{k+m-1}$ of length $m$ . Such repeating strings correspond to a rational number. This is easily made explicit. Write

$y=\sum _{j=0}^{m-1}b_{k+j}2^{-j-1}$ one then clearly has
$\sum _{j=0}^{\infty }b_{k+j}2^{-j-1}=y\sum _{j=0}^{\infty }2^{-jm}={\frac {y}{1-2^{m}}}$ Tacking on the initial non-repeating sequence, one clearly has a rational number. In fact, every rational number can be expressed in this way: an initial "random" sequence, followed by a cycling repeat. That is, the periodic orbits of the map are in one-to-one correspondence with the rationals.

### Periodic orbits as continued fractions

Such periodic orbits have an equivalent periodic continued fraction, per the isomorphism established above. There is an initial "chaotic" orbit, of some finite length, followed by the a repeating sequence. The repeating sequence generates a periodic continued fraction satisfying $x=[a_{n},a_{n+1},a_{n+2},\cdots ,a_{n+r},x].$ This continued fraction has the form

$x={\frac {\alpha x+\beta }{\gamma x+\delta }}$ with the $\alpha ,\beta ,\gamma ,\delta$ being integers, and satisfying $\alpha \delta -\beta \gamma =\pm 1.$ Explicit values can be obtained by writing
$S\mapsto {\begin{pmatrix}1&0\\1&1\end{pmatrix}}$ for the shift, so that
$S^{n}\mapsto {\begin{pmatrix}1&0\\n&1\end{pmatrix}}$ while the reflection is given by
$T\mapsto {\begin{pmatrix}-1&1\\0&1\end{pmatrix}}$ so that $T^{2}=I$ . Both of these matrices are unimodular, arbitrary products remain unimodular, and result in a matrix of the form
$S^{a_{n}}TS^{a_{n+1}}T\cdots TS^{a_{n+r}}={\begin{pmatrix}\alpha &\beta \\\gamma &\delta \end{pmatrix}}$ giving the precise value of the continued fraction. As all of the matrix entries are integers, this matrix belongs to the projective modular group $PSL(2,\mathbb {Z} ).$ Solving explicitly, one has that $\gamma x^{2}+(\delta -\alpha )x-\beta =0.$ It is not hard to verify that the solutions to this meet the definition of quadratic irrationals. In fact, every quadratic irrational can be expressed in this way. Thus the quadratic irrationals are in one-to-one correspondence with the periodic orbits of the dyadic transform, which are in one-to-one correspondence with the (non-dyadic) rationals, which are in one-to-one correspondence with the dyadic rationals. The question mark function provides the correspondence in each case.

## Properties of ?(x)

The question-mark function is a strictly increasing and continuous, but not absolutely continuous function. The derivative is defined almost everywhere, and can take on only two values, 0 (its value almost everywhere, including at all rational numbers) and $+\infty$ . There are several constructions for a measure that, when integrated, yields the question-mark function. One such construction is obtained by measuring the density of the Farey numbers on the real number line. The question-mark measure is the prototypical example of what are sometimes referred to as multi-fractal measures.

The question-mark function maps rational numbers to dyadic rational numbers, meaning those whose base two representation terminates, as may be proven by induction from the recursive construction outlined above. It maps quadratic irrationals to non-dyadic rational numbers. In both cases it provides an order isomorphism between these sets, making concrete Cantor's isomorphism theorem according to which every two unbounded countable dense linear orders are order-isomorphic. It is an odd function, and satisfies the functional equation ?(x + 1) = ?(x) + 1; consequently x → ?(x) − x is an odd periodic function with period one. If ?(x) is irrational, then x is either algebraic of degree greater than two, or transcendental.

The question-mark function has fixed points at 0, 1/2 and 1, and at least two more, symmetric about the midpoint. One is approximately 0.42037. It was conjectured by Moshchevitin that they were the only 5 fixed points.

In 1943, Raphaël Salem raised the question of whether the Fourier–Stieltjes coefficients of the question-mark function vanish at infinity. In other words, he wanted to know whether or not

$\lim _{n\to \infty }\int _{0}^{1}e^{2\pi inx}\,\operatorname {d?} (x)=0.$ This was answered affirmatively by Jordan and Sahlsten, as a special case of a result on Gibbs measures.

The graph of Minkowski question mark function is a special case of fractal curves known as de Rham curves.

## Algorithm

The recursive definition naturally lends itself to an algorithm for computing the function to any desired degree of accuracy for any real number, as the following C function demonstrates. The algorithm descends the Stern–Brocot tree in search of the input x, and sums the terms of the binary expansion of y = ?(x) on the way. As long as the loop invariant qrps = 1 remains satisfied there is no need to reduce the fraction m/n = p + r/q + s, since it is already in lowest terms. Another invariant is p/qx < r/s. The for loop in this program may be analyzed somewhat like a while loop, with the conditional break statements in the first three lines making out the condition. The only statements in the loop that can possibly affect the invariants are in the last two lines, and these can be shown to preserve the truth of both invariants as long as the first three lines have executed successfully without breaking out of the loop. A third invariant for the body of the loop (up to floating point precision) is y ≤ ?(x) < y + d, but since d is halved at the beginning of the loop before any conditions are tested, our conclusion is only that y ≤ ?(x) < y + 2d at the termination of the loop.

To prove termination, it is sufficient to note that the sum q + s increases by at least 1 with every iteration of the loop, and that the loop will terminate when this sum is too large to be represented in the primitive C data type long. However, in practice, the conditional break when y + d == y is what ensures the termination of the loop in a reasonable amount of time.

/* Minkowski's question-mark function */
double minkowski(double x) {
long p = x;
if ((double)p > x) --p; /* p=floor(x) */
long q = 1, r = p + 1, s = 1, m, n;
double d = 1, y = p;
if (x < (double)p || (p < 0) ^ (r <= 0))
return x; /* out of range ?(x) =~ x */
for (;;) { /* invariants: q * r - p * s == 1 && (double)p / q <= x && x < (double)r / s */
d /= 2;
if (y + d == y)
break; /* reached max possible precision */
m = p + r;
if ((m < 0) ^ (p < 0))
break; /* sum overflowed */
n = q + s;
if (n < 0)
break; /* sum overflowed */

if (x < (double)m / n) {
r = m;
s = n;
} else {
y += d;
p = m;
q = n;
}
}
return y + d; /* final round-off */
}


## Probability distribution

Restricting the Minkowski question mark function to  ?:[0,1] → [0,1], it can be used as the cumulative distribution function of a singular distribution on the unit interval. This distribution is symmetric about its midpoint, with raw moments of about m1 = 0.5, m2 = 0.290926, m3 = 0.186389 and m4 = 0.126992, and so a mean and median of 0.5, a standard deviation of about 0.2023, a skewness of 0, and an excess kurtosis about -1.147.

## Conway box function

Since the Minkowski question mark function is a strictly-increasing continuous bijection from the real numbers to the real numbers, it has an inverse function called the Conway box function. So $\square (?(x))=x$ and $?(\square (y))=y$ .

This maps the dyadic rationals to the rationals, and the rationals to the quadratic field. A cardinality argument shows that no matter how many times this is then repeated, almost all real numbers are unboxable in this sense.