In probability theory, one says that an event happens almost surely (sometimes abbreviated as a.s.) if it happens with probability one. The concept is analogous to the concept of "almost everywhere" in measure theory. Although in many basic probability experiments there is no difference between almost surely and surely (that is, entirely certain to happen), the distinction is important in more complex cases relating to some sort of infinity. For instance, the term is often encountered in questions that involve infinite time, regularity properties or infinite-dimensional spaces such as function spaces. Basic examples of use include the law of large numbers (strong form) or continuity of Brownian paths.
The terms almost certainly (a.c.) and almost always (a.a.) are also used. Almost never describes the opposite of almost surely: an event that happens with probability zero happens almost never.
Let be a probability space. An event happens almost surely if . Equivalently, happens almost surely if the probability of not occurring is zero: . More generally, any event (not necessarily in ) happens almost surely if is contained in a null set: a subset of some such that . The notion of almost sureness depends on the probability measure . If it is necessary to emphasize this dependence, it is customary to say that the event occurs -almost surely or almost surely .
"Almost sure" versus "sure"
The difference between an event being almost sure and sure is the same as the subtle difference between something happening with probability 1 and happening always.
If an event is sure, then it will always happen, and no outcome not in this event can possibly occur. If an event is almost sure, then outcomes not in this event are theoretically possible; however, the probability of such an outcome occurring is smaller than any fixed positive probability, and therefore must be 0. Thus, one cannot definitively say that these outcomes will never occur, but can for most purposes assume this to be true.
Throwing a dart
For example, imagine throwing a dart at a unit square (i.e. a square with area 1) wherein the dart will impact exactly one point, and imagine that this square is the only thing in the universe besides the dart and the thrower. There is physically nowhere else for the dart to land. Then, the event that "the dart hits the square" is a sure event. No other alternative is imaginable.
Now, notice that since the square has area 1, the probability that the dart will hit any particular sub-region of the square equals the area of that sub-region. For example, the probability that the dart will hit the right half of the square is 0.5, since the right half has area 0.5.
Next, consider the event that "the dart hits the diagonal of the unit square exactly". Since the area of the diagonal of the square is zero, the probability that the dart lands exactly on the diagonal is zero. So, the dart will almost never land on the diagonal (i.e. it will almost surely not land on the diagonal). Nonetheless the set of points on the diagonal is not empty and a point on the diagonal is no less possible than any other point, therefore theoretically it is possible that the dart actually hits the diagonal.
The same may be said of any point on the square. Any such point P will contain zero area and so will have zero probability of being hit by the dart. However, the dart clearly must hit the square somewhere. Therefore, in this case, it is not only possible or imaginable that an event with zero probability will occur; one must occur. Thus, we would not want to say we were certain that a given event would not occur, but rather almost certain.
Tossing a coin
Consider the case where a coin is tossed. A coin has two sides—heads and tails—and therefore the event that "heads or tails is flipped" is a sure event. There can be no other result from such a coin, assuming that it cannot land on its edge or be plucked out of the sky and never land.
Now consider the single "coin toss" probability space , where the event occurs if heads is flipped, and if tails. For this particular coin, assume the probability of flipping heads is from which it follows that the complement event, flipping tails, has .
Suppose we were to conduct an experiment where the coin is tossed repeatedly, and it is assumed each flip's outcome is independent of all the others. That is, they are i.i.d.. Define the sequence of random variables on the coin toss space, where . i.e. each records the outcome of the 'th flip.
The event that every flip results in heads, yielding the sequence , ad infinitum, is possible in some sense (it does not violate any physical or mathematical laws to suppose that tails never appears), but it is very, very improbable. In fact, the (limit of the) probability of tails never being flipped in an infinite series is zero. To see why, note that the i.i.d. assumption implies that the probability of flipping all heads over flips is simply . Letting yields zero, since by assumption. Note that the result is the same no matter how much we bias the coin towards heads, so long as we constrain to be greater than 0, and less than 1.
Thus, though we cannot definitely say tails will be flipped at least once, we can say there will almost surely be at least one tails in an infinite sequence of flips. (Note that given the statements made in this paragraph, any predefined infinitely long ordering, such as the digits of pi in base two with heads representing 1 and tails representing 0, would have zero-probability in an infinite series. This makes sense because there are an infinite number of total possibilities and .)
However, if instead of an infinite number of flips we stop flipping after some finite time, say a million flips, then the all-heads sequence has non-zero probability. The all-heads sequence has probability , while the probability of getting at least one tails is and the event is no longer almost sure.
Asymptotically almost surely
In asymptotic analysis, one says that a property holds asymptotically almost surely (a.a.s.) if, over a sequence of sets, the probability converges to 1. For instance, a large number is asymptotically almost surely composite, by the prime number theorem; and in random graph theory, the statement "G(n,pn) is connected" (where G(n,p) denotes the graphs on n vertices with edge probability p) is true a.a.s. when pn > for any ε > 0.
- Convergence of random variables, for "almost sure convergence"
- Degenerate distribution, for "almost surely constant"
- Almost everywhere, the corresponding concept in measure theory
- Infinite monkey theorem, a theorem using the aforementioned terms
- p 186, Stroock, D. W. (2011). Probability theory: an analytic view. Cambridge university press.
- Grädel, Erich; Kolaitis, Phokion G.; Libkin, Leonid; Marx, Maarten; Spencer, Joel; Vardi, Moshe Y.; Venema, Yde; Weinstein, Scott (2007). Finite Model Theory and Its Applications. Springer. p. 232. ISBN 978-3-540-00428-8.
- Jacod, Jean; Protter, (2004). Probability Essentials. Springer. p. 37. ISBN 978-3-540-438717.
- Friedgut, Ehud; Rödl, Vojtech; Rucinski, Andrzej; Tetali, Prasad (January 2006). "A Sharp Threshold for Random Graphs with a Monochromatic Triangle in Every Edge Coloring". Memoirs of the American Mathematical Society (AMS Bookstore) 179 (845): 3–4. ISSN 0065-9266.
- Spencer, Joel H. (2001). "0. Two Starting Examples". The Strange Logic of Random Graphs. Algorithms and Combinatorics 22. Springer. p. 4. ISBN 978-3540416548.