Talk:Random variable

From Wikipedia, the free encyclopedia
Jump to: navigation, search
WikiProject Statistics (Rated C-class, Top-importance)
WikiProject icon

This article is within the scope of the WikiProject Statistics, a collaborative effort to improve the coverage of statistics on Wikipedia. If you would like to participate, please visit the project page or join the discussion.

C-Class article C  This article has been rated as C-Class on the quality scale.
 Top  This article has been rated as Top-importance on the importance scale.
 
WikiProject Mathematics (Rated C-class, Top-importance)
WikiProject Mathematics
This article is within the scope of WikiProject Mathematics, a collaborative effort to improve the coverage of Mathematics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
Mathematics rating:
C Class
Top Importance
 Field: Probability and statistics
One of the 500 most frequently viewed mathematics articles.

Definition is not correct[edit]

The text says that there are two types of random variables - discrete and continuous; which is unfortunately not true. If a random variable is not discrete, which means that the set of its realizations is not countable, it does not necessary mean that it has a density. For instance, if {1,2, [3;4]} represents the set of the realizations of a random variable, such that {1}, {2}, [3;4] occur with non-zero probability, then it is not possible to construct a continuous distribution function although the realizations are not countable. —Preceding unsigned comment added by Bcserna (talkcontribs) 12:43, 21 October 2009

Wouldn't your example be considered a "mixed" type? The "mixed" type is mentioned in the last sentence of that same paragraph. (Perhaps, we should clarify that there are actually 3 types, instead of 2?) Jwesley78 (talk) 12:54, 21 October 2009 (UTC)
A variable for which there does not exist a countable set with probability 1 is a continuous variable, whether or not there exists a probability density (with respect to Lebesgue measure), and whether or not there exist individual points with non-zero probability. In connection with the example given by Bcserna, it is true that the distribution function is discontinuous, but "continuous random variable" is not synonymous with "random variable with a continuous distribution function". It is true that most commonly when people refer to a "continuous random variable" they have in mind an absolutely continuous distribution with a probability density, but this is not a requirement. However, the present wording of this section of the article is faulty in more than one respect, for example because it asserts that in the continuous case the probability of any one value is always zero, and because it asserts that in the discrete case the probability of one value is never zero. I shall try rewording it in an attempt to clarify the issue. JamesBWatson (talk) 20:51, 22 October 2009 (UTC)
But the article also asserts that: "This categorisation into types is directly equivalent to the categorisation of probability distributions". This is contradictory with the given definition of a continuous random variable. --Tomek81 (talk) 18:30, 18 March 2010 (UTC)
I think the point in the above example is that one of the possible outcomes is, in some sense, interval-valued. Melcombe (talk) 10:01, 23 October 2009 (UTC)
I don't understand that. What do you mean by "outcomes"? For any random variable on a subset of the real line events can be intervals, whether the variable is continuous or discrete). The point of the example, as I understand it, is that it is possible to have a distribution in which part of the probability is concentrated on individual points (as in a discrete variable) and part spread out over an individual (as in a typical continuous distribution). Such a distribution, as Bcserna correctly realized, does not have a continuous distribution function. Nevertheless the expression "continuous random variable" includes variables having such distributions. JamesBWatson (talk) 13:58, 26 October 2009 (UTC)
My reading of the OP was that [3;4] represented an interval, and that a possible outcome/observation is the interval [3;4], so that the value of the ransom variable would be the interval [3;4]. Of course there would be ways of data-coding that could indicate that and lead to a discrete distribution. It is unfortunate that the article doen't start by saying that it starting with/only about scalar, real-valued random variables. I have recently added a "see also" to multivariate random variable, but it is not good as an extension of what is here either. Melcombe (talk) 16:56, 26 October 2009 (UTC)
I think the paragraph is fine as it stands, does anyone other than the original poster disagree with that? 018 (talk) 18:51, 18 March 2010 (UTC)
Yes, I disagree. The definition of continuous random variables is not standard. Plus, with the given non-standard definition, Example 1 under Functions of random variables is incorrect, since that example assumes P(X=c)=0 for continuous variables (which should have been used as the definition, but it's not).--216.239.45.4 (talk) 19:00, 18 March 2010 (UTC)
So you think that the reference (Rice's textbook) is non-standard? Rice is, as far as I can tell, the most common intro stats text book. What is the sense of standard if Rice is non-standard? 018 (talk) 22:33, 18 March 2010 (UTC)

How a definition can be incorrect when we don't have a definition? Lead section contains just some introductory notes. Anyways, there are 3 “pure” types of random variables: discrete, (absolutely) continuous, and singular. Any random variable is representable as a sum (or mixture) of these three. And the sum means in the sense of probability distributions: F_X = \alpha F_D + \beta F_C + (1-\alpha-\beta) F_S. Reference: Lukacs (1970). Characteristic functions. London: Griffin.  // stpasha »  19:13, 18 March 2010 (UTC)

Are singular random variables a pathological example? i.e. we usually don't state things in introductory articles like, any function with countable singularities... maybe that would be better for a "see also." 018 (talk) 22:49, 18 March 2010 (UTC)
Singular cdf is continuous but non-differentiable. This is indeed “pathological example” in the sense that such rv's cannot be observed “in practice”. And that’s whe i’m saying that we need a proper definition session, because this stuff cannot be discussed in the lead.
The OP of this thread is correct: the claim in the lead is incorrect, and if it is supported by a reference then the reference is incorrect (or misinterpreted).  // stpasha » 
I cited a paragraph in the reference (by page). Why don't you read it and decide what you think. If it is misinterpreted, lets change it. Otherwise, I think we have to pick our reference and I'd argue for Rice's book over Lukacs and I think we could count references or count classes that use it as a basis for choosing. 018 (talk) 23:12, 18 March 2010 (UTC)
I don't have either Rice(1999) or Lukacs(1970) books right now, however an authoritative reference would be something like Billingsley(1995) (which incidentally i don't have either) — so it seems a trip to the library is in order. But regardless of what the correct definition is, i think the entire discussion of discrete/continuous/mixed topic should be move out of the Lead section into probably its own. The topic is just too subtle and perhaps too complicated for the lead.
As of right now, the lead says only about where the random variables are used, but fails to mention what a random variable IS. I see this as a gargantuan drawback.  // stpasha » 
Actually, I just reread Rice, and I don't think it really comments on the question at hand. I agree with you that the issue of the definition is a little problematic. Part of the problem is that the definition of a random variable is not intuitive so I think the idea is to not come out and say it because it will confuse and require lots of explaining. BTW, I'm not defending this as a good writing style. 018 (talk) 00:26, 19 March 2010 (UTC)

The introduction contradicts itself. It claims that "There are two types of random variables: discrete and continuous." Later it claims that "For a continuous random variable, the probability of any specific value is zero." It is not true that for every non-discrete random variable, the probability of a specific value is zero. Later in the same paragraph such "mixed" variables which are neither discrete nor continuous are mentioned, which contradicts the statement that there are only discrete and continuous variables. Tomek81 (talk) 20:04, 21 November 2010 (UTC)

Codomain of a random variable: observation space?[edit]

See Wikipedia talk:WikiProject Mathematics#Codomain of a random variable: observation space?. Boris Tsirelson (talk) 16:53, 27 March 2010 (UTC)

random variables don't have to be real-valued![edit]

This treatment of random variables is just weird for an encyclopedia article on the subject. Many people daily use random variables that are integer-valued, vector-valued, sequence- or string-valued, complex-valued, function-valued (e.g., a process), state-valued (e.g., in a Markov chain over some arbitrary state set), etc. Yet the article seems to have been written by someone with a peculiar focus on real-valued random variables:

A random variable can be thought of as an unknown value that may change every time it is inspected. Thus, a random variable can be thought of as a function mapping the sample space of a random process to the real numbers. [And then the section goes on to encode the events "heads" and "tails" as 0 and 1, as if "heads" and "tails" weren't perfectly good ways to describe the outcomes already.]

a random variable is a (total) function whose domain is the sample space, usually mapping events to real numbers.

Typically, the observation space is the real numbers with a suitable measure.

In addition, the section on "equivalence of random variables" also restricts itself to real-valued RVs without stating this restriction. (Presumably this material should be part of the section on "real-valued random variables," if it belongs in this article at all.)

I think this is quite misleading. First, it suggests that real-valued RVs may have some kind of privileged status in the theoretical setup, or at the very least are more convenient. Second, it gives a falsely narrow picture of how random variables are used in practice. Third, it is likely to confuse the kind of naive reader who doesn't yet understand what a random variable is -- all this discussion of how outcomes are real numbers will encourage a naive reader to confuse outcomes with the other objects in probability theory that really do have to be real numbers, such as measures of sets. (Concretely: The reader is told that an RV is a function mapping events to real numbers, and this is illustrated with a die example involving two functions, one of which returns integer values like 1,2,3,4,5,6, the other of which returns obviously real values like 1/6. It sure looks like the second one ought to be the random variable!)

How about revising the article to emphasize rather than attempt to suppress the diversity of domains for random variables? (I am employing the common usage of the term "domain", a usage that should probably be noted in the article: One often says that the "domain" of a boolean-valued RV is {true,false}, even though properly speaking, the RV is a function whose range is {true,false}.) Eclecticos (talk) 07:44, 1 June 2010 (UTC)

I agree with most of the above, and I have made a few changes in the article to address some of the valid criticisms that Eclecticos has made. I am sure further changes along the same lines could usefully be made. However, I do think that for the inexperienced reader it is helpful to emphasise examples using real-number valued variables. Expressing the ideas in the more abstract setting of measurable sets is likely to completely lose the majority of readers, who have no idea what a "measurable set" is, and do not have the mathematical background to pick up that understanding by simply following a link to the Wikipedia article on the topic. Another good reason for emphasising real-valued examples is that in practice most people using random variables are using real-valued random variables. It seems to me to be a question of striking a balance: to suggest (as parts of the article seemed to) that random variables inherently have real numbers as their values is certainly unhelpful, but to give special prominence to real-valued variables is likely to be helpful. One of the changes I have made is to express two examples in non-numerical form, and then to show that they can be expressed as real numbers if desired. However, adding some more non-real-valued examples to the article would be helpful too, particularly ones which cannot usefully be expressed as real numbers. JamesBWatson (talk) 11:11, 1 June 2010 (UTC)
We've discussed this issue before, and nobody was able to come up with a reliable reference for the definition were a random variable would be non-real (see the random element though). The problem here probably is because most authors would like to speak of the expected value of the random variable, which might not exist for an arbitrary “observation space”. // stpasha » 04:09, 2 June 2010 (UTC)
I agree with Stpasha. Boris Tsirelson (talk) 06:46, 2 June 2010 (UTC)
Well, I must say that I was puzzled when I first read that a random variable would be real-valued by definition. (I remember this from the Handbook for Applied Cryptography.) But that's not what I learned in my math lectures, AFAIR. And it doesn't really make too much sense to restrict it to real numbers, as pointed out by Eclecticos. It seems more like a convenience matter to make it real-valued (and this would in fact cover integers and rationals as well). But see Wolfram's definition of a random variable. Nageh (talk) 07:15, 2 June 2010 (UTC)
I'd say, Wolfram's text is a bit Bourbaki-ish. I like it when speaking to (good) students of math department. But I did not expect it to be preferred in a (non-mathematical) encyclopedia. Boris Tsirelson (talk) 08:18, 2 June 2010 (UTC)
I wouldn't say it is preferred. But as for an encyclopedia I think it should cover both aspects. I think it should still provide a rather generic definition first, without going into details, and state that random values are often considered real-valued functions. Later in the article an exact formulation regarding measurable spaces should be provided. This means I mostly agree with the current situation of the article (rather than the previous one). Nageh (talk) 12:02, 2 June 2010 (UTC)
I think usage varies greatly. But one respected dictionary of statistics does specifically define "random variable" as "a real-valued function on a sample space", going on to distinguish discrete and continuous random variables. It has separate entries for "random process", "random event", "random series" and "random linear graph". However it is probably more important that this article fits in with other wikipedia articles which, I think, take the approach that a random variable is real-valued unless specifically stated (as being multi-dimensional or complex-valued). The question of what "random variable" should mean in Wikipedia articles seems an important one and it might be good to aim eventually for some statement in something like Wikipedia:WikiProject Mathematics/Conventions. Personally, I prefer the scalar-valued interpretation of "random variable", with other possibilities stated explicitly. Melcombe (talk) 10:08, 2 June 2010 (UTC)
I agree with Melcombe. Let me remind that I have checked 4 books and all 4 define "random variable" as "a real-valued function on a sample space". (Believe me I did not post-select; these were all appropriate books on my shelf; or do not believe and check yourself on your shelf.) It would be quite inconvenient to say in the majority of cases "the expectation of a real-valued random variable", "the median of a real-valued random variable", "the sum of two real-valued random variables" etc etc. (all these notions are inapplicable to random elements of general measurable spaces). It is just not optimal, to make many cases harder while rare cases easier. Boris Tsirelson (talk) 14:59, 2 June 2010 (UTC)
Because it does not actually simplify, but even contradicts common sense. If you draw a ball from an urn, and X denotes a random variable describing its color, then you'll want to compute P(X=red), and not map it to a real value first. Even though random variables are repeatedly defined as real-valued, it really was confusing me when I saw that definition first. Sure you may compute expected values etc. but for some spaces it simply doesn't make sense. (What is the expected color of a ball drawn at random?) And AFAICT, simple examples like drawing colored balls from urns are those you start with when you learn probability theory and stochastics in school/introductory lectures. Nageh (talk) 15:27, 2 June 2010 (UTC)
As for me, you talk about a random color (more formally, a random element of the set of colors), which is completely legitimate; but why call it random variable?
Surely you have something to reply, but anyway: two different points of view are presented, each having its strong and weak features. What now? Should one of them be chosen? Or both, combined? In which proportion? According to the number of supporting WP editors? To their insistence? To the number of supporting reliable sources? Something else? Boris Tsirelson (talk) 17:49, 2 June 2010 (UTC)
From the point of view of analysis, the “random variable” ball color is not particularly interesting. The only thing which is interesting is the event “ball color = red”, which is already a binomial random variable, meaning that it takes real values 0 and 1. Ok, it's probably hard to see the point in this simple example, so consider another one. Suppose we have a “random variable” shape of an amoeba cell. If you're a biologist you might be highly interested in this random variable, you may observe it as many times as you want under your microscope, you may take pictures and write long papers about it. But if you want to actually analyze this “random variable”, you'll have to cast it into the domain of real numbers first. Thus you'll have an area of amoeba, its circumference, its diameter, its surface energy (a measure of curvature), etc. With these real-valued random variables we can already calculate summary statistics, run regressions, draw scatterplots — that is, apply the statistical analysis. With the shape “random variable” we can't do anything except taking pretty pictures. // stpasha » 08:43, 3 June 2010 (UTC)
Yes, pretty pictures. Also some highly abstract math: introduce an appropriate space of all "shapes", endow it with a sigma-algebra, consider probability measures on it, prove that they lead to standard probability spaces etc. In fact, I like this math; but a typical reader of WP likes it much less, I believe. Boris Tsirelson (talk) 10:08, 3 June 2010 (UTC)
What is a typical WP reader? See, even though I rely on a math background, I am not a mathematician. And I can only repeat that I find it highly confusing when random variables are introduced as real-valued "by definition". What is wrong with stating them as functions into some observation space, and pointing out that often this can be considered R? Anyway, I don't vote here, and I don't insist, just expressing an opinion. Nageh (talk) 10:27, 3 June 2010 (UTC)
It would be helpful to insert the amoeba example to this article and/or to the "random element" article. It could clarify the relation between random variables and random elements. Boris Tsirelson (talk) 11:09, 3 June 2010 (UTC)
Stpasha, a few remarks in response. You focus on statistics but random variables start out as part of probability theory; they just have applications to statistics. Let A (for "amoeba") be your shape-valued random variable. First, there are many reasonable things to ask about A itself without extracting real numbers from it, such as its entropy, or its mutual information with another random variable (which represents the same amoeba 1 second earlier, or a polygon that approximates A). Second, you are pointing out that C = circumference(A) and D = diameter(A) are also random variables and are real-valued. That is true, but it just highlights why it is important to regard A to be a random variable as well! If you know the distribution of A, then you immediately obtain the joint distribution over the pair (C,D); you may not be able to define the distribution over (C,D) without reference to A. Third, you suggest that you can't "do any statistics" with A except through the lens of real-valued variables like C and D. But that seems odd to me. For example, I'm sure you'd agree that estimators and decision rules are part of statistics. Suppose you have an estimator for the circumference C based on noisy observations A1, A2, ..., which are themselves shape-valued random variables that are conditionally independent given A. The bias, variance, risk of this estimator involve integrating over the possible values for A1, A2, ... directly. Eclecticos (talk) 12:43, 20 June 2010 (UTC)

Hi, original poster here (sorry to have started this discussion and then not checked back). I think it will be clearer for me to respond collectively rather than inline.

Stpasha asks for a reliable reference for the general definition of random variables. Here is one (via Google Books) from a textbook, Fristedt & Gray (1996), p. 11. (This was merely the first book I tried: I happened to run into a textbook via Google when checking the definition of conditionally exchangeable sequence, so I checked its definition of random variable.)

My guess is that any graduate-level probability theory textbook would use this general definition, because it is laying out the theoretical foundations of the field. However, I understand that some statistics textbooks (particularly introductory or applied ones) will focus on the real-valued case. That is the focus of traditional statistics and is rich enough to fill a first textbook with theorems about moments, the bias and variance of estimators, particular distributions over reals, etc.

Traditional statistics aside, however, modern statisticians often deal with random variables that are not real-valued. I think the relevant question is this: If you are a statistician and you want to refer to such an object, what do you call it? Well, I have read hundreds of machine learning papers that refer to such objects as random variables ... whereas I have never seen the term "random element" used as a standalone technical term (it is only used in a phrase like "random element of set S," meaning a random variable ranging over the elements of S). People do refer to random sequences, random graphs, etc., but these are understood to be special cases of random variables: "random graph" is just short for "graph-valued random variable."

I gather from the random element article that when Fréchet (1948) generalized the classical definition of "random variable," he thought it would be less confusing to his readers if he introduced a new term "random element" for the generalization. However, I am asserting that modern usage simply does not bother to make this distinction.

I have participated in several oral qualifying examinations of Ph.D. students where the student stumbled on the type of the mathematical objects involved, and so was asked by one or another statistics professor to give the formal definition of a random variable. The desired answer was always simply that it is a measurable function on a probability space. Not a measurable real-valued function. In fact, in these exams, the random variable that triggered the question typically ranged over sequences or graphs or whatever was being studied in the student's thesis proposal.

Boris asks how we should make the decision about this article. My take is this: When a first-year computer science graduate student is struggling to read a research paper, and is using the web to fill in gaps in his or her prob/stats background, the first Google hit for "random variable" is this article. So this article should make it easier, not harder, for the student to understand the paper. The papers that I give my first-year graduate students almost always use "random variable" in the broader sense, so the current article will only confuse them (i.e., it is worse than useless to them). Nor can I usefully email this link to current students to remind them of what a random variable is formally. I would be happy to cite examples of current papers that require the broader definition.

Several people are rightly worrying about presentation. I think a good order would (1) give an intuitive notion of random variables as unknown quantities (and mention that they can be assumed to be real-valued unless otherwise specified), (2) observe that to talk about multiple correlated random variables they all need to be functions of the same underlying outcome space, (3) give the formal measure-theoretic definition of required properties of these functions, (4) give examples. Eclecticos (talk) 12:43, 20 June 2010 (UTC)

Well, I have only one objection — to the phrase "My guess is that any graduate-level probability theory textbook would use this general definition". When starting this discussion I just listed four books that do the opposite (and these were just all the relevant books on my shelf at that moment). But I agree that Fristedt & Gray is a good example on the other side. About "several oral qualifying examinations of Ph.D. students", well, I have no such statistics; probably you are right. About "a good order" (1–4) I have no objections. Boris Tsirelson (talk) 15:42, 20 June 2010 (UTC)
Beyond Fristedt & Gray, I think I could quickly turn up several other references supporting my usage (on WP itself, by the way, see the discussion at random variate, which defines random variable as I do). Here's a clear statement on p. 40 of Michael I. Jordan's 2005 NIPS tutorial: "In elementary probability theory, random variables are defined as functions whose ranges are the reals. In more advanced probability theory, one lets random variables range over more general spaces, including function spaces and spaces of measures."
Just for the record: my first note and discussion, further discussion. And here are the four books: "Probability: theory and examples" by Richard Durrett, "Probability with martingales" by David Williams, "Theory of probability and random processes" by Leonid Koralov and Yakov Sinai, and "Measure theory and probability theory" by Krishna Athreya and Soumendra Lahiri. Not at all books for statisticians! Just graduate textbooks in probability. Boris Tsirelson (talk) 15:52, 20 June 2010 (UTC)
Boris: thanks for the refs. It looks like I have to withdraw my statement that in modern usage, "random variable" is always taken to be the more general term and encompasses the notions of "random vector", "random sequence", etc.
  • Athreya and Lahiri (2006) restrict "random variable" to real-valued even though random vectors, sequences, etc. are very much within the scope of the book -- they define all of these terms on p. 191 as distinct special cases of measurable functions.
  • Koralov & Sinai (2010) go even farther: they actually restrict even the term "measurable function" to real-valued functions! However, they are inconsistent: when they get around to defining "random process" and "random field" on p. 171, they say that "all the random variables X_t [in the process or field] are assumed to take values in a common measurable space ... in particular, we shall encounter real- and complex-valued processes, processes with values in R^d, and others with values in a finite or countable set [emphasis mine]." So they seem to assume the general efinition when they need it.
I think the lesson is that terminology varies. If you're writing a textbook, you have room to include definitions, and then you are entitled to define your terms in a way that will allow you to state your theorems in as few words as possible. If you have one chapter of theorems that are specifically about random real numbers and another chapter specifically about random sequences, then you might want to have separate short names for those things. I think this is why Athreya & Lahiri choose their terminology as they do. On the other hand, if your theorems are about conditional independence, inference, and sampling, as in the graphical models literature, then your results are mainly indifferent to the types of the random objects involved, and so you want a term that is general enough to cover any type. The term usually used in this general setting, in my experience, is "random variable" (and not "random element" or "random object" or "measurable function"). Eclecticos (talk) 03:26, 21 June 2010 (UTC)
Eclecticos, when I think of a random variable, I think of something I can manipulate with the typical machinery. Can you help me out: in the amoeba example, A is the "shape" of an amoeba and C is its circumference, can you please tell me what the definition and meaning of E(A), Var(A), Cov(A,C) is. Thanks. 018 (talk) 16:10, 20 June 2010 (UTC)
Oh, those expressions are not defined -- they're not well-typed, unless one has defined appropriate sum and product operations on shapes. Like ordinary variables in math or in computer science, random variables have types, which indicate what you can do with them. (Similarly, you may reasonably ask about the distribution of the random quantity C+3, but there is no random quantity A+3 since that expression is not well-typed. On the other hand, there is a random quantity rotate(A,π). Closer to your question, there is a conditional expectation of C given A, but not vice-versa.) Eclecticos (talk) 03:26, 21 June 2010 (UTC)
Anyway, I think the amoeba discussions are somewhat tangential. You and stpasha are asking the interesting question of why statisticians would ever want to theorize about random quantities that are not real-valued. But the fact is that they do (sometimes under names like "random measure," "random process", "random field", "random complex number," etc.). The question here is just about terminology. Eclecticos (talk) 03:26, 21 June 2010 (UTC)
Eclecticos, there are a few different types of generalizations of RVs. There are generalizations that allows E(X) to make sense: vector valued, complex valued, integer valued. Then there are the ones where E(X) makes no sense at all (amoeba valued). The point is the terminology is specifically designed to make it so that all RVs have the right "type" so that E(X) is defined. I do see the value of thinking of the frontier of an amoeba as a shape and performing operations on it. But my point is semantic: RVs are the real valued things (though they generalize slightly). 018 (talk) 03:49, 21 June 2010 (UTC)
Now the article is inconsistent. I see, we took the burden to say "real valued" whenever needed. (This is what I tried to avoid but was not convincing.) Now please add it to the "Moments" section, and "Equality in mean "section, and "Convergence" section, and whenever needed (and not only in this article). And what about "Functions of random variables"? Either random variables should be real-valued, or functions should be general. Boris Tsirelson (talk) 13:19, 21 June 2010 (UTC)

Okay, so the current version of the page now has this general definition but never uses anything but random variables (remember, the real valued is implied). Clearly we need a section on properties of non-real-valued random variables and examples. 018 (talk) 16:41, 21 June 2010 (UTC)

We need to merge the random element article into this one, since both names can in fact be used interchangeably.
There is a bigger problem with the explanation of the concept of a random variable though. There seems to be a huge gap between the mathematical definition of a random variable, as a function which maps the sample space into the “observation space”, and the real-life examples of random variables, such as age, height, temperature, income, race, etc. The problem is mainly that the “sample space” is a highly theoretical construct, and is not actually observable in the real life. Usually we rely on the Kolmogorov’s theorem, which states that given a distribution function one can construct the sample space and the measurable mapping X such that the resulting random variable will have the desired distribution. I’m not sure if this kind of theorem can be extended to our new generic definition of a random variable, but probably it can. Anyways, the meaning of this theorem is that we can alternatively define random variables in terms of their distribution functions (a much more comprehensible definition), so that the Kolmogorov’s theorem becomes the theorem about the equivalence of two definitions.  // stpasha »  07:55, 22 June 2010 (UTC)
Why?? Just the opposite. The age and height are functions from the population (endowed with the uniform distribution) to the real line. This is the simplest case, easy to understand. The next step is, replacing the finite actually existing population with a more abstract, usually continuous set of possibilities; but the philosophy is the same: each point of a probability space is a possible man/woman. And ultimately: each point of a probability space is a possible state of the Universe. Kolmogorov's theorem is a technical mean needed only for infinite dimension. For dimension two (age and height) the plane endowed with the given joint distribution IS such a probability space. What is the problem? Boris Tsirelson (talk) 10:33, 22 June 2010 (UTC)
I'm not a big fan of the finite-population case, so let's talk about the random variable “height of a person chosen at random in year 2100”. Then the sample space Ω will be the space of all potential persons who could live in 2100. Then an elementary element ω will correspond to a single possible person. At this point my imagination betrays me. What could possibly be the sigma-algebra on such set Ω? In order to check that the random variable “height” conforms to the definition, we need to know this sigma-algebra... How to specify the probability function on Ω? Once we have this function we can push-forward it to the observation space and determine the distribution of the height r.v. I don't think these are simple questions, especially since we know that people don't just pop into our universe out of nowhere, but instead are generated through some very nontrivial processes, and thus can themselves be seen as random variables, say, originating from the space of DNA configurations.  // stpasha »  13:15, 23 June 2010 (UTC)
stpasha, Deming wrote a piece on treating a census as a sample that you might find enlightening. 018 (talk) 14:19, 23 June 2010 (UTC)
stpasha, do not be more serious than it is usually made. Philosophically, it is the space of possibilities. But mathematically in each specific case we know what to ask. As I wrote, if we want to discuss age and height, then the plane endowed with the given joint distribution is such a probability space. And, yes, if afterwards we decide to add the income, then we replace the probability space "on the fly". This is usual. We have a canonical, measure preserving, map from the new space to the old space, and so we transfer the old random variables to the new space, and forget the old one. And if we add infinitely many coordinates then indeed Kolmogorov theorem is useful. Otherwise Lebesgue-Stieltjes measures are what we need (or even less, just integration, if the joint density exists). Well, you may be dissatisfied by this approach, but then check whether this is your POV or you have sources in support. Boris Tsirelson (talk) 14:52, 23 June 2010 (UTC)
Which is more-or-less exactly my point. “Not being more serious than it is usually made” means that there are certain conventions how the definition has to be applied in practice, and this layer of explanation is missing from the article. Say, if a simple person, who is not a probability guru, comes to this page and tries to figure out whether “the height” is indeed a random variable or not, using the definition provided, then I'm afraid he/she won't succeed. (Oh, and I don't know why I thought the Skorokhod's construction was called the Kolmogorov's theorem >.>)  // stpasha »  05:22, 25 June 2010 (UTC)
You are welcome to help to "the simple person" by explanation... Boris Tsirelson (talk)

Proposed middle ground[edit]

How about we define it like this: random variable has the reals (or some subset) as its domain and then say there is a generalization of random variable that allows the range to be any measurable space, noting that some authors simply call this a random variable. This makes expositional sense because it starts simple and then adds levels of complexity. In the words of Eclecticos, for students from classes similar to the ones I took, "this article should make it easier, not harder, for the student to understand the paper." In it's current state, it fails that test. But we can make it work for both students and keep in line with the sources.018 (talk) 14:12, 22 June 2010 (UTC)

This discussion alludes to random sequences among other random variables in the broad sense. It seems to me that our article Random sequence, after its second sentence, shifts attention to sequences of values (candidate random variates) rather sequences of functions (candidate random variables). Primarily it concerns assessing the randomness of particular binary sequences (domain N or one of its elements, codomain {0,1}), which seems to me the proper scope of our article Algorithmic randomness. Right?
What's at stake here? Is it to improve this article standalone or to agree an expository strategy for all articles in some sense? (Probability and statistics may be a candidate scope of agreement.) If the latter, then our hope is that editors generally work along with this article and perhaps its cousin Randomness (where "Random" redirects); links to this article or these two will be generally useful to readers of all articles. --P64 (talk) 22:55, 1 August 2010 (UTC)
Algorithmic probability has its own terminology, not always consistent with that of "classical" probability. Usually the context suggests which one is used. Boris Tsirelson (talk) 08:00, 2 August 2010 (UTC)

I really hate this article now. What is a moment generating function now? What is a moment? What exactly is the disadvantage of including the broader definition in a section about a broader distribution? If we don't most the of article needs to be qualified. 018 (talk) 23:12, 21 August 2010 (UTC)

the lead is a monster[edit]

From WP:LEAD, "The lead should contain no more than four paragraphs..." I would also point out, I don't think monster 400 word paragraph is what was envisioned. I'm not a good person to do this because I think the present emphasis on the general definition is counter productive and an example of how Wikipedia math articles are written for math Ph.D. students. 018 (talk) 03:01, 18 October 2010 (UTC)

OK, check out the current version and tell me what you think.Benwing (talk) 05:50, 18 October 2010 (UTC)
Definitely, better. Some remarks:
(a) "arbitrary types such as sequences, trees, sets, shapes, manifolds and functions" — why only two of these (trees and functions) are linked?
(b) "The expected value and variance of aggregations such as random vectors and random matrices is defined as the aggregation of the corresponding quantity computed over each individual element." — this is true for the expected value (since it is linear) but not variance (since it is quadratic); the natural counterpart of variance for a random vector is its covariance matrix, not just its diagonal, the vector of variances.
(c) "That is, given measurable spaces (Ω,Σ) and (Ω',Σ'), a random variable is a function" — no, it is a function from a probability space (not just a measurable space!) to a measurable space.
Boris Tsirelson (talk) 06:38, 18 October 2010 (UTC)
OK, I've tried to address these concerns. The definition in (c) was not mine, and seems to duplicate the formal definition below, so I just shortened it. Benwing (talk) 08:34, 18 October 2010 (UTC)
Nice. Boris Tsirelson (talk) 09:18, 18 October 2010 (UTC)
That is much better. Thanks. 018 (talk) 14:57, 18 October 2010 (UTC)
I completely agree that the lede is too big. Also, nowhere in the lede is the actual definition of a random variable: that it is a measurable function between measurable spaces. The definition below is *wrong* (it says from a probability space, but you don't need a distribution to be a random variable.) I'm going to invest half an hour in the lede, please don't revert again without discussion. MisterSheik (talk) 01:34, 21 October 2010 (UTC)
Also, a good reference for probability theory is Durret's book, which is freely available: http://www.math.duke.edu/~rtd/PTE/pte.html. See page 12.

Removing prob. distribution stuff[edit]

I removed most of the probability distribution stuff since the definition of distributions should be in that article. This section was good, but doesn't belong here:

Er, OK, but where did you put it? I don't see the examples in probability density function. I also think that if you take this stuff out, you need to include a reference in this article to where to find it. At the minimum, you need a section "Functions of random variables" that includes a short discussion of what this means and a link to where to find a further discussion, including how to compute the density of a function of a random variable. Similarly there should be a section "Moments of random variables". Basically, everything that is relevant to random variables needs to be mentioned in this article; just because the detailed discussion might be more relevant elsewhere doesn't mean the issue should be removed entirely. Benwing (talk) 03:37, 22 October 2010 (UTC)
I think that all that stuff should be in probability distribution because a random variable -- erven a real-valued random variable -- need not have a distribution, and therefore need not have an expected value or a moment or any of that stuff. These things are all properties of the distribution (the measure itself) not the random variable (the measurable map.) I realize that the article about probability distribution needs a lot of work, but does it make sense to make up for it by putting all the stuff that belongs there over here? Imagine that you're looking to find out what a random variable is and you're flooded with all of this information that is only vaguely related to random variables -- it's good information, sure, but it doesn't answer your question at all. My vote for what should happen is that probability distribution be fixed. The merges that were suggested by user:Stpasha sound like good ideas. This section about "functions of random variables" is hard to place. Maybe it belongs here, but then it doesn't need to go into distributions. What do you think? MisterSheik (talk)
The problem with getting into probability distributions here is it gives the wrong impression -- it makes you think that the random variable has the probabilities built-in. All the random variable does is gather up the events into big pieces that gan be measured. E.g., if you had a die, you could have an indicator random variable for roll is less than 3. There's no mention about distributions, except the guarantee that whatever the underlying probabilities of the outcomes on the die have to agree with the probabilities of the indicator variable. So, why fool the reader into thinking that there are probabilities? Why not link to prob. distribution and over there say "probabiltity distributions with real support have statistics such as expected value, ..."
"a random variable need not have a distribution"?? This is your Point Of View. The standard definition in most general case is: a random variable is a measurable map from a probability space to a measurable space. If you prefer "from a measurable space to another measurable space", it is your original idea. Boris Tsirelson (talk) 09:18, 22 October 2010 (UTC)
Every random variable has a distribution (although not a density), a characteristic function, and a notion of expected value attached to it. That is the difference between a random variable and a mere measurable function. The section about probability distributions should be trimmed down, but cannot be omitted altogether, since it is one of the most important properties of a random variable. The transformation of random variables section may not be entirely appropriate here, — perhaps move it to the probability density function? Because it really it talks about the transformation of densities (the transformation of distribution functions is actually quite trivial — a superposition of two maps).  // stpasha »  13:44, 22 October 2010 (UTC)
I was just copying the definition out of http://www.math.duke.edu/~rtd/PTE/PTE4_Jan2010.pdf . (He defines random variable as a measurable map to the reals.) If there's another good reference that suggests otherwise, then I'm happy to leave the definition using prob. spaces. I agree that the other information should be in the density function page. MisterSheik (talk) 15:20, 22 October 2010 (UTC)
If you're talking about his definition at the beginning of paragraph 1.2, then he says that a random variable is a map from Ω to R, without specifying what that Ω is. But already at the bottom of this page he claims that “if X is a random variable, then X induces a probability measure on R called the distribution”. Thus he meant that Ω was a probability space all along.  // stpasha »  17:34, 22 October 2010 (UTC)
I'm talking about the section entitled Random Variables. It clearly shows that Omega is the sample space. It says A measurable map is.... and then it says if the target is the reals, then it's called a random variable. I'm considering that this is not case, but I don't see how there could be another interpretation. MisterSheik (talk) 18:14, 22 October 2010 (UTC)
So it means he has two inconsistent definitions of a random variable. And later on, when he talks about the expected value of a random variable, the probability distribution miraculously reappears... It'll be easier to write to the author of this manual and ask him to rectify the definition.  // stpasha »  00:27, 23 October 2010 (UTC)
Just another reference point here ... in the MathWorld definition here: [1] it specifically mentions that a r.v. is a function from a probability space to a measurable space, quoting Doob 1996. Benwing (talk) 07:28, 23 October 2010 (UTC)
You're right: it seems inconsistent. I think we should go with the probability space definition. However, I maintain that probability distribution and its consequences should be explained in that article. MisterSheik (talk) 08:07, 23 October 2010 (UTC)

Functions of random variables[edit]

This section was removed. You may check the old revision just before the removal.

The lead[edit]

"to each possible outcome of a random event" — really? Or rather, of a random trial? (or "experiment") Boris Tsirelson (talk) 12:03, 22 October 2010 (UTC)

I think this has to be random event. The events are defined to be sets of outcomes, and the outcomes are the things that are mapped by the random variable. MisterSheik (talk) 18:18, 22 October 2010 (UTC)
Really, you think so? Then, what about the probability of this event? It is the probability that the random variable is defined! It must be equal to 1 (if you do not want to replace the standard notion of a random variable with your original one). Thus, in order to make the formulation correct, we have to say
"to each possible outcome of a random event of probability 1"
which is a formally correct, but ugly and puzzling formulation. And no wonder: the notion "event" is not intended for such use. Boris Tsirelson (talk) 16:08, 23 October 2010 (UTC)
Let me add that you can successfully implement your idea of a good article only when you are familiar with the topic of the article. Otherwise you'd better suggest your changes on the talk page. Boris Tsirelson (talk) 16:12, 23 October 2010 (UTC)
I read the article as saying "to each possible outcome of any random event" rather than "to each possible outcome of a particular random event". I think it's unnecessarily ambiguous. Why not replace it with "the sample space". If the link doesn't satisfy, then "the sample space (the set of all possible outcomes.)". MisterSheik (talk)
"to each possible outcome of any random event" is a strange idea, since each outcome belongs to a lot of different random events, and they all are irrelevant to the point. Boris Tsirelson (talk) 07:02, 24 October 2010 (UTC)
Well, I did "possible outcome, that is, element of a sample space". Boris Tsirelson (talk) 13:01, 24 October 2010 (UTC)

about deletions[edit]

MisterSheik, you cannot cannot cannot simply remove stuff that you think belongs elsewhere without doing all the work yourself of moving it, integrating it into the new page, and making sure the old page provides sufficient context about what was moved that the relevant part of the new page is easily located. Just dumping it in a talk page is not a proper substitute; you're basically saying "OK, I think the house needs to be rewired, so I'm just going to remove all the wiring that I think doesn't belong and dump it in the corner; if you need any of that wiring, here's where to find it."

I understand how that can be frustrating, but in fairness there was talk on this very talk page describing how moments did not belong on the page and so on. It's the writer's responsibility to make sure that his work is in the right place too. I'll discuss the deletion that you find objectionable below to explain why I still think it makes sense to remove it. MisterSheik (talk) 08:22, 23 October 2010 (UTC)

I understand your desire to maintain technical correctness, and I totally sympathize with your instinct to write the article in a "pure" fashion that lays everything out linearly and concisely, the way you might in a good grad-level math textbook. I've done the same for WP articles on subjects where I really am an expert, such as Old English phonology. The problem however is that if you do this, the article is only readable by experts, which is exactly what you do not want, since 99.9% of the readers are likely to be non-experts. I could (maybe) argue that Old English phonology is a sufficiently esoteric subject that the typical reader will be linguistically savvy enough to make sense of a statement like "the pairs /k/~/tʃ/ and /ɡ/~/j/ are almost certainly distinct phonemes synchronically in Late West Saxon ..." without explaining the IPA symbols used and without bothering to explain what a "phoneme" is or what "synchronically" means ... But this is definitely not the case for random variables.

This is a great point, and thank you for taking the time to explain it so clearly. I do agree with you that accessibility is very important. I accepted that the first paragraph of the article be non-technical, but I still think that one sentence in the entire four paragraph lede to make it clear that there exists a formal, clear way of talking abou this concept would be great. No math symbols, just "In the measure theoretic formulation of prob. theory, a random variable is a measurable function from a probability space to a measurable space. One super short paragraph for other readers. MisterSheik (talk) 08:22, 23 October 2010 (UTC)

You should read WP:TECHNICAL. The article needs to be geared towards the average reader, not the expert. Ask yourself, your measure-theoretic concerns about whether a random variable technically must include a distribution or not, are they relevant to the average non-expert reader? Will the average non-expert reader be served by a discussion here about the distribution of a random variable, its expectation and variance, functions of a random variable, etc.? If the answers are "no" and "yes", respectively, then you need to discuss this stuff here. If you feel your measure-theoretic concerns are important, then by all means discuss them in the relevant "for experts only" section.

I totally agree, but it's humiliating to presume that the non-expert is also too stupid to learn about measurable spaces. He may not be interested, and for that he can just read three short paragraphs in the lead and know what a rv is. If he wants the measure theoretic formulation he only has to read one line, and then read about measure theory— if those articles are kept short, then he can reasonable teach himself measure theory on wikipedia. MisterSheik (talk) 08:22, 23 October 2010 (UTC)

But keep in mind that, based on my experience and the experience of others I've worked with, the average reader is going to be rather confused about what a random variable is, how to think about it intuitively, what the relationship between a probability distribution and a random variable is, what it means to apply a function to a random variable, how you actually go about deriving the distribution of a function of a random variable, etc. In fact, the average reader will be sufficiently confused that if you just move stuff over into probability distribution they may well have a hard time figuring out how to apply the stuff there to a random variable, even though it seems totally obvious to you. Hence, it helps to be really specific and include lots of examples. Once you straighten out this article I'm going to add some more stuff about some of the areas I was very confused about, which I think will be very useful. Benwing (talk) 07:55, 23 October 2010 (UTC)

I really think that having a clear separation between what shows up on the rv page and the distribution page will help cement the conceptual differences. Duplicating all of probability theory on every related page doesn't make things easier to understand. What we have to consider is that the reader gets tired, and every reader will only read a given number of words. It's better to keep the pages succinct. That means organizing things and defining related comments succinctly, and providing links instead of putting in entire sections that redefine the related concepts. MisterSheik (talk) 08:22, 23 October 2010 (UTC)
You've made a number of good points, and I'll respond to them, but first I must note that you've missed the main point that I'm trying to make, which is that you've deleted a lot of important material without moving it anywhere or supplying a link to it in this article. The result is that the article is in a bad state. You don't seem to show any interest in fixing the problems, so I've gone ahead and fixed it myself by undoing the deletions.
As for whether and how to separate probability distributions from random variables, the point is not that you have to duplicate everything everywhere, but that you need to include a sufficient discussion of all relevant matters on the page to which it's relevant. In this case you've left the article with no discussion whatsoever of functions of random variables, moments of random variables, etc. How is a general reader to know where to find this info? It won't be obvious for them to go look on probability distribution (and the info isn't even there anyway, since you never moved it). Also, as for your comment about "humiliating to presume that the non-expert is also too stupid to learn about measurable spaces", this isn't what I said. Rather, it's simply not relevant to the vast majority of readers, and many of them will be confused if you insert too much stuff about measurable spaces, since they will have never even heard of the concept. I really think you should actually go and read WP:TECHNICAL and take it to heart. Here's why it's not a good idea to litter the "for non-experts" sections with expert-only material: For the non-expert, such material will be confusing and make it hard to find the stuff that's relevant to them; for the expert who already knows about measure theory, they will expect a discussion of it and can simply go down and find the relevant section. Benwing (talk) 00:31, 24 October 2010 (UTC)
How can you possibly believe that I want to make everything too technical. Look at what I removed: some measure theoretic stuff about the composition of measurable functions. Who is going to miss that section?
Here's my suggestion though: why not rewrite these things I removed using one sentence for every paragraph. Moment can be explained in one sentence with a link. Same with "functions of random variables.
Also, we need to decide if r.v. means real-valued random variable. If you want it to mean that, it should say that in the first line of the article. Otherwise, I would suggest having some non-real-valued examples.
I'm sorry that I don't have time to rewrite everything to do with probability theory on wikipedia. MisterSheik (talk) 06:49, 24 October 2010 (UTC)
What a sorrow :-) Boris Tsirelson (talk) 07:15, 24 October 2010 (UTC)
OK, as for making things too technical, what I was referring to was the fact that you were letting concerns that are relevant only for experts drive the structure of the article. Maybe, possibly a random variable can have no distribution (but the other posters disagree on this), but this is a very technical issue, because from a practical standpoint, r.v.s always do have distributions. Also your suggestion to take out paragraphs and replace them with a short sentence might work, but I suspect it would simply render the article too confusing for non-experts. I'd have to see what you actually wrote to be sure. Keep in mind that statistical topics like random variables are very tricky for people to get their heads around, and even connections between topics that seem super-obvious to you will not be at all obvious to them. Hence, spelling out explicitly how everything connects and giving lots of examples is good. Obviously you can go overboard but I think it's better to err on the side of too much text rather than too little. As an example: a function of a random variable is a tricky concept, and someone who doesn't already grok it will not get it from a one-sentence description. Benwing (talk) 08:48, 24 October 2010 (UTC)
BTW your comment about non-real-valued examples is a great one, thanks for making it! I'll see about incorporating it.
In general, I really do appreciate your input into this article and I think others do as well ... almost all Wikipedia editors in the process of learning Wikipedia culture, expected etiquette, etc. occasionally have hiccups -- don't sweat it ... Keep up the good work! Benwing (talk) 08:48, 24 October 2010 (UTC)
Thanks for your well-word comments. I do try to keep everyone in mind, but it's true that I am most concerned about people who have the patience and mathematical background to learn — even though there are plenty of other people who might be put off by mathematical language, or text without examples. I think that's one benefit of having many editors on wikipedia: they balance the "concerns that drive the structure of the articles." MisterSheik (talk) 23:25, 24 October 2010 (UTC)

Examples[edit]

A random variable is real valued. Every single source defines it that way. Now, there are also non-real valued random variables, but then they are preceded by a quantifier: a complex-valued r.v., a vector-valued r.v., an (E, ℰ)-valued r.v., and so on. But if you omit the quantifier, then it has to be the real-valued random variable. Which is why the first example given in the Examples section is wrong, I have deleted it but then somebody put it back :( Also there is no point in giving the examples of the probability mass functions before you actually tell the reader what that is. The second example strikes me with its stupidity... Really, if X is “the number rolled on a die”, then who needs the “clarification” that X = 1 if 1 is rolled, 2 if 2 is rolled, ..., 6 if 6 is rolled ???  // stpasha »  08:05, 24 October 2010 (UTC)

I put it back because I was trying to undo deletions that MisterSheik had made. I didn't realize this was your work instead. I'm not sure what exactly you had done before, so either point me to what exactly it was and I'll delete it again, or go and and delete yourself.Benwing (talk) 08:27, 24 October 2010 (UTC)
Nonetheless, I don't believe it to be the case that if you just say "a random variable" without qualification, it must necessarily be real-valued (often yes, but always, no). Now, AFAIK a random variable is normally real valued, but can also take other types. In such a case, my English intuition tells me that if you just see "a random variable" without qualification, its interpretation is determined by context, with "real-valued" as the default. In other words, some contexts will make it so that there's only one logical interpretation; but if the context doesn't do that, then the "default" of "real-valued" wins out. For example, if you're describing how the Gibbs sampling algorithm works, and you describe each node in the Bayesian graphical model as being or having "a random variable" (with no qualification), then you are obviously talking about a general r.v. of arbitrary type, not a real-valued r.v. Benwing (talk) 08:27, 24 October 2010 (UTC)
I'm not sure I see your point there. In the standard Gibbs sampling algorithm every node is a real-valued pseudo-random variable. Now, of course the algorithm could be extended to non-real valued r.v.’s, but if you want to talk about arbitrary random variables without qualifying their type it’s better to use the term “random element”.
One of the principal reason why we want random variables to be real-valued is because then they possess several nice properties, such as cumulative distribution function, characteristic function, expectation operator, — all these might not be defined if the random variable has arbitrary type (say, X = {heads, tails} random variable).  // stpasha »  19:46, 24 October 2010 (UTC)
I just want to point out two things: The Durrett text, which is just one source that we all have access to, defines r.v. as real-valued. I liked the suggestion (whoever's it was) to merge random element into this page, and if we do that, then what would the distinction between random element and random variable be? MisterSheik (talk) 23:17, 24 October 2010 (UTC)
OK, but in the realm of natural language processing (NLP), which I work in, it's quite common to have non-real-valued random variables. Bishop's book "Pattern Recognition and Machine Learning" uses the term "random variable" in connection with general algorithms (Gibbs sampling, Variational Bayes), etc. where it's quite clear that he does not intend to restrict the algorithms to real numbers -- in fact the examples tend to include more vector-valued r.v.s than real numbers. Nowhere does "random element" appear anywhere in this text, nor in any other texts or research papers that I've seen in the field of NLP, despite the fact that many if not most of the r.v.s are not real-valued. Benwing (talk) 07:11, 27 October 2010 (UTC)

Expectation of what?[edit]

"Expectation of a random variable"? Or rather, "Expectation of the distribution of a random variable"? The second option may be tempting, but leads to problems, as follows.

  • If XY almost surely then E(X)≤E(Y). Can we reformulate it in terms of distributions? Yes I can (can you?) but it becomes more technical and less intuitive.
  • E(X+Y)= E(X)+E(Y) (always, not just for independent random variables). Can we reformulate it in terms of distributions? Yes I can (can you?) but it becomes much more technical and less intuitive.
  • Also, think about Jensen inequality, first of all, (E(X))2≤E(X2), and many other statements.

Boris Tsirelson (talk) 07:08, 24 October 2010 (UTC)

Mathematics or probability theory[edit]

Could anybody explain why in the lead sentence the topic is associated with the probability theory? Since the random variable is a mathematical variable, and used as a pattern in probability theory and statistics, it would be more common to mention only mathematics.--Kiril Simeonovski (talk) 18:56, 13 November 2010 (UTC)

Are you sure that "the random variable is a mathematical variable"? I am not. There is no formal notion of a "variable" in mathematics (there are sets, functions etc., but not "variables"), but even informally, a "variable" in mathematics (whatever it reasonably means) has no probability distribution. A random variable has. Boris Tsirelson (talk) 20:22, 13 November 2010 (UTC)
Indeed, there are variables in mathematics, described as changeable values. Unlike random variable, other variables do not have probability distribution, but other types of distributions (note the frequency distribution). However, you're right. There is not exact definition about what in general a variable should include.--Kiril Simeonovski (talk) 23:13, 15 November 2010 (UTC)
I agree with Boris, "variable" has a range of meanings within mathematics. In this case, a "random variable" is a particular type of mathematical object, one that (as I think about it anyway) has no value a-prior but that has associated with it a probability distribution function that determines values it could take on. I don't understand what Kiril means by "used as a pattern". I would say there are lots of types of objects in mathematics, one type is a "random variable", which has applications in probability and statistics. While I wouldn't be surprised if they are in some way isomorphic to something with broader applications in math, I think that's stretching it. —Ben FrantzDale (talk) 17:00, 14 November 2010 (UTC)
Thanks. I consider probability theory a branch of mathematics, same as algebra, geometry or calculus. The global usage and importance of the random variable in the field of probability theory made me compare it with another key mathematical terms, such as plane in geometry, or permutation in combinatorics. In the lead sentences of both, plane and permutation, is mentioned "mathematics" (not "geometry" and "combinatorics"), and therefore I used to do the same with the random variable (in general, no probability without random variable, as no geometry without plane). My notion is that the articles concerned with the probability theory are very well covered, which could be used to trim the other mathematical articles.--Kiril Simeonovski (talk) 22:38, 15 November 2010 (UTC)

First sentence[edit]

Right now the first sentence says "In probability theory, a random variable, or stochastic variable, is a way of assigning a value to each possible outcome, that is element of a sample space." This seems very awkward: a rv is a way of assigning? each possible outcome of what? How about changing it to something like "In probability theory [or better, in probability and statistics], a random variable or stochastic variable is a variable that can take on any of various values -- one for each possible situation that could arise (each possible element of a sample space). Duoduoduo (talk) 21:47, 14 November 2010 (UTC)

Hi Duoduoduo, good catch there. "a rv is a way of assigning?" >> "r.v. is a function that assigns a value to each possible outcome" is more accurate. "each possible outcome of what?" >> It just tells about it in the second sentences, but sure it could be combined with the first sentence for the clarity. ~ Elitropia (talk) 22:19, 14 November 2010 (UTC)
And now I see "a random variable ... is a variable whose value results from a measurement on some random process"; the problem is that technically "random process" is a notion more complicated than "random variable" and in fact defined in terms of random variables. --Boris Tsirelson (talk) 16:20, 19 February 2011 (UTC)
Not only that: there is no "measurement" involved. The first sentence confuses the mathematical notion of random variable, which is well defined, with a supposedly empirical notion of a 'random' or 'stochastic' natural process (in the colloquial sense) that in practice is not defined empirically or otherwise, and is most likely meaningless. Better to stick to the purely mathematical definition. illywhacker; (talk) 21:11, 31 January 2012 (UTC)
A random variable is not a variable at all (nor is it random). That should probably clarified. 69.110.145.10 (talk) 16:14, 19 September 2012 (UTC)

Examples[edit]

I found the first example very confusing. It starts with "the following random variable : X=either head or tail." However, the definition of a random variable given just below is "a random variable is a measurable function". In the example, where is the function ? What is the variable of the function, what is the outcome depending on the variable ?? I find that very unclear : it seems to define the state space only. - Nicolas. 152.81.114.116 (talk) 15:00, 28 December 2010 (UTC)

Thanks for pointing this out. Hope my revision has answered your objection. Duoduoduo (talk) 15:33, 28 December 2010 (UTC)
Thanks, it's indeed much better. I'm wondering if the notation X(w=head)=1, X(w=tail)=0 would be better ? Or X(w)=1 iff w=head ? It would clearly indicate the state space and would show the random variable as a function. I'm not expert enough to decide by myself. 152.81.114.116 (talk) 15:48, 28 December 2010 (UTC)
How does it look now? Duoduoduo (talk) 20:42, 28 December 2010 (UTC)
Thanks. I guess this is better - but I'm not a mathematician! 152.81.114.116 (talk) 21:59, 29 December 2010 (UTC)

The real numbers with a suitable measure[edit]

Does "a suitable measure" refer to the probability measure, P? If so, maybe this could be made explicit. If not, why do the reals need a measure here? They're said to serve as the first item of the tuple (E,script E), but no measure was required for (E,script E) where E is not R. Dependent Variable (talk) 16:40, 2 March 2011 (UTC)

I agree, and remove the "suitable measure". --Boris Tsirelson (talk) 16:53, 2 March 2011 (UTC)

"try and improve comprehensibility"[edit]

The former phrase "Formally, it is a function from a probability space, typically to the real numbers" was correct. The new phrase "and the probability (or probability density for continuous random variables) of each of these values is defined by a probability space, typically restricted to the real numbers" is either incorrect or vague. In which way does the probability space define the probability? What is typically restricted to the real numbers, the probability space?! Boris Tsirelson (talk) 17:06, 8 February 2012 (UTC)

Now it is much better (I think so); but the phrase "discrete (ie it may assume any of a specified set of exact values)" disturbs me. Each real number is an "exact value", isn't it? Thus, say, the segment [0,1] is an example of a "specified set of exact values"... Boris Tsirelson (talk) 21:29, 8 February 2012 (UTC)

I agree that it is not a paragon of clarity but I think they are saying that you could not list out value such that your list would have non-zero measure. Sometimes, you are better off forcing the reader to click on the term in question. 018 (talk) 03:21, 9 February 2012 (UTC)
Well, now these are clickable, and "set" replaced with "list". Boris Tsirelson (talk) 06:58, 9 February 2012 (UTC)

Die[edit]

The sample space is the set of outcomes of the experiment. Possible outcomes are the number of eyes shown on the upper side of the die. These are numbers. The rv has as its values these numbers. This is a typical example of rv's of the type X(ω)=ω.Nijdam (talk) 06:58, 21 June 2012 (UTC)

No, the possible outcomes are not numbers. They are the possible state of affairs after the random event. State of affairs is something like "there is a 6 shown on the upper side of the die". A clear difference needs ot be made between the outcomes of the random event and the value of the random variable. Else things won't make sense to anyone. --rtc (talk) 17:11, 21 June 2012 (UTC)
On one hand, this makes sense. But on the other hand, we work with mathematical models, not with the reality itself. A mathematical model is built out of mathematical objects. "The set of an apple and an orange" is a common abuse of the mathematical language, but really the mathematical universe contains no fruits; we encode fruits (and other real things) by mathematical objects. Likewise, you cannot send an apple by email, but you can send a picture encoded by bits, – and it is a conventional substitute for the apple... Boris Tsirelson (talk) 20:52, 21 June 2012 (UTC)
This is the classic "platonist" view of mathematics, which is by no means uncontroversial. There are views that hold the mathematical world in fact to be not as separate from the real world as the Planonists claim. Anyway, the state of affairs of a die after rolling it is not the number it shows, and thus they should not use the same mathematical representation even if that's okay as far as theory is concerned. It's just too confusing and does not give readers the right intuition. --rtc (talk) 21:05, 21 June 2012 (UTC)
So, I am not uncontroversial, :-( but classic. :-) Boris Tsirelson (talk) 05:49, 22 June 2012 (UTC)

Take a look at sample space. Nijdam (talk) 21:18, 21 June 2012 (UTC)

Maybe both approaches could be mentioned. Indeed, the "nonclassic" approach is also not uncontroversial. Boris Tsirelson (talk) 05:49, 22 June 2012 (UTC)
The recently added "example" (of a pair a measures related to individuals in a population) emphasises the utility of regarding the sample space as a generic set of items, leaving the "numerical value" of a random variable to be treated as part of the "function" in the "functions defined on a probability space," rather than as part of the sample space. Melcombe (talk) 09:36, 22 June 2012 (UTC)

My lead and definition cleanups[edit]

The changes I did today were motivated by my frustration in trying to relate what this article said to what my wife's textbook said, when I was trying to help her interpret what a random variable is. I found the lead here to be overburdened with distracting generalization, and the definition to be a confusing and imprecise mixture of concepts. I hope that what I have done makes it more understandable, and precise enough to be not too offputting to the mathematicians. Let's talk if you think if it incorrect, not precise enough, not complete enough, or not understandable enough, and we can work on that. I added a couple of refs I came up with, too. Dicklyon (talk) 18:11, 26 April 2013 (UTC)

Nijdam, I don't understand this edit. I think the opening sentence of the lead made more sense before it, but I can't interpret it after. Can you explain the intent? Either way, it's not quite right, since in the continuous case a probability distribution does not associate a probability with an outcome, just with measurable sets of possible outcomes. I don't think we can skip that, just need the easy way to put it. I'll try. Dicklyon (talk) 18:21, 26 April 2013 (UTC)

I was very unhappy with the definition, so I tried to improve it, keeping as much as possible as there was already. Anyway, if you want to rewrite the definition it.s fine with me. Nijdam (talk) 16:59, 27 April 2013 (UTC)
I understand now that your change to "A random variables is defined on a set of possible outcomes..." was correct; I misinterpreted some sources. The other half of the sentence, which you didn't fix, was however not improved. I've replaced all that with a different approach now. Please take a look. Dicklyon (talk) 06:17, 28 April 2013 (UTC)
"A random variable is defined by a set of possible outcomes (the sample space Ω) and a probability distribution. The random variable associates each subset of possible outcomes with a real number, known as the probability of a sample outcome being in that subset of possible values." – Really? The sample space is (generally) NOT a set of values of the random variable. Consider a real-valued random variable. Its values are real numbers, but points of the sample space are (generally) not.
The function that associates a real number with each (measurable) subset of possible OUTCOMES is the probability measure given on Omega, before introducing this or that random variable. In the example below it is the uniform distribution on the set of all persons (within the city or whatever).
In contrast, the function that associates a real number with each (measurable) subset of possible VALUES is exactly the distribution of this random variable. The distribution is also a probability measure, but specific to this random variable, and sitting on reals. Boris Tsirelson (talk) 15:49, 27 April 2013 (UTC)
"For example, in an experiment a person may be chosen at random, and one random variable may be the person's height." – Yes! The person is a point of the sample space. His/her height is a real number, but the person is not. Boris Tsirelson (talk) 15:51, 27 April 2013 (UTC)
A random variable HAS a distribution; but the random variable is much more than a distribution. The distinction is hidden when dealing with only one random variable (the height), but becomes clear when we deal with two (height and weight). Assuming that a random variable IS (the same as) distribution we must agree that a pair of random variables IS a pair of distributions. But they are (generally) correlated! They have a joint (2-dim) distribution! Boris Tsirelson (talk) 15:57, 27 April 2013 (UTC)
Boris, I think I agree, but sources are all over on this. Do you one we could cite for the way you like to do it? Dicklyon (talk) 17:19, 27 April 2013 (UTC)
Every mathematical textbook defines all these notions in a crystal clear way. But probably there are a lot of non-mathematical textbooks with all kinds of, hmmm, viewpoints. I like this online source: Virtual Laboratories in Probability and Statistics. Boris Tsirelson (talk) 17:38, 27 April 2013 (UTC)
And maybe this: Hazewinkel, Michiel, ed. (2001), "Random_variable", Encyclopedia of Mathematics, Springer, ISBN 978-1-55608-010-4 . Boris Tsirelson (talk) 18:56, 27 April 2013 (UTC)
The first does a good job of making the r.v. be a measurement on a random sample; but it's view of what an r.v. is seems at odds with many other. It says "the important point is simply that a random variable is a function defined on the sample space S." Others say the r.v. maps subsets of S to probabilities, and presume that the possible values of the r.v. are the elements of S. That second one doesn't seem to include the idea of functions on samples, nor support your idea that an r.v. is more than a distribution But I find it very hard to understand, like most sources that try to be mathematical about this topic. That's what I mean by all over the place. Maybe we can incorporate the range clearly? From the intelligible but over-general first one through the rigorous but inscrutable second one? Dicklyon (talk) 20:16, 27 April 2013 (UTC)

OK, I read and compared some more books, and I see that my concept what really wrong. I'll try to fix now that I see better what they're saying... Dicklyon (talk) 01:00, 28 April 2013 (UTC)

Done. Comments? Dicklyon (talk) 04:58, 28 April 2013 (UTC)

Now indeed the concept is corrected, and I have only smaller remarks about formulations.
"A random variable is defined on a set of possible outcomes (the sample space Ω) and a probability distribution that associates each outcome with a probability." — I fail to parse it. "...defined on a set...and a...distribution..."? Defined (also) ON a distribution? Probably you mean something like this: "...defined on a set...endowed with a probability measure (the set and the measure form a probability space)..." But I am afraid, it is too much for a single phrase, to introduce the two ideas (both "random variable" and "probability space") simultaneously. As a result it is again unclear, when the word "outcome" stands for a point of the sample space (=a point of the probability space), and when for a real number.
"a random variable...associates each set of possible outcomes with a number" — no, a random variable itself is about points, not sets. (A point of Omega to a point of R.) Its distribution is about sets. Also the probability measure on Omega is.
Boris Tsirelson (talk) 06:05, 28 April 2013 (UTC)
Oops. Both of those comments are on a fragment that I meant to remove. See if it's OK now without it. Dicklyon (talk) 06:14, 28 April 2013 (UTC)
"derivable from a probability space describing the sample space" — I'd say, "derivable from a probability measure that turns the sample space into a probability space".
"the probability of which is the same probability as the random variable being in the range of real numbers" — it may seem to be a coincidence of two predefined probabilities. Rather, "the random variable being in the range of real numbers" IS that event (it is a natural-language form of \{\omega:a<X(\omega)<b\} usually abbreviated to just a<X<b), and one DEFINES P_X((a,b)) as P(\{\omega:a<X(\omega)<b\}). Here P is the prob. measure on Omega, and P_X is the distribution of X.
Boris Tsirelson (talk)
"Furthermore, the notion of a "range of values" here must be generalizable to the non-pathological subset of reals known as Borel sets." — Yes; but this is not a condition imposed on "functions for defining random variables"; it is rather an implication of (a) the definition of Borel sets and (b) the definition of a probability space.
"take on values that vary continuously within one or more real intervals" — I'd say, that fill one or more real intervals; otherwise the reader may think that it is about something that is changing in time, continuously.
Boris Tsirelson (talk)
"(corresponding to an uncountable infinity of events, subsets of Ω)" — these possible values correspond to points of Ω, not subsets. For a COUNTABLY infinite Ω we already have uncountable infinity of events, subsets of Ω, but this is not the point.
Boris Tsirelson (talk)
"In examples such as these, the sample space (the set of all possible persons) is often suppressed, since it is mathematically hard to describe, and the possible values of the random variables are then treated as a sample space." — Hard to describe? Rather, since it is irrelevant AFTER we have at hands the joint distribution of all relevant random variables (including indicators of all relevant events, if any). (In particular, if only one random variable is relevant, and we have its distribution.) And indeed, at this point we may use Rn (where n is the number of the relevant random variables) as the sample space, and the joint distribution as the probability measure.
Boris Tsirelson (talk)
"it is easier to track their relationship if..." — quite an understatement. It is impossible to even think about their relationship unless...come from the same random person.
Boris Tsirelson (talk) 07:09, 28 April 2013 (UTC)
That all sounds good, but since you understand it much better than I do, maybe you should take the next crack at it. It's hard for me the walk the fine line of being mathematically correct and still intelligible when I don't quite follow the deep math. Dicklyon (talk) 17:16, 28 April 2013 (UTC)
The problem is that I surely would leave the fine line to the math side. Boris Tsirelson (talk) 17:34, 28 April 2013 (UTC)

Introductory sentence[edit]

I would prefer as introductory sentence: In probability and statistics, a random variable or stochastic variable is a variable whose observed value is dependent on chance. Nijdam (talk) 17:02, 27 April 2013 (UTC)

Continuous[edit]

Somewhere above there has been a discussion about when a r.v is called continuous. I always called a r.v. continuous when its cdf is absolutely continuous, i.e. the rv has a density. Which means that besides discrete and continuous rv's there are also mixtures.Nijdam (talk) 11:52, 28 April 2013 (UTC)

Well, then also singular (continuous cdf with no abs.cont. part), and then also mixtures, of (most generally) three components. But these are technicalities, not for beginners, I'd say. Boris Tsirelson (talk) 13:28, 28 April 2013 (UTC)

Yes, I was confused there, too, thinking that continuous included mixtures. Looking at more books, I find what Tsirel points out: there are continuous and absolutely continuous. I'm not sure what to do about the "singular" ones, so I have barely alluded to their existence. Can any useful and comprehensible be added, or is that just for mathematically sophisticated readers? Dicklyon (talk) 17:12, 28 April 2013 (UTC)

See "Singular distribution" (and the links therefrom). I can only add that the distribution of the sum of the random series \sum_{n=1}^\infty \pm\frac1{a^n} is singular for some a and absolutely continuous for other a. Boris Tsirelson (talk) 17:41, 28 April 2013 (UTC)
But wait, I can say more!
Singular distributions on the line are rather exotic because there is no integer between 0 and 1. In contrast, singular distributions on the plane are not exotic, since there is an integer between 0 and 2. An example: the uniform distribution on the circle x^2+y^2=1. it is evidently atomless; but it is concentrated on a set of zero area, thus, it cannot have a 2-dim density. You can easily find more such examples. The joint distribution of two functionally related (absolutely) continuous random variables. In fact, Gaussian measures (allowed to degenerate) may be atomic, singular, and abs. cont. See also Multinormal_distribution#Degenerate_case. Boris Tsirelson (talk) 18:25, 28 April 2013 (UTC)
However, why here? All that would go to the "Probability distribution" article. Boris Tsirelson (talk) 19:00, 28 April 2013 (UTC)
OK, I made a minimal change linked to singular distribution. Maybe that's enough? Dicklyon (talk) 23:22, 28 April 2013 (UTC)

The quotation at the end of the lead was less than useless, more suitable for a zen koan than an encylopedia article. I've removed it, and added a citation. -Bryanrutherford0 (talk) 14:28, 25 July 2013 (UTC)

Confusing sentence[edit]

From my point of view, it's somehow confusing when, in the very first paragraph it says:

" As opposed to other mathematical variables, a random variable conceptually does not have a single, fixed value (even if unknown); rather, it can take on a set of possible different values, each with an associated probability."

I don't know of any conceptual definition of the term "variable" that involves "a single, fixed value". Indeed I would call that a Constant (like in the expression "ax+b", a and b are constants and x its the variable) From that I don't see how this opposition of types of variables is correct or valid. Even in the case that its valid, to me its not clear at all, and I think it should be even less clear for people with no or little mathematical background. 190.30.74.132 (talk) 03:55, 21 February 2014 (UTC)

I agree. I've changed this formulation. Boris Tsirelson (talk) 05:25, 21 February 2014 (UTC)