Wikipedia talk:WikiProject Probability

From Wikipedia, the free encyclopedia
Jump to: navigation, search

/Archive1 02:23, 18 August 2005 (UTC)

Specification[edit]

This section includes the discussion of the specification to help authors write the entries on the distributions while maintain a maximum amount of harmony between the different distributions. The actual specification will be presented on the main page.

Specifications of Standard Usage[edit]

Capitalization[edit]

We should decide on whether to capitize the names of the distributions when we refer to them in passing. e.g Do we talk about the Gamma distribution or the gamma distribution. I have been using capitalizations because it seems like a proper name. Acuster 07:54, 19 August 2005 (UTC)

We generally have to go with what's most common in the relevant literature. If there are several equally common alternatives, I'm generally in favor of preserving the case of any conventional piece of notation that the distribution may be named for. I personally write "Gamma distribution", and "Beta distribution", but "chi-squared distribution" and "zeta distribution", because the first two involve upper-case Greek letters and the last two lower-case Greek letters. For other distributions it's pretty clear: "F-distribution", "t-distribution"; "Cauchy distribution", "Wishart distribution" (based on proper names); "binomial distribution", "exponential distribution" (not based on proper names). I tend to hesitate when it comes to "normal distribution" vs. "Normal distribution". --MarkSweep 08:55, 19 August 2005 (UTC)
Yes - see talk pages on Gamma distribution and chi-squared distribution. PAR 09:14, 19 August 2005 (UTC)

Inline math[edit]

A question: how do we add inline math so it's elegant? I've tried adding it with 'math' tags but it looks goofy: e.g. \gamma(). Acuster 08:01, 19 August 2005 (UTC)

That's a complex issue. I'd say what you did was perfectly fine for inline math. If you prefer TeX's Computer Modern typeface for all math formulas, you can switch on mandatory PNG rendering in your user preferences. --MarkSweep 08:57, 19 August 2005 (UTC)
We had a discussion on this, but I can't remember the page. (MarkSweep, do you remember the page?) I've tried to summarize the results in the section on inline math. PAR 09:14, 19 August 2005 (UTC)

Specification of a Standard Layout[edit]

Prototype Layouts and Contents for Reference[edit]

I'd suggest using the Normal distribution as our prototype of the continuous because it's the most complete and elegant page currently and possibly the most important distribution overall certainly it has the most text both here and on mathworld. Acuster 07:57, 19 August 2005 (UTC)

I'd additionally recommend the article on the exponential distribution, since it includes a discussion of Bayesian estimation. A discussion of (semi-)conjugate priors for the normal mean and variance/precision is currently missing from the article on the normal distribution. --MarkSweep 08:45, 19 August 2005 (UTC)

Working Layout[edit]

This section should be a working space for hashing out ideas on the layout and the specification on the previous page can be used when we have concensus. Acuster 07:05, 20 August 2005 (UTC)

I think that the talk page is for carrying on a discussion and the project page is for laying out an editable list of specifications - MarkSweep, is that right? PAR 12:20, 20 August 2005 (UTC)
It doesn't matter much to me: We can have a discussion about the specification first, and then move it to the project page later. --MarkSweep 21:06, 20 August 2005 (UTC)
== Overview ==

== ?Examples? ==
  A section to provide the non-technical public with examples of the distribution 
  and its use. Would such a section be useful and possible to create?

== History ==

== Specification of the normal distribution ==
=== Probability density function ===
=== Cumulative distribution function ===

=== Generating functions ===
==== Moment generating function ====
==== Characteristic function ==== 

== Properties ==
=== Moments ===
=== Generating normal random variables ===
=== The central limit theorem ===
=== Infinite divisibility ===
=== Standard deviation ===

==Related distributions==
=== ?Generalizations of the distribution? ===
  This section would present and link to distributions which are more general 
  forms of the distribution presented on the page. For the exponential, this 
  would include the Erlang and Gamma. For the Gamma this would include the 
  several "generalized gamma".

== Occurrence ==
=== E.g. 1 ===
=== E.g. 2 ===

== Estimation of parameters ==
=== Maximum likelihood estimation of parameters ===
=== Unbiased estimation of parameters ===
=== Bayesian estimation of parameters ===

== See also ==

== References==

== External links ==

We need to make sure that we give enough background in the lead paragraphs and in the first couple of sections. I seem to recall that "Overview" sections are deprecated and that a summary and/or high-level overview should go into the lead paragraph (before the first section heading). --MarkSweep 21:06, 20 August 2005 (UTC)

Specification of the distribution: Notation for discrete PMFs[edit]

As I see it, there are basically three choices of notation for discrete probability mass functions. Consider a one-parameter family like the Zeta distribution, whose parameter is called s. We try to use k as the main argument:

  1. Vector notation (advocated initially by PAR) would write the probability as f_k(s) with (\forall s) \sum_k f_k(s)=1. I call this "vector notation" because f(s) can thought of as column vector (or even a stochastic matrix), and k indexes the kth component of that vector.
  2. Unary function notation (advocated, apparently, by Michael Hardy) would write the probability as f_s(k) with (\forall s) \sum_k f_s(k)=1. There are plenty of precedents for writing function parameters as subscripts, but the disadvantage is that this may become hard to read with several parameters.
  3. General function notation (advocated by yours truly) would write the probability as f(k\mid s) with (\forall s) \sum_k f(k\mid s)=1. This is exactly the same as the conventions currently used for continuous distributions, and it has the advantage that several parameters are easily accommodated.

The problem with options 1 and 2 is that they are easy to confuse. Option 3 is unambiguous. --MarkSweep 17:59, 18 August 2005 (UTC)

Ok, after thinking about it, I like the third choice. Its similarity to the continuous notation is a plus, and rational number subscripts in the first choice worry me even though they are countable. The second is worse because it is hard to specify a particular instance using a real number as a subscript and they are not countable. PAR 18:32, 18 August 2005 (UTC)
I made these changes, just to keep up to date.PAR 09:15, 19 August 2005 (UTC)
I also prefer the third option with a caveat. I use pipe's to denote conditionals and semi-colons to denote parameters. So N(x; \mu, \sigma) instead of N(x | \mu, \sigma) because it's entirely possible to have a conditional with parameters (such as N(x; \sigma | \mu) or N(x | \mu ; \sigma), though I only recall doing such for Bayesian stuff). Cburnett 20:33, August 19, 2005 (UTC)
I tend to do the opposite: The primary separator for me is the pipe, and I use semicolons pretty much like commas, except that they suggest some sort of grouping. I think the rationale for the pipe is that everything to its left jointly sums/integrates to unity. --MarkSweep 21:12, 20 August 2005 (UTC)
At first I had no preference between the semicolon and the bar, but I have looked it up in my main two books, "Data Reduction..." by Bevington and Robinson, and "Statistical Theory" by Lindgren. Both use the semicolon, Lindgren uses f(x;a,b,c), Bevington uses PB(x;a,b,c). The bottom line is, I now favor the semicolon with f and F or p and P. PAR 21:36, 20 August 2005 (UTC)
I don't have strong feelings either way when it comes to pipe vs. semicolon. I'd like to point out, though, that the pipe seems to be more widely used on Wikipedia already, so it would be less work to standardize on pipes. However, since you seem to be prepared to make all those changes, I won't stop you. ;-) If you do make any changes, maybe point to this discussion in your edit summary. Since these conventions and guidelines are all very recent, people may not be aware of them, or may want to voice an opinion, plus it's a good chance to get more editors involved. --MarkSweep 02:55, 26 August 2005 (UTC)

At this point, I just think some standard notation should be used. For the most part, any choice in notation is arbitrary. If it jives with book A & B but not C & D, then so be it... Cburnett 22:10, August 31, 2005 (UTC)


Science pearls[edit]

Hello, Please notice this project. I hope that the List of publications in statistics will be adopted by the probability project. Thanks,APH 06:51, 13 September 2005 (UTC)

Glossary[edit]

I made a Glossary of probability and statistics, any suggestions as to how to better integrate it into the other pages on this topic ?

I'm also interested for the WikiProject:General Audience - having a glossary makes things easier to understand. Flammifer 15:43, 21 September 2005 (UTC)

Gambler's fallacy[edit]

  • A reader has sent an e-mail to the Wikimedia help desk raising concerns about this article and has left a message at the bottom of the discussion page explaining his reasons. I would be grateful if someone could have a look at it.

Capitalistroadster 09:37, 19 December 2005 (UTC)

The article is fine, the reader is confused. I answered in Talk:Gambler's fallacy. PAR 16:20, 19 December 2005 (UTC)

Articles for the Wikipedia 1.0 project[edit]

Hi, I'm a member of the Wikipedia:Version_1.0_Editorial_Team, which is looking to identify quality articles in Wikipedia for future publication on CD or paper. We recently began assessing using these criteria, and we are looking for A-class, B-class, and Good articles, with no POV or copyright problems. Can you recommend any suitable articles? Please post your suggestions here. Cheers, Shanel 20:36, 9 March 2006 (UTC)

as a user[edit]

As a nonspecialist who nonetheless uses bibliometric and other probability distributions, I suggest that "An important note is that WikiProject Probability is not concerned with writing articles for an audience who is trying to learn probability theory, except as is appropriate for encyclopedic articles. Writing content with a primarily pedagogical intent is the province of Wikibooks. However, in time WikiProject Probability may develop into a project to write a Wikibook on the subject" might be a little elitist, depending on the interpretation.

Some knowledge of probability occurrs in many fields, and some reference to it in many articles. The articles have to explain what they are talking about in terms that are both comprehensible to the reader of that article, and also accurate within the limits of what those readers can comprehend. The example that led me here is Zipf's law-- which does need an exact discussion, but also needs enough of a discussion that its relationship to the other bibliometric distributions can be understood. It is possible to write at such a level--I mention the papers of Stephen J Bensman in JASIST and elsewhere. These explanations could be written by two different groups of people--those like myself, who may get it wrong but know what the audience can hope to understand, o people like those on the project here. If they can write them for us to use, or write the first part of the article in a way we can use, I think it would be better than the alternative (to avoid conflict with the articles here, they'd have different names as sees to be frequently done--seee Bradford's law.) I ask for advice and opinion. I think previous conventional encyclopedias have not done this very well, and we might hope for better. (my mental model is the parallel probability and statistics courses given at universities aimed at different audiences. In terms of the needs of my subject field, if the mathematicians can not teach or write so the library students will cunderstand, the librarians teach the subject themselves. -- I've been doing just that, and it make me uncomfortable because I am aware I do not know enough. DGG 08:30, 19 September 2006 (UTC)

Is this project still active?[edit]

Things seem kind of quiet around here, is anybody active? --Salix alba (talk) 23:03, 28 September 2006 (UTC)

apparently not, for my request for assistance (above) was not replied to. DGG 02:10, 29 September 2006 (UTC)

Project directory[edit]

Hello. The WikiProject Council has recently updated the Wikipedia:WikiProject Council/Directory. This new directory includes a variety of categories and subcategories which will, with luck, potentially draw new members to the projects who are interested in those specific subjects. Please review the directory and make any changes to the entries for your project that you see fit. There is also a directory of portals, at User:B2T2/Portal, listing all the existing portals. Feel free to add any of them to the portals or comments section of your entries in the directory. The three columns regarding assessment, peer review, and collaboration are included in the directory for both the use of the projects themselves and for that of others. Having such departments will allow a project to more quickly and easily identify its most important articles and its articles in greatest need of improvement. If you have not already done so, please consider whether your project would benefit from having departments which deal in these matters. It is my hope that all the changes to the directory can be finished by the first of next month. Please feel free to make any changes you see fit to the entries for your project before then. If you should have any questions regarding this matter, please do not hesitate to contact me. Thank you. B2T2 00:21, 26 October 2006 (UTC)

Wikipedia Day Awards[edit]

Hello, all. It was initially my hope to try to have this done as part of Esperanza's proposal for an appreciation week to end on Wikipedia Day, January 15. However, several people have once again proposed the entirety of Esperanza for deletion, so that might not work. It was the intention of the Appreciation Week proposal to set aside a given time when the various individuals who have made significant, valuable contributions to the encyclopedia would be recognized and honored. I believe that, with some effort, this could still be done. My proposal is to, with luck, try to organize the various WikiProjects and other entities of wikipedia to take part in a larger celebrartion of its contributors to take place in January, probably beginning January 15, 2007. I have created yet another new subpage for myself (a weakness of mine, I'm afraid) at User talk:Badbilltucker/Appreciation Week where I would greatly appreciate any indications from the members of this project as to whether and how they might be willing and/or able to assist in recognizing the contributions of our editors. Thank you for your attention. Badbilltucker 18:18, 30 December 2006 (UTC)

"Infinite monkey theorem in popular culture nominated for deletion[edit]

Infinite monkey theorem in popular culture has been nominated for deletion. Please express opinions at Wikipedia:Articles for deletion/Infinite monkey theorem in popular culture. Michael Hardy 05:40, 6 August 2007 (UTC)

Please opine here[edit]

Infinite monkey theorem in popular culture was deleted in an irregular way when this WikiProject had not been notified of the proposal for deletion. Please express opinions here: Wikipedia:Deletion_review/Log/2007_August_6#Infinite_monkey_theorem_in_popular_culture. Michael Hardy 22:57, 6 August 2007 (UTC)

Will "infinite monkey theorem in popular culture" ever get restored[edit]

(See infinite monkey theorem and infinite monkey theorem in popular culture.)

The discussion is continuing, and it's been moved to Wikipedia:Deletion review/Infinite monkey theorem in popular culture. So far 16 favor "relisting" the article, which could result in restoring it, and 16 endorse its deletion.

The predominant argument for endorsing deletion, as nearly as I can tell, is that since Wikipedia's math community irresponsibly neglected to offer its assistance in the deliberations originally, it should be punished by being forbidden to help later. POV: Frankly, I think that argument disregards the actual intended purpose of the process, which is to serve the interest of improving Wikipedia. Punishing the math community for not helping originally but forbidding it to help later is being made the purpose of the process instead. Michael Hardy 05:04, 10 August 2007 (UTC)

infinite monkey process continues....[edit]

Wikipedia:Articles for deletion/Infinite monkey theorem in popular culture (second nomination)

The long process continues. Now it is necessary for everyone who has an opinion on whether the article should be kept, to post their views at Wikipedia:Articles for deletion/Infinite monkey theorem in popular culture (second nomination). Click on that link and write either

  • Keep for this reason and this reason and this reason...
or
  • Delete for this reason and this reason and this reason...
Michael Hardy 04:55, 11 August 2007 (UTC)

Update: By my quick count, 17 "keep", 25 "delete" so far, plus various gradations such as "trim" or "merge" or "trim and merge", etc. Anyone with an opinion should speak up now before this closes. Michael Hardy 22:43, 15 August 2007 (UTC)

Hazard ratio[edit]

Could a statistician clarify hazard ratio a bit, or at least put in an example? I'm coming across it in medical studies and the article here is too mathematically dense to parse properly - it delves into limit equations without taking the time to explain what the ratio itself actually means. --Firien need help? 12:09, 1 November 2007 (UTC)

Revised navigational template[edit]

After some discussion at Template talk:ProbDistributions#Too large I've drafted a revised navigational template to replace {{ProbDistributions}}, which has grown too large. For the moment the draft version is at User:Qwfp/tempprobdist. To keep all the conversation in one place, please post comments at Template talk:ProbDistributions not here. Thanks, Qwfp (talk) 15:33, 24 February 2008 (UTC).

Inline "math" tags?[edit]

I was surprised to read this:

  • In-line math - should be done with <math> tags.

I disagree. I'll be back to argue the point later. I think this sort of thing should fit in with WP:MOSMATH. Michael Hardy (talk) 16:08, 8 January 2009 (UTC)

Statistics portal at Featured portal candidates[edit]

Portal:Statistics is being considered for featured quality status, at the Featured portal candidates process. Comments would be appreciated at Wikipedia:Featured portal candidates/Portal:Statistics. —G716 <T·C> 01:28, 9 June 2009 (UTC)

Bhatia–Davis inequality[edit]

I've just written a short article titled Bhatia–Davis inequality. I could use work both on itself and on links to it from other articles. Michael Hardy (talk) 18:12, 18 August 2009 (UTC)

Attempt to re-activate[edit]

I have followed the advice given for attemting to reactivate a project, by reducing the old content of the project page. I have also copied across some of the useful content from the Statistics project page. There was (and still is, but now on a subpage) some useful material on structuring articles on probability distribution articles. Would this be better elsewhere? I hope others will join in this effort to improve the Probability project. Melcombe (talk) 15:47, 24 June 2010 (UTC)

Renaming discussion regarding article Copula (statistics)[edit]

The proposed renaming being discussed at Talk:Copula (statistics)#Requested move may be of interest to members of this WikiProject. Favonian (talk) 08:21, 2 August 2011 (UTC)