# Talk:Conjugate prior

WikiProject Statistics (Rated Start-class, Low-importance)

This article is within the scope of the WikiProject Statistics, a collaborative effort to improve the coverage of statistics on Wikipedia. If you would like to participate, please visit the project page or join the discussion.

Start  This article has been rated as Start-Class on the quality scale.
Low  This article has been rated as Low-importance on the importance scale.
WikiProject Mathematics (Rated Start-class, Low-importance)
This article is within the scope of WikiProject Mathematics, a collaborative effort to improve the coverage of Mathematics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
Mathematics rating:
 Start Class
 Low Importance
Field: Probability and statistics

You may want to see some of the pages on Empirical Bayes, the Beta-Binomial Model, Bayesian Linear RegressionCharlesmartin14 23:43, 19 October 2006 (UTC).

This article and http://en.wikipedia.org/wiki/Poisson_distribution#Bayesian_inference disagree about the hyper parameters to the posterior. —Preceding unsigned comment added by 98.202.187.2 (talk) 00:57, 12 May 2009 (UTC)

This was my comment last night, sorry I didn't sign it, I've just corrected this on the page. Bazugb07 (talk) 14:32, 12 May 2009 (UTC)

Could someone fill in the table for multivariate normals and and pareto? 128.114.60.100 06:21, 21 February 2007 (UTC)

It would be nice to actually state which parameters mean what, since the naming in the table does not correspond to the naming on the pages for the corresponding distributions (atm I have a problem figuring out which of the hyperparameters for the prior for the normal (variance and mean) belong to the inverse gamma function and which to the normal) —Preceding unsigned comment added by 129.26.160.2 (talk) 10:44, 14 September 2007 (UTC)

For the Gamma likelihood with prior over the rate parameter, the posterior parameters are $\alpha_0+n\alpha,\beta_0+\sum_n x_n$ for any $n$. This is in the Fink reference.Paulpeeling (talk) 11:53, 24 May 2008 (UTC)

May want to consider splitting the tables into scalar and multivariate conjugate distributions.

Changed "assuming dependence" (under normal with no known parameters) to "assuming exchangability". "Dependence" is wrong; "independence" is better, but since technically that should be "independence, conditional on parameters", I replaced it with the usual "exchangability" for brevity. 128.59.111.72 (talk) 00:48, 18 October 2008 (UTC)

Wasn't the "dependence" referring to dependence among the parameters, not the data? --128.187.80.2 (talk) 23:00, 30 March 2009 (UTC)

## Family of distributions

How does one tell whether two distributions are conjugate priors? What distinguishes "families"?

## Incorrect posterior parameters

Has anyone else noticed the posterior parameters are wrong? At least according (Degroot, 1970), the multivariate normal distribution posterior in terms of precision is listed incorrectly: it should be what the multivariate normal distribution in terms of the covariance matrix is listed as on the table. I don't really have the time to make these changes right now or check any of the other posterior parameters for accuracy, but someone needs to double check these tables. Maybe I'll do it when I'm not so busy. Also the, Fink (1995) article disagrees with DeGroot on a number of points, so I question it's legitimacy, given that the latter is published work and former is an ongoing report. Maybe it should be removed as a source? DeverLite (talk) 23:22, 8 January 2010 (UTC)

I just implemented the Multivariate Gaussian with Normal - Wishart conjugate distribution according to the article and found that it does not integrate to one. I corrected the posterior distribution in that case, but the others probably also need to be corrected. — Preceding unsigned comment added by 169.229.222.176 (talk) 01:55, 20 August 2012 (UTC)

To prevent confusion, it should be made clear that the Student's t distribution specified as the posterior for the multivariate normal cases is a multivariate student's t distribution parametrized by precision matrix, not by covariance as in the wikipedia article on the multivariate student's t distribution. — Preceding unsigned comment added by 169.229.222.176 (talk) 02:03, 20 August 2012 (UTC)

I was just looking at it and it looked wrong to me. The accuracy-based posterior parameters should have no inversions (as can be seen in the univariate case for example). I can fix that according to DeGroot's formulation. --Olethros (talk) 15:35, 14 September 2010 (UTC)

## Marginal distributions

I think it would be useful to augment the tables with the "marginal distribution" as well. The drawback here is the tables will widen, and they are already pretty dense. Thoughts? --128.187.80.2 (talk) 23:00, 30 March 2009 (UTC)

I am not clear what you mean by marginal distribution here ... if it is what I first thought (marginal dist of the observations) then these marginal distributions might find a better and useful place under an article named like compound distributions. Or is it the marginal distribution of new observations conditional on the existing observations marginalised over the parameters (ie predictive distributions)? Melcombe (talk) 08:59, 31 March 2009 (UTC)
I was referring to the marginal distribution of the observations (not the predictive distribution). I often use this page as a reference guide (much simpler than pulling out my copy of Gelman et al.) and at times I have wanted to know the marginal distribution of the data. Granted, many books don't include this information, but it would be useful. As an example, in the Poisson-Gamma model $\mathbf{x} \sim NegBin \left( \alpha, \frac{\beta}{1+\beta} \right)$ (when the gamma is parameterized by rate). This information is largely contained in Negative binomial#Gamma-Poisson_mixture but that article does not specifically mention that it is the marginal distribution of the data in the Bayesian setting. Plus, it would be more convenient to have the information in one place. Your proposal to put it on a dedicated page may be a reasonable compromise since the tables are already large and this information is used much less frequently. --128.187.80.2 (talk) 17:27, 1 April 2009 (UTC)
I thought that giving such marginal distributions would be unusual in a Bayesian context, but I see that Bernardo & Smith do include them in the table in their book .. but they do this by a having a separate list of results for each distribution/model, which would be a drastic rearrangement of what is here. An article on compound distributions does seem to be needed for its own sake. Melcombe (talk) 13:21, 2 April 2009 (UTC)

## Poisson-Gamma

It keeps getting changed back and forth, but I have the hyperparameters as: \alpha + n,\ \beta + \sum_{i=1}^n x_i\!

-- There is certainly a problem as it currently stands. The wikipedia page on the gamma explicitly gives both the two forms. Particularly k=alpha, beta=1/theta. Hence the update rules must be consistent with this notation. I have corrected this for now. —Preceding unsigned comment added by 129.215.197.80 (talk) 15:23, 20 January 2010 (UTC)

Please add discussion if this is incorrect before changing it! —Preceding unsigned comment added by Occawen (talkcontribs) 05:06, 6 December 2009 (UTC)

## Most unintelligible article on Wikipedia

Just a cheeky comment to say that this is the hardest article to understand of all those I've read so far. It assumes a lot of background knowledge of statistics. Maybe a real-world analogy or example would help clarify what a conjugate prior is. Abstractions are valuable but people need concrete examples if they want to jump in half-way through the course. I'm really keen to understand the relationship between the beta distribution and the binomial distribution, but this article (and the ones it links to) just leave me befuddled. 111.69.251.147 (talk) 00:39, 21 June 2010 (UTC)

### Another less cheeky comment.

Paragraphs 1 through to contents - Great. The rest - incomprehensible. I have no doubt that if you already know the content, it is probably superb, but I saw a long trail of introduced jargon going seemingly in no particular direction. I was looking for a what & some "WHY do this", but I did not find it here. Many thanks for the opening paragraphs. Yes, I may be asking for you to be the first ever to actually explain Bayesian (conjugate) priors in an intuitive way. [not logged in] — Preceding unsigned comment added by 131.203.13.81 (talk) 20:13, 1 August 2011 (UTC)

So, working through the example, thanks for one, being my only hope to work out what it all means: If we sample this random ... f - Arh, not the "f" of a few lines above. x - Arh, "s,f" that's x & "x", well that's the value for q = x, that's theta, from a few lines above I'm rewriting it on my page to just get the example clear. — Preceding unsigned comment added by 131.203.13.81 (talk) 03:12, 10 August 2011 (UTC)

wat

Simple English sans maths in the intro would be great. —Preceding unsigned comment added by 78.101.145.17 (talk) 14:48, 24 March 2011 (UTC)

The external link is broken. Should I remove it? — Preceding unsigned comment added by 163.1.211.163 (talk) 17:38, 12 December 2011 (UTC)

## Wrong posterior

Some of the posterior are wrong. I just discovered one:

Normal with known precision τ μ (mean)

The posterior variance is (τ0+nτ)^-1. — Preceding unsigned comment added by 173.19.34.157 (talk) 04:09, 15 May 2012 (UTC)

## That Table

Yeah... That table, while informative, is not formatted very well. It wasn't clear at first what the Posterior Hyperparameters column represented, or what any of the variables meant in the Posterior Predictive column. — Preceding unsigned comment added by 129.93.5.131 (talk) 05:00, 10 December 2013 (UTC)