Talk:Bayesian linear regression

From Wikipedia, the free encyclopedia
Jump to: navigation, search
WikiProject Statistics (Rated C-class, High-importance)
WikiProject icon

This article is within the scope of the WikiProject Statistics, a collaborative effort to improve the coverage of statistics on Wikipedia. If you would like to participate, please visit the project page or join the discussion.

C-Class article C  This article has been rated as C-Class on the quality scale.
 High  This article has been rated as High-importance on the importance scale.
WikiProject Mathematics (Rated C-class, Low-importance)
WikiProject Mathematics
This article is within the scope of WikiProject Mathematics, a collaborative effort to improve the coverage of Mathematics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
Mathematics rating:
C Class
Low Importance
 Field: Probability and statistics
This article has comments.

Bayes or Empirical Bayes?[edit]

This isn't a description of Bayesian linear regression. It's a description of the Empirical Bayes approach to linear regression rather than a full Bayesian approach. (Empirical Bayes methods peek at the data to ease the computational burden.) Not only that, it assumes a natural conjugate prior, which is a reasonable approach in many cases, but it is a serious constraint. The fully Bayesian approach with non-conjugate priors is nowadays tractable in all but the largest models through the use of Markov chain Monte Carlo techniques. In my view, this article represents a particular and rather dated approach. Blaise 21:53, 10 April 2007 (UTC)

Can you fix it and check my rating? Thanks - Geometry guy 14:19, 13 May 2007 (UTC)
I don't agree with the first comment, in that it is a full Bayesian approach as opposed to a max likelihood approach. It's true that a conjugate prior is limiting but it is also an enabler for application to very high dimensional input spaces. Also, this same approach can be extended into a kind of adaptive-variance Kalman filter in cases where the model parameters are expected to change through time. i.e. I think the page is very valuable, but would also like to see how to derive matrix A.
Malc S, January 2009.
— Preceding unsigned comment added by (talkcontribs) 12:02, 30 January 2009‎

Some questions[edit]

I've got some questions about this article. What this method gives you is a weighted combination of some prior slope, and a new slope estimate from new data.

Q1) the weights are determined by the A. In this one-dimensional case I presume this is just a variance. There are no details as to how this A would be calculated or estimated. Does anyone know?

Q2) I could set this problem up using classical statistics, I think. I'd say "let's make a prediction based on a weighted combination of the prior slope and the new slope". Then I'd do some algebra to derive an expression for the weight. Does anyone have any idea whether the final answer would be much, if any, different?

thanks 20:09, 14 October 2007 (UTC)


Maybe a section on computation would be helpful. When the amount of data is large, direct computation can be difficult. — Preceding unsigned comment added by (talkcontribs) 14:00, 7 August 2011‎

I would suggest that the computations be kept separate from the analytic results using the conjugate prior. I'm doing research, and I come to this page, and all I want is the posterior. It is there, but it is buried in computations. I don't care about the computations; I already know how to do them. I'm not saying get rid of the computations, but not having the posterior, posterior mean, etc, all available quickly makes this article less useful. -- (talk) 19:08, 9 September 2012 (UTC)

Suggested merge[edit]

I've added a suggested merge tag, suggesting the page be merged into ridge regression, since this is essentially a Bayesian interpretation of ridge regression, which has exactly the same mathematics. Jheald (talk) 17:39, 11 November 2012 (UTC)