# Talk:Scaled inverse chi-squared distribution

WikiProject Statistics (Rated C-class, Mid-importance)

This article is within the scope of the WikiProject Statistics, a collaborative effort to improve the coverage of statistics on Wikipedia. If you would like to participate, please visit the project page or join the discussion.

C  This article has been rated as C-Class on the quality scale.
Mid  This article has been rated as Mid-importance on the importance scale.
WikiProject Mathematics (Rated C-class, Low-importance)
This article is within the scope of WikiProject Mathematics, a collaborative effort to improve the coverage of Mathematics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
Mathematics rating:
 C Class
 Low Importance
Field: Probability and statistics

## Mode

Shouldn't the mode be $\frac{\nu \sigma^2}{\nu-2}$ because $\chi_{\nu}^2(x)$ is maximized when $x = \nu-2$
— Preceding unsigned comment added by 128.2.82.177 (talkcontribs) 01:13, 6 March 2006‎

No, the correct denominator for the mode is $\nu+2$:

1) See "Bayesian Data Analysis, 2nd edition" by Gelman et al., p.574.
2) I have verified that the derivative of the stated probability density function
$f(x)=\frac{(\sigma^2\nu/2)^{\nu/2}}{\Gamma(\nu/2)}~ \frac{\exp\left[ \frac{-\nu \sigma^2}{2 x}\right]}{x^{1+\nu/2}}$
with respect to x is zero at the stated mode
$x_{mode}=\frac{\nu \sigma^2}{\nu+2}$
PAR 09:36, 11 July 2006 (UTC)
More generally, note that we shouldn't expect the two modal values to correspond.
If we have two parametrisations of a probability distribution, in terms of x and y(x), then we can write the pdfs as
$p(y) = \frac{dP}{dy}; \qquad p(x) = \frac{dP}{dx} = \frac{dP}{dy}\frac{dy}{dx}$
where P is the cumulative probability.
The modal value of y occurs when the rate of change in the probability density for y is zero, ie
$0 = \frac{d p(y)}{dy} = \frac{d^2 P}{dy^2}$
But the similar expression for the modal value of x gives
$0 = \frac{d p(x)}{dx} = \frac{d}{dx} \left(\frac{dP}{dy}\frac{dy}{dx}\right)$
$= \frac{d^2 P}{dy^2}\left(\frac{dy}{dx}\right)^2 + \frac{dP}{dy} \frac{d^2 y}{dx^2}$
In general this will not be zero when d2P/dy2 = 0, unless also either dP/dy = 0 or d2y/dx2 = 0 at that value.
The take-home message is therefore that for most distributions and most transformations, the most probable value of x will not usually be at the value corresponding to the most probable value of y, for any transformation y(x) more complicated than just y = ax + b. Jheald (talk) 16:44, 21 October 2012 (UTC)

## Request move

The name is more proper without the dashes --Kupirijo 23:16, 31 March 2007 (UTC)

I agree, I propose the article is moved to "Scaled_inverse_chi-square_distribution" any objections? Domminico (talk) 17:33, 24 August 2008 (UTC)

Dashes removed. Jheald (talk) 12:11, 21 October 2012 (UTC)

## Name

What is the name of this distribution? Is it "scaled-inverse-$\chi^2$" or "scale-inverse-$\chi^2$"? The article uses both. Personally I prefer "scaled" since it describes where the distribution comes from. It's a "scaled" version of the inverse-$\chi^2$ distribution. But maybe there's a reason/argument for "scale"?Domminico (talk) 20:27, 24 July 2008 (UTC)

## ML estimate of ν/2

Shouldn't the maximum likelihood estimate of $\frac{\nu}{2}$ read (mind the minus sign)

$\ln\left(\widehat{\frac{\nu}{2}}\right) - \psi\left(\widehat{\frac{\nu}{2}}\right) = \frac{1}{N}\sum_i \ln\left(x_i\right) + \ln\left(\frac{1}{N}\sum_i\frac{1}{x_i}\right)$

where I also have made use of the fact that the maximum likelihood estimate for $\sigma^2$ is $\widehat{\sigma^2} = \frac{N}{\sum_i\frac{1}{x_i}}$

Ikingut 10:31, 4 February 2011 (UTC)

## Parameterisation seems to be perverse

The parametrisation given here matches that in Gelman et al (1996/2003), and perhaps also elsewhere.

But surely more natural would be for the distribution to be that of x = 1/s2, where s2 was the sample variance of ν samples from N(0, σN) ? The scaling parameter σ2 used here is the exact reciprocal of that variance, σ2 = 1 / σN2.

I must work through more closely exactly what Gelman et al are doing with this distribution, but on the face of it, it seems to me that it would be much more intuitive to present the mean, mode etc of (1/s2) as scaling as

${\mathrm{Mean}}({1/s^2}) = \frac{1}{\sigma_N^2}\frac{\nu}{\nu-2} \qquad \mathrm{and} \qquad {\mathrm{Mode}}({1/s^2}) = \frac{1}{\sigma_N^2}\frac{\nu}{\nu+2},$

rather than with σ2 as presented. Jheald (talk) 12:34, 21 October 2012 (UTC)

Update. I think what I wasn't appreciating is that what this is a distribution for in the Bayesian context (namely σ2) is rather different that what it is a distribution for (namely 1/s2) when it is the distribution of an inverse chi-squared variable. Since the article is called "inverse chi-squared distribution", I have made a stab at re-writing the lead from that perspective, then mentioning the Bayesian context is a bit different. I've also changed the letter used for the scaling parameter from σ2 to τ2, since in the context of the inverse chi-squared variable it is τ2 that makes rather more sense; and this should avoid confusion like mine about σ2 in this context that led to my comments above. I've mentioned that σ2 is used for this parameter in the Bayesian context, and I intend to add a rather more detailed section about how the distribution occurs in Bayesian estimation of an unknown variable, perhaps using some of the material now at Normal distribution. Hope that that is all right with everybody. Jheald (talk) 21:52, 25 October 2012 (UTC)