Talk:Multivariate statistics

From Wikipedia, the free encyclopedia
Jump to: navigation, search
WikiProject Statistics (Rated Start-class, High-importance)
WikiProject icon

This article is within the scope of the WikiProject Statistics, a collaborative effort to improve the coverage of statistics on Wikipedia. If you would like to participate, please visit the project page or join the discussion.

Start-Class article Start  This article has been rated as Start-Class on the quality scale.
 High  This article has been rated as High-importance on the importance scale.

possible additions[edit]

Here are a few possible additions to this page, both as a note for myself and as an invitation for discussion:

  • Brief discussion of ordination and classification and how the multivariate techniques listed fit into those categories.
  • Link to factor analysis and mention of how it and PCA are related.
  • Link to cluster analysis.
  • Discussion of the kinds of research questions for which multivariate statistics might be most appropriate.

--Belgrano 23:44, 13 August 2005 (UTC)

Currently this page is a little misleading as it does not differentiate between models where more than one variable is modelled as random (multivariate) and those, like that commonly referred to as multiple regression, where only one variable is modelled as random but many explanatory variables are included. These are really univariate and I find it helpful to describe them as multivariable - although not a perfect term since a random variable is also a variable. The advantage of this comes when one uses multiple explanatory variables in a true multivariate regression and can just use both terms to describe the model. 17:20, 30 November 2007 (UTC)

in multiple regression you can have explanatory variables that are random! —Preceding unsigned comment added by (talk) 06:01, 10 June 2009 (UTC)

I agree somewhat with the previous comment. The words multivariate and multivariable have separate and distinct meanings in statistics. Multivariate is used when multiple dependent variables are used (aka outcomes, responses), while multivariable refers to the case when you have multiple predictors (aka, independent variables). Some of the methods listed on this page are not obviously multivariate, such as regression and logistic regression, although they could be. When they are, they are usually referred to as multivariate regression. (talk) 04:01, 29 February 2008 (UTC)Wade

Good points. People get those confused all the time.Makewater (talk) 20:11, 6 April 2009 (UTC)

I too agree with the above comment. It is a basic error that many people that use stats make and needs to be clarified.. —Preceding unsigned comment added by Astrovouk (talkcontribs) 04:36, 15 October 2009 (UTC)

regression analysis restricted to the linear case?[edit]

I don´t understand why are you listing "regression analysis" as being only a linear regression, and listing logistics models as a separate thing. Isn´t linear regression and logistic regression, as well as probit, tobit, etc. different models of regression, all of them included in "regression analysis"? —Preceding unsigned comment added by (talk) 05:57, 10 June 2009 (UTC)

Yes - this was strange, and since regression isn't normally considered a multivariate technique, I've removed it. RMGunton (talk) 23:59, 16 March 2010 (UTC)

Removing regression[edit]

I agree with the comment above that "regression" isn't normally considered a multivariate technique (and that "multivariable" seems a better term). It therefore doesn't belong on this page; if it did then there should also be discussion of 2-way ANOVA, etc - indeed the whole category of general linear models, since these allow for multiple predictors. I'm taking the liberty of removing this now. Redundancy analysis is the multivariate analogue of regression. RMGunton (talk) 23:51, 16 March 2010 (UTC)

The point you make is a valid one. Makewater (talk) 14:18, 1 September 2010 (UTC)
The point here shows a confusion between "multivariate regression", in which the dependent variables can be multidimensional, and "multiple regression" in which the independent variables are multidimensional. Thus, in "multivariate regression", the residual for a given observation is multidimensional and hence there is a need to treat the statistical dependence between the elements of the residual vector. Melcombe (talk) 08:44, 2 September 2010 (UTC)

Other material removed[edit]

The following entry was under "Types of Analysis" but it appears to be simply an example of a paper that uses multivariate statistics. As such, it doesn't seem to belong here, unless the page were extended with an "examples" section (but this could be almost infinitely long). RMGunton (talk) 21:19, 4 August 2011 (UTC)

  1. Geostatistics for complex geological phenomena also involves multivariate statistics. The modeling in these multiple-point geostatistics can be performed using pattern-based modeling [1].


Is the InsightsNow link in the external links spam? No relevance to multivariate stats, just landing page of a company. — Preceding unsigned comment added by (talk) 16:11, 7 January 2013 (UTC)

IS LDA limited to the use of normally distributed data????[edit]

IS LDA limited to the use of normally distributed data? yes according to here[[1]]

no according to [[2]] or — Preceding unsigned comment added by (talk) 18:36, 12 February 2014 (UTC)

  1. ^ Honarkhah, M and Caers, J, 2010, Stochastic Simulation of Patterns Using Distance-Based Pattern Modeling, Mathematical Geosciences, 42: 487 - 517