Talk:Analysis of variance
| WikiProject Statistics | (Rated C-class, High-importance) | ||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|||||||||||||||||
| WikiProject Mathematics (Rated Start-Class) | |||
|---|---|---|---|
| This article is within the scope of WikiProject Mathematics, a collaborative effort to improve the coverage of Mathematics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks. | |||
| Mathematics rating: | Start Class | High Priority | Field: Probability and statistics |
| One of the 500 most frequently viewed mathematics articles. | |||
|
Please update this rating as the article progresses, or if the rating is inaccurate. |
|||
[edit] Mergers?
A variety of mergers has been proposed. Please add to the discussion at Talk:Linear regression.
An automated Wikipedia link suggester has some possible wiki link suggestions for the Analysis_of_variance article:
- Can link degrees of freedom: ...hbox{Error}} + SS_{\hbox{Treatments}}</math> The number of degrees of freedom (abbreviated ''df'') can be partitioned in a similar way an... (link to section)
Additionally, there are some other articles which may be able to linked to this one (also known as "backlinks"):
- In Chi-squared test, can backlink analysis of variance: ...en approximately chi-squared tests; e.g., ''F''-tests in the analysis of variance and ''t''-tests are likelihood-ratio tests, but the test st...
- In Conjoint analysis (in marketing), can backlink analysis of variance: ...ike SPSS or SAS. == Analysis== The computer uses monotonic analysis of variance or linear programming techniques to create utility function...
Notes: The article text has not been changed in any way; Some of these suggestions may be wrong, some may be right.
Feedback: I like it, I hate it, Please don't link to — LinkBot 11:28, 1 Dec 2004 (UTC)
Frankly, I don't think this page is very clear. It would be nice to add a few words about the type problems ANOVA is applied to. The second bullet describing classes ("Random-effects models assume that the data describe a hierarchy of different populations whose differences are constrained by the hierarchy.") is hard to understand for the non-expert. The meaning of SS is not explained (I guess it means "sum of squares"?). Is there anybody with a backgroud in statistics who could improve upon this article? --Agd 20:28, 9 November 2005 (UTC)
Agree that the page is not clear. Needs major revision. 212.179.209.45 01:01, 16 January 2006 (UTC)
Agreed this page lacks many anova tests. I was going to use this as a reference before I got my stats book for class, and to my dismay there is not a single formula. NormDor 11:56, 17 March 2006 (UTC)
I came here after reading the biographical entry on R. A. Fisher. I agree with the foregoing comments that this page doesn't do a good job of explaining what Analysis of variance is about, or used for, that is useful for the numerate reader who is not already familiar with the topic. 198.161.198.74 (talk) 19:40, 15 January 2010 (UTC)
[edit] "Example of one-way ANOVA"?
The sections labeled "Example of [...]" are not actually examples of the ANOVA procedure, they are examples of study designs on which analysis by ANOVA might be useful. A fine distinction, but an important one I think. Should revise either the section headings or the sections' contents.
- I updated the language, and it should reflect what you noted. Chris53516 13:53, 15 August 2006 (UTC)
Sorry, but this page really completely useless. It's a bit like an "About" box for ANOVA. Gives you a hint about what it might be. But without any actual worked examples, showing you how to get the results you need, and then how to interpret those results, nothing useful is learned. Barfly42 15:24, 17 December 2006 (UTC)
[edit] ANOVA definition
I changed the definition of ANOVA (the first paragraph). The old definitions was not generale enough. ANOVA can be used to compare several distribution, but it is just an example of its applications. Some work should be devoted to change the definitions of fixed and random effects. Gideon Fell 10:33, 14 March 2007 (UTC)
In the examples section it says: "one-way ANOVA with repeated measures". Shouldn't it be "two way ANOVA" since the experiment of using the same subjects with repeated measurements matches the description of "two way ANOVA" from the Overview section. —Preceding unsigned comment added by 190.64.28.235 (talk) 15:21, 27 October 2009 (UTC)
[edit] Factorial ANOVA
According to "Statistics for experimenters" by Box, Hunter and Hunter the use of ANOVA makes no sense for factorial experiments (section 5.10 "Misuse of the ANOVA for 2^k factorial experiments"). This appears to be because each combination of factors only has a single degree of freedom, so the value actually calculated is equivalent to referral to a t table. This seems to be a common misconception. —The preceding unsigned comment was added by 129.215.37.22 (talk) 20:46, 14 March 2007 (UTC).
- Please provide a precise reference, and exact statement, since this cannot be correct.
- It is standard statistical practice, recommended by Box Hunter Hunter (following Fisher, for example), to have replicates! With replicates, some error degrees of freedom exist. (Perhaps they were referring to extremely expensive experiments, with no replications?) Kiefer.Wolfowitz (talk) 21:14, 19 January 2010 (UTC)
- I also don't see how this could be correct. The relationship between the t-test and the F-test in groups of size 2 is clear, but, for example, relying only on a sequence of t-tests ignores the potential for interactions. I don't have access to the reference, but this confusing (and unsubstantiated) claim has been around for more than half a year; perhaps it should be removed? Pools.KarmaHorn (talk) 18:28, 2 March 2010 (UTC)
There is a link to "Factorial ANOVA" in the 'Overview' section, which links back to the ANOVA page(this same article!). This is stupid. Either the link should be removed, or a "Factorial ANOVA" page should be added, or it should be an anchor going to a "Factorial ANOVA" section. —Preceding unsigned comment added by Mr.maddamsetti (talk • contribs) 20:10, 23 March 2010 (UTC)
[edit] ANOVA table
Could do with at least one example of an ANOVA table here, either with numbers in or showing notation for sums of squares, mean squares etc. Maybe also worth mentioning skeleton ANOVA tables, i.e. showing with entries only for df. I may add these myself at some point... Qwfp (talk) 18:18, 19 January 2008 (UTC)
[edit] ANOVA and visualization
it might be well worth the effort to add a section describing visualization techniques of ANOVA. through plots such as boxplot and others. I am willing to have a go at it, but I don't know who is responsible to this article and don't want to step anyone's tows... (p.s: I am currently doing my second degree in biostatistics) Talgalili (talk) 11:49, 1 December 2008 (UTC)
- I think this could be a very good idea (if the visualization is appropriate of course) Bgeelhoed (talk) 09:57, 4 December 2008 (UTC)
- Seconded. Finereach (talk) 11:01, 4 January 2009 (UTC)
[edit] Inconsistent notation
In the article there are four different notations, which unnecessarily confuse the reader:

Which notation should Wikipedia / the editors opt for?
Might I suggest: 
where
A is treatment (factor A)
T is total
E is error
What are your thoughts on this?
Ostracon (talk) 15:57, 16 August 2009 (UTC)
- I don't know enough to comment on that, but the article should also link to or provide definitions for what the different sum-of-squares sums mean. Someone who doesn't have any idea of what SS_{treatment} means, say, should be able to get a definition within at most a click. --24.17.142.210 (talk) 17:55, 1 July 2010 (UTC)
[edit] Assumptions section has errors
ANOVA assumes neither
- independence, because the randomization distribution has a covariance-symmetric (CS) covariance matrix with a small negative correlation between different replications (see Chapter 2 Section 14 of Bailey or Chapter 6 of Hinkelmann and Kempthorne)
nor
- normality (same reason, the randomization distribution allows inference, as emphasized by Charles S. Peirce and Ronald A. Fisher.
What is true is that the p-values of the randomization test of the ANOVA null-hypothesis are well approximated by the p-values of the F test using the F-distribution (Chapter 6, Hinkelmann & Kempthorne).
(Of course, it is easier to teach the mechanics of ANOVA testing by assuming a so-called "normal" linear model and using the F-distribution.)
Therefore, the "Assumptions" sections needs revision, imho. Kiefer.Wolfowitz (talk) 17:10, 24 November 2009 (UTC)
- Seeing no objections, I wrote a short discussion of randomized experiments and anova. Kiefer.Wolfowitz (talk) 17:57, 4 January 2010 (UTC)
-
- When I have more energy, I shall plan to put the "textbook normal-model" approach first and then follow with the randomization (design-based) approach second. I think that this will be easier for non-statisticians, particularly since the randomization-based analysis is explained in textbooks but not apparently in Wikipeda. Kiefer.Wolfowitz (talk) 00:49, 5 January 2010 (UTC)
-
-
- I reordered the two "approaches" (models first, then randomization tests). It would be useful to mention permutation tests as good (See Lehmann's TSH, for a theorem recommended by Paul Rosenbaum, which I don't have in front of me) for data coming from possibly non-randomized sources. Kiefer.Wolfowitz (talk) 15:26, 17 January 2010 (UTC)
-
[edit] Attributing work on the Anova on Ranks
There is a well known saying in mathematics and statistics: "A mathematician is only given credit for his discoveries that his colleagues agree to give him." Quoting an expository article by Seaman a decade after the discovery work of Sawilowsky on the rank transform is not only unfair, but is typical of the shallow scholarship that is becoming legendary on Wiki. So, I've decided to jump right in and set the record straight, although by now I know that scholarship and Wiki warriors rarely peacefully coexist.
I also took the opportunity to move the references misplaced in the middle of the article to the end, and put them in alpha order. —Preceding unsigned comment added by 141.217.105.21 (talk) 15:09, 4 January 2010 (UTC)
- Two comments, on substance and style.
- First, on substance: Rank methods were used long ago for ANOVA---e.g., by H. B. Mann, Kruskal Wallis, Milton Friedman. This article should not attribute such methods "first in 1981" to Connover & Iman!
- Is it wise to cite so many articles of Sawilowsky, while neglecting Lehmann, Hodges, Hajek, etc.? IMHO, it is generally better to direct Wikipedia readers first to standard books by researchers (roughly in order of increasing difficulty):
- Hollander, Wolfe. Nonparametric Statistical Methods. (Reliable cookbook).
- Hettsmansperger & McKean. Robust Nonparametric Statistical Methods.
- Erich Lehmann. Nonparametrics: Statistical Methods Based on Ranks.
- Hajek, Sidak, Sen. Theory of Rank Tests. ??
- P.K. Sen, Madan Puri. varia.??
- I don't find Seaman listed in the index in Hollander & Wolfe, Hajek, or Hettsmansperger. Hettsmansperger & McKean have nice thiings to say about Sawiloskiy's 1989-1990 work (e.g. p. 204). Unpublished simulation studies should rarely or never be cited as "proving" things, when there are mathematical proofs of asymptotic properties of tests (coupled with published studies of serious simulation studies in good journals, which are needed for finite-sample behavior). (I have heard that Conover & Iman is a good textbook.)
- Second comment, on style. Please refrain from making disparaging remarks about the Wikipedia project and the "scholarship" of the editors associated with this page. Please continue to suggest improvements or discuss problems on the Talk page here; please direct criticisms of Wikipedia to the appropriate fora elsewhere. Such contributions are most welcome.
- Thank you for your consideration.Kiefer.Wolfowitz (talk) 16:28, 4 January 2010 (UTC)
-
-
- I think what would be helpful is if you took a look at the Conover and Iman 1981 "Bridge" article. They go to great lengths to differentiate the different types of ranking procedures. Although there are some rank transformations that result in a known statistic, e.g., Wilcoxon Rank Sum, there are many others that do not result in a known statistic, and chief among them for purposes of the current discussion is the ANOVA. With regard to the rank transform, Iman (Conover's student), was the first to examine it, and it took place in the mid-1970s. Iman went on to become the President of the American Statistical Association.
-
-
-
- In exactly the wiki warrior spirit I mentioned above, feel free to eliminate any or all references to Sawilowsky - after all, until now his work wasn't mentioned! However, after you read the famous "Bridge" article, you will discover that Lehmann, Hodges, Hajek, et al., although world-reknown and major contributors to nonparametrics in general, and ranking (and aligned ranks) procedures in particular, had zero contributions to the hundreds of articles on the rank transform on ANOVA in general, and in terms of interaction effects in particular.
-
-
-
- I'm not sure which "unpublished" simulation studies (plural) you are referring to. The first study to show contrary results to Iman and Conover was Sawilowsky's dissertation; however, the primary results were subsequently published in (1) Communications in Statistics (1987), (2) Journal of Educational Statistics (1989), and (3) Review of Educational Research (1990). The first is a standard stat journal, JES is the premier stat journal of the American Educational Research Assocation, and RER is the premier synthesis journal in the social and behavioral sciences. If an encyclopedia has any interest in chronology, the disseration (1985) could be mentioned.
-
-
-
-
- I added G. L. Thompson's asymptotic article that begins with the inadmissablity of interactions on the RT ANOVA. She, among others, subsequently published numerous asymptotic studies confirming Sawilowsky's MC results. Thompson (JASA, 1991, p. 410) attributed the discovery of the failure of the rank transform in ANOVA to Sawilowsky's (et al.) Monte Carlo work, as did Akritas, JASA, 1991, p. 410), both calling him and his colleagues "careful data analysts". 141.217.105.21 (talk) 18:36, 5 January 2010 (UTC)
-
-
-
-
-
-
- I thank the other editors for responding to discussion here and in particular making thoughtful changes to the article, which clarify things greatly.
- That said, I do think that the recent edits have rendered this article too negative towards standard methods of rank-transformations. For example, in 2004, the review journal Statistical Science (of the Institute for Mathematical Statistics) had a special issue on nonparametric statistics, in which many authors discusses rank-based methods. Because of the previously mentioned textbooks and such review essays, I do believe that this article should have a first (positive) paragraph about the usefulness of rank-based methods as a general heuristic, whose properties seem to work best in simple designs.
- Then let us keep a caution that for complicated designs (e.g. factorial-treatment designs), some caution should be exercised (as the other editors' have documented, at least to my satisfaction); here, let us keep the (updated and improved) text currently available.
- Would that be acceptable?
- Finally, please refrain from labeling me as a "wiki warrior", for actions I've never committed in editing articles (or suggested on a Talk page even), e.g., removing references to Sakilowsky. (On the contrary, I referenced a favorable discusssion of Sakilowsky's work!)
- Thank you. Kiefer.Wolfowitz (talk) 23:22, 6 January 2010 (UTC)
- Your labeling of other editors as "wiki warriors" is particularly inappropriate given your recent writing of an article defining wiki warriors: Each of your definitions directly attacks the intentions of the editor, not behavior, and is therefore against the Wikipedia policy of "assuming good faith". Kiefer.Wolfowitz (talk) 23:50, 6 January 2010 (UTC)
-
-
-
-
-
-
-
-
- I went to the Sawilowsky page. I don't know why you have mispelled the name (I have a name that is hard to spell so I'm sensitive to it), nor do I know why you self-plagiarized en.wikipedia.org/wiki/Plagiarism#The_concept_of_self-plagiarism your comments by repeating them here. I agreed with your point there (as will probably most) and agree with it here, but you really don't need to promote your point on multiple pages.
-
-
-
-
-
-
-
-
-
- As to the substance, my reading is that the ranking procedure is known to work for some famous stats and there was nothing negative to report about them. To support your suggestion, can more of the famous stats where it works can be cited? But it bothers me that only for one type of t test and not the other does it work, and even I don't call the t a complicated test! Does it work with multiple regression? So, I don't quite yet understand what is so favorable about this as a procedure that it should be addressed as positive general heuristic. Can you explain?
-
-
-
-
-
-
-
-
-
- I think other technical experts (which I am not) should weigh in on this.68.43.236.244 (talk) 04:50, 7 January 2010 (UTC)
-
-
-
-
-
-
-
-
-
-
- Hi Kiefer.Wolfowitz. Sorry for this addition, and being long winded. I went back to the ANOVA page and now I really don't get your concern. There are three paragraphs on the ranking, of which the first two are 100% positive. (I would say it is too positive! It is only the third paragraph on the subject that talks about it not working. Now it raises the legitimate question of why that material in the 3rd paragraph should be hidden? I for sure would want to know if after working for an hour to get the answer if my stats really work or not!68.43.236.244 (talk) 05:00, 7 January 2010 (UTC)
-
-
-
-
-
[edit] Factorial experiments: Rank and anova
Hettmansperger and McKean's book "Robust Nonparametric Statistical Methods"
- Hettmansperger, T. P.; McKean, J. W. (1998). Robust nonparametric statistical methods. Kendall's Library of Statistics. 5 (First ed.). London: Edward Arnold. pp. xiv+467 pp.. ISBN 0-340-54937-8, 0-471-19479-4. MR1604954
state that the "R transform" works well, even for small samples (McKean and Seavers), pages 254-258. Distinguishing the R-transform from the Rank-transform is difficult for the public . Maybe the article should discuss the R-transform (following Hetmansperger & McKean, the best authority known to this amateur) first and foremost. Then the article can continue to discuss the rank-transform, and mention that its use seems to be deprecated (for some time). Would that be agreeable? Sincerely, Kiefer.Wolfowitz (talk) 21:35, 19 January 2010 (UTC)
-
-
- The aligned ranks procedure of H&M is marginally better than the Blair-Sawilowsky Fawcett-Salter (only because the latter has minor Type I inflations (e.g., nominal alpha = .05 in some cases may lead to a rise in Type I error to .055, which also likely accounts for why it is more powerful than the former) (Headrick, T., & Sawilowsky, S., January, 1999, The best test for interaction in factorial ANOVA and ANCOVA. Statistics Symposium on Selected Topics in Nonparametric Statistics. Gainesville, FL). Until there are algorithms available in statistical packages, H&M remains difficult to compute.
- The main point is that to do as you propose it will require a major rewrite, because align procedures are often complicated to conduct, and indeed have no relation to the pure rank transform, of which the named npar statistics are equivalent (e.g., Spearman's rho, Wilcoxon Rank-Sum/Mann-Whitney U, etc.)
- Furthermore, the jury is still out on aligned ranks methods (including H&M, Puri and Sen, etc.), with many layouts yet studied and shown to be valid.Edstat (talk) 18:12, 31 January 2010 (UTC)
-
-
-
-
- Thanks for your helpful and informative answer, which satisfies me that the status quo is good enough. Kiefer.Wolfowitz (talk) 20:51, 31 January 2010 (UTC)
-
-
[edit] SAS recommendations
I looked in the recent SAS Linear Models and Mixed Linear Models books, and they contain no references to rank (as far as I can see). Would an editor please either provide a current reference or please delete/modify the statements about SAS? Again, Statistical Science in 2004 had a lot of papers on rank-based procedurs in its special issue on nonparametrics´, so it doesn't seem useful to include a reference to rank-based methods in the 1980s. Kiefer.Wolfowitz (talk) 05:04, 14 January 2010 (UTC)
- I have no problem if some editor wants to review ALL SAS documentation to see if they have rescinded their recommendation. However, hiding history is decidedly un-encyclopedic. The point in the current text is authors in prestigious statistical outlets (JASA, AS, etc.) recommended this procedure, and software companies (SAS is an "e.g.") followed suit. This caused untold destruction in the analysis of data, for those who know what a Type I error of 1 means. I can't imagine why someone would want this to happen again!
- Having a lot of papers on rank-based procedures that don't address the specific issue at hand moots a 2004 date vs. 1980s. I raise the question: what is the desire (bolding a subtitle?) to cover up the history of the failure of this statistic?141.217.105.193 (talk) 13:11, 15 January 2010 (UTC)
-
- Well, well, well. In the SAS/STAT 9.2 user's guide, 2008, p. 291. Intro to nonparametric analysis,: Many nonparametric methods analyze the ranks of a variable rather than the original values. Procedures such as PROC NPAR1WAY calculate the ranks for you and then perform appropriate nonparametric tests. However, there are some situations in which you use a procedure such as PROC RANK to calculate ranks and then use another procedure to perform the appropriate test. See the section “Obtaining Ranks” on page 297 for details.
-
- And from page 297: "The primary procedure for obtaining ranks is the RANK procedure in Base SAS software. Note that the PRINQUAL and TRANSREG procedures also provide rank transformations. With all three of these procedures, you can create an output data set and use it as input to another SAS/STAT procedure or to the IML procedure. For more information, see the chapter “The RANK Procedure” in the Base SAS Procedures Guide. Also see Chapter 70, “The PRINQUAL Procedure,” and Chapter 90, “The TRANSREG Procedure. In addition, you can specify SCORES=RANK in the TABLES statement in the FREQ procedure. PROC FREQ then uses ranks to perform the analyses requested and generates nonparametric analyses. For more discussion of the rank transform, see Iman and Conover (1979); Conover and Iman (1981); Hora and Conover (1984); Iman, Hora, and Conover (1984); Hora and Iman (1988); and Iman (1988)."
-
- So it seems to me that SAS originally made their recommendation in 1985 and 1987 based on recommendations on such fine publications as JASA and the AS, and continue to do so in 2008! Hmmm. I wonder which of SAS's quoted papers were reviewed in Mathematical Reviews and are listed on MathSciNet? 141.217.105.193 (talk) 13:30, 15 January 2010 (UTC)
-
-
- My point was that SAS's complementary documentation for its modules on linear models --- for Linear Models and for Linear Mixed Models --- doesn't seem to mention rank-transformations. Our article is on Anova, not on nonparametrics.
- I thank the other editor(s) for updating the reference about SAS's module on nonparametric methods, which continues to recommend the rank-transform.
- Regarding R-transforms which are rank-based methods (but not rank transforms), see above. Such methods were featured not only in Statistical Science 2004 but also in the nonparametric/robust article(s) in JASA 2000's series of short surveys of statistics. Kiefer.Wolfowitz (talk) 15:22, 17 January 2010 (UTC) (UPDATED to distinguish rank-tranforms from R-transforms. Kiefer.Wolfowitz (talk) 16:26, 20 January 2010 (UTC) )
-
[edit] Effect size measures section
This section is confusing two different effect size measures when it states the following:
"The generally-accepted regression benchmark for effect size comes from (Cohen, 1992; 1988): 0.20 is a minimal solution (but significant in social science research); 0.50 is a medium effect; anything equal to or greater than 0.80 is a large effect size (Keppel & Wickens, 2004; Cohen, 1992)."
"Nevertheless, alternative rules of thumb have emerged in certain disciplines: Small = 0.01; medium = 0.06; large = 0.14 (Kittler, Menard & Phillips, 2007)."
The first paragraph refers to rule of thumb guidelines for categorizing Cohen's d. The second paragraph refers to rule of thumb guidelines for categorizing eta-squared.
68.54.107.114 (talk) 02:17, 11 January 2010 (UTC)AmateurStatistician
[edit] One vs Two -way vs Multivariate ANOVA
Generally there seems to be a bit of a disorder in ANOVA information. There is a One-way ANOVA page on wikipedia but the two way ANOVA page redirects here without giving any reasonable comparison or differentiation between the two. Since this is one of the most widely used tests in social sciences it should be clear what are the distinctions in clear and simple terms. JakubHampl (talk) 17:31, 17 April 2010 (UTC)
[edit] "Due to"
Perhaps something should be said about ANOVAs not always having explanatory power wrt causality (in observational studies). This is perhaps most controversial in heritability estimates, particularly in human subjects. From Stoltenberg, S. F. (1997). "Coming to terms with heritability". Genetica 99 (2–3): 89–96. doi:10.1023/A:1018366705281. PMID 9463077.
| “ | However, the language that surrounds the partitioning of variance is prone to misunderstanding in its own right (Lewontin, 1974; Kempthorne, 1978), therefore I avoid using terms such as ‘due to’ or ‘caused by’ when referring to the statistical relations between an independent variable and a dependent variable (e.g., in an analysis of variance [ANOVA]), but instead use terms such as ‘associated with’ to avoid deterministic implications. | ” |
The papers cited which go into more detail on this are: Lewontin and Kempthorne. Note that "due to" is used here right in the lead. Tijfo098 (talk) 05:06, 26 October 2010 (UTC)
- I've changed 'due' to 'attributable' in the lead as a start. Qwfp (talk) 08:00, 26 October 2010 (UTC)
-
- This seems like an sound improvement. Thanks for alerting us to pay more attention to our use of "due to". Best regards, Kiefer.Wolfowitz (talk) 17:29, 26 October 2010 (UTC)
-
-
- I'm not sure it's an improvement, since a reasonable person might think "attributable to" means "caused by". But certainly "due to" is a commonplace usage, and potentially misleading for the reason Kempthorne mentions. Michael Hardy (talk) 20:47, 26 October 2010 (UTC)
-
[edit] References
I put the cleanup tag due to sheer size and unclear usefulness to the article. Ideally, they should be references directly used in the article, through footnotes. Here, they seem not to be attached to footnotes and are more of a "further reading" section (See WP:CITE and WP:LAY), but it still contain duplications. For examples, SAS user guides from 25 and 23 years ago are listed. Same book, different editions: why are are they both listed? Is it content that is in one but not the other and both are relevant to Anova? Why not cite a more recent edition which would be more up to date and accessible instead? It seems to me quality should be picked over quantity.--137.122.49.102 (talk) 19:04, 3 November 2010 (UTC)
- The old SAS manuals were added to document the notability of the research of Shlomo Sawilowsky, by editor User:Edstat, particularly cautioning against the (standard) use of univariate rank-transformations for multivariate data (like in Puri and Sen in the early 1970s); this univariate rank-transformation approach is of lesser interest today, following the nonparametric multivariate analysis of David Tyler, Hannu Oja, etc. Since this page is an encyclopedia article on ANOVA and not a monograph on the history of minor topics in ANOVA, there is no reason to mention topics that are ignored by standard and reliable references on ANOVA (e.g. as judged by JASA reviews and citations). Thanks, Sincerely, Kiefer.Wolfowitz (talk) 19:24, 3 November 2010 (UTC)
- (P.S. Having written this, I reach for a pre-emptive aspirin or two! 19:24, 3 November 2010 (UTC))
- Oh, I see what's happening, it's actually more of a style issue. Though it looks better now, what really should be done is to make footnotes in the text like so <ref name=...>{{cite ...}}</ref>, then the <references/> or {{reflist}} of the notes section will take care of the citing and links. Since Wikipedia is not a paper journal, the citation style should conform to the best Wikipedia standards and using footnotes, instead of the old-fashion, paper journal AuthorName (year) and have the reader fish out the reference in the references section. There's more on this on WP:CITE, and it's not a bad idea to see how it's done in good articles (see for example, how the references are done in maple syrup, or foie gras for another approach when books are cited multiple times at different pages, thus there's a "notes" section and a "references" one). Thanks for working on it though.--137.122.49.102 (talk) 14:50, 4 November 2010 (UTC)
- Thanks for the encouragement. I'll return next week to improve the formatting of the citations and the citations themselves, e.g. finding the eelworms example of Bailey, etc. Best regards, Kiefer.Wolfowitz (talk) 20:26, 4 November 2010 (UTC)
- Oh, I see what's happening, it's actually more of a style issue. Though it looks better now, what really should be done is to make footnotes in the text like so <ref name=...>{{cite ...}}</ref>, then the <references/> or {{reflist}} of the notes section will take care of the citing and links. Since Wikipedia is not a paper journal, the citation style should conform to the best Wikipedia standards and using footnotes, instead of the old-fashion, paper journal AuthorName (year) and have the reader fish out the reference in the references section. There's more on this on WP:CITE, and it's not a bad idea to see how it's done in good articles (see for example, how the references are done in maple syrup, or foie gras for another approach when books are cited multiple times at different pages, thus there's a "notes" section and a "references" one). Thanks for working on it though.--137.122.49.102 (talk) 14:50, 4 November 2010 (UTC)
- WP:PAREN is a perfectly acceptable style for WP:Inline citations, and is used on thousands of Wikipedia articles, especially when the academics in the given field use that style normally. As a rule, we don't switch based on personal preferences, e.g., my personal belief that ref tags are better than simple, newbie-friendly parenthetical citations. WhatamIdoing (talk) 01:07, 5 February 2011 (UTC)
[edit] Sawilowski & students on univariate rank transformation
Following the anonymous editor's concerns, I removed this section, but include it here for archival purposes and to facilitate its use in a stand-alone article:
[edit] Quotation: ANOVA on ranks
When the data do not meet the assumptions of normality, the suggestion has arisen to replace each original data value by its rank (from 1 for the smallest to N for the largest), then run a standard ANOVA calculation on the rank-transformed data. Conover and Iman (1981) provided a review of the four main types of rank transformations. Commercial statistical software packages (e.g., SAS, 1985, 1987, 2008) followed with recommendations to data analysts to run their data sets through a ranking procedure (e.g., PROC RANK) prior to conducting standard analyses using parametric procedures.
This rank-based procedure has been recommended as being robust to non-normal errors, resistant to outliers, and highly efficient for many distributions. It may result in a known statistic (e.g., Wilcoxon Rank-Sum / Mann-Whitney U), and indeed provide the desired robustness and increased statistical power that is sought. For example, Monte Carlo studies have shown that the rank transformation in the two independent samples t test layout can be successfully extended to the one-way independent samples ANOVA, as well as the two independent samples multivariate Hotelling's T2 layouts (Nanna, 2002).
Conducting factorial ANOVA on the ranks of original scores has also been suggested (Conover & Iman, 1976, Iman, 1974, and Iman & Conover, 1976). However, Monte Carlo studies by Sawilowsky (1985a; 1989 et al.; 1990) and Blair, Sawilowsky, and Higgins (1987), and subsequent asymptotic studies (e.g. Thompson & Ammann, 1989; "there exist values for the main effects such that, under the null hypothesis of no interaction, the expected value of the rank transform test statistic goes to infinity as the sample size increases," Thompson, 1991, p. 697), found that the rank transformation is inappropriate for testing interaction effects in a 4x3 and a 2x2x2 factorial design. As the number of effects (i.e., main, interaction) become non-null, and as the magnitude of the non-null effects increase, there is an increase in Type I error, resulting in a complete failure of the statistic with as high as a 100% probability of making a false positive decision. Similarly, Blair and Higgins (1985) found that the rank transformation increasingly fails in the two dependent samples layout as the correlation between pretest and posttest scores increase. Headrick (1997) discovered the Type I error rate problem was exacerbated in the context of Analysis of Covariance, particularly as the correlation between the covariate and the dependent variable increased. For a review of the properties of the rank transformation in designed experiments see Sawilowsky (2000).
A variant of rank-transformation is 'quantile normalization' in which a further transformation is applied to the ranks such that the resulting values have some defined distribution (often a normal distribution with a specified mean and variance). Further analyses of quantile-normalized data may then assume that distribution to compute significance values. However, two specific types of secondary transformations, the random normal scores and expected normal scores transformation, have been shown to greatly inflate Type I errors and severely reduce statistical power (Sawilowsky, 1985a, 1985b).
According to Hettmansperger and McKean[1] "Sawilowsky (1990)[2] provides an excellent review of nonparametric approaches to testing for interaction" in ANOVA.
[edit] Supporting references
I believe that most of these books and articles are related to Sawilowski's publications or unpublished writings, and were added in excellent faith by Edstat, I add in good faith (having just removed many references that were zealously added by me, when I was evangelizing for generalized randomized block designs!). I'll come back and look for references to them in other sections. Again, they would be very useful in an article about academics closely associated with Sawilowski (not necessarily on Wikipedia) or in a stand alone article on rank-transforms, if that is a notable topic (e.g. is it covered in statistical encyclopedias or recent surveys in notable reliable journals?). Thanks Kiefer.Wolfowitz (talk) 19:43, 3 November 2010 (UTC)
- Blair, R. C., & Higgins, J. J. (1985). "A Comparison of the Power of the Paired Samples Rank Transform Statistic to that of Wilcoxon’s Signed Ranks Statistic". Journal of Educational and Behavioral Statistics 10 (4): 368–383. doi:10.3102/10769986010004368.
- Blair, R. C., Sawilowsky, S. S., & Higgins, J. J. (1987). "Limitations of the rank transform in factorial ANOVA". Communications in Statistics: Computations and Simulations B16: 1133–1145.
- Conover, W. J., & Iman, R. L. (1976). "On some alternative procedures using ranks for the analysis of experimental designs". Communications in Statistics A5: 1349–1368.
- Conover, W. J. & Iman, R. L. (1981). "Rank transformations as a bridge between parametric and nonparametric statistics". American Statistician 35 (3): 124–129. doi:10.2307/2683975. JSTOR 2683975. http://is.ba.ttu.edu/conover/Dr.Conover.htm.
- Ferguson, George A., Takane, Yoshio. (2005). "Statistical Analysis in Psychology and Education", Sixth Edition. Montréal, Quebec: McGraw–Hill Ryerson Limited.
- Headrick, T. C. (1997). Type I error and power of the rank transform analysis of covariance (ANCOVA) in a 3 x 4 factorial layout. Unpublished doctoral disseration, University of South Florida.
- Helsel, D. R., & Hirsch, R. M. (2002). Statistical Methods in Water Resources: Techniques of Water Resourses Investigations, Book 4, chapter A3. U.S. Geological Survey. 522 pages.[1]
- Hettmansperger, T. P.; McKean, J. W. (1998). Robust nonparametric statistical methods. Kendall's Library of Statistics. 5 (First ed.). London: Edward Arnold. pp. xiv+467 pp.. ISBN 0-340-54937-8, 0-471-19479-4. MR1604954
- Iman, R. L. (1974). "A power study of a rank transform for the two-way classification model when interactions may be present". Canadian Journal of Statistics 2 (2): 227–239. doi:10.2307/3314695. http://jstor.org/stable/3314695.
- Iman, R. L., & Conover, W. J. (1976). A comparison of several rank tests for the two-way layout (SAND76-0631). Alburquerque, NM: Sandia Laboratories.
- King, Bruce M., Minium, Edward W. (2003). Statistical Reasoning in Psychology and Education, Fourth Edition. Hoboken, New Jersey: John Wiley & Sons, Inc. ISBN 0-471-21187-7
- Keppel, G. & Wickens, T.D. (2004). Design and analysis: A researcher's handbook (4th ed.). Upper Saddle River, NJ: Pearson Prentice–Hall.
- Kittler, J.E., Menard, W. & Phillips, K.A. (2007). "Weight concerns in individuals with body dysmorphic disorder". Eating Behaviors 8 (1): 115–120. doi:10.1016/j.eatbeh.2006.02.006. PMC 1762093. PMID 17174859. http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pmcentrez&artid=1762093.
- Nanna, M. J. (2002). "Hoteling's T2 vs. the rank transformation with real Likert data". Journal of Modern Applied Statistical Methods 1: 83–99.
- Pierce, C.A., Block, R.A. & Aguinis, H. (2004). "Cautionary note on reporting eta-squared values from multifactor anova designs". Educational and Psychological Measurement 64 (6): 916–924. doi:10.1177/0013164404264848.
- SAS Institute. (1985). SAS/stat guide for personal computers (5th ed.). Cary, NC: Author.
- SAS Institute. (1987). SAS/stat guide for personal computers (6th ed.). Cary, NC: Author.
- SAS Institute. (2008). SAS/STAT 9.2 User's guide: Introduction to Nonparametric Analysis. Cary, NC. Author.
- Sawilowsky, S. (1985a). Robust and power analysis of the 2x2x2 ANOVA, rank transformation, random normal scores, and expected normal scores transformation tests. Unpublished doctoral dissertation, University of South Florida.
- Sawilowsky, S. (1985b). "A comparison of random normal scores test under the F and Chi-squared distributions to the 2x2x2 ANOVA test". Florida Journal of Educational Research 27: 83–97.
- Sawilowsky, S. (1990). "Nonparametric tests of interaction in experimental design". Review of Educational Research 60 (1): 91–126.
- Sawilowsky, S. (2000). "Review of the rank transform in designed experiments". Perceptual and Motor Skills 90 (2): 489–497. doi:10.2466/PMS.90.2.489-497. PMID 10833745.
- Sawilowsky, S., Blair, R. C., & Higgins, J. J. (1989). "An investigation of the type I error and power properties of the rank transform procedure in factorial ANOVA". Journal of Educational Statistics 14 (3): 255–267. doi:10.2307/1165018. http://jstor.org/stable/1165018.
- Strang, K.D. (2009). Using recursive regression to explore nonlinear relationships and interactions: A tutorial applied to a multicultural education study. Practical Assessment, Research & Evaluation, 14(3), 1–13. Retrieved 1 June 2009 from: [2]
- Thompson, G. L. (1991). "A note on the rank transform for interactions". Biometrika 78 (3): 697–701. doi:10.1093/biomet/78.3.697.
- Thompson, G. L., & Ammann, L. P. (1989). "Efficiencies of the rank-transform in two-way models with no interaction". Journal of the American Statistical Association 4 (405): 325–330.
I note that Hettsmansberger and McKean is notable and reliable, given the writers' being asked to be head editors of e.g. the Statistical Science special issue on nonparametrics or to write the JASA 2000 article reviewing nonparametrics and robust statistics. (I am happy that, as first noted in the article on Sawilowski, that H & McK have nice comments in a few pages about Professor Sawilowski.) I don't see why the other articles should stay in an article on Anova here, unless they are cited by reliable books on ANOVA. Thanks, Kiefer.Wolfowitz (talk) 19:47, 3 November 2010 (UTC)
[edit] Discussion (of Univariate Rank transformation)
Let the discussion begin! Kiefer.Wolfowitz (talk) 19:32, 3 November 2010 (UTC)
[edit] Power of the noncentral F-test: Planning studies
The F-test is used in planning the experiment and the anova, because the non-centrality parameter shifts the F-distribution to the right. Using t-tests to plan experiments, as Bailey does in an otherwise fine book, results in larger numbers of subjects than needed, in many cases. This is not discussed, despite it being the main motivation. (Non-central t-distributions are less readily accessible, and don't appear in textbooks on Anova.)
[edit] Removed confounded experiment: Autoposy reports aliasing of treatments and groups
[edit] Examples
This example has no randomized assignment of treatment to subjects. It seems that group-status is perfectly confounded with treatment, so this is a worthless "experiment". Kiefer.Wolfowitz (talk) 20:48, 3 November 2010 (UTC)
[edit] Example removed
In a first experiment, Group A is given vodka, Group B is given gin, and Group C is given a placebo. All groups are then tested with a memory task. A one-way ANOVA can be used to assess the effect of the various treatments (that is, the vodka, gin, and placebo).
In a second experiment, Group A is given vodka and tested on a memory task. The same group is allowed a rest period of five days and then the experiment is repeated with gin. The procedure is repeated using a placebo. A one-way ANOVA with repeated measures can be used to assess the effect of the vodka versus the impact of the placebo.
In a third experiment testing the effects of expectations, subjects are randomly assigned to four groups:
- expect vodka—receive vodka
- expect vodka—receive placebo
- expect placebo—receive vodka
- expect placebo—receive placebo (the last group is used as the control group)
Each group is then tested on a memory task. The advantage of this design is that multiple variables can be tested at the same time instead of running two different experiments. Also, the experiment can determine whether one variable affects the other variable (known as interaction effects). A factorial ANOVA (2×2) can be used to assess the effect of expecting vodka or the placebo and the actual reception of either.
[edit] Euclidean geometry
In a balanced design, the factors's induce an orthogonal decomposition of a Euclidean space; and the converse holds (see Bailey). First project the data onto the mean-value subspace, and then consider that subspace's orthogonal complement, which then needs be intersected with the subspaces of treatment & block subspaces (which may have further decompositions). The squared Euclidean norm of the projected residuals is the sum of squares. The degrees of freedom are the dimensions of the subspace.
With this orthogonality (orthomodularity), the sums of squares add nicely, regardless of any normality of the residuals.
This geometric account of Anova is given in friendlier fashion in Bailey, in Christensen, and in the very friendly Saville & Woods (in 2 volumes) for example. It should be given here. Kiefer.Wolfowitz (talk) 21:11, 3 November 2010 (UTC)
[edit] Careful review suggested
This article suffers from obtuse pedagogy (it's essentially useless) to downright inaccurate information about ANOVA, its assumptions, and its small sample robustness and power properties. (The ANOVA F test of difference in means is robust to departures from independence, homoscedasticity, and/or normality? Tell that to the hundreds of Monte Carlo studies published since 1980!) A thorough reading of the Monte Carlo literature after 1980 would benefit this article greatly. My suggestion is that the current editors step back and ask for some help, preferably not from the asymptotic maths lobby, but from qualified applied statisticians who have read the literature post 1980 (but for starters, read Glass, Peckham, & Sanders, 1972; Bradley, 1969, 1972, etc.; Blair, 1980, 1981, 1985, etc.; Sawilowsky, 1990, 1992, etc.) It's just a suggestion - don't reach for the aspirin or saltines.Edstat (talk) 03:49, 15 November 2010 (UTC)
- WP:Be bold: If you see something that can be improved, improve it! Qwfp (talk) 07:20, 15 November 2010 (UTC)
- I rewrote the relevant sentences for greater specificity. I removed the strong claim that referenced Lindmann, because the leading researchers I cite are more guarded in their endorsements of the F-test for Anova's null hypothesis. Kiefer.Wolfowitz (talk) 18:45, 15 November 2010 (UTC)
- The robustness that is referenced is that associated with comparing the p-values from F-tests with the p-values from randomization test of the null-hypothesis (when there has been randomized assignment, or with the permutation test when there need not be random assignment but power is desired against all alternative distributions, following Lehmann & Rosenbaum). This is the benchmark discussed in Hinkelmann & Kempthorne, the reference cited.
- The article doesn't deny that associates of Sawilowsky have found alternatives for which their simulation studies show problems. If such studies were considered important enough to be highlighted in the leading textbooks or the most reliable surveys on ANOVA, then please write an appropriately lengthed paragraph on them. But please consider this question: Aren't the books cited a reasonable selection of the best books on ANOVA, by many standards? Sincerely, Kiefer.Wolfowitz (talk) 15:31, 15 November 2010 (UTC)
- Thanks for starting to relook at some of the issues. The "standard" you are referring to is inappropriate. Under non-normality, the Anova test's robustness is poorer than the permutation test, and in fact one way to fix its Type I error problems is to turn it into a permutation test! As for power, the comparison is not reasonable, because the power spectrum of the ANOVA follows the power spectrum of the permutation test, whereas nonparametric alternatives are greatly higher! Moreover, under heteroscedasticity, what good is it to compare the Anova to the permutation test, when the latter is non-robust to that violation? (See, e.g., Boik). Lehmann retired before Monte Carlo studies could be conducted on a PC, Hinkelmann writes well but is not a world class researcher, and Kempthorne was a pioneer who lived prior to most of the work on the ANOVA conducted by Monte Carlo studies. There are plenty of good textbooks that discuss this, you can start with Wilcox.Edstat (talk) 23:02, 15 November 2010 (UTC)
- EdStat, I have tried to introduce the randomization-perspective in many articles on WP, following ASA & RSS guidelines for a first course in statistics, that the distinction between (randomized) experiments and observational studies is important: Neither the ASA nor the RSS specify why this distinction matters. The answer of Peirce, early Fisher, Kempthorne, and Basu is that the randomization design allows a test of the null-hypothesis using an objective known probability distribution: These arguments leave open the choice of a statistic. Kempthorne noted that the randomization test using the F-statistic gave similar results as the F-test. All else is commentary. ;) Kiefer.Wolfowitz (talk) 14:16, 16 November 2010 (UTC)
- Please re-read what I wrote. Permutation tests have maximum (against all alternatives) power (see Lehmann for conditions). Permutation tests need not have maximum power against some alternative(s); apparently, you refer to some simulation studies of some alternative. Kiefer.Wolfowitz (talk) 14:18, 16 November 2010 (UTC)
- I continue to point out how poorly this entry is - just ask ANYONE who wants to learn about the method gets from reading this entry. Statements such as "Some popular designs have the following anovas:" is just downright silly, as are a dozen other statements.Edstat (talk) 20:48, 18 November 2010 (UTC)
- Editor Qwfp invited you to contribute improvements. At the very least, please list the mistakes. Thanks, Kiefer.Wolfowitz (talk) 21:55, 18 November 2010 (UTC)
- Qwfp, here is the latest example of what I meant. After Kiefer.Wolfowitz's stalking my other edits to delete them on other pages he has now twice deleted an important condition in ANOVA (namely, factorial ANOVA is less robust to testing oridnal interactions than disordinal interactions in the absence of population normality) - a very real, practical concern to anyone actually wanting to use ANOVA. First, I assumed good faith that he just didn't like the section it was in, so I moved it to what may be a better place. But, he then deleted it a 2nd time, this time with the caustic remark in the Edit summary: "off-topic promotion of Sawilowski again." Oh his talk page he explains his reason - why, after all he is a statistician! Thus, he has no problem with deleting a reference to this: "Underlying assumptions of factorial ANOVA...I add a fifth consideration that is nearly universally overlooked. It is most important to stress that testing for ordinal interactions (Figure 14.8) in factorial ANOVA can be more severely debilitating than test for disordinal interactions (Figure 14.7) when underlying assumptions are violated" Sawilowsky, (2007, Real Data Analysis: A volumne in quantitative methods in education and the behavioral sciences: Issues, Research, and Teaching, American Educational Research Association, Educational Statisticians, IAP:Charlotte, NC, ISBN 978-1059311-564-7.) Kiefer.Wolfowitz' opinion obviously trumps citations from that book! That is what I meant by "been there, done that, got wikified." So no, I won't be making any more edits to this page as long as bullying on this page persists, and stalking my other edits on other pages persists, along with the litany of false personal attacks Kiefer.Wolfowitz makes whenever ANY editor crosses him!Edstat (talk) 13:44, 19 November 2010 (UTC)
- Edstat, your personal attacks do not improve your argumentation and the sympathy with which outside editors typically view complaints. (Please observe that editor Melcombe is a useful counterexample to your claim that I mistreat editors: He just deleted the Mathematical Reviews link for Pfanzagl's book, with the "useless" characterization. Please see whether he and I have ever had an edit war, even though we usually come from different perspectives, and he can be frank at times. You know that I can be frank and sometimes irritable, I assume!)
- Edstat, Please try to consider in an article on ANOVA (not on factorial experiments) whether — even before you provide a link to the article on factorial experiments, or explain what they are (apart from saying that they can be arbitrarily complex, which holds only if the number of experimental units is infinite) — it is prudent to promote another finding by Sawilowsky, which does not appear in the most reliable references. I repeat that I removed text and references that I had earlier added, all for the part of making the article more readable (following a reader's complaint, above). Sincerely, Kiefer.Wolfowitz (talk) 14:14, 19 November 2010 (UTC)
- Yes, I admit when I am wrong. You have not made personal attacks against ANY" editor who you disagree with, which was an exaggeration and I apologize. I must amend to say you make personal attacks and stalk several editors, of which I am one. Furthermore, it is obnoxious for you to decide you are the arbiter of what constitutes a "reliable reference". To call a publication by the American Educational Research Association, an academic organization of perhaps 90,000 Ph. D.s, not a "reliable reference" is obnoxious at the least, and a violation of wikipedia rules for certain, and you know it. However, my past experiences with you is further discussion is fruitless. Let this page continue to be a laughingstock of those who actually use ANOVA in their professional careers. Goodbye.Edstat (talk) 14:22, 19 November 2010 (UTC)
- I have made no statements about the AERA. Kiefer.Wolfowitz (talk) 14:32, 19 November 2010 (UTC)
- Yes, I admit when I am wrong. You have not made personal attacks against ANY" editor who you disagree with, which was an exaggeration and I apologize. I must amend to say you make personal attacks and stalk several editors, of which I am one. Furthermore, it is obnoxious for you to decide you are the arbiter of what constitutes a "reliable reference". To call a publication by the American Educational Research Association, an academic organization of perhaps 90,000 Ph. D.s, not a "reliable reference" is obnoxious at the least, and a violation of wikipedia rules for certain, and you know it. However, my past experiences with you is further discussion is fruitless. Let this page continue to be a laughingstock of those who actually use ANOVA in their professional careers. Goodbye.Edstat (talk) 14:22, 19 November 2010 (UTC)
- Qwfp, here is the latest example of what I meant. After Kiefer.Wolfowitz's stalking my other edits to delete them on other pages he has now twice deleted an important condition in ANOVA (namely, factorial ANOVA is less robust to testing oridnal interactions than disordinal interactions in the absence of population normality) - a very real, practical concern to anyone actually wanting to use ANOVA. First, I assumed good faith that he just didn't like the section it was in, so I moved it to what may be a better place. But, he then deleted it a 2nd time, this time with the caustic remark in the Edit summary: "off-topic promotion of Sawilowski again." Oh his talk page he explains his reason - why, after all he is a statistician! Thus, he has no problem with deleting a reference to this: "Underlying assumptions of factorial ANOVA...I add a fifth consideration that is nearly universally overlooked. It is most important to stress that testing for ordinal interactions (Figure 14.8) in factorial ANOVA can be more severely debilitating than test for disordinal interactions (Figure 14.7) when underlying assumptions are violated" Sawilowsky, (2007, Real Data Analysis: A volumne in quantitative methods in education and the behavioral sciences: Issues, Research, and Teaching, American Educational Research Association, Educational Statisticians, IAP:Charlotte, NC, ISBN 978-1059311-564-7.) Kiefer.Wolfowitz' opinion obviously trumps citations from that book! That is what I meant by "been there, done that, got wikified." So no, I won't be making any more edits to this page as long as bullying on this page persists, and stalking my other edits on other pages persists, along with the litany of false personal attacks Kiefer.Wolfowitz makes whenever ANY editor crosses him!Edstat (talk) 13:44, 19 November 2010 (UTC)
- Editor Qwfp invited you to contribute improvements. At the very least, please list the mistakes. Thanks, Kiefer.Wolfowitz (talk) 21:55, 18 November 2010 (UTC)
- I continue to point out how poorly this entry is - just ask ANYONE who wants to learn about the method gets from reading this entry. Statements such as "Some popular designs have the following anovas:" is just downright silly, as are a dozen other statements.Edstat (talk) 20:48, 18 November 2010 (UTC)
- Thanks for starting to relook at some of the issues. The "standard" you are referring to is inappropriate. Under non-normality, the Anova test's robustness is poorer than the permutation test, and in fact one way to fix its Type I error problems is to turn it into a permutation test! As for power, the comparison is not reasonable, because the power spectrum of the ANOVA follows the power spectrum of the permutation test, whereas nonparametric alternatives are greatly higher! Moreover, under heteroscedasticity, what good is it to compare the Anova to the permutation test, when the latter is non-robust to that violation? (See, e.g., Boik). Lehmann retired before Monte Carlo studies could be conducted on a PC, Hinkelmann writes well but is not a world class researcher, and Kempthorne was a pioneer who lived prior to most of the work on the ANOVA conducted by Monte Carlo studies. There are plenty of good textbooks that discuss this, you can start with Wilcox.Edstat (talk) 23:02, 15 November 2010 (UTC)
[edit] Heteroscedacity: Variable variance
Editor Edstat raised concerns about a non-normality, and about heteroscedacity (alternatively, differing variances, or a failure of homoscedacity!), etc. In the section on the randomization analysis, references to Cox and to Kempthorne are given to support the statement that a proper randomization procedure and unit-treatment additivity imply constant variance. Thus result explains why both Cox & Kempthorne (and Rosenbaum, Rubin, Imbens, Abadie, Angchrist, etc.) emphasize proper randomization and why they emphasize the unit-treatment additivity assumption. When this unit-treatment additivity is implausible, the analysis is more difficult (although local average unit-treatment additivity saves much of the standard analysis). While the article's few paragraphs are not a substitute for a textbook, they at least sketch the central issues, and reference the most reliable sources. Edstat's claim that normality is so important is not supported by the analysis by these authors, who are usually regarded as the most reliable sources. Kiefer.Wolfowitz (talk) 17:25, 19 November 2010 (UTC)
- A reaction to 'Consequence of failure to meet assumption the fixed effects analysis of variance and covariance', Blair, R. C. (1981), Review of Educational Research', 51(4), 499-507. doi: 10.3102/00346543051004499. I hesitate mentioning this reference, which is one among many, given that after all, you are a statistician.Edstat (talk) 17:38, 19 November 2010 (UTC)
- Again, the best authors base their analysis on the randomization distribution, or at least like their degrees of freedom determined by the randomization distribution, which is determined by the assignment mechanism.
- If the study was not a randomized experiment but only an observational study, and you want to focus on specific alternatives (rather than the general class specified by Lehmann & Rosenbaum), then of course violations of "normality" matter, as you say. But such studies are so bad in general that they receive little emphasis in the anova literature in statistics.
- Please see references in the article on statistics education for work by statisticians to help education, which is notorious for the lack of controlled experiments for evaluating teaching (Thomas Cooke): Thomas D. Cook, Randomized Experiments in Educational Policy Research: A Critical Examination of the Reasons the Educational Evaluation Community has Offered for Not Doing Them, Educational Evaluation and Policy Analysis 24 (2002), no. 3, 175-199. Kiefer.Wolfowitz (talk) 23:45, 19 November 2010 (UTC)
- (1) I take it you never bothered to look up the reference I gave. You assumed from the journal title that it was an observational study. Neither the motivating study by Glass, Peckham, and Sander's (1972) in RER, nor Blair's RER study, is an observational study. They are both Monte Carlo studies based on mathematical distributions. I would suggest you also examine Sawilowsky's Monte carlo work on real data sets (or if you prefer, Harrel's, Serlin's, Zumbo's, Zimmerman's, Kromrey's, Ferron's, H. Kesselman's, Kesselman's, R. Ramsey's, Ramsey's, Wilcox's, Huberty's, G. Thompson's, Higgin's, J. Bradley's, Beretvas', Dayton's, de Leeuw's, Feng's, Hambleton's, Huck's, Kirk's, J. Levin's, Lix's, Lomax's, Micceri's, Onghena's, F. Schmidt's, Singer's, S. Weinberg's, S. Wise's, Mawxwell's, Toothaker's, Grissom's, Peng's, Becker's, Appelbaum's, Beasley's, and 100's of others' Monte Carlo work if prefer because you think citing peer-reviewed literature is "promoting" Sawilowsky) except its obvious you are oblivious to the entire genre of literature, and don't bother to look up citations even when you ask for them. (2) "best authors"? - your bias is showing again! Tell me, "statistician", which wikipage has so-designated the "best authors"? (3) From your comment on the ANOVA on ranks page, it is obvious that you really have no idea what randomization of subjects is all about. (4) "randomization distribution", when the topic is layout? Are you reading what you are writing?(5) You quote "Cook", which was outdated before he even tried to update Campbell and Stanley (1963)? My suggestion, "statistician" - why not actually go to a library and read up a bit on the discipline? Wikipedia deserves you, and you deserve wikipedia! I've already stated I won't edit any page you are working on. I'll also no longer respond to any of your "discussion" page diatribes either. So you, and your editing cabal, will get the last word!Edstat (talk) 03:43, 28 November 2010 (UTC)
-
- As I wrote, the permutation test has optimality properties against all alternatives. You are discussing specific alternatives, for which other alternatives can have better power. Kiefer.Wolfowitz (talk) 11:51, 28 November 2010 (UTC)
[edit] Extremely unclear
After reading this article, I am still left with absolutely no idea how this technique is actually employed. There are many references to "treatments" -- is it used exclusively in medical research? A fully-worked example (including computation) would be a great boon. 121a0012 (talk) 05:59, 5 January 2011 (UTC)
[edit] Effect size section self-contradictory
Please see the following section (copied below):
"Though, considering that η2 are comparable to r2 when df of the numerator equals 1 (both measures proportion of variance accounted for), these guidelines may overestimate the size of the effect. If going by the r guidelines (0.1 is a small effect, 0.3 a medium effect and 0.5 a large effect) then the equivalent guidelines for eta-squared would be the squareroot of these, i.e. 01 is a small effect, 0.09 a medium effect and 0.25 a large effect, and these should also be applicable to eta-squared. When the df of the numerator exceeds 1, eta-squared is comparable to R-squared (Levine & Hullett, 2002)."
Note that it is self-contradictory. First it says "η2 are comparable to r2 when df of the numerator equals 1" and later says "When the df of the numerator exceeds 1, eta-squared is comparable to R-squared". Any suggestions on which is correct?
I also suggest that this section is removed until consensus is reached.
Trevorzink (talk) 02:28, 6 April 2011 (UTC)
Cite error: There are <ref> tags on this page, but the references will not show without a {{Reflist}} template or a <references /> tag; see the help page.