Wikipedia talk:WikiProject Statistics

From Wikipedia, the free encyclopedia
Jump to: navigation, search
WikiProject Statistics
Main page Talk page Members Templates Resources
          This page is of interest to the following WikiProjects:
WikiProject Statistics (Rated Project-class)
WikiProject icon

This page is within the scope of the WikiProject Statistics, a collaborative effort to improve the coverage of statistics on Wikipedia. If you would like to participate, please visit the project page or join the discussion.

 Project  This page does not require a rating on the quality scale.

Thank-you for your efforts[edit]

I know this isn't the place for this, but as someone who had no statistics (still possible in 1980's to get a phd in maths with no stats), but desperately needing to explain stuff to her children studying two types of engineering, I REALLY appreciate all the effort that has gone into making the statistics pages level-understandable (starting low and going up). (Having worked on the mathematics wikipedia pages, I know how difficult it is to come to a consensus.) Thank-you.Lfahlberg (talk) 13:28, 22 January 2014 (UTC)

"Free statistical software"[edit]

Free statistical software is an odd article, seemingly lifted as a whole from Citizendium and giving avuncular advice about statistics freeware (despite its title, not free software) but also statistics in general. Bits within it strike me as worth of retention though, somewhere. Any ideas? -- Hoary (talk) 00:53, 18 February 2014 (UTC)

Statistical software software redirects to List of Statistical Packages. My feeling is that there should be a page about statistical software in general. Then the list page can state what licenses they use. Is there anything unique about Free Statistical Software other than its license?
On a separate note what are guidelines regarding using content for Citzendium? Jonpatterns (talk) 10:08, 18 February 2014 (UTC)
No, this has nothing to do with license. It's not about the libre, it's about the gratuit. Offhand I can't think of examples that are the former and not the latter (and thus wouldn't be in the article), but the article has plenty of examples of what are the latter but not the former. Is there anything special about software that's gratuit? A lot of people would say no, but I'd guess that most of these either have their software paid for by others or have a reliable and ample salary. Personally, I'm not at all unsympathetic to the idea of publicizing the gratuit. Prices could easily be added to Comparison of statistical packages. Oh no, wait: they can't. ("An article should not include product pricing or availability information unless there is a source and a justified reason for the mention.") -- Hoary (talk) 12:49, 18 February 2014 (UTC)
I think the article should be renamed Statistical software. The advantage of libre licensing and gratuit pricing can be discussed as part of the article. The exact pricing of a solution isn't needed, it can simply be said to no/low/med/high cost. Jonpatterns (talk) 13:26, 18 February 2014 (UTC)
The advantage of the gratuit should be obvious. Of course, some people may think of the cliché "There's no such thing as a free lunch" and extend it to software: surely what costs no money must be a trojan or similar? Curiously, the article freeware doesn't do anything to allay such fears; but if any article should allay them, that is it. After all, there's no obvious reason to think that the pluses and imaginable minuses of statistics freeware as compared with for-money statistics software would differ from those for software with other applications. Likewise, the pluses and imaginable minuses of free software (and the related open-source software) can be discussed in those articles. Meanwhile, I'm not at all sure that software can be described as no, low, medium or high cost unless we can present sources commenting on this. Plus these terms have debatable meaning: I'm a stingy bastard, so what's "medium cost" for you can be "high cost" to me; everyone you know qualifies for the academic discount for product X, but it doesn't apply to me; north Americans are quoted prices for Y that I have to grudgingly concede are "medium", but here in Japan Y is only available via one firm, which trebles its price (citing its addition of Japanese documentation, which I anyway don't want); etc etc. -- Hoary (talk) 09:15, 19 February 2014 (UTC)
We do have lists that categorize as freeware, free software, or commercial software, which is quite specific and generally easy to source. - MrOllie (talk) 15:44, 19 February 2014 (UTC)

Popular pages tool update[edit]

As of January, the popular pages tool has moved from the Toolserver to Wikimedia Tool Labs. The code has changed significantly from the Toolserver version, but users should notice few differences. Please take a moment to look over your project's list for any anomalies, such as pages that you expect to see that are missing or pages that seem to have more views than expected. Note that unlike other tools, this tool aggregates all views from redirects, which means it will typically have higher numbers. (For January 2014 specifically, 35 hours of data is missing from the WMF data, which was approximated from other dates. For most articles, this should yield a more accurate number. However, a few articles, like ones featured on the Main Page, may be off).

Web tools, to replace the ones at tools:~alexz/pop, will become available over the next few weeks at toollabs:popularpages. All of the historical data (back to July 2009 for some projects) has been copied over. The tool to view historical data is currently partially available (assessment data and a few projects may not be available at the moment). The tool to add new projects to the bot's list is also available now (editing the configuration of current projects coming soon). Unlike the previous tool, all changes will be effective immediately. OAuth is used to authenticate users, allowing only regular users to make changes to prevent abuse. A visible history of configuration additions and changes is coming soon. Once tools become fully available, their toolserver versions will redirect to Labs.

If you have any questions, want to report any bugs, or there are any features you would like to see that aren't currently available on the Toolserver tools, see the updated FAQ or contact me on my talk page. Mr.Z-bot (talk) (for Mr.Z-man) 05:28, 23 February 2014 (UTC)

AfC submission[edit]

Wikipedia talk:Articles for creation/Variance of Effect-size. FoCuSandLeArN (talk) 12:55, 24 February 2014 (UTC)

Another AFC submission[edit]

I just created Articles for creation/Abundance estimation, I intend to add to it over the next few weeks/months but I welcome any help or additions. I intend it to be an overview of abundance estimation methods and their applications. My main focus is currently on mark-recapture methods so I would particularly welcome input/additions about other methods. Jamesmcmahon0 (talk) 21:33, 5 March 2014 (UTC)

Audience considerations[edit]

I've just read the articles in Weiner and Gaussian processes. I am not a mathematician, but I am a social scientist with an interest in research methodologies. I was hoping to find a clear description of cases where an assumption of normal distribution is sound. I am working on a paper where qualitative interviews indicated heterogeneity in a key behavior, so we have looked at splitting our groups using k-means cluster analysis, based on continuous behavior observation data. We found that previously-used groupings of observations, where agents had been assumed to have homogeneous behavior, had heterogeneous behavior and that individuals clustered together in multiple equilibria. We have had a lot of push-back from the statisticians and stats-trained researchers, in the group, because they claimed at first to not understand the method and then said the findings were probably exaggerated.

I came to wikipedia with these concerns: what conceptual framework supports an assumption of homogeneity or heterogeneity? What tests are available to establish one or the other? What types of cause and effect relationships underlie equilibrium processes that exist in reality? Basically, I wanted to turn the argument around and ask them to question their assumptions in the same light they were questioning my work.

I searched the web for "empirical support for homogeneity and normal distributions" and saw the word "process" with wikipedia in the search results, and thought I was on the right track for finding information about the causal/conceptual framework, like an operational model, a process flow diagram or at least a textual description of what characteristics typify these sorts of processes, or something like that. But, I was completely unprepared to understand what I was reading. It was not helpful or useful to me at all.

I don't know in general about all of the articles in the math/stats project at Wikipedia, but these articles were not accessible to me. I think they would be inaccessible by any non-mathematician. The sort of 'text book talk' in proofs and formulas can be helpful. I've really appreciatd the project's sensitivity and specificity articles. But, in these articles there was nothing but 'text book talk'. I had no frame of reference to understand these articles.

Maybe it is my applied research background that cripples me in the more basic research and math theory arena, but it seems like the audience for wikipedia should be somewhat like that of an encyclopedia, not a text book. And definitely not an advanced undergraduate/graduate school level textbook.

So, all I can say in response to my colleagues, for now, is "your assumption contradicts the beliefs of the real people we are claiming to study" and "i've shown that there isn't a tendency toward an equilibrium between our three core behavioral indices, but toward multiple points of equilibrium". I am guessing they will reply "we know better than the people we are studying, they don't realize their equilibrium-seeking tendencies" and "all you've shown is something so confusing that we don't understand it and that you don't know how to do things the old fashioned, tried and true way".

I thought the wikipedia articles would help explain how empirical single-equilibrium processes occur, something about the standard approach for supporting an assumption of equilibrium and if and how homogeneity relates to the discussion and... And all I found were pieces written to an audience so specific that I didn't learn a single thing, although the figures did say something to me, but I can't explain what because the article didn't say.

I don't want this to be a place to settle a dogmatic/ideological score, but I do think the audience should be considered in a more meaningful way. I wanted to find information that could help me make sense of complicated math stuff, but it was over my head. I'm sorry to see that.

Wikipedia talk:Articles for creation/Multivariate metamodelling of dynamic models[edit]

Dear statistics experts: I am guessing that this old Afc submission is about some kind of statistical analysis. It will be deleted shortly unless someone decides that it is a notable topic and should be kept and improved. Just saying... —Anne Delong (talk) 00:45, 4 April 2014 (UTC)