Talk:Defense independent pitching statistics
|WikiProject Baseball||(Rated C-class, High-importance)|
Removed "xFIP has the highest correlation with future ERA of all the pitching metrics" because the citation given doesn't support that text.
Somebody keeps editing this page to remove the contributions of Clay Dreslough and Bill James to this topic, and to add the name 'TangoTiger'. I'm new to Wiki (I just opened an account after a few weeks of 'anonymous' editing) so I don't know if it's vanity editing or just a misunderstanding.
I'm a college student currently writing a term paper on the development of DIPS (which, as the name implies, are a GROUP of statistics, and are not equivalent to 'dERA' which is only one statistic). This is just one of the factual errors with the current article. I have interviewed several SABR members and constructed the following chronology that I would like to add to this article:
1) In the 1980s, Bill James develops 'Component ERA' (CERA). This estimates a pitcher's ERA from his component stats (strikeouts, hits allowed, etc.) in essence to deduce how "lucky" a pitcher was (if a pitcher's ERA was LOWER than his CERA, then he was unlucky in the sense that he allowed more runs than he "SHOULD HAVE", based on his other pitching stats).
2) In the late 1990s, Clay Dreslough develops 'DICE' (Defense Independent Component ERA) for his computer baseball game. He noticed that when CERA doesn't match actual ERA, most of the error is in 'hits allowed'. He doesn't publish his work but he does use the stat in his game.
2a) In July 2000, Mr. Dreslough publishes 'DICE' on his web site. He posts the formula and indicates that DICE is a better predictor of a pitcher's ERA in the following season. (http://www.sportsmogul.com/content/dice.htm).
3) In January 2001, Voros McCracken posts his article on DIPS. I edited this page to add a link to this article (http://www.baseballprospectus.com/article.php?articleid=878), but it has since been deleted. Where Dreslough showed that DICE is a better PREDICTOR of ERA in the following season, McCracken did the research to figure out WHY this is the case -- namely that the ability to prevent hits on balls in play is not predictable from one season to the next. Thus, he argues, K, BB, HBP and HR are the only stats worth using to evaluate a pitcher's effectiveness. These are the stats used in deriving dERA (defense-independent ERA).
Voros' work was extremely controversial in that baseball scouts have always assumed that pitchers strongly affect not just WHETHER a batter hits a pitch, but how WELL.
4) The most well-research counter-argument to McCracken's claims are a recent article by Tom Tippett, the creator of Diamond Mind Baseball, another computer game (www.diamondmind.com). His research is shown in his article at http://www.diamond-mind.com/articles/ipavg2.htm. When this page is unblocked, I plan to summarize this article here as his findings, although also controversial, do represent the leading edge of research in this area.
I have never heard of 'TangoTiger', nor has anyone I have interviewed.
- I don't know why someone keeps editing the page to put a reference to him in it, but TangoTiger is the name of Tom Tango's blog - Tom Tango is a fairly well known baseball researcher who published the book "playing the percentages in baseball". Blankfrackis (talk) 20:10, 21 August 2008 (UTC)
Edit of 2/12/06
This article went through a series of edits over several weeks in which three different anon addresses and account names attempted to delete the contributions of one researcher and substitute those of another researcher. A Google of the two researchers revealed that the deleted name produced just under 50,000 hits, and the substituted name produced just under 500 hits. In each case I reverted the text to preserve the name of the original, better known researcher.
An admin suggested that I edit the article instead of reverting it. Tonight (or really this morning) I did so, and both researchers' work is now reflected. Although I enjoy baseball stats, the material as posted goes deeper than my interest level in such things. I will leave it to other editors to determine if the new content is worth preserving or not. My involvement was based on the Wikipedia values that replacing someone else's work with your own is Wrong, and so long as the other researcher's work remains intact I intend to bow out of this issue. OverInsured 06:56, 13 February 2006 (UTC)
- While I was doing the edit tonight, edits were made that AGAIN deleted one researcher and added the other. I replaced these edits with the merged version that gives more or less equal treatment to both researchers. Please do not delete references to the work of rivals without prior discussion on this Talk page. OverInsured 07:06, 13 February 2006 (UTC)
Edit of 2/13/06
I realize that OverInsured is a mod and I'm not. But this is getting frustrating. I just added a large section describing the logic behind McCracken's research and results and it was reverted once again. I am NOT A VANDAL. I have inserted my work back in, while mantaining OverInsured's addition of the FIP formula.
I do find his logic for reversion to be spurious. A Google search on a nom de plume shows more hits than one for an actual person, so it means that research is somehow more widely recognized? I have left the reference to 'TangoTiger', but I believe this page would be more complete if, like Bill James, Tom Tippett and Clay Dreslough, we had an actual name we could credit his work to, along with a dated research article instead of a small reference inside inside an undated internet posted related to an entirely different subject. As best I can tell, the reason TangoTiger's formula is so similar to Mr. Dreslough's is because it was copied, and therefore it doesn't bear inclusion in this article -- which should focus on original DIPS and related research. HatTrick
- Let me tell you how to get me to bow out, and it's really simple. Stop erasing the other guy's work! I've already added the bulk of the text that Dreslough previously added, which could have been considered a vanity post. Now that I typed it in it's officially not vanity! If you can be content with having your work added WITHOUT erasing someone else's you've gotten what you wanted! If you'd just left his work alone at the start and added your own I don't have anywhere near the expertise to contradict you, so this need to delete others is what's gotten you reverted consistently for two months. If you have to erase rivals' work to be satisfied you're simply out of touch with both Wiki spirit and policy. OverInsured 08:30, 13 February 2006 (UTC)
I'm getting tired of re-entering the same content instead of working on new content. This is the second revert in 24 hours and your justification is just plain false. If you actually READ my edits last night, you would have realized:
1) I DID NOT delete TangoTiger's formula, his name or "his work" (although I don't know exactly what that means, since it appears that "his work" consists of vandalizing this page -- but since you seem so avid about his name remaining on this page, I left it as is).
2) I DID NOT add Dreslough's name. It was already on the page (from your edits). In fact, I specifically added his formula WITHOUT giving him crediit. Although Clay has been helpful to me in my research, I have no agenda to promote his name, and that is why in last night's edit, and again in today's edit, I HAVE NOT added his name.
3) That by reverting this page, you are undoing my work and creating SEVERAL factual errors, including:
- 3a) The statement that DIPS and dERA are "equivalent" (they are as equivalent as "Kenya" and "Africa" are equivalent)
- 3b) The idea that FIP and DICE are "conflicting formula" (they are in fact essentially the SAME formula, which is why the chronology is somewhat important, and I've yet to find anyone who think's that TangoTiger has made a "contribution" to this field). Nevertheless, you are adamant that his name remain, so I will simply add the names of other research and describe their contributions, so hopefully the reader understands how these research and statistics tie together and evolved from each other. Still, I really don't understand how Wiki's mission requires that we include every bit of info that is ever added to a page without deleting that which is irrelevant or vandalous. If I go add my own name to the 'Batting Average' page right now (because I am using this stat, among many, in my paper, in the same way that TangoTiger appears to have co-opted the DICE formula), will it be impossible for another editor to remove my name?!?
- 3c) That a "ball in play" is equivalent to a "ball hit in the field of play". It is not.
- 3d) The elimination of "hit batters" from the formula.
- 3e) Grammatical errors (such as "is" instead of "in") etc.
If you feel the need to edit this page in some way that I'm not understanding, PLEASE do it by editing the page itself to add the content you need to add. PLEASE do not REVERT the page, undoing the work I'm trying to add. HatTrick EST 17:07 13 Feb 2006
- It will be interesting to see how this article evolves and what your long-term intentions are for the text as future edits are made. (I liked the other structure better, but since I reorganized it that's natural.) For the record, however, for now the current rewrite meets the criteria I set down for not reverting the edits: Tangotiger's work remains listed, and has not been supplanted by someone else's. No one researcher's work is presented as the ultimate guiding light. I hope we get good peer review on the details since my baseball interests in statistics go in other directions. I intend to stay on the sidelines if the above criteria continue to be met. OverInsured 01:34, 14 February 2006 (UTC)
Edit as of 2/15/2006
I'm Tangotiger. First time posting here, so please let me know if I'm doing anything incorrectly.
I just came across this page.
This statement: "As best I can tell, the reason TangoTiger's formula is so similar to Mr. Dreslough's is because it was copied" is totally false. Statements like this should simply not be allowed to be made. I have enough "witnesses" who would back up my claim at the old Baseball Boards (later Fanhome.com, now Scout.com) about how I developed FIP.
I have also corresponded a multitude of times with Voros, and therefore, anyone wanting to know me could have simply gone to Voros and asked him. He also would backup how FIP was developed.
The best research on DIPS is presented on my site at: http://www.tangotiger.net/solvingdips.pdf
I also correspond with Tom Tippett many many times, and he has told me that he wished he had read that report, prior to doing his. Since he is also cited on this page, he too could have been asked about me and my research.
I'm not sure who was asked about "do you know tangotiger", but starting with the people already listed on this page would have been a good start.
Probably the largest online presence of sabermetric following is at baseballthinkfactory.org (previously knows as baseballprimer.com, co-founded by Sean Forman of baseball-reference.com, easily the most popular sabermetric online database). They have their own wiki site, with this page on DIPS: http://digamma.net/btfwiki/index.php/DIPS
Now that we have this background, this page should be cleaned up appropriately, and ensuring that everyone's work is documented properly.
The initial timeline misses Voros' critical posts on Usenet in the late 90s, as well as followups at the Baseball Boards, and prior to his Baseball Prospectus article.
I had not read the DICE article on Sports Mogul until today. That article was probably posted roughly around when I posted my Defensive Spectrum article, an article that was completely inspired by Voros work.
That Clay ran a regression on the components to come up with his equation is noteworthy. The more important part is when it was published.
Kevin Harlow derived an equation, that is essentially FIP / DICE, at its core, here: http://members.cox.net/~harlowk22/DIPS-GS.html
You can skip the math, and jump down to the last 5 or 6 lines. Tom Tippett in his article, essentially came up with the same thing, as I did (mine was published at the now-defunct Fanhome/Baseball Boards).
So, everyone has come up with the same concept of the 3/13/-2. It was first published either by Clay or me (I'd have to go back through my archives to really get the date), though Clay certainly used it privately prior to me. Where Clay showed it as a regression, I derived its basis, as Tippett then did, and Harlow also did, all independently.
The true architect has to remain Voros, as he is the one that turned the whole online sabermetric community upside down with his initial findings and claims. A substantial portion of this page should be devoted to Voros, and little blurbs should be devoted to the rest of us.
And I also see references like "Page is locked. I'm guessing 'TangoTiger' is messing with it". Please remove all such speculations. I've never been here before today.
- This is great guidance for all of us! This answers most of the open questions that led to a series of conflicting edits and puts everyone's contributions in perspective. It's also refreshing to hear someone give the lion's share of the credit to others. Do you want to edit the text to reflect these comments? If not, next week when I have time I could take a crack at it and then you, Dreslough, HatTrick and other editors could take a look at it and edit further. My interests in baseball don't run along these lines so I can't judge the statistical elements, but if no one else wants to I believe I could merge the facts as presented to give McCracken the founding credit and show how others' research went on from there. OverInsured 21:49, 15 February 2006 (UTC)
I will take a crack at it, and I will get Voros to confirm my text, to ensure its accuracy. I will post it here, and then you can revise as you see fit. There are also plenty of guys at baseballthinkfactory.org who have followed DIPS as long, or longer, than I have, and they'd be able to offer a different/more accurate perspective than I could. I'll ask them to chime in with edits, once you are feel it's in a ready state to be edited. Thanks...
- That sounds great! The whole idea of Wikipedia is for the real experts to edit the articles so we learn from the people who really know. BTW, if you put 4 tilde's (~) after your post the system will sign your user name and add a time stamp, which makes tracking discussions easier. Welcome! OverInsured 22:09, 15 February 2006 (UTC)
Hey this is Voros. I really don't want to add or edit anything here. I'm of the opinion the principals involved shouldn't take part in an encyclopedic entry. Feel free to tear me a new one if you like, whatever folks think is warranted.
What I would like to do establish Nov 18, 1999 as a baseline of sorts. This is, as far as I know, my first public discussion of DIPS. Fortunately it is on usenet so:
There you go. There's some other stuff after that folks can hunt up if they wish (particularly on USENET), but that's really far beyond what I think Wikipedia would need to cover here. I just wanted to lay down a date so folks can work from it.
Whatever people want to do from there is fine with me. Credit when not accompanied by large piles of cash is overrated. :D
Love Wikipedia and honored to even get a mention here. Keep up the fine work folks...
Voros 08:12, 23 February 2006 (UTC)Voros
I apologize for not adding any Wiki content. My time isn't what I thought it would be. I'll try to stay in the loop on this one. Tangotiger 19:47, 13 March 2006 (UTC)
Note about citations -- June 21, 2006
I have not made any substantive edits (additions, deletions, corrections). I have only added citations within the body of the text. Although a few of these were given at the bottom, some "original" sources were missing and some of those were hard to dig out. When the text refers to a particular article or essay or blog as a source for, say, a formula or later discussion, it really ought to provide the footnote reference immediately. Wikipedia can't be THE ORIGINAL source, so its essays needs to be documented by links to original formulas, essays, and so forth. I hope what I've done is helpful. Mack2 15:27, 21 June 2006 (UTC)
I'm not sure I understand what the struggle has been all about on this article. But it strikes me as very odd that in an article devoted to defense independent pitching statistics (a term that McCracken coined), we are offered the formulas for DICE and FIP but nobody has bothered to post McCracken's own formula, at least in simplified form. Mack2 07:03, 22 June 2006 (UTC)
- The reason that McCracken's own formula is missing is because it is quite complicated. We have a link to it, but it would take the better part of a page to describe it. What this article needs right now is a more proper description of the work done by others besides McCracken on this subject. Right now, Tippett's is the only major work mentioned, when at the very least, Tom Tango, Mitchel Lichtman, David Gassko, and JC Bradbury each contributed at least as much to the body of knowledge on the subject. Your edits however to source this articles have been much appreciated. Bibigon 07:49, 22 June 2006 (UTC)
- Thank you. One point, however. It's clear from a reading of Rob Neyer's column of April 24, 2001 (a day after the full article was published on BP) and April 26 (with extended comments from readers including Craig Wright and Bill James) that while the 1999 Usenet "article" (which was really just a preliminary fragment of McCracken's research) sparked interest among baseball research fanatics, only the publication in 2001 really brought wide attention to his theory. It was new(s) to Bill James, for example. And it was news to Neyer: http://espn.go.com/mlb/s/2001/0115/1017090.html. (It also appears that Neyer was able to present a simplified formula for DIPS in his column, without all the details.) And Alan Schwarz reports in his book that the real explosion of interest followed the BP and Neyer/ESPN exposure, and didn't come before that. That's when "all hell broke loose," as Schwarz quotes McCracken as saying. Mack2 14:14, 22 June 2006 (UTC)
- DIPS Version 2.0 by Voros McCracken
- Pitching Defense Independent Pitching Stats, Version 2.0 Formula by Voros McCracken
- Pitching and Defense: How Much Control Do Hurlers Have? by Voros McCracken
- Counterpoint: Pitching and Defense by Keith Woolner
- From The Mailbag - Special Edition: Pitching and Defense Voros McCracken and Keith Woolner
- Larry Mahnken's DIPS Worksheets
- Tom Tango's Tangotiger Site, introducing FIP
- DICE explanation page at sportsmogul.com
- Jay Jaffe's summary and updating of DIPS statistics, 2004
- Solving DIPS by Erik Allen, Arvin Hsu and Tom Tango
- DIPS Revisited by Mitchel Lichtman
- Another Look at DIPS by JC Bradbury
- Batted Balls and DIPS by David Gassko
- DIPS, LIPS, and HIPS by David Gassko
- Uncovering DIPS by David Gassko
- "Prospectus Toolbox: Dying Quails and Pitchers BABIP" by Derek Jacques
Criticism of DIPS page
1. Actual equation for DIPS or DIPS 2.0 is not given -- only a "simpler" equation is given.
2. Under a statistical reference, this -- and any other measures similar whether in baseball portal or not -- should answer basic statistical question. These should include:
a. Is the process being explained a stochastic process? b. Does the system fall into a Weiner process or Brownian Motion process? c. Is the process subject to partial differential equation?
3. There is no mention of this, but I would think that studying the system of baseball wherein a pitcher throws and a batter makes contact would fall under a Stochastic process whereby we could compute based on factors derived from the pitcher and the hitter and the defense a fair comparative across pitchers.
4. There is not criticism section at all in the main DIPS page.