# Base runs

Base runs (BsR) is a baseball statistic invented by sabermetrician David Smyth to estimate the number of runs a team "should" have scored given their component offensive statistics, as well as the number of runs a hitter/pitcher creates/allows. It measures essentially the same thing as Bill James' Runs Created, but as sabermetrician Tom M. Tango points out, BaseRuns models the reality of the run-scoring process significantly better than any other "run estimator".

## Purpose and formula

These were described in Smyth's Base Runs Primer.

Base Runs was primarily designed to provide an accurate model of the run scoring process at the Major League Baseball level, and it accomplishes that goal very well: in recent seasons, BsR has the lowest RMSE of any of the major run estimation methods. But in addition, Base Runs can claim something no other run estimator can -- its accuracy holds up in even the most extreme of circumstances and/or leagues. For instance, when a solo home run is hit, Base Runs will correctly predict one run having been scored by the batting team. By contrast, when Runs Created assesses a solo HR, it predicts 4 runs to be scored; likewise, most linear weights-based formulas will predict a number close to 1.4 runs having been scored on a solo HR. This is because each of these models were developed to fit the sample of a 162-game MLB season; they work well when applied to that sample, of course, but are woefully inaccurate when taken out of the environment for which they were designed. Base Runs, on the other hand, can be applied to any sample at any level of baseball (provided you can calculate the B multiplier), because it models the way the game of baseball operates, and not just for a 162-game season at the highest professional level. This means Base Runs can be applied to high school or even Little League statistics.

## Weaknesses of base runs

From the TangoTiger wiki:[unreliable source?]

"Base Runs adheres to more of the fundamental constraints on run scoring than most other run estimators, but it is by no means perfectly compliant. Some examples of shortcomings:

• BsR will sometimes give a negative estimate; this happens when the B factor is negative.
• BsR will sometimes project many more than three runners left on base per inning, despite the fact that three is the upper limit. For example, if walks have a B coefficient of .1, an inning with 10 walks and 3 outs will yield an estimate of 10*1/(1+3) = 2.5 runs, meaning that 7.5 runners must have been stranded.
• Tangotiger's research found that BsR overvalued events within the .500-.800 team OBP range

One avenue for possible improvement in the model is the scoring rate estimator B/(B + C). There is no deep theory behind this construct--it was chosen because it worked empirically. It is possible that a better score rate estimator could be developed, although it would most likely have to be more complex than the current one."