# Talk:Benford's law

WikiProject Mathematics (Rated B-class, Low-importance)
This article is within the scope of WikiProject Mathematics, a collaborative effort to improve the coverage of Mathematics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
Mathematics rating:
 B Class
 Low Importance
Field: Probability and statistics
One of the 500 most frequently viewed mathematics articles.
WikiProject Statistics (Rated B-class, High-importance)

This article is within the scope of the WikiProject Statistics, a collaborative effort to improve the coverage of statistics on Wikipedia. If you would like to participate, please visit the project page or join the discussion.

B  This article has been rated as B-Class on the quality scale.
High  This article has been rated as High-importance on the importance scale.

## Explanation

The section entitled "Explanation" is difficult to understand. Please remember that Wikipedia is a general encyclopædia and is equally likely to be viewed by individuals of average mathematical knowledge as it is to be viewed by specialists. 68.49.208.76 06:30, 2 September 2007 (UTC)

You are correct - but accurate, easy-to-understand explanations of technical issues are very, very hard to write. Wikipedia is full of excellent articles that are useless to 99% of humanity for that very reason. And Benford's Law is particularly tricky to explain to laymen; it's so counter-intuitive. - DavidWBrooks 11:53, 2 September 2007 (UTC)

Please take a look into: http://www.dspguide.com/ch34.htm. In short, it explains the "law" as an artifact of the manipulation of the data. Pretty well written and easy to understand. Marco 12:51, 14 September 2008 (UTC)

I dunno. This is the heart of its argument, quoted:
In answer to our question, the logarithmic pattern of leading digits derives solely from sf(g) and the convolution, and not at all from pdf(g).
Putting that into layman's terms will be - well, interesting. Feel free to take a shot at it, though. - DavidWBrooks (talk) 12:54, 14 September 2008 (UTC)
When you get past all the making-a-lot-out-of-quantitative-details, he's just saying that if a probability distribution is broad and reasonably flat on a log-scale, then Benford's law is expected to hold. That's an easy point to explain and get across. Just put a broad, smooth distribution on a properly-labeled log scale. Maybe, for comparison, put a really sharp distribution on a log scale. Readers will be able to look at the log scale and see how Benford's law should hold in the first case but not the second.
The problem is that a "properly-labeled log scale" includes labels for 1,2,3,4,5,6,7,8,9,10,20,30,etc., and it's a bit tricky to do that in the programs I have. I'll try though.... --Steve (talk) 19:22, 14 September 2008 (UTC)
The logarithmic-scale probability density function for an exponential decay process. The area under the curve between two points is proportional to the probability that the function has a value between those two points. Note that this looks different from the conventional depiction of exponential decay. This is because the x-axis is distorted by the logarithmic scaling, so the height also has to be distorted for the area under the curve to be correct.
Here's a start. There's a published paper that says that exponential-decay probability functions satisfy Benford's Law to within a few percent. This matches up with the fact that the probability-density function is reasonably smooth over about two orders of magnitude, as shown in this pic. The text would describe how, since 30.1% of the horizontal (logarithmic) number line lies between 1's and 2's, it's not surprising that around 30.1% of the area under this particular curve lies between 1's and 2's. --Steve (talk) 20:50, 14 September 2008 (UTC)
OK, I decided against that particular picture but added the appropriate explanation and citations with diagrams. Hope it's helpful. --Steve (talk) 05:36, 9 October 2008 (UTC)
The new material looks good. Derek farn (talk) 11:12, 9 October 2008 (UTC)
Unfortunately, the new material is an original research. We therefore must forget it. Wikipedia is and advanced technology for collecting conservative garbage! --Javalenok (talk) 12:23, 24 October 2011 (UTC)
Not original research: The argument is the same as the cited article in "The American Statistician". [Well, the discussion above was in 2008, and the "American Statistician" article was in 2009. So if you had objected in 2008, you might have had a point... But right now it is definitely not original research.] --Steve (talk) 21:42, 24 October 2011 (UTC)

Berger and Hill sum this up well by saying, "A simple explanation? Are you sure?"--Jack Upland (talk) 02:02, 25 May 2014 (UTC)

A probability distribution on a log scale. In this example, the ratio of the red to blue width is very close to the ratio of the red to blue area. So it follows Benford's law, pretty accurately, but not perfectly.
Jack -- To the right is what we're talking about. I personally think that this is a "simple explanation". Your comment implies that you disagree. Do you think that it's "not simple", or "not an explanation", or both? Why? Could you elaborate? --Steve (talk) 06:30, 28 May 2014 (UTC)
I was referring to this article - [1] - which I cited on the page. They know far more than I do, but personally I find it mysterious that I can pick up an atlas and demonstrate Benford's Law, and I have not found an explanation that I understand or even believe.--Jack Upland (talk) 10:19, 28 May 2014 (UTC)

References

Ah, gotcha. In the notation of that article, I have been emphasizing (4) (or (3) with an improved wording that negates their complaints), following the Fewster article. I.e., "Benford's law occurs when, if you plot the probability distribution on a log scale, it's spread out pretty smoothly over many orders of magnitude."
They are retorting that this is not a complete explanation, because I have not said why real-world data-sets often have probability distributions that span several orders of magnitude relatively smoothly. Well, I agree with that! The answer is, indeed, not simple. The answer for "stock market prices" is certainly different than the answer for "area of islands".
For stock market prices, they tend to change proportionally, like a $1 stock would go up or down by 1% ($0.01) with a comparable frequency as a $1000 stock would go up or down by 1% ($10). When you have that kind of dynamics, acting over a long period of time, the probability distribution (on a log scale) tends to get more and more smooth and spread out.
For the population of cities and towns, actually the same explanation still works I think. A town of 200 people might grow to 250 people or shrink to 150, with a similar frequency as a town of 200,000 people might grow to 250,000 people or shrink to 150,000. So again, you get the dynamic where the probability distribution (on a log scale) tends to get more and more smooth and spread-out over time.
For the area of islands, on the other hand, I don't know why the probability distribution would be pretty broad and smooth on a log scale. I don't know enough about the geological processes that cause island formation.
So, in that respect, I tend to agree with that paper. We should not expect to find a single simple explanation that applies to every real-world situation where you find Benford's law.

The conclusion of the paper is: "Although many facets of BL now rest on solid ground, there is currently no unified approach that simultaneously explains its appearance in dynamical systems, number theory, statistics, and real-world data. In that sense, most experts seem to agree with Fewster that the ubiquity of BL, especially in real-life data, remains mysterious." I don't think you agree with that. (And incidentally I can't find that in Fewster's paper.")--Jack Upland (talk) 00:45, 30 May 2014 (UTC)

I would say that "there is no unified approach that explains why you often find probability distributions that, on a log scale, are spread out pretty smoothly over many orders of magnitude, in many different fields like dynamical systems, number theory, statistics, and real-world data". I think it can happen for many different reasons.
It's quite similar to saying "there is no unified approach that explains why multiplication occurs in many different fields like dynamical systems, number theory, ..." It occurs when calculating the area of a rectangle, and when calculating compound interest, and when writing down laws of physics, etc. etc. What a coincidence! Or is there a "unified approach" that explains why multiplication occurs everywhere? Not really. You can try, but it would be so extremely vague that there's no real point!
(Loosely speaking, whenever you are doing primarily multiplication and division, rather than addition and subtraction, you can easily wind up with data that is spread out pretty smoothly over many orders of magnitude.)
I agree with Berger and Hill that there is "currently no unified approach", but I would drop the word "currently". :-P
I agree with you that the reference to Fewster makes no sense. Maybe it's from personal communication, rather than what Fewster wrote in her paper. --Steve (talk) 18:18, 29 May 2014 (UTC)
Update: I rewrote a bit, mainly merging the Limitations section with the Explanations section. I added a footnote with quotes from the Berger and Hill paper that I think make it very clear that they accept the argument in the Limitations section, and don't consider it a "simple explanation" because it does not explain of why we so often come across data-sets that, when plotted as a probability distribution on a log-scale, vary smoothly over several orders of magnitude. --Steve (talk) 18:44, 30 May 2014 (UTC)

## Dispersion should not be too small

normally never mentioned: the dispersion or variance should be not "to small". A kind of proof in nordisk Matematisk tidskrift from 1965 ( or almost) has that condition included in teh proof. —Preceding unsigned comment added by 130.226.230.8 (talkcontribs) 16:18, 16 May 2008

## Error

I'm not convinced the log10 should change to log100 just because we look at two digits instead of one. The number is still base10.

Here's an example: Numbers that start with 1 should comprise 30.1% of the numbers. If we subdivide all the numbers beginning with 1 into 10,11,12,...19, we should expect these ten sub-numbers to add up to the 30.1% expectation of all numbers beginning with 1. Using Log10 (and NOT Log100) yields:

10 - 4.14%

11 - 3.78%

12 - 3.48%

13 - 3.22%

14 - 3.00%

15 - 2.80%

16 - 2.63%

17 - 2.48%

18 - 2.35%

19 - 2.23%

(summing the distributions)

yields 30.1%

. . .which is exactly what we would expect.

Using Log100, on the other hand, will yield only half of the expected value. You can duplicate this result for all the ranges 1-9.

Caleb B caleb@tcad.net —Preceding unsigned comment added by 69.29.42.173 (talk) 21:21, 10 June 2008 (UTC)

You are right. If you take log 100, then the cumulative probability is log_{100} (100) - log_{100} (10) = 1/2, whereas it should be 1. The right probability of a group of digits n (=10,...,99) is log_{10} (n+1) - log_{10} (n), which is also the probability mentioned in Hill's paper. 131.155.15.29 (talk) 09:46, 9 July 2008 (UTC)

## scale invariance

I'm not sure the example with feet and yards is correct. I'm not a statistician, but what I understand based Harold Jeffreys book (1939; Chap. 3) is that the problem arises when you are dealing with scales that have different number of dimensions. There is no reason to think that a measurement squared should have a different distribution than a measurement cubed. Imagine that you have cubes of random sizes. There is no reason to think that the distribution of the volumes of the cubes should look different from the distribution of the areas. If the measurements gave a uniform distribution for the area (equal numbers of each digits), it would give a non uniform distribution for the volume and vice versa. The only distribution that doesn't change when you transform from the area to the volume (or take any other power) is to assign the uniform probability to the logarithm. This also illustrates why Bensford's law only applies to scales that can't be negative. A cube can't have a negative volume or a negative area. There is an asymmetry in things that can only be positive because when you go towards the negatives you hit a wall at zero whereas you can go to infinity towards the more positive values. Measurements that can be negative don't have this asymmetry. Going towards the negatives is the mirror of going towards the more positives and thus things tend to get distributed uniformly.--BenE (talk) 04:50, 6 December 2008 (UTC)

Strictly, cubes can have negative volume, if they can have negative sides. But the whole thing is related to scale invariance which happens to be associated with power laws and so in turn follows the solution to the dimension problem. Whether you prefer
log(2)-log(1) = log(6)-log(3), or
log(2)-log(1) = log(sqrt(2))-log(sqrt(1))+log(sqrt(20))-log(sqrt(10)) = log(cbrt(2))-log(cbrt(2))+log(cbrt(20))-log(cbrt(10))-log(cbrt(200))-log(cbrt(100))
as an illustration is a matter of personal choice, though I think the former is easier to understand in the context of first digits.--164.36.38.240 (talk) 10:26, 8 January 2009 (UTC)

The point about scale invariance should be trivially obvious. For things like river length or income there is no obvious natural unit to count in. We could measure river length in miles, kilometres, or any historical measure and the law still applies. Similarly income will usually be measured in the local currency but the law will hold true whether we use USD, Indian Rupees or gold ounces (for the same data). This is in contrast to things like population where the obvious unit to count in is people.

The "why" link looks like some individual is just showing their mathematical naivete. Either put an explanation there or leave it alone but asking "why" is basically saying "I don't get it" - which is fine but not something anyone else needs to know. —Preceding unsigned comment added by 88.104.110.246 (talk) 17:28, 1 March 2011 (UTC)

## Paper on the arXiv

I have added a link to a paper on the arXiv, which discusses shannon-entropy and benfords law and whatnot. Perhaps someone might wish add something to the page from that, and move the link from "external links" to "references". —Preceding unsigned comment added by Paul Murray (talkcontribs) 00:08, 23 January 2009 (UTC)

## Nonsense in the text

It is simply not true that the probability of something is just the area under a curve when drawn in logaritmic scale.

${\displaystyle \Pr {(a

Making a substitution x = 10^y (y is, hence, the ordinate on the logarithmic axis) yields

${\displaystyle \Pr {(a

I.e. the probability is the area under the curve f(y) = Ln[10] p(y) 10^y, which is completely different —Preceding unsigned comment added by 147.231.27.150 (talk) 14:19, 11 February 2009 (UTC)

Right, the transformation needs to be accounted for in the integrand. Where is the problem in the article? Baccyak4H (Yak!) 14:37, 11 February 2009 (UTC) improved OP math markup Baccyak4H (Yak!) 14:41, 11 February 2009 (UTC)
This is Footnote [5] in the article:
"Note that if you have a regular probability distribution (on a linear scale), you have to multiply it by a certain function to get a proper probability distribution on a log scale: The log scale distorts the horizontal distances, so the height has to be changed also, in order for the area under each section of the curve to remain true to the original distribution. See, for example, [1]"
--Steve (talk) 09:23, 12 February 2009 (UTC)

## "Mathematical statement"

Benford's law is a loosely-formulated implication: it says that if you consider numbers drawn from certain natural sources, then their first digits will conform to a certain specific distribution. The section called "mathematical statement" gives a precise formulation of the conclusion. Making the hypothesis mathematically precise is much more problematic. It's claimed that Hill was the first to give a precise mathematical proof of Benford's law, but in fact what he did was to give a mathematical proof of a mathematical statement, which one may or may not agree captures the essence of Benford's law.

By the way, within the article it's hard to figure out what earlier material is cited by the claim that "Ted Hill proved the result about mixed distributions mentioned above." The word "mixed" doesn't appear elsewhere in the article, and I'm guessing that the reference is to the section on multiple probability distributions. Ishboyfay (talk) 03:44, 20 February 2009 (UTC)

I agree, if we say that Benford's law is an approximate empirical statement, then it can't be mathematically proven as such. This calls for more careful phrasing than we have now. For your second question, yes it's the multiple probability distributions, I put in a link to clarify. :-) --Steve (talk) 07:23, 20 February 2009 (UTC)
(I'm basically agreeing with the above.) The first sentence of the misnamed section Mathematical statement reads:
"More precisely, Benford's law states that the leading digit d (d ∈ {1, …, b − 1} ) in base b (b ≥ 2) occurs with probability P(d)=logb(d + 1) − logbd = logb((d + 1)/d)."
But as much as this "kind of" states it "correctly", this is not a mathematical statement at all, since it is extremely unclear what the word "probability" means here. Alas, the word "probability" has no meaning at all here.
I'm not saying that Benford's law *cannot* be stated mathematically, but that is a very slippery endeavor.
I strongly recommend that Wikipedia stick to true statements and avoid false, and -- like the above sentence -- meaningless ones.Daqu (talk) 18:26, 11 May 2009 (UTC)

## Prime numbers may follow benfords law

http://www.physorg.com/news160994102.html —Preceding unsigned comment added by 208.71.237.254 (talk) 17:58, 11 May 2009 (UTC)

Not really. The first digits of primes up to 10n are fairly evenly distributed for large n. The picture is different for primes up to 2×10n for large n or 3×10n etc. but even then it does not approach Benford's law. --Rumping (talk) 12:07, 20 November 2009 (UTC)

## Income differences?

Can you really count income to "distributions that cover many orders of magnitude rather smoothly"? Is there a significant portion of the population that has ten times more, and ten times less, income than the average?Mumiemonstret (talk) 12:59, 25 September 2009 (UTC)

Well in the US, see here, 80% earn 15,000 to 150,000 dollars. So yeah, I guess that's mostly within one order of magnitude. Can you think of a better example? Or we could soften the wording: "distributions that span several orders of magnitude rather smoothly" :-) --Steve (talk) 03:02, 26 September 2009 (UTC)

I saw my removal of the WP:LINKFARM was partially reverted, so I wanted to make it clear why I pulled out the links that have been restored. The primary problem is that they all run afoul of

The primary problem is with ELNO#1: "Any site that does not provide a unique resource beyond what the article would contain if it became a featured article," which I would say describes all the remaining links. I don't think any of them contain anything special that would be unavailable if someone took the time to flesh out this article. Some of them would definitely be worth using to add or cite material, though, which would improve the article and have the nice side effect of retaining the links.

Some of the links have problems beyond ELNO#1, though:

5. Links to web pages that primarily exist to sell products or services, or to web pages with objectionable amounts of advertising. For example, the mobile phone article does not link to web pages that mostly promote or advertise cell-phone products or services.
11. Links to blogs, personal web pages and most fansites, except those written by a recognized authority. (This exception is meant to be very limited; as a minimum standard, recognized authorities always meet Wikipedia's notability criteria for biographies.)
11. Links to blogs, personal web pages and most fansites, except those written by a recognized authority. (This exception is meant to be very limited; as a minimum standard, recognized authorities always meet Wikipedia's notability criteria for biographies.)
8. Direct links to documents that require external applications or plugins (such as Flash or Java) to view the content, unless the article is about such file formats. See rich media for more details.
13. Sites that are only indirectly related to the article's subject: the link should be directly related to the subject of the article. A general site that has information about a variety of subjects should usually not be linked to from an article on a more specific subject. Similarly, a website on a specific subject should usually not be linked from an article about a general subject. If a section of a general website is devoted to the subject of the article, and meets the other criteria for linking, then that part of the site could be deep linked.
8. Direct links to documents that require external applications or plugins (such as Flash or Java) to view the content, unless the article is about such file formats. See rich media for more details.
5. Links to web pages that primarily exist to sell products or services, or to web pages with objectionable amounts of advertising. For example, the mobile phone article does not link to web pages that mostly promote or advertise cell-phone products or services.
8. Direct links to documents that require external applications or plugins (such as Flash or Java) to view the content, unless the article is about such file formats. See rich media for more details.
8. Direct links to documents that require external applications or plugins (such as Flash or Java) to view the content, unless the article is about such file formats. See rich media for more details.

There is one notable exception, the Benford Online Bibliography. The front page, which is what we linked to, doesn't have very much information, but clicking through provides some great resources. I should have been more careful to keep it the first time.

Thoughts? — Bdb484 (talk) 21:14, 4 February 2010 (UTC)

I agree with Johnuniq. All of the current links are very useful to the reader (myself included). In fact, I would be in favour of adding a few more, as long as the help to further illustrate this peculiar ratio. It pops up all over the place. A list of places (with associated links) would further this article. --Thorwald (talk) 01:54, 5 February 2010 (UTC)
I'm 100 percent with you on the primacy of common sense over Wikipedia "rules," but this isn't a case where the two conflict. If the links are that helpful, then the information can be pulled into the article and cited. As it stands, the external link section is no more useful than googling Benford and seeing what pops up. Thanks to the Benford Bibliography link, we already have a collection of high-quality links that easily surpasses what we have here.
The benefit from working out which links should stay and which should go is simple: improving the article. If you find something particularly useful in one of these links, then go ahead and add it to the article with a citation. Then the reader has access to more information that is better organized -- all without losing links to the pages in question.
That way, everybody gets what they want, no? — Bdb484 (talk) 02:10, 5 February 2010 (UTC)
Sounds good. I agree that is a better approach. Just don't delete the links until we have time to include the relevant information in the article (with citations). --Thorwald (talk) 02:26, 5 February 2010 (UTC)
My instinct is to agree that useful content from an external link should be incorporated into the article, with the link used as a reference – that is one of the first things we say to people who add links to their website on fifty different articles. However, I think this topic has rather unusual attributes that make that procedure unworkable since most of the external links have too much detail for a general article here, yet they each have something useful to say. The Benford Online Bibliography link that you moved to the top of the list is only of interest to a serious researcher (I would be inclined to restore it to the "More mathematical" section). Johnuniq (talk) 02:35, 5 February 2010 (UTC)
I'm not in any rush to pull any of the links. They were initially pulled down as part of WP:BRD cycle, so I'm happy to give anyone plenty of time to work it out.
From my review of the bibliography, it seems that it has plenty for both the lay reader and the experienced mathematician. For example, the first link they offer is to the Radio Lab segment, which presented a very accessible introduction to Benford. It also provides a lot of links to plain-language news coverage from the Wall Street Journal, the New York Times, Washington Post. I'd be inclined to leave it where it is, but I wouldn't object if you feel strongly about it. — Bdb484 (talk) 02:45, 5 February 2010 (UTC)
I could be mistaken, but it seems like there hasn't been anything done in the way of clean-up in the two months-plus since we talked about this. Is anybody working on this? — Bdb484 (talk) 20:03, 6 April 2010 (UTC)
Above I have explained that your suggestion, while admirable in general, is difficult to implement (and possibly unhelpful) in this particular case. You could try getting more opinions at WP:ELN. Johnuniq (talk) 03:17, 7 April 2010 (UTC)
I do remember you offering that opinion, though I don't remember you offering anything to substantiate it. Just the same, when it was requested that I allow some time for the relevant material to be included in the article, I was happy to do so. After more than two months, no apparent effort has been made to that end.
The folks at ELN generally support abiding by WP:ELNO, so I'm not sure what more they would have to offer to the discussion. But if you think they'd believe there's a reason to disregard the rules that are good enough for every other article, you should probably take that route yourself, as the burden for establishing a reason to keep each of these links falls on you. If not, though, I'll be happy to handle the EL clean-up myself. — Bdb484 (talk) 04:16, 7 April 2010 (UTC)

I compared the current external links list with what existed one year ago. In the last 12 months:

• Eight links are the same (with some tweaking, but same site).
• Four links have been removed ([2], [3], [4], [5]).
1. Benford Online Bibliography, an online bibliographic database on Benford's Law.
2. Benford’s law, Zipf’s law, and the Pareto distribution by Terence Tao
3. From Benford to Erdös, WNYC radio segment

The first new link seems essential, the second is by Terence Tao which automatically qualifies it, and the third seems hard to disagree with (although I have not heard it). Given that eight links have been considered satisfactory for at least a year, and the three new ones seems totally suitable, I do not see why any should be pruned. I notice that Sbyrnes321 reverted your removal of the links, and Thorwald posted in agreement with keeping the links above (although the second post supports conversion cited material). I think it fair to conclude that consensus favors keeping the links. Johnuniq (talk) 08:52, 7 April 2010 (UTC)

## Example #2

Here is an xample using factorials. From OEIS A008905 ref, Noe has a list of the first 1000 or so leading digits of the consecutive factorials. Here is the distribution that is seen from that sample vs Benford:

digit Benford n!
1 0.30 0.29
2 0.18 0.18
3 0.13 0.12
4 0.10 0.10
5 0.08 0.07
6 0.07 0.09
7 0.06 0.05
8 0.05 0.05
9 0.05 0.05

--Billymac00 (talk) 04:53, 25 April 2010 (UTC)

## Bizarre crashing problems

This is really weird, but I've found that, for some reason (at least on my computer, an iMac running Mac OS X Snow Leopard), this page crashes when viewed in Chrome, but the talk page loads fine. Meanwhile, Safari can load the page itself, but not the talk page. I haven't tried any other browsers or operating systems, but I'd still like to know if anyone else has encountered this and if any explanation is known. 75.69.192.96 (talk) 05:00, 23 May 2010 (UTC)

Best to post this at WP:Village pump (technical) (with a link to Benford's law). Johnuniq (talk) 05:08, 23 May 2010 (UTC)

This crashing issue is happening for me, too. Only the Benford's Law page, and the page loads fine, but the moment you attempt to scroll down the thread crashes. It happens with the current Chrome and Chrome-dev release, as well as the current daily build of Chromium. Oh, and Safari and Firefox, too. apraetor —Preceding undated comment added 01:43, 16 June 2010 (UTC).

• I'm in Safari and Chrome 5 on 10.6.4 and I don't see a problem with either page. But I would post to VPT if you are sure that this problem is unique to this page and that browser setup. 184.59.8.54 (talk) 18:52, 16 June 2010 (UTC)
I was going to post on VPT on behalf of the two people with the problem, but when I searched that page for "crash" I saw that similar complaints have been made, with the reply that it is a browser bug and the supplier of the browser should be notified. Johnuniq (talk) 00:29, 17 June 2010 (UTC)

## Explanation 2

I just read in a post the following: "Benford’s law arises naturally if the data under consideration span several orders of magnitude—for example, the first digits of the powers of two obey Benford's law" – this seems to me much more intuitive than the current "This distribution of first digits arises whenever a set of values has logarithms that are distributed uniformly, as is approximately the case with many measurements of real-world values". Are they equivalent, or perhaps complementary? For me, as a layman, the "spanning orders of magnitude" thing made much more sense and instantly gave me a vague but intuitive grasp at why the law works. Do you think it could be integrated in the lead? --Waldir talk 00:55, 15 December 2010 (UTC)

Phrasing like that is in Benford's law#Limitations, but not in the lead right now. I agree, it should be. --Steve (talk) 01:10, 15 December 2010 (UTC)
I'm not very familiar with this topic, so I'd rather not add it myself. Do you think you could do that? --Waldir talk 15:57, 17 December 2010 (UTC)

## Graphic caption wrong?

A logarithmic scale bar. Picking a random x position on this number line, roughly 30% of the time the first digit of the number will be 1.

I understand this (the 1 to 2 zone is bigger), but isn't that only due to the scale of the graph? If an x is chosen random visually by distance, then I agree, but if it is a random x variable, it should over time hit all the digits equally no? —Preceding unsigned comment added by 216.7.125.201 (talk) 17:09, 28 January 2011 (UTC)

The caption should be reworded to be clearer. Maybe "Roughly 30% of this line consists of numbers that begin with the digit 1: Numbers between 0.1 and 0.2, between 1 and 2, between 10 and 20, etc."? What do you think?
Also the picture is pretty hard to read. There doesn't seem to be a better option: [6]. It should really be SVG... --Steve (talk) 20:04, 28 January 2011 (UTC)

## Scale invariance argument

It appears to be based on the paper of R.S. Pinkham. On the distribution of first significant digits. Ann. Math. Statist., 32:1223{1230, 1961. (open access) But someone should check the details before adding it as source. Tijfo098 (talk) 02:30, 23 March 2011 (UTC)

I was right, see slide #26 here. Probably a more citable secondary ref exists somewhere. Tijfo098 (talk) 22:28, 23 March 2011 (UTC)

## Election fraud

Quite amusing to see that exact same "magic box" claimed to prove fraud in Iranian elections can be used to prove that Obama "stole" the US pres. elections. Deckert, Myagkov and Ordeshook go over this in quite some detail. The basic error in using BL as a "magic box" is the assumption that voters are iids; especially not true in small precincts. Heck, they even show that using that analysis one concludes that elections have been stolen in some US precincts for decades in a row. Tijfo098 (talk) 03:50, 23 March 2011 (UTC)

Amusing...? I tend to find that thing quite depressing... blatant misunderstandings like that. JaeDyWolf ~ Baka-San (talk) 15:47, 23 March 2011 (UTC)

## Explanations

I think that sections 4.1 and 4.2 should be reversed in order. The reason is that it is the scale invariance that is the primary reason for this phenomenon. It's true that exponential growth processes display the phenomenon, but the reason is that exponential growth processes is one of a number of ways that scale invariance can arise. Putting exponential growth processes in first place overemphasizes this particular way of obtaining the phenomenon at the expense of the more primary underlying reason.

I was led to look at the article because of a recent entry on Andrew Gelman's statistics blog where Benford's law was discussed. One of the comments indicated that it was exponential growth that was the cause of the phenomenon (I responded just below it about the more general reason). It may be that the person who made that comment had read part, but not all of the WikiPedia entry (which had been cited by Andrew Gelman), and came away thinking that exponential growth was "the" explanation. Reversing the order of the two entries would put the early emphasis where it belongs: On the scale invariance. Bill Jefferys (talk) 13:18, 13 October 2011 (UTC)

I disagree that scale invariance is "the primary reason for this phenomenon". The height of human adults does not satisfy Benford's law when the heights are measured in feet, and it also does not satisfy Benford's law when the heights are measured in meters. I don't see anything within the "scale invariance" argument that would explain why Benford's law does apply to the lengths of rivers but does not apply to the heights of human adults. If something deserves to be called "the primary reason for this phenomenon", it should of course explain both why it works when it works, and why it doesn't work when it doesn't work. I think the real "primary explanation" is the one in the "Limitations" section (poor organization there), although of course I'm biased. :-) --Steve (talk) 17:21, 20 October 2011 (UTC)
Scale invariance obviously only applies when the phenomenon being measured involves numbers varying by at least several orders of magnitude. This is the case with rivers, and not the case with human heights, which don't vary even over one order of magnitude, considering newborn infants and basketball players at the extremes.
My point is that exponential growth processes is only one mechanism that produces scale invariance, so that putting it first is misleading (see the Gelman blog entry). It should be second (and your comment doesn't refute this). Bill Jefferys (talk) 17:54, 20 October 2011 (UTC)
"Scale invariance obviously only applies when the phenomenon being measured involves numbers varying by at least several orders of magnitude." This is not "obvious" based on the wikipedia article explanation as currently written. (At least, not obvious to me!) The current explanation just says, "if it is indeed true that the first digits have a particular distribution, it must be independent of the measuring units used", but does not say a word about why or whether "it is indeed true that the first digits have a particular distribution". You seem to have a deeper understanding of the "scale invariance argument" than just what's written here now, and I hope you take some time to improve that part of the article. :-) --Steve (talk) 21:42, 24 October 2011 (UTC)
Actually, the "limitations" section you pointed to gives a reasonably good discussion of why the phenomenon has to vary over several orders of magnitude.
I'm thinking that a (perhaps) retitled "limitations" section might come before the section we're discussing, and then reorder the two entries in that section to put the exponential growth processes second last. Again, my motivation here is that the person who commented on Andrew Gelman's blog may well have read that section, halfway through, decided that exponential growth was the explanation, when the whole issue is much more than that since exponential growth is only one way that the phenomenon can arise, and there are more fundamental considerations, as both you and I point out. Bill Jefferys (talk) 23:09, 24 October 2011 (UTC)
I agree with your understanding that the primary reason is scale invariant data. However, exponential growth is a stand-alone reason for B's law to apply even when scale variant data is being used. Ankh.Morpork 18:52, 23 December 2012 (UTC)

The article states that normal distributions "can't span several orders of magnitude." This is not true, the support of a normal distribution is the entire real line, and hence every normal distribution spans every order of magnitude. There are also normal distribution that span several orders of magnitude with a high probability: just consider the normal distribution with mean 0 and variance 10^10. — Preceding unsigned comment added by Bhglaser (talkcontribs) 17:14, 5 November 2013 (UTC)

I would say scale invariance is a characteristic of Benford data, but not an explanation.--Jack Upland (talk) 06:54, 23 May 2014 (UTC)

## Scale invariance partly circular

The scale invariance section says, "The law can alternatively be explained by the fact that, if it is indeed true that the first digits have a particular distribution, it must be independent of the measuring units used (otherwise the law would be an effect of the units, not the data). "

It's a proof by contradiction, assuming that "the law is an effect of the units" is false, but it never proves this step. We clearly need better sources for this section. Superm401 - Talk 23:28, 31 January 2012 (UTC)

I would not agree. It essentially says that when scale invariance (i.e. not depending on units) of the distribution of first digits happens, this implies Benford's law, so that step does not need to be proved. --Rumping (talk) 20:56, 31 May 2013 (UTC)

The article states that the law holds only for data sets whose logarithms are uniformly and randomly distributed. It further states that data sets following normal distributions would not follow the law. That makes sense, but the text does not elaborate on what kind of data sets would have their logarithms uniformly and randomly distributed. Do I understand correctly that a set of completely random numbers would not follow this law? If so, what properties should a set of data have in order for the logarithms of those data to be uniformly and randomly distributed? In other words, what is it exactly that makes real-life financial data, for example, not distributed in uniform matter? I think a short sentence addressing this would do this otherwise helpful article a lot of good. Any takers?—Ëzhiki (Igels Hérissonovich Ïzhakoff-Amursky) • (yo?); March 29, 2012; 17:32 (UTC)

A simple set that satisfies Benford's and trivially has a uniformly distributed logs is just an = e^xn, where xn is randomly distributed in the range of [0, 10]. The an then span several decades and satisfy Benford's. I need to check this, but I'm pretty sure normally distributes data does follow Benford's -- but only if the width of the distribution spans several decades, which is unlikely for a single stock to do. This is kinda a tricky issue but I'll see what I can do on it later. a13ean (talk) 14:47, 30 March 2012 (UTC)
Thanks, this helps somewhat and I'll be looking forward to your addition. By the way, "e" in "e^xn" above is the Euler's number, correct? If so, can't it be any other arbitrarily selected constant?—Ëzhiki (Igels Hérissonovich Ïzhakoff-Amursky) • (yo?); March 30, 2012; 15:01 (UTC)
It is Euler's number, and it's an arbitrary choice here just because it plays nicely with the natural log. There's lots of other functions that are mostly log-distributed across several decades and similarly follow Bedford's. a13ean (talk) 15:45, 30 March 2012 (UTC)

MORE QUESTIONING ON THE BENFORD'S LAW DISTRIBUTION

Hi, the article explains that charting the mathematical constants that you can see in the chart given that the constants also show the Benfords Law pattern BUT it says the chart uses "the first significant number" of the constant and therefore does that mean its excluding "the number to the left of the decimal" of the constant because I don't know of any constants higher than 5 and that chart implies there's constants that start with 6,7,8, and even 9 and I do not know of any constant that starts with that high a number. Is the article saying the chart is excluding the number to the left of the decimal in the constant or am I reading the chart wrong?

173.238.43.211 (talk) 05:26, 25 July 2012 (UTC)

## Digital signal processing

An editor (128.178.7.131) suggested the following reference:

• The Scientist and Engineer's Guide to Digital Signal Processing by Steven W. Smith, chapter 34.
"Digital Signal Processing usually involves signals with either time or space as the independent parameter, such as audio and images, respectively. However, the power of DSP can also be applied to signals represented in other domains. This chapter provides an example of this, where the independent parameter is the number line. The particular example we will use is Benford's Law, a mathematical puzzle that has caused people to scratch their heads for decades. The techniques of signal processing provide an elegant solution to this problem, succeeding where other mathematical approaches have failed."

I am restoring the IP's suggestion (which they removed, possibly due to a format problem) as the content appears very interesting, and I hope to read it one day... Johnuniq (talk) 09:17, 14 December 2012 (UTC)

It's Ref 17 in the article already...the footnote is not properly templated though. --Steve (talk) 23:18, 14 December 2012 (UTC)
Ouch, and it's in external links as well. I'll try to pay more attention to this topic, and might get around to fixing the ref. Johnuniq (talk) 23:28, 14 December 2012 (UTC)
I (former 128.178.7.131) removed the comment because the book chapter is already referenced; however, I think the content of the chapter should be reflected in the text since it provides a very interesting explanation, a statistical test and insights into why the law applies to some distributions and not to others. Furthermore, it is relatively easy to understand (e.g. for an engineers). Moutonnette (talk) 16:10, 17 December 2012 (UTC)
I agree. The reason I did not get around to fixing the reference formatting (now fixed by Steve) is that I read chapter 34 from the reference and my first reading did not find a page to verify the points in the article where it is used. Yet, the DSP approach should be mentioned (and its observation that finding the leading digit is equivalent to multiplying by powers of ten), although unfortunately there would be a long wait before I have time to do that myself. Johnuniq (talk) 22:33, 17 December 2012 (UTC)

The DSP book is discussing everything in terms of DSP procedures because the point of the book is to teach DSP and practice using it. Here, the only goal is to teach Benford's law. There is no ulterior motive. Therefore we can make things much much simpler than the DSP book does.

A broad probability distribution on a log scale. The total area of red divided by the total area of blue is approximately the same as the width of a red bar divided by the width of a blue bar. Therefore this distribution satisfies Benford's law to high accuracy.

The discussion in the DSP book corresponds to the Benford's law#Limitations section. The DSP book is much much wordier and harder to follow (because it is delving into extraneous topics), but there is virtually no understanding of Benford's law that you get from the DSP book that you would not get from the Limitations section. Please refer to the figure on the right...

• "Ones scaling test" section: Multiplying all the numbers in the distribution by 1.01 shifts the curve a bit to the right. If you keep shifting the curve to the right, the total red area goes up a bit, then down a bit, then up a bit, then down a bit. It's obviously periodic because the pattern of vertical bars is periodic. The DSP book is pointing out this fact to lead up to the fourier series discussion later...
• "Writing Benford's law as a convolution" section: Basically a description of the figure on the right.
• "Solving in the frequency domain" and "Solving Mystery #2" sections: The distribution shown on the right is very broad and smooth over many orders of magnitude. Therefore the areas and widths are related, as described in the figure caption. At Benford's law#Limitations, there is also a plot of the opposite case, a narrow distribution that does not satisfy Benford's law. In the DSP book, this same fact -- the relation between distribution width and Benford's law accuracy -- is "explained" by invoking properties of fourier series. But I don't see any benefit to mentioning fourier series. All you need to do is look at the two side-by-side graphs in the article, and the fact becomes abundantly obvious. The fourier series discussion adds nothing. (Well, it could be used to quantify the relationship between width and Benford's law accuracy, subject to certain mathematical assumptions ... but such extreme level of detail is not appropriate for the wikipedia article.)
• "Solving Mystery #1" section: The pattern of the bars are related to logarithms. It sounds obvious, and it is obvious...
• Rest of the chapter -- various examples and details most of which are already discussed in the wikipedia article.

Therefore I don't think any new content has to be added from the DSP book beyond what's already in the article. (But I'm biased...) --Steve (talk) 03:29, 18 December 2012 (UTC)

OK, I see you have been working on this article a long time, and have thought about the topic a lot more than me. For now, I just want to add one thing I found a couple of days ago. Ref 18 is Fewster, R. M. (2009). "A simple explanation of Benford's Law". There is an overview here which includes a link to a pdf of the full paper. Johnuniq (talk) 06:41, 18 December 2012 (UTC)

## Outcomes of exponential growth processes

When this quantity reaches a value of 100, the value will have a leading digit of 1 for a year, reaching 200 at the end of the year... Early in the fourth year, the leading digit will pass through 8 and 9. The leading digit returns to 1 when the value reaches 1000, and the process starts again, taking a year to double from 1000 to 2000.

Can someone explain the meaning of the end of the paragraph that "the process starts again, taking a year to double from 1000 to 2000" since during year 4, the quantity will increase from 800 to 1600 and in year 5, it will change from 1600 to 3200. Don't the time intervals for each of the initial digits vary throughout the exponential growth? e.g 4 will be the leading digit in the third year as q increases from 400 to 800, for a shorter time then in year 6 when q increases from 3200 to 6400. Ankh.Morpork 18:29, 23 December 2012 (UTC)

I believe you are asking whether increasing from 4000 to 5000 will take exactly the same amount of time as increasing from 400 to 500. Because 500/400 = 5000/4000, log2(500/400) = log2(5000/4000), and that equals 0.32192809488736235. So it will take 0.32192809488736235 years for increasing from 400 to 500, or from 4000 to 5000, or even from 5000 to 6250. Nolancapehart (talk) 23:49, 1 May 2013 (UTC)

## fallacious explanation

normal distributions can't span several orders of magnitude

That may be true for the examples given (IQ, human heights) but it's not true of normal distributions in general, which have nonzero probability distribution over the entire real line. Nor do I believe it to be even approximately true for all normal distributions since the variance could be extremely wide.

However, if one "mixes" numbers from those distributions, for example by taking numbers from newspaper articles, Benford's law reappears.

Wouldn't the central limit theorem suggest that mixing distributions would produce something normal? DAVilla (talk) 11:00, 24 December 2012 (UTC)

More generally, if there is any cut-off which excludes a portion of the underlying data above a maximum value or below a minimum value, then the law will not apply.

This is an extremely strong claim, so much so that counter-examples are trivial. DAVilla (talk) 11:04, 24 December 2012 (UTC)

Sold on all three. This could use some cleanup. a13ean (talk) 17:54, 25 December 2012 (UTC)

## Change in base

Does Benford's law apply to prime numbers? http://primes.utm.edu/notes/faq/BenfordsLaw.html From this, I am think about the change of base to each prime is changing the distribution. So...

In Benford's_law#Mathematical_statement is the statement the general form:

${\displaystyle P(d)=\log _{b}(d+1)-\log _{b}(d)=\log _{b}\left(1+{\frac {1}{d}}\right).}$

If there is a change in base, say from one prime number to the next prime number how would one calculate the change in P(d)? John W. Nicholson (talk) 21:01, 24 January 2013 (UTC)

## Primes

Here is a paper which is dealing with prime numbers and zeros following Benford's law:

http://arxiv.org/PS_cache/arxiv/pdf/0811/0811.3302v1.pdf

I hope it is useful. John W. Nicholson (talk) 01:46, 8 March 2013 (UTC)

That paper has statements like "Note in figure 1 that primes seem however to approximate uniformity in its first digit. Indeed, the more we increase the interval under study, the more we approach uniformity (in the sense that all integers 1, ..., 9 tend to be equally likely as a first digit)" which suggests to me that prime numbers (like positive integers) do not follow Benford's law.--Rumping (talk) 21:13, 31 May 2013 (UTC)
When you say "that paper" and there have been two papers stated (one in this section and one in the prior section) I am unsure which you are refering to without looking at the article again. But, without looking and knowing how the graph is in the first article of what they mean with "approach uniformity" I would highly suggest that you look at it again. Log graphs tend to have a large 1 digit range and a smaller 9 digit range even with a "approach uniformity". This means that is a grouping ia like what Benford's law requires. John W. Nicholson (talk) 02:44, 2 June 2013 (UTC)
"That paper" was refering to http://arxiv.org/PS_cache/arxiv/pdf/0811/0811.3302v1.pdf and the quotes come from there. To give an example, if you look at primes smaller than 100,000, then the number which start 1 are 1193, 2 1129, 3 1097, 4 1069, 5 1055, 6 1013, 7 1027, 8 1003, and the number which start with 9 are 1006. This is almost uniform, and a long way from Benford's law. Go up to a higher power of 10 and the distribution will tend to be even closer to uniform, and in the limit it is uniform. --Rumping (talk) 10:51, 21 June 2013 (UTC)
So you are saying that for regular numbers less than 100,000 then the number which start 1 are different than 2, 3, 4, 5, 6, 7, 8, and 9? John W. Nicholson (talk) 04:45, 23 June 2013 (UTC)
No, there are 11,111 of each, which is uniform. And for primes the distribution is almost uniform, and closer to uniform as you increase the limit. --Rumping (talk) 09:42, 31 July 2013 (UTC)

## "Benford's Law has been explained in various ways."

"Benford's Law has been explained in various ways."

This is kind of pitiful. Unfortunately, this is what I've seen with many wikipedia articles in general. There are a lot of "experts" who unfortunately don't know much. This is especially the case for many math articles I see, but not limited to that. The basic thrust of the article is totally wrong (case in point the bogus article 0.999999 = 1), or the explanation is muddled and confused. Typical of such math articles is to write down a bunch of impressive looking equations someone copied from a book and doesn't really understand.

I would suggest that wikipedia is more a general encyclopedia for the general reader, not an advanced technical book. It fails on both counts in this case. The explanation is not comprehensible to the general reader, and is quite inadequate for the expert.

This "explained in various ways" reminds me of the famous "five proofs" or whatever number of God's existence. You don't need five proofs, you need just one. The reason five proofs are put forward is kind of desperation. Same here.

If someone who really understands math (or some other real subject beyond "Buffy the Vampire Slayer" plots or such, which is mostly what wikipedia is good for) tries to provide high quality information, teeming multitudes of "experts" shout him or her down. No insult intended, it's just the reality. Would you have 100 random people off the street wielding scalpels to do your brain surgery? No. Did a committee of 100 compose Beethoven's Fifth Symphony? No. Does a random committee of 100 adjudicate Supreme Court decisions? No.

Anyway, off my soap box, back to the problem at hand, here is some help. This is not just a diatribe against wikipedia, but a genuine desire to help the editors who are trying to produce something worthwhile.

Take the numbers 1 to 9. Call them measurements in yards. The first digit happens 1 time for each number. No "Benford's law" so far. Then multiple them all by something, say 3. Aha! You will see "Benford's Law" emerge.

Measurement numbers (distances, times, etc.) are arbitrary, based on the units and the base. Suppose you convert yards to feet, as suggested above. 1 -> 3; 2 -> 6; 3 -> 9; 4 -> 12; 5 -> 15; 6 -> 18; 7 -> 21; 8 -> 24; 9 -> 27. "Benford's law" rear's it's ugly head. If you just look a little, it's obvious that it's easier to "get to 1" than to "get to 9" by multiplying. Big deal. :)

Or suppose you are counting sheep for your insomnia. If the count is in the range 90 to 99 before nodding off, that's 10 slots possible. Moving on to 100-199, that's 100 slots, so ones will tend to "crowd out" the nines. Suppose you go to sleep on sheep 166. Leading 1 wins! But even if you dozed off at sheep 966, leading 1 would STILL win out. Nine has zero chance to "win", can only tie at best, such as 9 or 99 sheep.

"Benford's law" intuitively boils down to the basic facts: *** In incrementing numbers 1 comes before 2, so 1 "has more chances" to be the first digit. *** In multiplying numbers 1 is smaller than 2, so the chance that the product of X with a number starting with 1 results in a number starting with 1 is greater than the chance that the product of X with a number starting with 2 results in results in a number staring with 2.

To show the somewhat arbitrary nature of "Benford's law", suppose you started with 999,999,999 sheep, and counted down, somewhat in the vein of "99 bottles of beer on the wall". :) Then all the numbers would start with 9, assuming your insomnia was not permanent. "Benford's law" would be totally reversed.

I normally do not spend time on wikipedia. It's a waste of time for me. It does pain me to see so much mediocrity. But I do appreciate the editors are trying their best. I wish wikipedia well and hope this comment helps. 71.212.104.23 (talk) 08:25, 22 August 2013 (UTC)

Okay, first of all, just because you think 0.9 = 1 is "bogus" doesn't detract from the fact that it's true. It looks to me like you're blaming the article itself for something that you don't understand. All of the examples you've given ([1...9], counting sheep or backwards from a value] are of arbitrary numbers which wouldn't follow Benford's Law. Things HAVE been explained in various ways and, from what you've said, I can see no problems beyond what you've perceived to be there. JaeDyWolf ~ Baka-San (talk) 10:18, 22 August 2013 (UTC)
So, 71.212.104.23, what is the best case for Benford's law? Can prime numbers be shown to follow Benford's? John W. Nicholson (talk) 13:43, 22 August 2013 (UTC)
Benford's law is an empirical statement about real-world datasets--a statement which is sometimes but not always true. Since it is not a mathematical theorem, it does not possess a mathematical proof. Different real-world datasets may satisfy Benford's law for different reasons. That's why one explanation is not necessarily enough.
(However, even for real mathematical theorems, there is still often a benefit to showing more than one proof, as different proofs may offer different insights and ways of thinking about things. That is why multiple proofs of the same thing can commonly be found in textbooks and courses ... and wikipedia.)
71.212.104.23's arguments for Benford's law are pretty shallow: None of those examples actually follows, or is expected to follow, Benford's law.
Of course, the article, like most articles, could be clearer and better and easier to read by non-experts. It would help to point out specific things that are especially confusing. --Steve (talk) 17:28, 10 March 2014 (UTC)
It doesn't follow to say that because it's empirical there's no mathematical proof. There must be some explanation, but it hasn't been found.--Jack Upland (talk) 02:57, 24 May 2014 (UTC)
Jack -- You can prove a mathematical statement like
"If a data-set is generated by process X, then it will follow Benford's law to accuracy Y."
For example...
"If a data-set is generated by an exponential growth process, terminated at a random time, then it will follow Benford's law to accuracy Y, where Y is blah blah blah (a formula relating to the growth rate, the probability distribution of termination time, etc.)"
A probability distribution on a log scale. In this example, the ratio of the red to blue width is very close to the ratio of the red to blue area. So it follows Benford's law, pretty accurately, but not perfectly.
I can prove that statement, and I can prove 10 other statements just like that. But none of those is a "mathematical proof of Benford's law". Because (1) "Process X" will be different for different data-sets (lengths of rivers, stock-market prices, etc.), (2) To make a rigorous mathematical proof, I have to idealize process X, which means that it won't (strictly speaking) apply to any real-world data-set. For example, bacterial growth under the right conditions is approximately exponential, but not exactly exponential. No real-world data-set is generated by an exactly exponential growth process. And as soon as you start speaking loosely and making approximations, you no longer have a rigorous mathematical proof.
The only way to "prove" in a totally rigorous way that the lengths of rivers follow Benford's law to 1% accuracy, is to measure all the rivers, or else to build a rigorously-accurate geological model of river formation and evolution. Who knows, maybe there is some geological process that makes it unusually likely for a river to have a length between 350km and 700km.
If you look at the figure to the right, it's very easy to prove that the more broad a probability distribution is (on a log scale), the more closely it follows the Benford's law distribution. But different data-sets are broad on a log-scale to different extents, and for different reasons! :-D --Steve (talk) 06:22, 28 May 2014 (UTC)
I think that's ludicrous. Benford's Law is proveable in practice. What's proved impossible is a mathematical explanation of why it's true. There is no explanation of why the lengths of rivers (in metres, yards, or cubits) should obey Benford's Law. Apparently the Law holds even if the figures are inaccurate.--Jack Upland (talk) 11:19, 29 August 2014 (UTC)
There is an explanation! It has two parts. The first part of the explanation is a geophysical explanation of why the lengths of rivers has a broad distribution on a log scale. The second part is the mathematical explanation of why anything with a broad distribution on a log scale will always obey Benford's law, as explained in the article, Section 4.1. --Steve (talk) 14:24, 1 September 2014 (UTC)

Rubbish. Singing in the choir is not an interpretation. I can imitate phrases in foreign languages but don't claim to understand them by that fact. Dressing up the phenomenon in mathematical jargon is not an explanation. It is merely reproducing the problem in a logical-numerical manner. If you can't explain it, don't buck-pass to Napier.--Jack Upland (talk) 08:52, 30 November 2014 (UTC)

Hi Jack -- Sometimes I see an alleged explanation of something that invokes a math concept I'm not familiar with, let's say the Levi-Civita connection for example. In that circumstance, I have no way to know whether it's actually the explanation I'm looking for, or whether it is merely a restatement of the original question in a more jargon-y way. Both are possible, and I've seen both plenty of times. The only way that I could possibly figure out whether it's a real explanation or not is to put in some effort to learn more about the Levi-Civita connection and build up my intuition about the Levi-Civita connection. After that I can decide whether the conceptual framework of the Levi-Civita connection really does lead me to the clear beautiful explanation I was looking for. And if so, whether that explanation actually requires understanding the Levi-Civita connection, or whether the explanation can be extracted and presented in a way that does not require this background knowledge.
The point of this example is: Just because it seems to you that an alleged explanation is not really an explanation but a jargon-y restatement, doesn't mean that that's really the case. Maybe it is a real explanation and you just need to spend a bit more time learning the necessary background. Maybe not. There's no easy way to tell.
To me, the explanations in this article do not seem jargon-y at all, and I made the graphic above specifically to help make the explanation accessible to as broad an audience as possible. I am not trying to intimidate anybody. If you cannot understand what the text is saying, you should say more specifically what you find confusing. I'm sure the writing can be improved.
If you really understand the text at a technical level and still disagree with it, either because you think that it is not an explanation at all, or because you think that it is unnecessary jargon and it can be described in a simpler way, then I am happy to hear your technical argument to that effect.
I don't see how you can ever hope to explain Benford's law without discussing logarithms and probability distributions, since the words "log" and "distribution" are right there in the statement of Benford's law. :-P --Steve (talk) 16:58, 30 November 2014 (UTC)
My maths is a bit rusty, but I'm not objecting to the explanation because it's too technical to understand. I came to this page hoping for some explanation of Benford's Law and didn't find one. A description is not an explanation. The article's "explanation" describes the Benford's Law as a logarithmic distribution. But it does not explain why this distribution occurs so widely. I agree with Arno Berger and Ted Hill who said, "The widely known phenomenon called Benford’s Law continues to defy attempts at an easy derivation".[1] I think they understand the maths!--Jack Upland (talk) 05:21, 2 December 2014 (UTC)

References

Not having a simple derivation does not mean that it can't have a simple explanation. There is some cleanup to be done (specifically in the Explanations section), but the general idea is there (in the Mathematical background section). a13ean (talk) 16:11, 2 December 2014 (UTC)
OK, that makes no sense, but dream the dream and live your fantasy.--Jack Upland (talk) 10:15, 15 December 2014 (UTC)

## Positional Number System

It's simply an artifact of the positional number system. Of the first 20 numbers, 50% start with a 1. Of the first 30 numbers, 33% start with 1. Of the first 100 numbers, 10% start with 1. Of the first 200 numbers, 50% start with a 1. Compounding this is that for the vast majority of ditributions in nature, the frequency of a given value diminishes as the value approaches the upper and lower limits of the range. In the direction of the upper limit, the number of possible values starting with 9 compared to 8 is only a tenth of the number of possible values starting with a 10, 11, 12, ... 19 compared to 9. See this symmetrical range as an example:

Value: Frequency
5: 2
6: 3
7: 5
8: 8
9: 8
10: 5
11: 3
12: 2

i.e. 10 values beginning with a 1, purely due to the decimal numbering system.

If every number had a unique symbol, the effect would disappear. — Preceding unsigned comment added by 217.13.157.77 (talk) 15:00, 20 January 2014 (UTC)

For each positive integer n, this graph shows the probability that a random integer between 1 and n starts with each of the nine possible digits. For any particular value of n, the probabilities do not precisely satisfy Benford's Law; however, looking at a variety of different values of n and averaging the probabilities for each, the resulting probabilities do exactly satisfy Benford's Law.[citation needed]
Your first few sentences ("of the first 20 numbers, 50% start with a 1", etc.) is more-or-less the same as this picture and description in the article...
It's odd that your example demonstrating Benford's law is actually a dataset which is very far from satisfying Benford's law. None of the numbers starts with a 2! In fact, approximately-normal distributions never satisfy Benford's law.
I don't understand what you're getting at with "positional number system". It sounds like you are saying something very obvious: If we didn't write numbers using digits, then there would be no such thing as a first digit, and therefore there would be no Benford's law. :-P --Steve (talk) 17:41, 10 March 2014 (UTC)

I understand positional number system argument, but it is simply false. Why stop at 200? Go up to 999 (for the argument's sake). At that point the frequency of first digits is evenly distributed at 1/9 - 111 each.--Jack Upland (talk) 07:05, 23 May 2014 (UTC)

## 1/f noise

I am surprised there is no mention of 1/f noise.

It struck me that the probability of a number with lead digit 1 is 0.301 which is log10(2). This makes sense because we are talking of the range 1.0 to just under 2.0. That represents a doubling (ie factor of 2).

Likewise if you add up the probabilities for lead digits 2 and 3 you get another 0.301. And similarly for the sum of the probabilities for lead digits 4 through 7. In other words, corresponding to 2 further doublings.

So this is like looking at a noise spectrum from electromagnetic noise in the environment. Consider an arbitrary frequency F. From F to 2F we find a certain amount of energy incident on some area within some period of time. We also find the same energy for the range 2F to 4F. And similarly for 4F to 8F. So this is in fact 1/f noise. It makes sense, because a bunch of photons at frequency nF requires n times the energy of an equal-numbered bunch at frequency F. So as we go up in frequency, we might expect the amplitude to go down.

Extrapolating this to other things such as earthquake incidences etc sounds viable. Probably quite a lot of things work this way in fact.

My point in all of this is that there is no simple explanation of the simple origin of what otherwise seems like some sort of magic. — Preceding unsigned comment added by 86.30.114.58 (talk) 22:32, 24 May 2014 (UTC)

A probability distribution on a log scale. In this example, the ratio of the red to blue width is very close to the ratio of the red to blue area. So it follows Benford's law, pretty accurately, but not perfectly.
See plot on the right. In pure 1/f noise, the plot would be exactly flat. So it would indeed follow Benford's law perfectly! Oh, except that if it's exactly flat, then the area under the curve would be infinite ... so it's not a probability distribution!
More realistically, the plot might be approximately flat over a certain range, maybe many orders of magnitude, and then tail off on both sides. Kinda like the example on the right, actually. I mean, if I saw the curve on the right when measuring a noise power spectrum, I might call it "approximately 1/f noise, at least in the frequency range 10-1000". But is this really 1/f noise? Maybe a clearer description of this curve would be "a probability distribution that (on a log scale) is relatively smooth and flat over several orders of magnitude". So that's the kind of wording which is already in the article.
1/f noise (in a sufficiently broad but not infinite bandwidth) is undoubtedly an explanation of one way that these kinds of probability distributions can appear (i.e. probability distributions that are pretty wide and flat on a log scale). However, (1) Calling it an "explanation" is too charitable, because you still need to explain why the 1/f noise happens in the first place, (2) It is kinda an unusual case in practice -- I mean, if you look at the real-world datasets that exemplify Benford's laws, only a very small fraction of them are related to 1/f noise. Street addresses, tax documents, lengths of rivers, populations of cities, physical constants, etc. etc.: None of these are related to 1/f noise, as far as I can tell. The 1/f noise related examples are kinda rare in practice. Maybe some earthquake statistics (as you suggest).
I'm not opposed to mentioning 1/f noise in the article, as a category of processes from which you can get probability distributions that (on a log scale) are relatively smooth and wide. It could be mentioned alongside various other such processes, like exponential growth processes. --Steve (talk) 18:02, 28 May 2014 (UTC)

## Phone numbers as a non-Benford's Law distribution

The article currently gives "the 1974 Vancouver, Canada telephone book" as an example of a distribution that does not obey Benford's Law since "no number [in it] began with the digit 1". However, in the North American Numbering Plan (obeyed by Canada), telephone numbers are never allowed to begin with 1. I feel this example should be removed or, at the very least, this fact should be noted in a footnote. Admiral.Mercurial (talk) 12:47, 30 August 2014 (UTC)

• I agree that's an error in Raimi (the original paper notes 0 incidence of 1s but doesn't investigate it) and we should probably remove the claim. Protonk (talk) 13:42, 30 August 2014 (UTC)
• Thanks for catching this. If it's an error in the original paper (and not an introduced Wikipedia error), it would be better to explain the error, with a Wikilink to the North American Numbering Plan. Simply removing the erroneous claim leaves open the possibility of it being re-added, or of it misleading a reader who follows the footnote to the original paper. Reify-tech (talk) 13:57, 31 August 2014 (UTC)
I'm not sure it's an error. Rather it's a trivial example with an obvious explanation.--Jack Upland (talk) 00:21, 1 September 2014 (UTC)
It may be obvious to us here and now, but evidently it wasn't obvious and trivial to the authors of the paper at the time of writing, since they noticed it but offered no explanation for the anomaly. I think it is better to explain it as an example of inadvertent bias, rather than to omit it as "obvious", when it was clearly not obvious to the authors of the paper. It still remains non-obvious to readers unfamiliar with the North American Numbering Plan. Reify-tech (talk) 13:29, 1 September 2014 (UTC)
I agree with Jack. It is not an error, it is a trivial example with an obvious explanation. The authors of the paper apparently found it too obvious to even mention. But there's nothing wrong with saying it explicitly, so I just added it. :-D --Steve (talk) 14:01, 1 September 2014 (UTC)

## capitalization

The current article text refers to the law as "Benford's law" 28 times and as "Benford's Law" 52 times. I don't know which is correct, but it should be consistent. 2605:6000:EE4A:2900:6250:C93B:E4D4:B4BC (talk) 05:19, 15 January 2015 (UTC)

Good catch. The "hobgoblin of little minds" {http://www.goodreads.com/quotes/353571-a-foolish-consistency-is-the-hobgoblin-of-little-minds-adored} is hard to maintain sometimes! The usage should follow the article title. - DavidWBrooks (talk) 13:09, 15 January 2015 (UTC)
The current article title is no more definitive than the current text. Both usage and title in this article should follow the standard set by the referenced sources, if they're in general agreement on a styling. On cursory examination, this standard seems to be "Benford's Law," but I leave it to someone more familiar with the topic to make this determination. 2605:6000:EE4A:2900:6250:C93B:E4D4:B4BC (talk) 19:23, 16 January 2015 (UTC)

## Wrong explanation

"Benford's law applies most accurately to data that are distributed smoothly across many orders of magnitude" is just false and should be removed. Consider a variable uniformly distributed between 1 and 10^12; I believe this qualifies as "distributed smoothly across many orders of magnitude." The probability of the first digit of this variable being 1 is exactly 0.1.

The shaded area correspond to the probability if this graph is P(log(x)) not P(x) graphed in semi-log scale.

Moreover, the picture showing the probability in semi-log plot is extremely misleading. The shaded area correspond to the probability only if this graph is P(log(x)) not P(x) graphed in semi-log scale. If it is P(log(x)), and it is expected to be almost uniform so that you can apply the area proportional to width argument, then that means the probability P(x) itself has to be a linearly decreasing function (or a decreasing function that look almost linear at least over one order of magnitude at every point) so that P(log(x)) = P(x) dx/d(log(x)) = x P(x) looks uniform. I noticed that there was a note on the picture explaining that this graph is not a plain probability graphed in semi-log scale, but given the explanation above, you can see how a non-expert audience can be mislead just by reading the section and looking at the picture that a uniform distribution follows Benford's law. Sprlzrd (talk) 16:35, 22 April 2015 (UTC)

A variable uniformly destributed between 1 and 10^12 has a 90% chance of being between 10^11 and 10^12. This is not an example of a variable that is "distributed smoothly across many orders of magnitude", it is a variable that is almost entirely restricted to a single order of magnitude. Indeed, it has a 50% chance of being in a mere 0.3-order-of-magnitude window.
I have the impression that your complaint about P(log x) is a combination of two things: (1) You acknowledge that it is technically correct (because of Note 8) but think that Note 8 is too easy for readers to miss? (2) As you say, such probability distributions correspond to a P(x) graph that approximately follows 1/x over a certain range ... and you think that such distributions are unexpected and weird? (I'm not quite sure what your point is, sorry.)
For (1), it should be easy to fix. We can put references to Note 8 in more places, we can introduce separate labeling for footnotes versus references (like in United_States_dollar and many other articles, see how there is "[6]" vs "[Note 6]"), we can even move Note 8 into the main text. Do you think something like that would help? Or do you have any other suggestions?
For (2), such distributions (where P(log x) is slowly-varying over many orders of magnitude, i.e. P(x) kinda follows 1/x over many orders of magnitude) are really common in the world and in science and in math. Just look at any dataset following Benford's law! Many log-normal distributions are in this category for example.
If I misunderstand your complaints I'm sorry and I hope you can re-state them to clarify :-D --Steve (talk) 12:17, 23 April 2015 (UTC)
Okay, I see. What is written is not the same as what you are trying to say. The variables in my example are uniformly distributed, which is as smooth as any distribution gets, and they are distributed across many orders of magnitude. You are trying to say that the log of your variables is almost uniformly distributed over many orders of magnitude (it doesn't even have to be over many order of magnitude). If that's the case, just say that. I read that sentence for some of my physicist friends and they all interpreted it the way I did.
About the graph, I have the same complaint: it shows the probability distribution of the log of the variables, not the semi-log scale of the probability distribution of the variable, and it should be explicitly stated; no one is going to understand the significance of the footnote unless they already know enough about the subject.
The fact that anything with uniformly distributed log follows Benford's distribution is correct, and everyone who reads the article should know that, and the picture, if explained properly, does a good job of explaining that. Does it explain the ubiquity of Benford's law? No, it doesn't. And the combination of the first sentence, and the ambiguous explanation of the graph implies it does. I suggest the following: (i) replace the word smooth by uniform, since smooth means something totally different (ii) be very explicit about the distribution of the log has to be uniform not the distribution the variable (we could mention what that means in term of the distribution of the variable itself, P(x) follows 1/x) (iii) be explicit about this doesn't explain why Benford's law is found in real data, since we don't have any reason to expect 1/x distributions everywhere. (iv) remove anything about over many order of magnitude. You can get Benford's distribution for something with uniform log over one decade.Sprlzrd (talk) 21:40, 23 April 2015 (UTC)
Thanks, this is really helpful feedback :-D
I made some changes along the lines you suggested just now. I agree with "smooth" --> "uniform", and I tried to improve the wording from "log-scale probability distribution" without making it sound too convoluted for non-mathematicians to read. Did I succeed?
Your suggestion to "be explicit about this doesn't explain why Benford's law is found in real data, since we don't have any reason to expect 1/x distributions everywhere" has already been taken care of a while ago, it's the last paragraph of the section ... (Random thought: Maybe there should also be a link to Zipf's law? I'm not sure what the exact relation is.)
I disagree with your categorical objection to "many order of magnitude". It is generally true that real-world probability distributions that cover many orders of magnitude have smaller discrepancies from Benford's law than probability distributions that are contained within fewer orders of magnitude. This is obvious I think, and explicitly stated by three references in the section.
Now, it is true that you can concoct an example (log-uniform in one order of magnitude with sharp cut-offs) that exactly satisfies Benford's law despite being restricted to one order of magnitude. But surely you recognize that this is an artificial and atypical example. But nevertheless I tried to soften the language so that readers don't get the wrong idea. What do you think? --Steve (talk) 12:21, 24 April 2015 (UTC)
I liked very much the changes you made. I agree that the data from the cited references seem to show that empirically the broader the distribution the better agreement with Benford's law. I added a little explanation for more mathematical audience in the parenthesis. Is that okay? --Sprlzrd (talk) 18:25, 24 April 2015 (UTC)
Sure! But I changed the wording ... it's only the fractional part of the logarithm that is supposed to be uniformly distributed. Do you agree? --Steve (talk) 22:00, 25 April 2015 (UTC)
It took me a minute staring at it to understand what that sentence means, but sure, it seems technically correct. It's your call; if you think this adds significant information keep it as it is. It would be nice if you could provide a citation for that sentence. I am happy with the overall changes made in this section; my main objection is definitely resolved :) -Sprlzrd (talk) 14:44, 27 April 2015 (UTC)

## Terminal digits in pathology - an irrelevant example

The authors give the following as a non-example for the Benford's law: "The terminal digits in pathology reports violate Benford's law due to rounding, and the fact that terminal digits are never expected to follow Benford's law in the first place." Clearly, this is not false, but it strikes me as irrelevant, and may confuse some readers. One would *never* expect the terminal digit to follow Benford's distribution (it should be fairly uniform for other reasons). So why bring up this specific instance? Jakub Konieczny 20:55, 27 April 2015 (UTC) — Preceding unsigned comment added by Jakub.konieczny (talkcontribs)

## Benford's Law in Nuclear Physics

Benford's law is applicable for the evaluated nuclear physics quantities. It works reasonably well for large samples (>400) and performs poorly for a small sample (~12). These results have been published in Journal Of Physics G [1]. It is a first application of Benford's law in nuclear physics.

1. ^ B Pritychenko 2015 J. Phys. G: Nucl. Part. Phys. 42 075103. doi:10.1088/0954-3899/42/7/075103

## Ignored reference

I think this reference could clarify some points and should at least be cited http://www.sciencedirect.com/science/article/pii/S0378437100006336 — Preceding unsigned comment added by Vitelot (talkcontribs) 13:08, 16 July 2015 (UTC)

Thanks, I had long been looking for something like that in the literature. I used it as the basis for a new section "Multiplicative fluctuations" [7]. (I know there are other aspects to the paper too, but this is a start.) :-D --Steve (talk) 15:40, 16 July 2015 (UTC)

Hello fellow Wikipedians,

I have just added archive links to one external link on Benford's law. Please take a moment to review my edit. If necessary, add {{cbignore}} after the link to keep me from modifying it. Alternatively, you can add {{nobots|deny=InternetArchiveBot}} to keep me off the page altogether. I made the following changes:

When you have finished reviewing my changes, please set the checked parameter below to true to let others know.

Archived sources still need to be checked

Cheers.—cyberbot IITalk to my owner:Online 15:49, 27 February 2016 (UTC)

## The example is wrong

If you look at the tallest 60 buildings in the world, none of them start with a 1 in metres (see the wikipedia page). You can't use Benford on just the tail of the distribution. — Preceding unsigned comment added by 74.64.57.82 (talk) 17:38, 10 March 2016 (UTC)

I thought it was referring to "tallest structures" - List_of_tallest_buildings_and_structures_in_the_world#Tallest_structure_by_category, right?
I hate to be defending that section, it's not a very good example of Benford's law. I personally think it should be deleted altogether. The figure with populations of all countries is a much better example of Benford's law, and more concise too! As long as we have that figure, I don't see the point in having the "tallest structures" section. --Steve (talk) 01:23, 23 March 2016 (UTC)

## Bad Example? - Distributions not Following Benford's Law

I'm not an idiot when it comes to stats, but hardly an expert. This is given as an example of a distribution that would not follow Benford's Law:

"Where numbers are assigned sequentially: e.g. check numbers, invoice numbers"

I think the example is not necessarily wrong, but might require more explanation. Sequentially assigned numbers are generally a good example of Benford's Law in action, e.g. street numbers. All streets have a 100 block, most a 200 and 300 block, and then less and less likely. But those with a 900 block could easily have a 1000 block, and then maybe a 2000 block, and so on. And those are sequentially assigned. I'm not sure the best way to describe the difference between a more limited set of sequential numbers, like check numbers, and more open-ended sets like street numbers, but it seems like a worthwhile distinction to prevent apparent contradictions in the article. Just my handful of change. Sdr (talk) 19:54, 31 July 2016 (UTC)

If you take any one street by itself, and look at the house numbers on that street, they will be pretty far from Benford's law. If you look at many different streets as a group, the house numbers will probably be pretty close to Benford's law. Do you agree? If so, the only question is how to reword that bullet point to make it clearer...
For example, on my own home street, the numbers go up to ~185, so 1's are very overrepresented and 2's are very underrepresented compared to Benford's law. --Steve (talk) 18:09, 2 August 2016 (UTC)
I’m not an expert, but it takes more than a simple monotonic relationship between digit frequencies to be Benford’s Law which has a specific logarithmic probability law. So, it may be that check numbers and street numbers are not examples of Benford’s Law for this reason. Constant314 (talk) 19:06, 2 August 2016 (UTC)
An anecdote: A long time ago I took a short course from Richard Hamming. As an aside on day, he told us that he used to make a little money by walking into a lab and making an even money bet that the next measurement made by anybody in the lab would have a first digit of three or less. Constant314 (talk) 19:19, 2 August 2016 (UTC)
The lead does give "street addresses" as an example of Benford's Law. It also says that Benford's Law "tends to be most accurate when values are distributed across multiple orders of magnitude". This is not generally true of street numbers. I'm not sure about cheques, but with invoices, in my experience, the numbers have a set number of digits, e.g. from 0001 to 9999. In this case, I think the initial digits would be evenly distributed, and not follow Benford's Law.--Jack Upland (talk) 22:30, 2 August 2016 (UTC)
Where I live, I very commonly come across 1-digit, 2-digit 3-digit, and (in the city) 4-digit street addresses. (Not all on the same street!) This is in Massachusetts, USA. Is it different in other regions?
Invoices or checks from a single company will probably not follow Benford's law, but the set of all invoices I receive (which come from many different companies) probably will. I might get a 7-digit invoice number from my phone company, a 3-digit invoice number from my local electrician, etc. --Steve (talk) 14:29, 3 August 2016 (UTC)
As before I am not an expert, so this is just my speculation. To be truly Benford, the numbers must be distributed uniformly on a logarithmic scale. Street addresses, check numbers and invoice numbers are distributed uniformly on a linear scale. The frequency of the most significant digit is a logarithmic function of the digit value. One would expect that changing units would not change the distribution of most significant digits. For instance, if something measured in meters showed a Benford distribution then converting the measurements to feet should still yield a Benford distribution.
Suppose I open a checking account and start with check number 100. When I close the account I’m up to 2699. Number of checks beginning with one:1100, number beginning with two: 800. Number checks beginning with three: 100, four:100, five:100, six:100, 7:100, 8:100, 9:100. It is a non-increasing, but not Benford.
The total number of checks written over the lifetime of an account for all closed accounts: maybe Benford.
Average sound power I experience in watts per square meter: probably Benford.
Average sound power I experience in dBm per square meter: probably not Benford.
A manufacturer of opamps produces many types and grades form jelly-bean to ultraprecision. The noise specifications vary by a factor of 1000. The noise of each opamp is measured. The measurements are probably Benford.
Number of bits I uploaded each minute at my URL, not counting minutes with zero bits uploaded: maybe Benford.Constant314 (talk) 16:22, 7 August 2016 (UTC)