Talk:Benford's law/Archive 2

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
Archive 1 Archive 2 Archive 3 Archive 4

Error

I'm not convinced the log10 should change to log100 just because we look at two digits instead of one. The number is still base10.

Here's an example: Numbers that start with 1 should comprise 30.1% of the numbers. If we subdivide all the numbers beginning with 1 into 10,11,12,...19, we should expect these ten sub-numbers to add up to the 30.1% expectation of all numbers beginning with 1. Using Log10 (and NOT Log100) yields:

10 - 4.14%

11 - 3.78%

12 - 3.48%

13 - 3.22%

14 - 3.00%

15 - 2.80%

16 - 2.63%

17 - 2.48%

18 - 2.35%

19 - 2.23%


(summing the distributions)

yields 30.1%

. . .which is exactly what we would expect.

Using Log100, on the other hand, will yield only half of the expected value. You can duplicate this result for all the ranges 1-9.

Caleb B caleb@tcad.net —Preceding unsigned comment added by 69.29.42.173 (talk) 21:21, 10 June 2008 (UTC)

You are right. If you take log 100, then the cumulative probability is log_{100} (100) - log_{100} (10) = 1/2, whereas it should be 1. The right probability of a group of digits n (=10,...,99) is log_{10} (n+1) - log_{10} (n), which is also the probability mentioned in Hill's paper. 131.155.15.29 (talk) 09:46, 9 July 2008 (UTC)

Paper on the arXiv

I have added a link to a paper on the arXiv, which discusses shannon-entropy and benfords law and whatnot. Perhaps someone might wish add something to the page from that, and move the link from "external links" to "references". —Preceding unsigned comment added by Paul Murray (talkcontribs) 00:08, 23 January 2009 (UTC)

Nonsense in the text

It is simply not true that the probability of something is just the area under a curve when drawn in logaritmic scale.

Making a substitution x = 10^y (y is, hence, the ordinate on the logarithmic axis) yields

I.e. the probability is the area under the curve f(y) = Ln[10] p(y) 10^y, which is completely different —Preceding unsigned comment added by 147.231.27.150 (talk) 14:19, 11 February 2009 (UTC)

Right, the transformation needs to be accounted for in the integrand. Where is the problem in the article? Baccyak4H (Yak!) 14:37, 11 February 2009 (UTC) improved OP math markup Baccyak4H (Yak!) 14:41, 11 February 2009 (UTC)
This is Footnote [5] in the article:
"Note that if you have a regular probability distribution (on a linear scale), you have to multiply it by a certain function to get a proper probability distribution on a log scale: The log scale distorts the horizontal distances, so the height has to be changed also, in order for the area under each section of the curve to remain true to the original distribution. See, for example, [1]"
--Steve (talk) 09:23, 12 February 2009 (UTC)

"Mathematical statement"

Benford's law is a loosely-formulated implication: it says that if you consider numbers drawn from certain natural sources, then their first digits will conform to a certain specific distribution. The section called "mathematical statement" gives a precise formulation of the conclusion. Making the hypothesis mathematically precise is much more problematic. It's claimed that Hill was the first to give a precise mathematical proof of Benford's law, but in fact what he did was to give a mathematical proof of a mathematical statement, which one may or may not agree captures the essence of Benford's law.

By the way, within the article it's hard to figure out what earlier material is cited by the claim that "Ted Hill proved the result about mixed distributions mentioned above." The word "mixed" doesn't appear elsewhere in the article, and I'm guessing that the reference is to the section on multiple probability distributions. Ishboyfay (talk) 03:44, 20 February 2009 (UTC)

I agree, if we say that Benford's law is an approximate empirical statement, then it can't be mathematically proven as such. This calls for more careful phrasing than we have now. For your second question, yes it's the multiple probability distributions, I put in a link to clarify. :-) --Steve (talk) 07:23, 20 February 2009 (UTC)
(I'm basically agreeing with the above.) The first sentence of the misnamed section Mathematical statement reads:
"More precisely, Benford's law states that the leading digit d (d ∈ {1, …, b − 1} ) in base b (b ≥ 2) occurs with probability P(d)=logb(d + 1) − logbd = logb((d + 1)/d)."
But as much as this "kind of" states it "correctly", this is not a mathematical statement at all, since it is extremely unclear what the word "probability" means here. Alas, the word "probability" has no meaning at all here.
I'm not saying that Benford's law *cannot* be stated mathematically, but that is a very slippery endeavor.
I strongly recommend that Wikipedia stick to true statements and avoid false, and -- like the above sentence -- meaningless ones.Daqu (talk) 18:26, 11 May 2009 (UTC)

Income differences?

Can you really count income to "distributions that cover many orders of magnitude rather smoothly"? Is there a significant portion of the population that has ten times more, and ten times less, income than the average?Mumiemonstret (talk) 12:59, 25 September 2009 (UTC)

Well in the US, see here, 80% earn 15,000 to 150,000 dollars. So yeah, I guess that's mostly within one order of magnitude. Can you think of a better example? Or we could soften the wording: "distributions that span several orders of magnitude rather smoothly" :-) --Steve (talk) 03:02, 26 September 2009 (UTC)

Prime numbers may follow benfords law

http://www.physorg.com/news160994102.html —Preceding unsigned comment added by 208.71.237.254 (talk) 17:58, 11 May 2009 (UTC)

Not really. The first digits of primes up to 10n are fairly evenly distributed for large n. The picture is different for primes up to 2×10n for large n or 3×10n etc. but even then it does not approach Benford's law. --Rumping (talk) 12:07, 20 November 2009 (UTC)

Linkfarm cleanup

I saw my removal of the WP:LINKFARM was partially reverted, so I wanted to make it clear why I pulled out the links that have been restored. The primary problem is that they all run afoul of

The primary problem is with ELNO#1: "Any site that does not provide a unique resource beyond what the article would contain if it became a featured article," which I would say describes all the remaining links. I don't think any of them contain anything special that would be unavailable if someone took the time to flesh out this article. Some of them would definitely be worth using to add or cite material, though, which would improve the article and have the nice side effect of retaining the links.

Some of the links have problems beyond ELNO#1, though:

5. Links to web pages that primarily exist to sell products or services, or to web pages with objectionable amounts of advertising. For example, the mobile phone article does not link to web pages that mostly promote or advertise cell-phone products or services.
11. Links to blogs, personal web pages and most fansites, except those written by a recognized authority. (This exception is meant to be very limited; as a minimum standard, recognized authorities always meet Wikipedia's notability criteria for biographies.)
11. Links to blogs, personal web pages and most fansites, except those written by a recognized authority. (This exception is meant to be very limited; as a minimum standard, recognized authorities always meet Wikipedia's notability criteria for biographies.)
(Additionally, it seems likely that this link violates the policy's directive not to link to pages that violate copyright law.)
8. Direct links to documents that require external applications or plugins (such as Flash or Java) to view the content, unless the article is about such file formats. See rich media for more details.
13. Sites that are only indirectly related to the article's subject: the link should be directly related to the subject of the article. A general site that has information about a variety of subjects should usually not be linked to from an article on a more specific subject. Similarly, a website on a specific subject should usually not be linked from an article about a general subject. If a section of a general website is devoted to the subject of the article, and meets the other criteria for linking, then that part of the site could be deep linked.
8. Direct links to documents that require external applications or plugins (such as Flash or Java) to view the content, unless the article is about such file formats. See rich media for more details.
5. Links to web pages that primarily exist to sell products or services, or to web pages with objectionable amounts of advertising. For example, the mobile phone article does not link to web pages that mostly promote or advertise cell-phone products or services.
8. Direct links to documents that require external applications or plugins (such as Flash or Java) to view the content, unless the article is about such file formats. See rich media for more details.
8. Direct links to documents that require external applications or plugins (such as Flash or Java) to view the content, unless the article is about such file formats. See rich media for more details.

There is one notable exception, the Benford Online Bibliography. The front page, which is what we linked to, doesn't have very much information, but clicking through provides some great resources. I should have been more careful to keep it the first time.

Thoughts? — Bdb484 (talk) 21:14, 4 February 2010 (UTC)

I have done quite a lot of spam cleanup, and in general I would be inclined to agree with you. But for this particular article, the links are helpful and each has a very high content-to-noise ratio. Benford's law is a very strange observation and each of the external links provides some useful insight. We could spend a couple of hours here and reluctantly agree to remove maybe two or three of the links – what particular benefit would arise from that? I know about WP:OTHERSTUFF, but the time spent rearranging entirely innocent links here would have much better effect by cleaning some real spam, for example, WT:WikiProject Spam. For an example of a true linkfarm, see here (now cleaned up). In summary, all Wikipedia's procedures involve the application of common sense (with very few exceptions, see WP:5P), and the current external links are not linkspam and they each have different but useful information that assists the reader, so my opinion is that none should be removed. If you really want, the couple of links requiring Java or whatever can be flagged, however none of the links go directly to a page that requires some application to see what the page offers. Johnuniq (talk) 00:49, 5 February 2010 (UTC)
I agree with Johnuniq. All of the current links are very useful to the reader (myself included). In fact, I would be in favour of adding a few more, as long as the help to further illustrate this peculiar ratio. It pops up all over the place. A list of places (with associated links) would further this article. --Thorwald (talk) 01:54, 5 February 2010 (UTC)
I'm 100 percent with you on the primacy of common sense over Wikipedia "rules," but this isn't a case where the two conflict. If the links are that helpful, then the information can be pulled into the article and cited. As it stands, the external link section is no more useful than googling Benford and seeing what pops up. Thanks to the Benford Bibliography link, we already have a collection of high-quality links that easily surpasses what we have here.
The benefit from working out which links should stay and which should go is simple: improving the article. If you find something particularly useful in one of these links, then go ahead and add it to the article with a citation. Then the reader has access to more information that is better organized -- all without losing links to the pages in question.
That way, everybody gets what they want, no? — Bdb484 (talk) 02:10, 5 February 2010 (UTC)
Sounds good. I agree that is a better approach. Just don't delete the links until we have time to include the relevant information in the article (with citations). --Thorwald (talk) 02:26, 5 February 2010 (UTC)
My instinct is to agree that useful content from an external link should be incorporated into the article, with the link used as a reference – that is one of the first things we say to people who add links to their website on fifty different articles. However, I think this topic has rather unusual attributes that make that procedure unworkable since most of the external links have too much detail for a general article here, yet they each have something useful to say. The Benford Online Bibliography link that you moved to the top of the list is only of interest to a serious researcher (I would be inclined to restore it to the "More mathematical" section). Johnuniq (talk) 02:35, 5 February 2010 (UTC)
I'm not in any rush to pull any of the links. They were initially pulled down as part of WP:BRD cycle, so I'm happy to give anyone plenty of time to work it out.
From my review of the bibliography, it seems that it has plenty for both the lay reader and the experienced mathematician. For example, the first link they offer is to the Radio Lab segment, which presented a very accessible introduction to Benford. It also provides a lot of links to plain-language news coverage from the Wall Street Journal, the New York Times, Washington Post. I'd be inclined to leave it where it is, but I wouldn't object if you feel strongly about it. — Bdb484 (talk) 02:45, 5 February 2010 (UTC)
I could be mistaken, but it seems like there hasn't been anything done in the way of clean-up in the two months-plus since we talked about this. Is anybody working on this? — Bdb484 (talk) 20:03, 6 April 2010 (UTC)
Above I have explained that your suggestion, while admirable in general, is difficult to implement (and possibly unhelpful) in this particular case. You could try getting more opinions at WP:ELN. Johnuniq (talk) 03:17, 7 April 2010 (UTC)
I do remember you offering that opinion, though I don't remember you offering anything to substantiate it. Just the same, when it was requested that I allow some time for the relevant material to be included in the article, I was happy to do so. After more than two months, no apparent effort has been made to that end.
The folks at ELN generally support abiding by WP:ELNO, so I'm not sure what more they would have to offer to the discussion. But if you think they'd believe there's a reason to disregard the rules that are good enough for every other article, you should probably take that route yourself, as the burden for establishing a reason to keep each of these links falls on you. If not, though, I'll be happy to handle the EL clean-up myself. — Bdb484 (talk) 04:16, 7 April 2010 (UTC)

I compared the current external links list with what existed one year ago. In the last 12 months:

  • Eight links are the same (with some tweaking, but same site).
  • Four links have been removed ([2], [3], [4], [5]).
  • Three links have been added:
  1. Benford Online Bibliography, an online bibliographic database on Benford's Law.
  2. Benford’s law, Zipf’s law, and the Pareto distribution by Terence Tao
  3. From Benford to Erdös, WNYC radio segment

The first new link seems essential, the second is by Terence Tao which automatically qualifies it, and the third seems hard to disagree with (although I have not heard it). Given that eight links have been considered satisfactory for at least a year, and the three new ones seems totally suitable, I do not see why any should be pruned. I notice that Sbyrnes321 reverted your removal of the links, and Thorwald posted in agreement with keeping the links above (although the second post supports conversion cited material). I think it fair to conclude that consensus favors keeping the links. Johnuniq (talk) 08:52, 7 April 2010 (UTC)

Example #2

Here is an xample using factorials. From OEIS A008905 ref, Noe has a list of the first 1000 or so leading digits of the consecutive factorials. Here is the distribution that is seen from that sample vs Benford:

digit Benford n!
1 0.30 0.29
2 0.18 0.18
3 0.13 0.12
4 0.10 0.10
5 0.08 0.07
6 0.07 0.09
7 0.06 0.05
8 0.05 0.05
9 0.05 0.05

--Billymac00 (talk) 04:53, 25 April 2010 (UTC)

Bizarre crashing problems

This is really weird, but I've found that, for some reason (at least on my computer, an iMac running Mac OS X Snow Leopard), this page crashes when viewed in Chrome, but the talk page loads fine. Meanwhile, Safari can load the page itself, but not the talk page. I haven't tried any other browsers or operating systems, but I'd still like to know if anyone else has encountered this and if any explanation is known. 75.69.192.96 (talk) 05:00, 23 May 2010 (UTC)

Best to post this at WP:Village pump (technical) (with a link to Benford's law). Johnuniq (talk) 05:08, 23 May 2010 (UTC)

This crashing issue is happening for me, too. Only the Benford's Law page, and the page loads fine, but the moment you attempt to scroll down the thread crashes. It happens with the current Chrome and Chrome-dev release, as well as the current daily build of Chromium. Oh, and Safari and Firefox, too. apraetor —Preceding undated comment added 01:43, 16 June 2010 (UTC).

  • I'm in Safari and Chrome 5 on 10.6.4 and I don't see a problem with either page. But I would post to VPT if you are sure that this problem is unique to this page and that browser setup. 184.59.8.54 (talk) 18:52, 16 June 2010 (UTC)
    I was going to post on VPT on behalf of the two people with the problem, but when I searched that page for "crash" I saw that similar complaints have been made, with the reply that it is a browser bug and the supplier of the browser should be notified. Johnuniq (talk) 00:29, 17 June 2010 (UTC)

Explanation 2

see also discussion above

I just read in a post the following: "Benford’s law arises naturally if the data under consideration span several orders of magnitude—for example, the first digits of the powers of two obey Benford's law" – this seems to me much more intuitive than the current "This distribution of first digits arises whenever a set of values has logarithms that are distributed uniformly, as is approximately the case with many measurements of real-world values". Are they equivalent, or perhaps complementary? For me, as a layman, the "spanning orders of magnitude" thing made much more sense and instantly gave me a vague but intuitive grasp at why the law works. Do you think it could be integrated in the lead? --Waldir talk 00:55, 15 December 2010 (UTC)

Phrasing like that is in Benford's law#Limitations, but not in the lead right now. I agree, it should be. --Steve (talk) 01:10, 15 December 2010 (UTC)
I'm not very familiar with this topic, so I'd rather not add it myself. Do you think you could do that? --Waldir talk 15:57, 17 December 2010 (UTC)

Graphic caption wrong?

A logarithmic scale bar. Picking a random x position on this number line, roughly 30% of the time the first digit of the number will be 1.

I understand this (the 1 to 2 zone is bigger), but isn't that only due to the scale of the graph? If an x is chosen random visually by distance, then I agree, but if it is a random x variable, it should over time hit all the digits equally no? —Preceding unsigned comment added by 216.7.125.201 (talk) 17:09, 28 January 2011 (UTC)

The caption should be reworded to be clearer. Maybe "Roughly 30% of this line consists of numbers that begin with the digit 1: Numbers between 0.1 and 0.2, between 1 and 2, between 10 and 20, etc."? What do you think?
Also the picture is pretty hard to read. There doesn't seem to be a better option: [6]. It should really be SVG... --Steve (talk) 20:04, 28 January 2011 (UTC)

scale invariance

I'm not sure the example with feet and yards is correct. I'm not a statistician, but what I understand based Harold Jeffreys book (1939; Chap. 3) is that the problem arises when you are dealing with scales that have different number of dimensions. There is no reason to think that a measurement squared should have a different distribution than a measurement cubed. Imagine that you have cubes of random sizes. There is no reason to think that the distribution of the volumes of the cubes should look different from the distribution of the areas. If the measurements gave a uniform distribution for the area (equal numbers of each digits), it would give a non uniform distribution for the volume and vice versa. The only distribution that doesn't change when you transform from the area to the volume (or take any other power) is to assign the uniform probability to the logarithm. This also illustrates why Bensford's law only applies to scales that can't be negative. A cube can't have a negative volume or a negative area. There is an asymmetry in things that can only be positive because when you go towards the negatives you hit a wall at zero whereas you can go to infinity towards the more positive values. Measurements that can be negative don't have this asymmetry. Going towards the negatives is the mirror of going towards the more positives and thus things tend to get distributed uniformly.--BenE (talk) 04:50, 6 December 2008 (UTC)

Strictly, cubes can have negative volume, if they can have negative sides. But the whole thing is related to scale invariance which happens to be associated with power laws and so in turn follows the solution to the dimension problem. Whether you prefer
log(2)-log(1) = log(6)-log(3), or
log(2)-log(1) = log(sqrt(2))-log(sqrt(1))+log(sqrt(20))-log(sqrt(10)) = log(cbrt(2))-log(cbrt(2))+log(cbrt(20))-log(cbrt(10))-log(cbrt(200))-log(cbrt(100))
as an illustration is a matter of personal choice, though I think the former is easier to understand in the context of first digits.--164.36.38.240 (talk) 10:26, 8 January 2009 (UTC)

The point about scale invariance should be trivially obvious. For things like river length or income there is no obvious natural unit to count in. We could measure river length in miles, kilometres, or any historical measure and the law still applies. Similarly income will usually be measured in the local currency but the law will hold true whether we use USD, Indian Rupees or gold ounces (for the same data). This is in contrast to things like population where the obvious unit to count in is people.

The "why" link looks like some individual is just showing their mathematical naivete. Either put an explanation there or leave it alone but asking "why" is basically saying "I don't get it" - which is fine but not something anyone else needs to know. —Preceding unsigned comment added by 88.104.110.246 (talk) 17:28, 1 March 2011 (UTC)

Election fraud

Quite amusing to see that exact same "magic box" claimed to prove fraud in Iranian elections can be used to prove that Obama "stole" the US pres. elections. Deckert, Myagkov and Ordeshook go over this in quite some detail. The basic error in using BL as a "magic box" is the assumption that voters are iids; especially not true in small precincts. Heck, they even show that using that analysis one concludes that elections have been stolen in some US precincts for decades in a row. Tijfo098 (talk) 03:50, 23 March 2011 (UTC)

Amusing...? I tend to find that thing quite depressing... blatant misunderstandings like that. JaeDyWolf ~ Baka-San (talk) 15:47, 23 March 2011 (UTC)

Scale invariance argument

It appears to be based on the paper of R.S. Pinkham. On the distribution of first significant digits. Ann. Math. Statist., 32:1223{1230, 1961. (open access) But someone should check the details before adding it as source. Tijfo098 (talk) 02:30, 23 March 2011 (UTC)

I was right, see slide #26 here. Probably a more citable secondary ref exists somewhere. Tijfo098 (talk) 22:28, 23 March 2011 (UTC)

Question about distributions

The article states that the law holds only for data sets whose logarithms are uniformly and randomly distributed. It further states that data sets following normal distributions would not follow the law. That makes sense, but the text does not elaborate on what kind of data sets would have their logarithms uniformly and randomly distributed. Do I understand correctly that a set of completely random numbers would not follow this law? If so, what properties should a set of data have in order for the logarithms of those data to be uniformly and randomly distributed? In other words, what is it exactly that makes real-life financial data, for example, not distributed in uniform matter? I think a short sentence addressing this would do this otherwise helpful article a lot of good. Any takers?—Ëzhiki (Igels Hérissonovich Ïzhakoff-Amursky) • (yo?); March 29, 2012; 17:32 (UTC)

A simple set that satisfies Benford's and trivially has a uniformly distributed logs is just an = e^xn, where xn is randomly distributed in the range of [0, 10]. The an then span several decades and satisfy Benford's. I need to check this, but I'm pretty sure normally distributes data does follow Benford's -- but only if the width of the distribution spans several decades, which is unlikely for a single stock to do. This is kinda a tricky issue but I'll see what I can do on it later. a13ean (talk) 14:47, 30 March 2012 (UTC)
Thanks, this helps somewhat and I'll be looking forward to your addition. By the way, "e" in "e^xn" above is the Euler's number, correct? If so, can't it be any other arbitrarily selected constant?—Ëzhiki (Igels Hérissonovich Ïzhakoff-Amursky) • (yo?); March 30, 2012; 15:01 (UTC)
It is Euler's number, and it's an arbitrary choice here just because it plays nicely with the natural log. There's lots of other functions that are mostly log-distributed across several decades and similarly follow Bedford's. a13ean (talk) 15:45, 30 March 2012 (UTC)


MORE QUESTIONING ON THE BENFORD'S LAW DISTRIBUTION

Hi, the article explains that charting the mathematical constants that you can see in the chart given that the constants also show the Benfords Law pattern BUT it says the chart uses "the first significant number" of the constant and therefore does that mean its excluding "the number to the left of the decimal" of the constant because I don't know of any constants higher than 5 and that chart implies there's constants that start with 6,7,8, and even 9 and I do not know of any constant that starts with that high a number. Is the article saying the chart is excluding the number to the left of the decimal in the constant or am I reading the chart wrong?

173.238.43.211 (talk) 05:26, 25 July 2012 (UTC)

Digital signal processing

An editor (128.178.7.131) suggested the following reference:

  • The Scientist and Engineer's Guide to Digital Signal Processing by Steven W. Smith, chapter 34.
    "Digital Signal Processing usually involves signals with either time or space as the independent parameter, such as audio and images, respectively. However, the power of DSP can also be applied to signals represented in other domains. This chapter provides an example of this, where the independent parameter is the number line. The particular example we will use is Benford's Law, a mathematical puzzle that has caused people to scratch their heads for decades. The techniques of signal processing provide an elegant solution to this problem, succeeding where other mathematical approaches have failed."

I am restoring the IP's suggestion (which they removed, possibly due to a format problem) as the content appears very interesting, and I hope to read it one day... Johnuniq (talk) 09:17, 14 December 2012 (UTC)

It's Ref 17 in the article already...the footnote is not properly templated though. --Steve (talk) 23:18, 14 December 2012 (UTC)
Ouch, and it's in external links as well. I'll try to pay more attention to this topic, and might get around to fixing the ref. Johnuniq (talk) 23:28, 14 December 2012 (UTC)
I (former 128.178.7.131) removed the comment because the book chapter is already referenced; however, I think the content of the chapter should be reflected in the text since it provides a very interesting explanation, a statistical test and insights into why the law applies to some distributions and not to others. Furthermore, it is relatively easy to understand (e.g. for an engineers). Moutonnette (talk) 16:10, 17 December 2012 (UTC)
I agree. The reason I did not get around to fixing the reference formatting (now fixed by Steve) is that I read chapter 34 from the reference and my first reading did not find a page to verify the points in the article where it is used. Yet, the DSP approach should be mentioned (and its observation that finding the leading digit is equivalent to multiplying by powers of ten), although unfortunately there would be a long wait before I have time to do that myself. Johnuniq (talk) 22:33, 17 December 2012 (UTC)

The DSP book is discussing everything in terms of DSP procedures because the point of the book is to teach DSP and practice using it. Here, the only goal is to teach Benford's law. There is no ulterior motive. Therefore we can make things much much simpler than the DSP book does.

A broad probability distribution on a log scale. The total area of red divided by the total area of blue is approximately the same as the width of a red bar divided by the width of a blue bar. Therefore this distribution satisfies Benford's law to high accuracy.

The discussion in the DSP book corresponds to the Benford's law#Limitations section. The DSP book is much much wordier and harder to follow (because it is delving into extraneous topics), but there is virtually no understanding of Benford's law that you get from the DSP book that you would not get from the Limitations section. Please refer to the figure on the right...

  • "Ones scaling test" section: Multiplying all the numbers in the distribution by 1.01 shifts the curve a bit to the right. If you keep shifting the curve to the right, the total red area goes up a bit, then down a bit, then up a bit, then down a bit. It's obviously periodic because the pattern of vertical bars is periodic. The DSP book is pointing out this fact to lead up to the fourier series discussion later...
  • "Writing Benford's law as a convolution" section: Basically a description of the figure on the right.
  • "Solving in the frequency domain" and "Solving Mystery #2" sections: The distribution shown on the right is very broad and smooth over many orders of magnitude. Therefore the areas and widths are related, as described in the figure caption. At Benford's law#Limitations, there is also a plot of the opposite case, a narrow distribution that does not satisfy Benford's law. In the DSP book, this same fact -- the relation between distribution width and Benford's law accuracy -- is "explained" by invoking properties of fourier series. But I don't see any benefit to mentioning fourier series. All you need to do is look at the two side-by-side graphs in the article, and the fact becomes abundantly obvious. The fourier series discussion adds nothing. (Well, it could be used to quantify the relationship between width and Benford's law accuracy, subject to certain mathematical assumptions ... but such extreme level of detail is not appropriate for the wikipedia article.)
  • "Solving Mystery #1" section: The pattern of the bars are related to logarithms. It sounds obvious, and it is obvious...
  • Rest of the chapter -- various examples and details most of which are already discussed in the wikipedia article.

Therefore I don't think any new content has to be added from the DSP book beyond what's already in the article. (But I'm biased...) --Steve (talk) 03:29, 18 December 2012 (UTC)

OK, I see you have been working on this article a long time, and have thought about the topic a lot more than me. For now, I just want to add one thing I found a couple of days ago. Ref 18 is Fewster, R. M. (2009). "A simple explanation of Benford's Law". There is an overview here which includes a link to a pdf of the full paper. Johnuniq (talk) 06:41, 18 December 2012 (UTC)

fallacious explanation

normal distributions can't span several orders of magnitude

That may be true for the examples given (IQ, human heights) but it's not true of normal distributions in general, which have nonzero probability distribution over the entire real line. Nor do I believe it to be even approximately true for all normal distributions since the variance could be extremely wide.

However, if one "mixes" numbers from those distributions, for example by taking numbers from newspaper articles, Benford's law reappears.

Wouldn't the central limit theorem suggest that mixing distributions would produce something normal? DAVilla (talk) 11:00, 24 December 2012 (UTC)

More generally, if there is any cut-off which excludes a portion of the underlying data above a maximum value or below a minimum value, then the law will not apply.

This is an extremely strong claim, so much so that counter-examples are trivial. DAVilla (talk) 11:04, 24 December 2012 (UTC)

Sold on all three. This could use some cleanup. a13ean (talk) 17:54, 25 December 2012 (UTC)

Change in base

Does Benford's law apply to prime numbers? http://primes.utm.edu/notes/faq/BenfordsLaw.html From this, I am think about the change of base to each prime is changing the distribution. So...

In Benford's_law#Mathematical_statement is the statement the general form:

If there is a change in base, say from one prime number to the next prime number how would one calculate the change in P(d)? John W. Nicholson (talk) 21:01, 24 January 2013 (UTC)

Outcomes of exponential growth processes

When this quantity reaches a value of 100, the value will have a leading digit of 1 for a year, reaching 200 at the end of the year... Early in the fourth year, the leading digit will pass through 8 and 9. The leading digit returns to 1 when the value reaches 1000, and the process starts again, taking a year to double from 1000 to 2000.

Can someone explain the meaning of the end of the paragraph that "the process starts again, taking a year to double from 1000 to 2000" since during year 4, the quantity will increase from 800 to 1600 and in year 5, it will change from 1600 to 3200. Don't the time intervals for each of the initial digits vary throughout the exponential growth? e.g 4 will be the leading digit in the third year as q increases from 400 to 800, for a shorter time then in year 6 when q increases from 3200 to 6400. Ankh.Morpork 18:29, 23 December 2012 (UTC)

I believe you are asking whether increasing from 4000 to 5000 will take exactly the same amount of time as increasing from 400 to 500. Because 500/400 = 5000/4000, log2(500/400) = log2(5000/4000), and that equals 0.32192809488736235. So it will take 0.32192809488736235 years for increasing from 400 to 500, or from 4000 to 5000, or even from 5000 to 6250. Nolancapehart (talk) 23:49, 1 May 2013 (UTC)

Scale invariance partly circular

The scale invariance section says, "The law can alternatively be explained by the fact that, if it is indeed true that the first digits have a particular distribution, it must be independent of the measuring units used (otherwise the law would be an effect of the units, not the data). "

It's a proof by contradiction, assuming that "the law is an effect of the units" is false, but it never proves this step. We clearly need better sources for this section. Superm401 - Talk 23:28, 31 January 2012 (UTC)

I would not agree. It essentially says that when scale invariance (i.e. not depending on units) of the distribution of first digits happens, this implies Benford's law, so that step does not need to be proved. --Rumping (talk) 20:56, 31 May 2013 (UTC)

Primes

Here is a paper which is dealing with prime numbers and zeros following Benford's law:

http://arxiv.org/PS_cache/arxiv/pdf/0811/0811.3302v1.pdf

I hope it is useful. John W. Nicholson (talk) 01:46, 8 March 2013 (UTC)

That paper has statements like "Note in figure 1 that primes seem however to approximate uniformity in its first digit. Indeed, the more we increase the interval under study, the more we approach uniformity (in the sense that all integers 1, ..., 9 tend to be equally likely as a first digit)" which suggests to me that prime numbers (like positive integers) do not follow Benford's law.--Rumping (talk) 21:13, 31 May 2013 (UTC)
When you say "that paper" and there have been two papers stated (one in this section and one in the prior section) I am unsure which you are refering to without looking at the article again. But, without looking and knowing how the graph is in the first article of what they mean with "approach uniformity" I would highly suggest that you look at it again. Log graphs tend to have a large 1 digit range and a smaller 9 digit range even with a "approach uniformity". This means that is a grouping ia like what Benford's law requires. John W. Nicholson (talk) 02:44, 2 June 2013 (UTC)
"That paper" was refering to http://arxiv.org/PS_cache/arxiv/pdf/0811/0811.3302v1.pdf and the quotes come from there. To give an example, if you look at primes smaller than 100,000, then the number which start 1 are 1193, 2 1129, 3 1097, 4 1069, 5 1055, 6 1013, 7 1027, 8 1003, and the number which start with 9 are 1006. This is almost uniform, and a long way from Benford's law. Go up to a higher power of 10 and the distribution will tend to be even closer to uniform, and in the limit it is uniform. --Rumping (talk) 10:51, 21 June 2013 (UTC)
So you are saying that for regular numbers less than 100,000 then the number which start 1 are different than 2, 3, 4, 5, 6, 7, 8, and 9? John W. Nicholson (talk) 04:45, 23 June 2013 (UTC)
No, there are 11,111 of each, which is uniform. And for primes the distribution is almost uniform, and closer to uniform as you increase the limit. --Rumping (talk) 09:42, 31 July 2013 (UTC)

Explanations

I think that sections 4.1 and 4.2 should be reversed in order. The reason is that it is the scale invariance that is the primary reason for this phenomenon. It's true that exponential growth processes display the phenomenon, but the reason is that exponential growth processes is one of a number of ways that scale invariance can arise. Putting exponential growth processes in first place overemphasizes this particular way of obtaining the phenomenon at the expense of the more primary underlying reason.

I was led to look at the article because of a recent entry on Andrew Gelman's statistics blog where Benford's law was discussed. One of the comments indicated that it was exponential growth that was the cause of the phenomenon (I responded just below it about the more general reason). It may be that the person who made that comment had read part, but not all of the WikiPedia entry (which had been cited by Andrew Gelman), and came away thinking that exponential growth was "the" explanation. Reversing the order of the two entries would put the early emphasis where it belongs: On the scale invariance. Bill Jefferys (talk) 13:18, 13 October 2011 (UTC)

I disagree that scale invariance is "the primary reason for this phenomenon". The height of human adults does not satisfy Benford's law when the heights are measured in feet, and it also does not satisfy Benford's law when the heights are measured in meters. I don't see anything within the "scale invariance" argument that would explain why Benford's law does apply to the lengths of rivers but does not apply to the heights of human adults. If something deserves to be called "the primary reason for this phenomenon", it should of course explain both why it works when it works, and why it doesn't work when it doesn't work. I think the real "primary explanation" is the one in the "Limitations" section (poor organization there), although of course I'm biased. :-) --Steve (talk) 17:21, 20 October 2011 (UTC)
Scale invariance obviously only applies when the phenomenon being measured involves numbers varying by at least several orders of magnitude. This is the case with rivers, and not the case with human heights, which don't vary even over one order of magnitude, considering newborn infants and basketball players at the extremes.
My point is that exponential growth processes is only one mechanism that produces scale invariance, so that putting it first is misleading (see the Gelman blog entry). It should be second (and your comment doesn't refute this). Bill Jefferys (talk) 17:54, 20 October 2011 (UTC)
"Scale invariance obviously only applies when the phenomenon being measured involves numbers varying by at least several orders of magnitude." This is not "obvious" based on the wikipedia article explanation as currently written. (At least, not obvious to me!) The current explanation just says, "if it is indeed true that the first digits have a particular distribution, it must be independent of the measuring units used", but does not say a word about why or whether "it is indeed true that the first digits have a particular distribution". You seem to have a deeper understanding of the "scale invariance argument" than just what's written here now, and I hope you take some time to improve that part of the article. :-) --Steve (talk) 21:42, 24 October 2011 (UTC)
Actually, the "limitations" section you pointed to gives a reasonably good discussion of why the phenomenon has to vary over several orders of magnitude.
I'm thinking that a (perhaps) retitled "limitations" section might come before the section we're discussing, and then reorder the two entries in that section to put the exponential growth processes second last. Again, my motivation here is that the person who commented on Andrew Gelman's blog may well have read that section, halfway through, decided that exponential growth was the explanation, when the whole issue is much more than that since exponential growth is only one way that the phenomenon can arise, and there are more fundamental considerations, as both you and I point out. Bill Jefferys (talk) 23:09, 24 October 2011 (UTC)
I agree with your understanding that the primary reason is scale invariant data. However, exponential growth is a stand-alone reason for B's law to apply even when scale variant data is being used. Ankh.Morpork 18:52, 23 December 2012 (UTC)


The article states that normal distributions "can't span several orders of magnitude." This is not true, the support of a normal distribution is the entire real line, and hence every normal distribution spans every order of magnitude. There are also normal distribution that span several orders of magnitude with a high probability: just consider the normal distribution with mean 0 and variance 10^10. — Preceding unsigned comment added by Bhglaser (talkcontribs) 17:14, 5 November 2013 (UTC)

I would say scale invariance is a characteristic of Benford data, but not an explanation.--Jack Upland (talk) 06:54, 23 May 2014 (UTC)

Positional Number System

It's simply an artifact of the positional number system. Of the first 20 numbers, 50% start with a 1. Of the first 30 numbers, 33% start with 1. Of the first 100 numbers, 10% start with 1. Of the first 200 numbers, 50% start with a 1. Compounding this is that for the vast majority of ditributions in nature, the frequency of a given value diminishes as the value approaches the upper and lower limits of the range. In the direction of the upper limit, the number of possible values starting with 9 compared to 8 is only a tenth of the number of possible values starting with a 10, 11, 12, ... 19 compared to 9. See this symmetrical range as an example:

Value: Frequency
5: 2
6: 3
7: 5
8: 8
9: 8
10: 5
11: 3
12: 2

i.e. 10 values beginning with a 1, purely due to the decimal numbering system.

If every number had a unique symbol, the effect would disappear. — Preceding unsigned comment added by 217.13.157.77 (talk) 15:00, 20 January 2014 (UTC)

For each positive integer n, this graph shows the probability that a random integer between 1 and n starts with each of the nine possible digits. For any particular value of n, the probabilities do not precisely satisfy Benford's Law; however, looking at a variety of different values of n and averaging the probabilities for each, the resulting probabilities do exactly satisfy Benford's Law.[citation needed]
Your first few sentences ("of the first 20 numbers, 50% start with a 1", etc.) is more-or-less the same as this picture and description in the article...
It's odd that your example demonstrating Benford's law is actually a dataset which is very far from satisfying Benford's law. None of the numbers starts with a 2! In fact, approximately-normal distributions never satisfy Benford's law.
I don't understand what you're getting at with "positional number system". It sounds like you are saying something very obvious: If we didn't write numbers using digits, then there would be no such thing as a first digit, and therefore there would be no Benford's law. :-P --Steve (talk) 17:41, 10 March 2014 (UTC)

I understand positional number system argument, but it is simply false. Why stop at 200? Go up to 999 (for the argument's sake). At that point the frequency of first digits is evenly distributed at 1/9 - 111 each.--Jack Upland (talk) 07:05, 23 May 2014 (UTC)

1/f noise

I am surprised there is no mention of 1/f noise.

It struck me that the probability of a number with lead digit 1 is 0.301 which is log10(2). This makes sense because we are talking of the range 1.0 to just under 2.0. That represents a doubling (ie factor of 2).

Likewise if you add up the probabilities for lead digits 2 and 3 you get another 0.301. And similarly for the sum of the probabilities for lead digits 4 through 7. In other words, corresponding to 2 further doublings.

So this is like looking at a noise spectrum from electromagnetic noise in the environment. Consider an arbitrary frequency F. From F to 2F we find a certain amount of energy incident on some area within some period of time. We also find the same energy for the range 2F to 4F. And similarly for 4F to 8F. So this is in fact 1/f noise. It makes sense, because a bunch of photons at frequency nF requires n times the energy of an equal-numbered bunch at frequency F. So as we go up in frequency, we might expect the amplitude to go down.

Extrapolating this to other things such as earthquake incidences etc sounds viable. Probably quite a lot of things work this way in fact.

So there is nothing magical about the distribution of leading digits.

My point in all of this is that there is no simple explanation of the simple origin of what otherwise seems like some sort of magic. — Preceding unsigned comment added by 86.30.114.58 (talk) 22:32, 24 May 2014 (UTC)

A probability distribution on a log scale. In this example, the ratio of the red to blue width is very close to the ratio of the red to blue area. So it follows Benford's law, pretty accurately, but not perfectly.
See plot on the right. In pure 1/f noise, the plot would be exactly flat. So it would indeed follow Benford's law perfectly! Oh, except that if it's exactly flat, then the area under the curve would be infinite ... so it's not a probability distribution!
More realistically, the plot might be approximately flat over a certain range, maybe many orders of magnitude, and then tail off on both sides. Kinda like the example on the right, actually. I mean, if I saw the curve on the right when measuring a noise power spectrum, I might call it "approximately 1/f noise, at least in the frequency range 10-1000". But is this really 1/f noise? Maybe a clearer description of this curve would be "a probability distribution that (on a log scale) is relatively smooth and flat over several orders of magnitude". So that's the kind of wording which is already in the article.
1/f noise (in a sufficiently broad but not infinite bandwidth) is undoubtedly an explanation of one way that these kinds of probability distributions can appear (i.e. probability distributions that are pretty wide and flat on a log scale). However, (1) Calling it an "explanation" is too charitable, because you still need to explain why the 1/f noise happens in the first place, (2) It is kinda an unusual case in practice -- I mean, if you look at the real-world datasets that exemplify Benford's laws, only a very small fraction of them are related to 1/f noise. Street addresses, tax documents, lengths of rivers, populations of cities, physical constants, etc. etc.: None of these are related to 1/f noise, as far as I can tell. The 1/f noise related examples are kinda rare in practice. Maybe some earthquake statistics (as you suggest).
I'm not opposed to mentioning 1/f noise in the article, as a category of processes from which you can get probability distributions that (on a log scale) are relatively smooth and wide. It could be mentioned alongside various other such processes, like exponential growth processes. --Steve (talk) 18:02, 28 May 2014 (UTC)

Explanation

The section entitled "Explanation" is difficult to understand. Please remember that Wikipedia is a general encyclopædia and is equally likely to be viewed by individuals of average mathematical knowledge as it is to be viewed by specialists. 68.49.208.76 06:30, 2 September 2007 (UTC)

You are correct - but accurate, easy-to-understand explanations of technical issues are very, very hard to write. Wikipedia is full of excellent articles that are useless to 99% of humanity for that very reason. And Benford's Law is particularly tricky to explain to laymen; it's so counter-intuitive. - DavidWBrooks 11:53, 2 September 2007 (UTC)

Please take a look into: http://www.dspguide.com/ch34.htm. In short, it explains the "law" as an artifact of the manipulation of the data. Pretty well written and easy to understand. Marco 12:51, 14 September 2008 (UTC)

I dunno. This is the heart of its argument, quoted:
In answer to our question, the logarithmic pattern of leading digits derives solely from sf(g) and the convolution, and not at all from pdf(g).
Putting that into layman's terms will be - well, interesting. Feel free to take a shot at it, though. - DavidWBrooks (talk) 12:54, 14 September 2008 (UTC)
When you get past all the making-a-lot-out-of-quantitative-details, he's just saying that if a probability distribution is broad and reasonably flat on a log-scale, then Benford's law is expected to hold. That's an easy point to explain and get across. Just put a broad, smooth distribution on a properly-labeled log scale. Maybe, for comparison, put a really sharp distribution on a log scale. Readers will be able to look at the log scale and see how Benford's law should hold in the first case but not the second.
The problem is that a "properly-labeled log scale" includes labels for 1,2,3,4,5,6,7,8,9,10,20,30,etc., and it's a bit tricky to do that in the programs I have. I'll try though.... --Steve (talk) 19:22, 14 September 2008 (UTC)
The logarithmic-scale probability density function for an exponential decay process. The area under the curve between two points is proportional to the probability that the function has a value between those two points. Note that this looks different from the conventional depiction of exponential decay. This is because the x-axis is distorted by the logarithmic scaling, so the height also has to be distorted for the area under the curve to be correct.
Here's a start. There's a published paper that says that exponential-decay probability functions satisfy Benford's Law to within a few percent. This matches up with the fact that the probability-density function is reasonably smooth over about two orders of magnitude, as shown in this pic. The text would describe how, since 30.1% of the horizontal (logarithmic) number line lies between 1's and 2's, it's not surprising that around 30.1% of the area under this particular curve lies between 1's and 2's. --Steve (talk) 20:50, 14 September 2008 (UTC)
OK, I decided against that particular picture but added the appropriate explanation and citations with diagrams. Hope it's helpful. --Steve (talk) 05:36, 9 October 2008 (UTC)
The new material looks good. Derek farn (talk) 11:12, 9 October 2008 (UTC)
Unfortunately, the new material is an original research. We therefore must forget it. Wikipedia is and advanced technology for collecting conservative garbage! --Javalenok (talk) 12:23, 24 October 2011 (UTC)
Not original research: The argument is the same as the cited article in "The American Statistician". [Well, the discussion above was in 2008, and the "American Statistician" article was in 2009. So if you had objected in 2008, you might have had a point... But right now it is definitely not original research.] --Steve (talk) 21:42, 24 October 2011 (UTC)

Berger and Hill sum this up well by saying, "A simple explanation? Are you sure?"--Jack Upland (talk) 02:02, 25 May 2014 (UTC)

A probability distribution on a log scale. In this example, the ratio of the red to blue width is very close to the ratio of the red to blue area. So it follows Benford's law, pretty accurately, but not perfectly.
Jack -- To the right is what we're talking about. I personally think that this is a "simple explanation". Your comment implies that you disagree. Do you think that it's "not simple", or "not an explanation", or both? Why? Could you elaborate? --Steve (talk) 06:30, 28 May 2014 (UTC)
I was referring to this article - [1] - which I cited on the page. They know far more than I do, but personally I find it mysterious that I can pick up an atlas and demonstrate Benford's Law, and I have not found an explanation that I understand or even believe.--Jack Upland (talk) 10:19, 28 May 2014 (UTC)

References

Ah, gotcha. In the notation of that article, I have been emphasizing (4) (or (3) with an improved wording that negates their complaints), following the Fewster article. I.e., "Benford's law occurs when, if you plot the probability distribution on a log scale, it's spread out pretty smoothly over many orders of magnitude."
They are retorting that this is not a complete explanation, because I have not said why real-world data-sets often have probability distributions that span several orders of magnitude relatively smoothly. Well, I agree with that! The answer is, indeed, not simple. The answer for "stock market prices" is certainly different than the answer for "area of islands".
For stock market prices, they tend to change proportionally, like a $1 stock would go up or down by 1% ($0.01) with a comparable frequency as a $1000 stock would go up or down by 1% ($10). When you have that kind of dynamics, acting over a long period of time, the probability distribution (on a log scale) tends to get more and more smooth and spread out.
For the population of cities and towns, actually the same explanation still works I think. A town of 200 people might grow to 250 people or shrink to 150, with a similar frequency as a town of 200,000 people might grow to 250,000 people or shrink to 150,000. So again, you get the dynamic where the probability distribution (on a log scale) tends to get more and more smooth and spread-out over time.
For the area of islands, on the other hand, I don't know why the probability distribution would be pretty broad and smooth on a log scale. I don't know enough about the geological processes that cause island formation.
So, in that respect, I tend to agree with that paper. We should not expect to find a single simple explanation that applies to every real-world situation where you find Benford's law.

The conclusion of the paper is: "Although many facets of BL now rest on solid ground, there is currently no unified approach that simultaneously explains its appearance in dynamical systems, number theory, statistics, and real-world data. In that sense, most experts seem to agree with Fewster that the ubiquity of BL, especially in real-life data, remains mysterious." I don't think you agree with that. (And incidentally I can't find that in Fewster's paper.")--Jack Upland (talk) 00:45, 30 May 2014 (UTC)

I would say that "there is no unified approach that explains why you often find probability distributions that, on a log scale, are spread out pretty smoothly over many orders of magnitude, in many different fields like dynamical systems, number theory, statistics, and real-world data". I think it can happen for many different reasons.
It's quite similar to saying "there is no unified approach that explains why multiplication occurs in many different fields like dynamical systems, number theory, ..." It occurs when calculating the area of a rectangle, and when calculating compound interest, and when writing down laws of physics, etc. etc. What a coincidence! Or is there a "unified approach" that explains why multiplication occurs everywhere? Not really. You can try, but it would be so extremely vague that there's no real point!
(Loosely speaking, whenever you are doing primarily multiplication and division, rather than addition and subtraction, you can easily wind up with data that is spread out pretty smoothly over many orders of magnitude.)
I agree with Berger and Hill that there is "currently no unified approach", but I would drop the word "currently". :-P
I agree with you that the reference to Fewster makes no sense. Maybe it's from personal communication, rather than what Fewster wrote in her paper. --Steve (talk) 18:18, 29 May 2014 (UTC)
Update: I rewrote a bit, mainly merging the Limitations section with the Explanations section. I added a footnote with quotes from the Berger and Hill paper that I think make it very clear that they accept the argument in the Limitations section, and don't consider it a "simple explanation" because it does not explain of why we so often come across data-sets that, when plotted as a probability distribution on a log-scale, vary smoothly over several orders of magnitude. --Steve (talk) 18:44, 30 May 2014 (UTC)