Wikipedia:Reference desk/Archives/Mathematics/2007 September 20

From Wikipedia, the free encyclopedia
Jump to: navigation, search
Mathematics desk
< September 19 << Aug | September | Oct >> September 21 >
Welcome to the Wikipedia Mathematics Reference Desk Archives
The page you are currently viewing is an archive page. While you can leave answers for any questions shown below, please ask new questions on one of the current reference desk pages.

September 20[edit]

Factoring a Binomial[edit]

Hey, I was studying together with an older friend and I was trying to factor this equation: ax2+bx+c=0, right? You know, trying to set it to (x-p)(x-q)=0 to find the roots.

So I go about trying to solve it the way they taught me in school, you know, find two numbers whose product=ac and their sum=b. I ask my friend how he does this problem, and he shows me that he uses this equation: "-b±√b2-4ac/2a". (That means it's all divided by "2a".)

I look into it, and I find out that the 2 numbers he gets aren't the p and q roots I'm looking for, but r1 and r2: 0=a[(x-r1)(x-r2)]. So I say to him he's wrong, but he just shows me this problem sheet he has titled "Signdiagrams"(which has the answers on the back), and it tells him I'm wrong. What's going on? -- 01:45, 20 September 2007 (UTC)

a*x^2 + b*x + c = 0
We also have
(x-p)*(x-q) = 0
x^2 - (p+q)*x + (p*q) = 0
Multiply both side by A
A*x^2 - A*(p+q)*x + (A*p*q) = A*0
Now all we have to do is match the following
Find p and q such that
(1) A = a (really easy task)
(2) A*p*q = c
(3) -A * (p + q) = b 02:04, 20 September 2007 (UTC)

Umm, what's the difference? If 0=a(x-r1)(x-r2), then 0=(x-r1)(x-r2). i.e. r1 and r2 are p and q. --Spoon! 02:28, 20 September 2007 (UTC)
Well, try substituting your equations into his. I'll assume a = 1 for simplicity, but you can put it back in:
Therefore: and , and also
So you have:
Looking at the square root term:
This means:
So... what's the difference between your two formulae? - Rainwarrior 02:57, 20 September 2007 (UTC)
The two of you may be solving different problems. Consider the quadratic polynomial 5x2+5x−60. When x is three the polynomial is zero, which is what your friend finds using the quadratic formula,
But when we consider factors, we notice that every term of the polynomial is a multiple of five; in fact,
We will have this same problem whenever the coefficient of the x2 term is not one, as we can easily verify by multiplying out (xp)(xq). For his root formula a common nonzero constant factor, λ, cancels out.
Before we try to find p and q, we need to divide the polynomial by a. Thus
In other words, we should have pq = ca and p+q = −ba. --KSmrqT 02:51, 20 September 2007 (UTC)

ahh.. thanks. I don't really understand what happened at

(x-p)*(x-q) = 0
x^2 - (p+q)*x + (p*q) = 0

but I'm sure I'll figure it out. -- 04:28, 20 September 2007 (UTC)

It's multiplication of polynomials using the law a(b+c)=ab+ac. Thus: (x-p)*(x-q) = (x-p)*x-(x-p)*q = x*x-p*x-x*q+p*q = x^2 - (p+q)x + (p*q) . Bo Jacoby 10:46, 20 September 2007 (UTC).

packing in S3[edit]

How much is known about sphere packing (or for that matter the Thomson problem) in , the surface of the four-dimensional sphere? If N>>120, are there regions of face-centered cubic packing, or what? What are the flaws like? —Tamfang 04:34, 20 September 2007 (UTC)

I think in four dimensions an optimal configuration is known. The article I think mentions that it's known exactly for 8 and 24, but for a lot of them it's up in the air, as far as I know. We have upper and lower bounds, and for some dimensions we know that the optimal packing is not a lattice, though most of the time the best known packing is a lattice. There's a really neat article linked in the references in Kissing number here: "Kissing numbers, sphere packings, and some unexpected proofs." PDF - Rainwarrior 16:26, 20 September 2007 (UTC)
Thanks but I think you misunderstood the question. —Tamfang 08:15, 21 September 2007 (UTC)
I'm sure I did, but I'm curious as to what it is exactly. Are you trying to pack 120 spheres onto a four dimensional sphere? - Rainwarrior 08:44, 21 September 2007 (UTC)
I'd bet that the best such packing for N=120 is to inscribe the spheres in the dodecahedra of the 120-cell. – In , the surface of the 3-ball, when N>>12 you typically get regions of mildly distorted "kissing pennies" lattice, with 12 flaws where a disk has 5 neighbors rather than 6; in other words, the Voronoi cells are usually 12 pentagons and N-12 hexagons, increasingly regular as N increases. – Since face-centred cubic is the densest packing of balls in flat 3-space, I'm guessing that it shows up in curved 3-space, mildly distorted, when N is big enough. —Tamfang 19:19, 21 September 2007 (UTC)
Well, if your suggested arrangement is optimal, I like the comparison between your 120-cell cell-centres, and the known optimal dodecahedron's face-centres in S^2. I've never looked at the arrangement in terms of "flaws" before; that's an interesting idea. I notice you say "usually" with your 12 pentagon rule for N > 12 in S^2, for what N does it not hold up? - Rainwarrior 03:27, 22 September 2007 (UTC)
Sometimes the convex hull shows a square instead of two triangles; for example the best solution for N=24 is (i believe) a snub cube, 6 squares and 32 triangles, rather than the expected 44 triangles, and every node has 5 neighbors [1] – in other words the Voronoi cells are congruent irregular pentagons. —Tamfang 00:30, 24 September 2007 (UTC)
There is no way to form a (topological) sphere of 12 pentagons and 1 hexagon, so for N=13 the Voronoi regions include a square. [2]Tamfang 17:45, 25 September 2007 (UTC)

convergence/divergence problem[edit]

For what values of "a" is convergent?

My answer was none, because no matter what, the cosine factor's oscillation will prevent any convergence.

Yes,no?--Mostargue 08:03, 20 September 2007 (UTC)

Also, using integration by parts several times (and also verifying it with Mathematica):

I found the indefinite integral to be:

If that is evaluated at infinity, the sin and the cosine will cause it to diverge, no matter what a is. But even if a=0, it is still divergent because then the integral just becomes sine, and that is also divergent.--Mostargue 08:14, 20 September 2007 (UTC) —Preceding unsigned comment added by Mostargue (talkcontribs) 08:14, 20 September 2007 (UTC)

I'm assuming here that "a" is a constant, but even if it was another function, I still can't think of how it could prevent divergence.--Mostargue 08:16, 20 September 2007 (UTC)

Are you assuming that a ≥ 0 ? I don't see that stated anywhere in the problem. What happens if a is negative ? Don't rely on your idea that oscillation prevents convergence - this is incorrect. For a counterexample, consider the series
- the partial sums of this series oscillate, but it still converges to ln(2). Gandalf61 08:28, 20 September 2007 (UTC)

Right, but that series is different because the individual terms approach zero. Looking at the graph of the original function, its terms get bigger and bigger. Not a good sign.--Mostargue 08:40, 20 September 2007 (UTC)

Only if a is positive. Think about what happens if a is negative. Gandalf61 08:48, 20 September 2007 (UTC)

If a is negative, then the terms will get smaller and smaller. But that doesn't prove convergence. Hmm.. But how would sin(b) be evaluated then?--Mostargue 08:55, 20 September 2007 (UTC)

Where b is what? If you mean b=infinity, remember you don't (strictly speaking) evaluate at infinity, but rather at finite values and take a limit. Algebraist 11:44, 20 September 2007 (UTC)
When a is negative, , so , since , a finite constant, and is also constant. So --Spoon! 13:07, 20 September 2007 (UTC)

standard deviation?[edit]

Why is standard deviation calculated with squaring?? Is it just a trick to convert negatives to positives? Wouldn't it amplify the extremely high values to even higher? I'm trying to learn stats and it seems like using absolute difference from the median would make more sense to me to give an indication of spread.

For the data: 2 4 6 7

How far from the median: 3 1 1 2 (reordered as 1 1 2 3)

Now take the median of those numbers is 1.5.

Wouldn't that give more info about spread? Why is standard deviation more popular? It seems unintuitive to me.

--Sonjaaa 08:33, 20 September 2007 (UTC)

Yes, the squaring amplifies high values to even higher, but the square root de-amplifies it again. Yes, the definition of standard deviation is unintuitive. But there is a formula for the standard deviation! There is no formula for your intuitive measure of spread - is does not depend on data in an analytical way. (See analytic function). Mathematicians want formulas. Bo Jacoby 10:37, 20 September 2007 (UTC).
Sonjaaa - the measure of dispersion that you describe is called the median absolute deviation or MAD. I can think of two problems with it. The first problem is that is difficult to calculate for large samples - I think you have to keep track of all the separate sample observations to calculate the MAD, whereas the standard deviation can be calculated just from knowing the sum of the observations and the sum of their squares. The second problem is that the MAD is insensitive to the values of outliers. In you example, look at what happens if we increase the largest observation, 7. If we replace 7 with 8, the MAD increases from 1.5 to 2. But if we then replace 8 by 10 or 20 or even 100, the MAD stays at 2. Intuitively, you would expect 2 4 6 100 to be more dispersed than 2 4 6 8, but with the MAD measure they have the same dispersion. Gandalf61 10:56, 20 September 2007 (UTC)
This might be slightly technical, but the main reason the SD is computed with squaring is that in reality the SD is a derived statistic from a (function of a) more primitive one, the second sample moment, AKA the sample variance. I say primitive because the population variance is an expectation, while the population SD is not. And the variance is computed from squaring, by definition. The SD is, however, in the same units as the original data, so is easier to interpret.
Ironically, what G61 mentions as the second drawback of the MAD is actually a benefit in those circumstances when it is often used. Baccyak4H (Yak!) 13:53, 20 September 2007 (UTC)
A further alternative is mean absolute deviation (from the mean), also MAD. This resembles the SD in that it is sensitive to outliers, and in fact for a population reasonably Normal, MAD≈0.8SD. There is a link on the page for this MAD to an article identifying some advantages compared with the SD— 17:14, 20 September 2007 (UTC)
Another reason: the sum of squares is additive, i.e. the sum of squares from error can be added to the sum of squares from the "real effect" to give the total sum of squares. I don't think that works if you don't square it but I'm too tired to work it out. Gzuckier 13:55, 21 September 2007 (UTC)


I am trying to understand a relatively simple concept in statistics. It probably relates moreso to semantics and interpretation (than to mathematical calculations per se). Nonetheless, I would like to know what the normal / standard / conventional approach would be in the field of statistics. Thanks. Consider the data below, representing the "Student of the Month" at XYZ High School. What would be the proper / appropriate / mathematically correct methods to calculate and report the following statistics to describe this data set or this group of people? The question really boils down to (I think?) ... how does the mathematician / statistician deal with the duplicity or repetition of certain group members.

Month Name Gender Race Class Age
September Ann F Hispanic Sophomore 80
October Bob M White Junior 22
November Carl M White Senior 53
December Dave M Black Junior 41
January Ellen F Hispanic Senior 10
February Ellen (again) F Hispanic Senior 10
March Ellen (again) F Hispanic Senior 10
April Frank M Black Freshman 39
May Ann (again) F Hispanic Sophomore 80
June George M White Senior 71

So, what are the correct Statistics for the XYZ High School Student of the Month Program?

  • (01) Number of Students: 7 or 10
  • (02) Number of Males (as a percent): 5 out of 10 ... or 5 out of 7
  • (03) Number of Females: 2 or 5
  • (04) Number of Females (as a percent): 2 out of 7 ... or 2 out of 10 ... or 5 out of 10 ... or what

etc. etc. etc. for:

  • (05) Number and Percent of Whites or Blacks or Hispanics
  • (06) Number and Percent of Freshmen or Sophomores or Juniors or Seniors
  • (07) Average Age of Students: Method A = (sum of all of the ages divided by 10) ... or Method B = (sum of all the distinct people's ages divided by 7)
    • Average Age of Students: Method A = 41.6 years old ... or Method B = 45.14286 years old

I guess the question boils down to this: does Ann (and all of her characteristics) "count" only one time or two times? Similarly, does Ellen (and all of her characteristics) "count" only one, or two, or three times? And why? And does the single (versus multiple) counts change depend on which statistic / characteristic involved (race versus age versus gender, etc.)?

I can see arguments going both ways. Take Ellen, for example. Point: She (her data) should count only one time, because she is one only person / one human being. Counterpoint: She (her data) should count three times, because each of her three wins/awards/titles generically represents a different Student of the Month honoree (January honoree, February honoree, March honoree) (with the irrelevant coincidence that those three honorees just happen all to be Ellen). So, clearly, the distinction is: the number and characteristics of (a) honorees/winners versus (b) distinct honorees/winners. (I think?)

Of course, I want to understand the general concept -- so the above is just a fake / easy / hypothetical data set for illustration. That is, all the specifics and "details" are irrelevant -- that it is a High School, that it is a Student of the Month program, their ages, their races, etc. How does math / statistics properly handle this? Thanks. (Joseph A. Spadaro 13:19, 20 September 2007 (UTC))

This seems like a question about the basic assumptions behind Data modeling.
For example, consider your examination of Ellen. Both distinctions [(a) versus (b)] are relevant -- a statistician should not be forced to choose between them. In instance (a) we are examining attributes of the data set itself ... (i.e., how many "rows" in our table relate to Ellen [this is not precisely correct, but close enough]). In instance (b) we are examining attributes of the entities that our data set happens to be talking about (i.e., How many individual human beings are recorded in our data, and is Ellen one of those human beings).
In a properly normalized data set, you would not have entries such as "Ellen" and "Ellen (again)" [just like we don't have "Male" and "Male (again)"] ... they would all just be "Ellen". Moreover, there would be a separate mechanism to properly distinguish whether multiple instances of "Ellen" actually referred to the same physical human being, or whether there were multiple students who just happen to have the same name.
All of these lines of inquiry are valid, and the issue is just a matter of the way you structure your data. You should not be forced to "pick and choose" which interpretations are valid. dr.ef.tymac 15:52, 20 September 2007 (UTC)
In other words, what kind of an analysis you do depends on what questions you are seeking an answer to. If you are interested in the number of individuals, as opposed to the number of rewards, then you count the individual persons. Ultimately, it depends on what you want to do with the answer. It's not any more complicated than that. 19:43, 20 September 2007 (UTC)
(Edit Conflict)Yeah. I would say the main problem is that both the questions you give and their answers are too vague. They have, as you pointed out, more than one interpretation. For instance, on (02), number of males as a percent of what? 5 males out of 7 students? 5 distinct male winners out of 10 months? 5 months with a male winner out of 10 months? 5 male winners in 1 year? 0 males out of 2 females? 5 males out of 5 males? 1 Male out of 2 genders? Each of these is a more specific and more useful question, and is much easier to answer. Each could be an answer to your original question, depending on the circumstances. This illustrates, as it happens, an important concept in statistics - numbers don't lie, but they don't have to. As you can see, you can get wildly different answers to the same question by interpreting it differently. For instance, given identical data regarding aircraft and highway fatalities, you can conclude that it is both more and less save to take a plane. You could measure deaths per hour of travel, which might very reasonably indicate that flying is dangerous (on an hourly basis). On the other hand, taken on a per-mile basis, flying could be safer. Flying, see, is much much faster than driving, so it takes many fewer hours to travel the same number of miles. Which is the right answer, then? The right answer is that the world is complicated, and any attempt to simplify it will leave out information. If you want a simple answer, accept that it will be wrong. If you want the full answer, accept that it will have to be specific and detailed. Black Carrot 19:47, 20 September 2007 (UTC)

Development of the decimal system[edit]

Hi there. Can anyone give me some basic information or point me towards a relevent website detailing the development of the decimal number system. I'm a teacher who wants to use this information at primary level. Nothing too advanced just a general timeline of key milestones such as Roman Numerals, arabic digits, invention of the zero symbol, place value etc. Thanks in advance for any help. Kirk UK —Preceding unsigned comment added by (talk) 14:09, 20 September 2007 (UTC)

History of the Hindu-Arabic numeral system contains some references / links. (Joseph A. Spadaro 14:18, 20 September 2007 (UTC))

domain, range, co-domain, image[edit]

Alphonse claims that, in defining functions, mathematicians used to use the term "range" instead of the term "co-domain" but they changed to the latter term because "range" is not precise enough. He also claims that the term "range" is no longer favored, and the preferred term is "image".

Is Alphonse correct? Please explain your answer with simple clarification, please also note this is not homework, just an attempt to get these nits picked. dr.ef.tymac 15:04, 20 September 2007 (UTC)

Um, did you look at our article on range? In short, the usual terminology is that an image of a set is the result of mapping all elements of that set through a function. The co-domain is the set that the outputs of a function are in (in most elementary mathematics, we're looking at functions with a domain and co-domain of the real numbers or, less frequently, the complex numbers). The range is the image of the domain, which is a subset of the co-domain, which depending on the function, may or may not be the entire co-domain. Who is this Alphonse? Donald Hosek 17:13, 20 September 2007 (UTC)

Graph of example function,
Codomain, range, and image are all precise and in current use. Each has a different meaning, though with some overlap. Consider the example plot (shown right) that opens our function article. We have taken the domain to be the real interval from −1 to 1.5 (including both endpoints), and the codomain to be the same. We can see that all the function values do indeed lie between −1 and 1.5, as they must by definition of codomain. However, the values do not go below approximately −0.72322333715 nor above 13√10 ≈ 1.05409255339, so the range is smaller than the full codomain. The image of the full domain is synonymous with the range, but we can also speak of the image of a portion of the domain; for example, the image of the interval of the domain between 12 and 1 is the interval of the codomain between −12√2 and 0. Another useful term is "inverse image"; for example, the inverse image of 0 consists of all the points in the domain that map to 0, here −1, 12, and 12(1±√3). --KSmrqT 17:25, 20 September 2007 (UTC)

calculating probability of two events occurring at same time[edit]


I have a dataset that looks like:

 Event1 occurred at Time1
 Event2 occurred at Time2
 Event1 occurred at Time3
 Event3 occurred at Time4
 Event2 occurred at Time4

I'd like to know how often (how predictably?) different Events occur at or near the same time. But I have no clue which algorithm or technique or analysis to use.

I think I'm looking for an analysis that will assess each pair of events and provide a probability of having them occur within a specific timeframe of each other.

Example Data:

 Doorbell rings at 2:00:00pm
 Dog barks at 2:00:03pm
 Dog barks at 2:10:12pm
 Fridge door opens at 2:11:12pm
 Dog barks at 2:11:45pm
 Doorbell rings at 2:15:00pm
 Dog barks at 2:15:07pm
 Fridge door opens at 2:22:11pm

Sample/Expected results - making up numbers, of course:

 If the doorbell rings, you can expect the dog to bark within one minute 97% of the time.
 If the doorbell rings, you can expect the fridge door to open within 20 minutes 22% of the time. 
 If the fridge door opens, you can expect the dog to bark within one minute 67% of the time.  

What is this kind of analysis called? Can I 'create' it by using 'canned' stuff in something like SQL Server? Other advice?

Many thanks!

JCade 18:27, 20 September 2007 (UTC)Jennifer

(NAE-Not an expert) One option, depending on the computing resources you have available, would be to store and update the entire distribution for each pair or collection of events you want to know about. Say you want to know how soon after a door opens the dog will bark, provided the dog does so within twenty minutes. You could store an array of twenty integers (starting them all at 0), and add one to the proper minute-marks (possible several if the door keeps opening) every time the dog barks. Once you have the distribution, you can do anything you want with it. Black Carrot 19:29, 20 September 2007 (UTC)
Question: calculating probability of two events occurring at same time
Answer: Please define the term "the same time".
May I suggest that you pick a atomic time interval and stick to it. Say an atomic time interval of 5 seconds. If two events occurs in the same time interval then it is a simultaneous event. If two events occurs in adjacent time interval then they are a near event. Now you can calculate the probability that event A and event B are simultaneous event and/or near event. 00:33, 21 September 2007 (UTC)

NICE! I can easily define my interval for "same time" - and the concept of having "same" and "near" times is valuable. Thank you! But I have to be sheepish here - I still don't know how to calculate the probability of each pair of events being "same" or "near." My dataset is in SQL there a SQL tool/utility/function I can harness for this? Is there a term I can google to help me get to the equation part? Thanks again! —Preceding unsigned comment added by JCade (talkcontribs) 03:20, 21 September 2007 (UTC) Ooops! The newbie didn't sign.... JCade 03:21, 21 September 2007 (UTC)

This is a maths reference desk. we can only help you with maths questions and not SQL questions. 04:03, 21 September 2007 (UTC)
Assuming a time interval of 5 seconds. In a 24 hour period there are 17280 time intervals. If event A occurs 172 times in a 24 hour period (ie 172 time intervals which has event A) then the probability of event A in a time interval is 172/17280 = 0.0099537
if event B occurs 400 times (in a 24 hour period) the probability of event B in a time interval is 400/17280 = 0.023148
Next if event B is completely independent of event A then you would expect the probability of both event A and event B occurring in the same time interval to be (172/17280)*(400/17280) = (43/186624) = 2.3E-4
You would expect to find 17280 * 2.3E-4 = 3.98 such events (a time interval that contains both event A and event B) in a day.
I hope this helps. 04:18, 21 September 2007 (UTC)
Er? I'm assuming that event A (or event B) is NOT something that occurs more often in some parts of the day than others. 04:27, 21 September 2007 (UTC)

Okay - I see where you're going but I'm not sure it's where I want to go. The calculations you've shown (THANK YOU!) tell me how to calculate probability for this kind of situation. The final result of (ProbA)*(ProbB) gives me the probability of both events occurring at the same time. I'm getting the basic equations - whew! That makes sense, and now..... Is there a way we can remove some of the stiffness of the timeframe? Right now, we've divided a day into 5-second intervals to give us our definition of "concurrent." Can we define "same time" as "within 5 seconds after an event" or "plus-or-minus 5 seconds of an event"? I am trying to see if certain events (or pairs, actually) are not truly independent of each other, or even when two events are duplicates. I want to seek patterns where we can say things like "these two events almost always occur within 5 minutes of each other." I was assuming (bad word, I know) that we would look at the duration of time between pairs of events and analyse those. Perhaps probability isn't the right thing to be calculating? Should I be using another term like "correlation" or "association" or "causality" or something? Or maybe your answer is what I need, and I'm just making this much more complex than it really needs to can tell me if that's the case! Thanks for everything so far!! JCade 16:17, 21 September 2007 (UTC)

Yes, you can create a correlation function, which will tell you (a) whether there is any kind of correlation, and (b) when it happens. If we have the following definitions:

Then the correlation function between a and b would be

(See Correlation#Sample correlation for the formula for r). Then, if there is a particular value of k for which the function is significantly large (I can't think of the right way to identify what "significantly large" would be right now, sorry) then you can surmise that, for example, every 5k seconds after the doorbell rings, the dog is likely to bark. Confusing Manifestation 13:26, 22 September 2007 (UTC)

Thank you so much! Some excellent concepts and good references -- now I'll go digest it all. Much appreciated! 14:44, 24 September 2007 (UTC)


pi is 3.14 —Preceding unsigned comment added by (talk) 21:58, 20 September 2007 (UTC)

If that's a question, then the answer is no, since pi is an irrational and transcendental number. Splintercellguy 22:48, 20 September 2007 (UTC)
More specifically, 3.14 < 22/7 – 1/630 < π. See Proof that 22 over 7 exceeds π#Quick upper and lower bounds.  --Lambiam 08:14, 21 September 2007 (UTC)
Here at the cafeteria, pie is only 1.75 (for one slice). Gzuckier 13:57, 21 September 2007 (UTC)
Your comment makes me wonder whether some form of "Blueberry π" appears on menus in Greece. In Brazil, an amusing thing you often see on menus is "X-Burger". This only makes sense when you know that the Portuguese pronunciation of "X" is "cheese", so they make a little cross-language joke. :) --Sean 16:33, 21 September 2007 (UTC)
Probably not since the Greek pronounciation of π is "pee". Donald Hosek 17:20, 21 September 2007 (UTC)
This is parallel in a sense (but orthogonal in another) to the common English abbreviations "X-mas", "X-tian", etc., which all rely on the fact that the English letter X looks like the Greek letter χ (the first letter of Χριστός, Christ). Tesseran 19:38, 21 September 2007 (UTC)
Pi = circumfrence / diameter. -- 01:58, 26 September 2007 (UTC)