# Wikipedia:Reference desk/Archives/Mathematics/2007 July 3

Mathematics desk
< July 2 << Jun | July | Aug >> July 4 >
Welcome to the Wikipedia Mathematics Reference Desk Archives
The page you are currently viewing is an archive page. While you can leave answers for any questions shown below, please ask new questions on one of the current reference desk pages.

# July 3

## Splitting an ordered set while minimizing group sum standard deviation

Let there be an ordered set of figures, for instance, {1,2,3,4,5}. I must split into N groups, each consisting of a continuous subset of figures. (That is, I may split the above set into 2 groups like this: {{1,2},{3,4,5}}, but not like this: {{1,5},{2,3,4}}.) In splitting, my goal is to minimize the standard deviation of the sums of the figures in each group.

An example: say I am required to split the set {1,1,200,200} into two groups. The correct way to split it, in line with the above rule, is into {{1,1,200},{200}}.

I don't need an answer (although that would be nice) as much as a hint as to where to start looking. (How are such algorithms called, to begin with.) — Itai (talk) 12:00, 3 July 2007 (UTC)

This smells like one of those nasty NP-complete problems, and if that is the case, there isn't really a "good" answer. A partial answer might be possible if you elaborate on what is the application, what is a typical size for the set, whether you would accept approximate solutions, and so on. -- Meni Rosenfeld (talk) 12:28, 3 July 2007 (UTC)
It's worth noting that what you're looking for is a (particular) weak composition of the cardinality of your sequence n, because of the substring constraint. You also know the mean of the sums (useful in calculating their standard deviation), of course. Then, as an approximate solution (or perhaps initial guess) you can start with all N-1 boundaries at position 0 (that is, before the first element, so that all but one substring are empty and the other is the whole sequence), and then for every boundary move it to the right if that reduces the difference between the sums of its left and right substring neighbors. Repeat this adjustment until no moves are possible (this obviously cannot take more than ${\displaystyle O(N^{2}n)}$ time, including the summing). I'm not sure if there's a polynomial-time way to finish the job, but if ${\displaystyle O(2^{N}n)\ll O(2^{n}n)}$ is good enough, you can then consider moving each subset of the boundaries by one element to see if that improves things. I'll let you know if anything else occurs to me; it might be useful to consider how to simply test whether a given choice is optimal. --Tardis 13:41, 3 July 2007 (UTC)
You do not know the mean of the sums, you only know their weighted mean which, if I understand the problem correctly, is irrelevant. You have reminded me that there is no immediately obvious way to test a solution, so despite what I have said before, I'm not even sure this problem is in NP. -- Meni Rosenfeld (talk) 16:46, 3 July 2007 (UTC)
Are you sure we don't know that mean? That mean is just ${\displaystyle {\frac {1}{N}}\sum _{g=1}^{N}\sum _{i=1}^{n_{g}}x_{n_{gi}}={\frac {1}{N}}\sum _{i=1}^{n}x_{i}}$ (with liberally inventive notation), regardless of the sizes of the groups, right? What we don't know is the mean of the averages of the groups. --Tardis 22:24, 3 July 2007 (UTC)
You are of course correct, I've been thinking about this backwards. I do feel mortified for erring while trying to correct another, and I apologize. -- Meni Rosenfeld (talk) 22:36, 3 July 2007 (UTC)
No harm at all done! On the subject of NP membership, though, is there an easy way to test the optimality of, say, a TSP solution? Even if there's not, that's not necessarily news, of course, since we can easily test the decision problem "Does there exist a path no longer than x?" with given path and x, but I feel like the function problem deserves an NP-like test too. --Tardis 23:15, 3 July 2007 (UTC)
Let's call it n numbers in g groups with sums s_1, s_2, ..., s_g, and a mean of m. Is there a simple way to find the best with s_1 <= s_2 <= ... <= s_g? If so, I'd find that and its opposite, then check everything in between. If not, I'd find s_1 <= s_2 <= ... <= s_g with the highest s_1, then the lowest s_g, then the opposite of those and check everything between them. This is just for finding the best solution. If all you need is a good solution, I'd split them up so each group has about the same number of elements, then move the splits between adjacent groups so both groups are closer to the mean until I can't do that any more, then I'd do it so it just decreases the standard deviation, then, if that's not good enough, I'd jiggling the borders a bit and trying again a few times. I'd program the computer to do this. I wouldn't just to it by hand unless it was only about a dozen elements in three or four groups (two groups is trivial). — Daniel 18:34, 3 July 2007 (UTC)
Just remember, If you calculate ${\displaystyle \textstyle \sum _{1}^{N}x_{i}}$ first, finding the sums ${\displaystyle \textstyle \sum _{1}^{S}x_{i}=s_{lower}}$,${\displaystyle \textstyle \sum _{S+1}^{N}x_{i}=s_{upper}}$ are pretty trivial since by finding one of the smaller sums, you find the other by subtraction from ${\displaystyle \textstyle \sum _{1}^{N}x_{i}}$. For the number of resulting groups equaling two, you could find the minimal group sum standard deviation in just 2 passes over the entire dataset. For M > 2 where M is the number of resulting groups, the simplest way to handle this is divide and conquer; Iterate over all possible splittings of the given data set into two sets {A,B} and then try to conquer B by splitting it into M-1 groups. The runtime complexity will be directly proportional to number of possible partitions given your conditions. The number of partitions is given by N-1 Choose M-1 or ${\displaystyle \textstyle {\binom {N-1}{M-1}}}$ if I'm not mistaken. (Say you have 12 numbers... that means there's 11 split points possible... As M is the number of groups that art to be chosen , M-1 is the number of split points that must be chosen as a result). Take this with a grain of salt. Root4(one) 05:15, 4 July 2007 (UTC)
I came back to check if anybody had any additional comments, and the person who had additional comments turns out to be me! For every one of the partitions, you still have to sum over M sums (and generate them, although we did cover a trick for that earlier), so the complexity of my solution is more like ${\displaystyle \textstyle O\left({\binom {N-1}{M-1}}M\right)}$, and of course I'm not even tackling adding complexity of adding numbers of significant size (like 1 MB binary representation anybody?) Again, I still say take with a grain of salt, as without actually writing the program, I can only give what I think the answer should be. Writing the program may reveal something I had not considered. Root4(one) 12:51, 4 July 2007 (UTC)
OP once more. The problem, for those who were wondering, is one of text layout - splitting indivisible paragraphs over an arbitrary number of columns. I am much obliged for all your kind help. Having consulted certain knowledgeable individuals, however, I realized that standard deviation is probably less important to me than minimizing the difference between the shortest and the longest column (in other words, the smallest and largest group sum), which has the added advantage of being much easier to calculate. Furthermore, I have been offered an algorithm which insures, within what seems like a reasonable time (I think with what you'll call ${\displaystyle \textstyle O(n)}$ complexity), that this difference is no more than twice the largest figure in the initial set. (I'll be happy to elaborate and provide a proof if anybody's interested, but here's a quick overview: sum the entire initial set, divide this sum into as many groups as you require, and find within which those ideal division points fall. For instance, were we to divide the set {1,3,1} - whose sum is 5 - into 2 groups, our single ideal division point would fall within the second figure, 3, at a "distance" of 2.5 from the beginning of the data set. Now, add all such disputed figures to the group to their left or right - doesn't matter which, but it must be consistent for all disputed figures - and you'll have wound up with as many groups as you required, the maximum difference between the sums of any two of which is twice the largest figure in the initial set, as said above.)
Anyway, thanks for all your suggestions, guys. I'll keep watching for answers to this question, just for curiosity's sake - and of course, should I amazingly come up with an answer of my own, I'll be sure to post it. — Itai (talk) 00:42, 6 July 2007 (UTC)

## Geodesic distance

At the science desk, I was given the advice to pose my question here: I am trying to find out, what the geodesic distance in a Robertson-Walker-metric is. I came across comoving distance and proper length and I wonder if one of them is the thing I am looking for. The proper length looks quite nice but unfortunately the definition does not seem to be restricted to geodesics. (I wonder if such an arbitrary looking definition makes sense anyway.) I just found that it says "For instance, if one measured the distance along a straight line or geodesic between the two points, one would not be correctly measuring comoving distance." in comoving distance. So it looks as if both of them do not help me, do they? Anyway, can I use Riemannian normal coordinates (exponential coordinates in maths jargon, I think) to calculate the geodesic distance? Or would you propose another ansatz? -- 217.232.40.21 16:43, 3 July 2007 (UTC)

Would that be the Friedmann-Lemaitre-Robertson-Walker metric? If so, it's apparently tied to general relativity, which makes Geodesic (general relativity) promising. Black Carrot 06:38, 4 July 2007 (UTC)
Depending on the metric, computing geodesics — much less geodesic distance — can be a challenge; but this metric is deliberately simplified, and of cosmological interest. Follow the links in the article, search the web and standard physics sources. You can find a discussion in Gravitation (p. 722, Exercise 27.4); I can't promise they compute geodesic distance, but they may help answer the question behind the question. Usually computing a geodesic means solving a differential equation. If we require a geodesic between two given points, this becomes a boundary-value problem, more difficult than an initial-value problem. Computing the length of the geodesic may be the easier part. However, some geometries allow multiple geodesic paths between distant points, which can complicate the determination of "distance". --KSmrqT 09:30, 4 July 2007 (UTC)
Well, for the easiest, spatially flat case, I have the system of ODEs, they are 2nd order nonlinear, coupled with nonconstant coefficents:
${\displaystyle {\ddot {\gamma }}^{0}+H(t)a^{2}(t)\sum _{i}\left({\dot {\gamma }}^{i}\right)^{2}=0}$
${\displaystyle {\ddot {\gamma }}^{i}+2H(t){\dot {\gamma }}^{0}{\dot {\gamma }}^{i}=0}$
At first glance I also thought a highly symmetric manifold as RW-spacetime would admit quite simple geodesics, but it doesn't. What worries me most ist the fact that there do not seem to be geodesics with constant time, as constant ${\displaystyle \gamma ^{0}}$ leads to constant ${\displaystyle \gamma ^{i}}$ as well, through the first equation. Unfortunately Friedmann-Lemaître-Robertson-Walker metric#External Links is not quite helpful. Searching the web is what I did for several days before asking here, so that did not bring enlightenment. Seems like I have to look out for a "bible". Thanks for the hint. I hope it helps me. -- 217.232.42.35 18:05, 4 July 2007 (UTC)
Did you look at the article today? After your first post I glanced at it and filled out missing seminal citations. That should make it much easier to find contemporary papers, because they will cite these. Web searching is a skill worth honing; so much is now on line, the challenge is to know how and where to look. My first impulse in seeking advice for this would be to ask physicists, not mathematicians. More specifically, search and post at the newsgroup sci.physics.relativity. Keep in mind that this metric goes by several different names and abbreviations. --KSmrqT 21:51, 4 July 2007 (UTC)

## Sums and factorials

I've passed some exams lately (and did quite badly), and I couldn't figure out a few questions, especially the proof of this equality :

${\displaystyle {\frac {1}{\sqrt {1-4x}}}=1+\sum _{n=1}^{\infty }{\frac {(2n)!}{(n!)^{2}}}x^{n}}$

I really don't know where this comes from. And I must say I just don't know what to do in such cases, as I haven't had much training with sums. Could you just give some general indications on what to do with such sums (and others)? Because here I just don't know where to start, and it must be so for many others (and even if this sum just simplifies in a special way, could you still give me more advice ?).

Thanks. Xedi 18:10, 3 July 2007 (UTC)

I think the binomial expansion of (1-4x)^(-1/2) might be involved. 80.169.64.22 19:30, 3 July 2007 (UTC)
Yes, indeed. An alternative way is to use Taylor series, though that will be harder in this case. -- Meni Rosenfeld (talk) 19:33, 3 July 2007 (UTC)

## Abstract Algabra

I want to prove the following

Show that if G is nonabelian, then the factor group G/Z(G is not cyclic. I think I need to use the contrapositive, but I am stuck.

Thanks

Laura

Yes, a good way to solve it is assuming for contradiction that G/Z(G) is cyclic, and reach a contradiction by showing that G is in this case abelian. So you have to show that every two elements of G commute. Now, if you have some element of G, how can you represent it using the factor group? -- Meni Rosenfeld (talk) 20:19, 3 July 2007 (UTC)
Just a quick nit here contrapositive and contradiction are not the same thing. Contradiction is that you assume that the statement is false and it leads to a contradiction. For example, proving that ${\displaystyle {\sqrt {2}}}$ is irrational by assuming that it is rational and showing that this leads to a logical contradiction. Proving the contrapositive, on the other hand, relies on the equivalence of ${\displaystyle A\to B}$ and ${\displaystyle \lnot B\to \lnot A}$. What Meni has described here is in fact the contrapositive and not a proof by contradiction. Donald Hosek 22:03, 3 July 2007 (UTC)
It's a matter of how you call things, really. You can say that I have proved the contrapositive of the original statement, that if G/Z(G) is cyclic then G is abelian, and you can say (which I did) that I assumed otherwise for the sake of contradiction, and reached the conclusion that G is both nonabelian (given) and abelian (which we show), which is the contradiction we have sought. -- Meni Rosenfeld (talk) 22:17, 3 July 2007 (UTC)
You can call a rock a loaf of bread, but that doesn't make it edible. Read Reductio ad absurdam and Contraposition. Proof by contradiction requires that you assume the negation of your proposition (which in this case is an implication, so we're actually talking about ${\displaystyle \lnot (A\to B)}$ and not merely ${\displaystyle \lnot B}$) and then get a conclusion which is both true and false. It's a common mistake. I've seen grad students make it. But it's still a mistake. Donald Hosek 23:00, 3 July 2007 (UTC)
You are correct; however, in my experience the distinction is blurry not only in the words of grad students, but of proffesors as well (in fact, I don't recall them ever using the term "contrapositive"; they would use "proof by contradiction" or variations thereof whenever they assume anything to be false, in any part of the proof). The reason I find for this is that the difference is trivial; I think we can agree that the heart of the proof here is the part where I take a group G such that G/Z(G) is cyclic and show that it is abelian. It then doesn't really matter how I embed it in the entire proof; A real proof by reductio ad absurdum would go something like: "We want to prove that for every group G, if G is nonabelian then G/Z(G) is not cyclic. Assume otherwise. Then there exists a group G such that (G nonabelian -> G/Z(G) not cyclic) is false, that is, G is nonabelian and G/Z(G) is cyclic. From this and from the lemma that (G/Z(G) cyclic -> G abelian), it follows that there exists a group of G such that G is abelian and nonabelian, QEA". None of the differences between this proof and my version of it are, in my opinion, deep enough to deserve much attention. -- Meni Rosenfeld (talk) 23:23, 3 July 2007 (UTC)
Just to add to the fun, it's worth mentioning that to prove a statement by contradiction and to prove its contrapositive by contradiction are logically identical. That is, to show A ==> B by contradiction means assuming A and ~B, and then arriving at a contradiction. Same for proving ~B ==> ~A. J Elliot 23:52, 3 July 2007 (UTC)
It's not just a matter of a couple syllables. The value of mathematics lies in the precision of its language. As for professors who mis-speak about contrapositive vs contradiction, I'd like to have a conversation with their PhD advisors some time. Donald Hosek 02:56, 4 July 2007 (UTC)
I also find the distinction between the by contrapositive and by contradiction methods uninformative. In reality, when we do a proof, the conditional we are proving is A AND [All known theorems] ==> B. This has contrapositive ~B ==> ~A OR [Some known theorem is false]. Proof by contradiction is ~B AND A ==> [some known theorem is false]. These are logically equivalent statements. And I've also had professors who called both contradiction proofs. nadav (talk) 04:56, 4 July 2007 (UTC)
While I'm sympathetic, I lean towards Donald's side. There are problems where considering the contrapositive and contradiction alternatives give you two different avenues of attack. It's worthwhile to maintain the distinction if for no other reason than to make sure both these avenues are considered. I think it's important that students understand the difference from a logical perspective. But, I have no interest in a fight on this matter and I'm willing to admit it's personal preference. J Elliot 05:25, 4 July 2007 (UTC)
The difference is quite obviously one of method. Care needs to be taken with proof. Your description of proof by contradiction is not quite correct. In Euclid's proof of the infinitude of primes, for example, we start by the premise that there exists a largest prime ${\displaystyle p}$, and then demonstrate that there exists some natural number ${\displaystyle q}$ such that ${\displaystyle q}$ has a prime divisor greater than ${\displaystyle p}$. Logical equivalency of the boolean statements does not mean that the methods are the same. By that reasoning, there's no point in making the difference between proving the contrapositive and making a direct proof, a claim that I would hope you won't care to assert. Your professors are not always right. I'm not always right. But in this case, I am. That's my last word on this. Donald Hosek 05:33, 4 July 2007 (UTC)
Do you promise? Then, if you're not too tired out from that, if you've finished wallowing in the afterglow of a deliciously pointless and overblown argument, perhaps you could contribute an answer the poster's question. I assume you do know the answer, as familiar as you are with the ins and outs of Math in all its bewitching subtlety. They did, after all, say that they already knew what format the proof should be in, and asked for help taking it from there. Black Carrot 06:30, 4 July 2007 (UTC)
Whoa.
Black Carrot: You already know that we do not hand out homework answers here (which this question very well may be). I have already provided enough of a hint for the solution for now.
Donald: No need to get angry. No one is trying to say you are wrong, or that our professors are right. We are merely describing a phenomenon that happens in practice, and explaining why, in our opinion, it is so common. -- Meni Rosenfeld (talk) 10:52, 4 July 2007 (UTC)