Mathematics desk
< June 14	<< May \| June \| Jul >>	June 16 >

Welcome to the Wikipedia Mathematics Reference Desk Archives
The page you are currently viewing is a transcluded archive page. While you can leave answers for any questions shown below, please ask new questions on one of the current reference desk pages.

June 15[edit]

Representing a big data set by a small data set having the samecumulants?[edit]

Consider a data set X = (X₁, X₂, . . . , X_I) having mean value μ and standard deviation σ.

The one element data set (μ) has the same mean value as the big data set X.

The two element data set (μ-σ, μ+σ) has the same mean value and the same standard deviation as the big data set X.

I want to generalize this.

What is the three-element data set (A, B, C) having the same mean value and the same standard deviation and the same skewness as X?

What is the four-element data set (A, B, C, D) having the same mean value and the same standard deviation and the same skewness and the same kurtosis as X?

and so on.

Bo Jacoby (talk) 22:05, 15 June 2015 (UTC).[reply]

I think the way the proceed is this, first restate the problem in terms of moments. So you want A, B, C, .. so that A+B+C, A²+B²+C², A³+B³+C³, ... have given values. These are power sums and you can use Newton's identities to convert these into elementary symmetric polynomials. Using these as coefficients, write down a polynomial. The roots of this polynomial are then the values A, B, C, ... that you want. For example, for two elements you want A, B so that P₁=A+B=(2/n)Σ_i X_i and P₂=A²+B²=(2/n)Σ_i X_i². Then let E₁=P₁ and E₂=(E₁P₁-P₂)/2. The values A, B are now the roots of X²-E₁X+E₂=0. You have to solve an equation with degree equal to the size of the set. I think this is analogous to Chebyshev Quadrature but for arbitrary moments. (This is like Gaussian quadrature but with equal weights. Not sure if we cover this, but see [1].) With Chebyshev Quadrature you start to get complex roots for large n, so the same thing will probably happen here as well. I'm more (but still not very) familiar with the Gaussian type Quadrature because it applies the theory of orthogonal polynomials; in that case you're guaranteed to get real roots and you get more moments with the same number of data points, you just have to allow arbitrary weights. Not sure if there is a similar theory for Chebyshev. (There are Chebyshev polynomials but those are different afaik.)--RDBury (talk) 06:36, 16 June 2015 (UTC)[reply]

You can also consider the polynomial

p(x)=\prod _{j}(1-xX_{j})

where the

X_{j}

are the unknown data elements of the small data set (denoted by A, B, C, etc. by Bo above). The series expansion of the logarithm is then given by:

\log \left[p(x)\right]=-\sum _{k=1}^{\infty }{\frac {M_{k}}{k}}x^{k}

where

M_{k}=\sum _{j}X_{j}^{k}

are the moments that are known. So, you can directly write down the logarithm of the polynomial using the known moments, exponentiation is easy using most computer algebra systems (I'm sure Bo can write a compact J program for this :) ) and then the $X_{j}$ can be extracted from the zeros (and I think there is a simple J routine for that too.) So, I wouldn't be surprised if Bo can come up with a one line J program that will do the job. Count Iblis (talk) 15:17, 16 June 2015 (UTC)[reply]

Thanks gentlemen! I think I am on track now. Bo Jacoby (talk) 07:04, 17 June 2015 (UTC).[reply]

This is a one line J program implementing RDBury's method for two elements. (Oops: the double apostrophes around p q are changed to italics by the WP editor!)

   simplify=. 3 : '|.>{:p.(-:q-*:p),p,_1[''p q''=.2*}.(%{.)+/y^/i.3'
   simplify 1 2 2 2 3
1.36754 2.63246
   simplify simplify 1 2 2 2 3
1.36754 2.63246

Bo Jacoby (talk) 10:38, 17 June 2015 (UTC).[reply]

I took the liberty of fixing your apostrophe issue. -- Meni Rosenfeld (talk) 23:00, 17 June 2015 (UTC)[reply]

Thank you Meni! Bo Jacoby (talk) 04:33, 18 June 2015 (UTC).[reply]

This 6-liner implements RDBury's method for computing three elements and four elements etc. As predicted the roots are sometimes complex.

simplify=. 4 : 0
y=.x*}.(%{.)+/y^/i.>:x
x=.1
for.y do.x=.((-/x*(#x){.y)%#x),x end.
-|.>{:p.x
)

Examples:

  1 simplify 1 2 2 2 3
2
  2 simplify 1 2 2 2 3
1.36754 2.63246
  3 simplify 1 2 2 2 3
1.2254 2 2.7746
  4 simplify 1 2 2 2 3
1.05666 2j0.29983 2j_0.29983 2.94334
  5 simplify 1 2 2 2 3
1 2 2 2 3
  10 simplify 1 2 2 2 3
1 1 2 2 2 2 2 2 3 3

Thank you everybody. The problem is solved. -- Bo Jacoby (talk) 20:37, 18 June 2015 (UTC).[reply]