Jump to content

Weighted median

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Sytelus (talk | contribs) at 06:07, 2 September 2015 (Added properties, example and algorithm pseudocode.). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

The top chart shows a list of elements with values indicated by height and the median element shown in red. The lower chart shows the same elements with weights as indicated by the width of the boxes. The weighted median is shown in red and is different than the ordinary median.

In statistics, a weighted median of a sample is the 50% weighted percentile.[1][2][3] It was first proposed by F. Y. Edgeworth in 1888.[4][5] Like the median, it is useful as an estimator of central tendency, robust against outliers. It allows for non-uniform statistical weights related to, e.g., varying precision measurements in the sample.

Properties

Weighted median of a set partitions that set in two halves such that sum of weights in each partition is as equal as possible.

If weights of all numbers in the set are equal then median is same as weighted median.

When set has even number of elements, we will have lower weighted median and upper weighted median instead of a single element as median.

Example

Consider the set of numbers with each number having weights respectively. The median is 0.1 and the weighted median is 0.2.

Algorithm

Weighted median can be computed by sorting the set of numbers and finding the smallest numbers which sums to half the weight of total weight. This algorithm takes time. There is more optimal approach to find weighted median as using modified selection algorithm.[1]

//Main call is WeightedMedian(a, 1, n)
//Returns lower median
WeightedMedian(a[1..n], p, r)
  //base case for single element
  if r = p
    return a[p]
  //base case for two elements
  //make sure we return lower median
  if r-p = 1
    if a[p].w >= a[r].w
      return a[p]
    else return a[r]
  //partition around pivot r
  q = partition(a, p, r)
  wl, wg = sum weights of partitions (p, q-1), (q+1, r)
  //if partitions are balanced then we are done
  if wl and wg both < 1/2
    return a[q]
  else 
    //increase pivot weight by the amount of partition we eliminate
    if wl > wg
      a[q].w += wg
      //recurse on pivot inclusively 
      WeightedMedian(a, p, q)
    else
      a[q].w += wl
      WeightedMedian(a, q, r)

As the recursive call in inclusive of median, recurrence relation is,

Which yields runtime.

See also

References

  1. ^ a b Cormen, Thomas H.; Leiserson, Charles E.; Rivest, Ronald L.; Stein, Clifford (2001). "Introduction to Algorithms". ISBN 9780262032933. {{cite journal}}: Cite journal requires |journal= (help)
  2. ^ Horowitz, Ellis; Sahni, Sartaj; Rajasekaran, Sanguthevar (1996-12-15). "Computer Algorithms C++: C++ and Pseudocode Versions". ISBN 9780716783152. {{cite journal}}: Cite journal requires |journal= (help)
  3. ^ Bovik, Alan C (2010-07-21). "Handbook of Image and Video Processing". ISBN 9780080533612. {{cite journal}}: Cite journal requires |journal= (help)
  4. ^ Edgeworth, F. Y. (1888). "On a New Method of Reducing Observations Relating to Several Quantities". Philosophical Magazine. 25 (154): 184. doi:10.1080/14786448808628170.
  5. ^ Attention: This template ({{cite doi}}) is deprecated. To cite the publication identified by doi: 10.2307/23036355 , please use {{cite journal}} (if it was published in a bona fide academic journal, otherwise {{cite report}} with |doi= 10.2307/23036355 instead.