Weissman score

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search

The Weissman score is an efficiency metric for lossless compression applications, which was developed for fictional use. It compares both required time and compression ratio of measured applications, with those of a de facto standard according to the data type. It was developed by Tsachy Weissman, a professor at Stanford, and Vinith Misra, a graduate student, at the request of producers for HBO's television series Silicon Valley, about a fictional tech start-up.[1][2][3][4]

The formula is the following; where r is the compression ratio, T is the time required to compress, the overlined ones are the same metrics for a standard compressor, and alpha is a scaling constant.[1]

Weissman score was used in Dropbox Tech Blog to explain real-world work on lossless compression.[5]

Example[edit]

This example shows the score for the data of Hutter Prize,[6] using the paq8f as a standard and 1 as the scaling constant.

Application Compression ratio Compression time [min] Weissman score
paq8f 5.467600 300 1.000000
raq8g 5.514990 420 0.720477
paq8hkcc 5.682593 300 1.039321
paq8hp1 5.692566 300 1.041145
paq8hp2 5.750279 300 1.051701
paq8hp3 5.800033 300 1.060801
paq8hp4 5.868829 300 1.073383
paq8hp5 5.917719 300 1.082325
paq8hp6 5.976643 300 1.093102
paq8hp12 6.104276 540 0.620247
decomp8 6.261574 540 0.63623
decomp8 6.276295 540 0.637726

Limitations[edit]

Although the value is relative to the standards against which it is compared, the unit used to measure the times changes the score (see examples 1 and 2), and the multiplier also can't have a numeric value of 1 or less, because the logarithm of 1 is 0 (examples 3 and 4), and the logarithm of any value less than 1 is negative (examples 5 and 6); that would result in scores of value 0 (even with changes), undefined, or negative (even if better than positive).

Examples[edit]

# standard compressor scored compressor Weissman score Observations
compression ratio compression time log(compression time) compression ratio compression time log(compression time)
1 2.1 2 min 0.30103 3.4 3 min 0.477121 1*(3.4/2.1)*(0.30103/0.477121)=1.021506 Change in unit or scale, changes the result.
2 2.1 120 s 2.079181 3.4 180 s 2.255273 1*(3.4/2.1)*(2.079181/2.255273)=1.492632
3 2.2 1 min 0 3.3 1.5 min 0.176091 1*(3.3/2.2)*(0/0.176091)=0 If time is 1, its log is 0; then the score can be 0 or infinity.
4 2.2 0.667 min -0.176091 3.3 1 min 0 1*(3.3/2.2)*(-0.176091/0)=infinity
5 1.6 0.5 h -0.30103 2.9 1.1 h 0.041393 1*(2.9/1.6)*(-0.30103/0.041393)=-13.18138 If time is less than 1, its log is negative; then the score can be negative.
6 1.6 1.1 h 0.041393 1.6 0.9 h -0.045757 1*(1.6/1.6)*(0.041393/-0.045757)=-0.904627

See also[edit]

References[edit]

  1. ^ a b Perry, Tekla (July 28, 2014). "A Fictional Compression Metric Moves Into the Real World". Retrieved January 25, 2016. 
  2. ^ Perry, Tekla (July 25, 2014). "A Made-For-TV Compression Algorithm". Retrieved January 25, 2016. 
  3. ^ Sandberg, Elise (April 12, 2014). "HBO's 'Silicon Valley' Tech Advisor on Realism, Possible Elon Musk Cameo". The Hollywood Reporter. Retrieved June 10, 2014. 
  4. ^ Jurgensen, John; Rusli, Evelyn M. (April 3, 2014). "There's a New Geek in Town: HBO's 'Silicon Valley'". The Wall Street Journal. Retrieved June 10, 2014. 
  5. ^ "Lossless compression with Brotli in Rust for a bit of Pied Piper on the backend". Dropbox Tech Blog. Retrieved 2017-06-24. 
  6. ^ Hutter, Marcus (July 2016). "Contestants". Retrieved January 25, 2016.