Weissman score
The Weissman score is an efficiency metric for lossless compression applications, which was developed for fictional use. It compares both required time and compression ratio of measured application, with those of a de facto standard according to the data type. It was developed by Tsachy Weissman, Stanford Professor, and Vinith Misra, a graduate student, at the request of producers for HBO's television series Silicon Valley, about a fictional tech start-up. [1] [2] [3] [4]
The formula is the following; where r
is the compression ratio, T
is the time required to compress, the overlined ones are the same metrics for a standard compressor, and alpha is a scaling constant.[1]
Weissman score was used in Dropbox Tech Blog to explain real-world work on lossless compression. [1]
Example
This example shows the score for the data of Hutter Prize,[5] using the paq8f as a standard and 1 as the scaling constant.
Application | Compression ratio | Compression time [min] | Weissman score |
---|---|---|---|
paq8f | 5.467600 | 300 | 1.000000 |
raq8g | 5.514990 | 420 | 0.720477 |
paq8hkcc | 5.682593 | 300 | 1.039321 |
paq8hp1 | 5.692566 | 300 | 1.041145 |
paq8hp2 | 5.750279 | 300 | 1.051701 |
paq8hp3 | 5.800033 | 300 | 1.060801 |
paq8hp4 | 5.868829 | 300 | 1.073383 |
paq8hp5 | 5.917719 | 300 | 1.082325 |
paq8hp6 | 5.976643 | 300 | 1.093102 |
paq8hp12 | 6.104276 | 540 | 0.620247 |
decomp8 | 6.261574 | 540 | 0.63623 |
decomp8 | 6.276295 | 540 | 0.637726 |
Limitations
Although the value is relative to the standards against which it is compared, the unit used to measure the times changes the score (see examples 1 and 2). And the times also can't have a numeric value of 1 or less, because the logarithm of 1 is 0 (examples 3 and 4), and the logarithm of any value less than 1 is negative (examples 5 and 6); that would result in scores of value 0 (even with changes), undefined, or negative (even if better than positive).
Examples
# | standard compressor | scored compressor | Weissman score | Observations | ||||
---|---|---|---|---|---|---|---|---|
compression ratio | compression time | log(compression time) | compression ratio | compression time | log(compression time) | |||
1 | 2.1 | 2 min | 0.30103 | 3.4 | 3 min | 0.477121 | 1*(3.4/2.1)*(0.30103/0.477121)=1.021506 | Change in unit or scale, changes the result. |
2 | 2.1 | 120 s | 2.079181 | 3.4 | 180 s | 2.255273 | 1*(3.4/2.1)*(2.079181/2.255273)=1.492632 | |
3 | 2.2 | 1 min | 0 | 3.3 | 1.5 min | 0.176091 | 1*(3.3/2.2)*(0/0.176091)=0 | If time is 1, its log is 0; then the score can be 0 or undefined. |
4 | 2.2 | 0.667 min | -0.176091 | 3.3 | 1 min | 0 | 1*(3.3/2.2)*(-0.176091/0)=undefined | |
5 | 1.6 | 0.5 h | -0.30103 | 2.9 | 1.1 h | 0.041393 | 1*(2.9/1.6)*(-0.30103/0.041393)=-13.18138 | If time is less than 1, its log is negative; then the score can be negative. |
6 | 1.6 | 1.1 h | 0.041393 | 1.6 | 0.9 h | -0.045757 | 1*(1.6/1.6)*(0.041393/-0.045757)=-0.904627 |
See also
References
- ^ a b Perry, Tekla (July 28, 2014). "A Fictional Compression Metric Moves Into the Real World". Retrieved January 25, 2016.
- ^ Perry, Tekla (July 25, 2014). "A Made-For-TV Compression Algorithm". Retrieved January 25, 2016.
- ^ Sandberg, Elise (April 12, 2014). "HBO's 'Silicon Valley' Tech Advisor on Realism, Possible Elon Musk Cameo". The Hollywood Reporter. Retrieved June 10, 2014.
- ^ Jurgensen, John; Rusli, Evelyn M. (April 3, 2014). "There's a New Geek in Town: HBO's 'Silicon Valley'". The Wall Street Journal. Retrieved June 10, 2014.
- ^ Hutter, Marcus (July 2016). "Contestants". Retrieved January 25, 2016.