Jump to content

Codec listening test

From Wikipedia, the free encyclopedia
(Redirected from Blind a/b test)

A codec listening test is a scientific study designed to compare two or more lossy audio codecs, usually with respect to perceived fidelity or compression efficiency.

Most tests take the form of a double-blind comparison. Commonly used methods are known as "ABX" or "ABC/HR" or "MUSHRA". There are various software packages available for individuals to perform this type of testing themselves with minimal assistance.

Testing methods

[edit]

ABX test

[edit]

In an ABX test, the listener has to identify an unknown sample X as being A or B, with A (usually the original) and B (usually the encoded version) available for reference. The outcome of a test must be statistically significant. This setup ensures that the listener is not biased by their expectations, and that the outcome is not likely to be the result of chance. If sample X cannot be determined reliably with a low p-value in a predetermined number of trials, then the null hypothesis cannot be rejected and it cannot be proved that there is a perceptible difference between samples A and B. This usually indicates that the encoded version will actually be transparent to the listener.

ABC/HR test

[edit]

In an ABC/HR test, C is the original which is always available for reference. A and B are the original and the encoded version in randomized order. The listener must first distinguish the encoded version from the original (which is the Hidden Reference that the "HR" in ABC/HR stands for), prior to assigning a score as a subjective judgment of the quality. Different encoded versions can be compared against each other using these scores.

MUSHRA

[edit]

In MUSHRA (MUltiple Stimuli with Hidden Reference and Anchor), the listener is presented with the reference (labeled as such), a certain number of test samples, a hidden version of the reference and one or more anchors. The purpose of the anchor(s) is to make the scale be closer to an "absolute scale", making sure that minor artifacts are not rated as having very bad quality.

Results

[edit]

Many double-blind music listening tests have been carried out. The following table lists the results of several listening tests that have been published online. To obtain meaningful results, listening tests must compare codecs' performance at similar or identical bitrates, since the audio quality produced by any lossy encoder will be trivially improved by increasing the bitrate. If listeners cannot consistently distinguish a lossy encoder's output from the uncompressed original audio, then it may be concluded that the codec has achieved transparency.

Popular formats compared in these tests include MP3, AAC (and extensions), Vorbis, Musepack, and WMA. The RealAudio Gecko, ATRAC3, QDesign, and mp3PRO formats appear in some tests, despite much lower adoption as of 2007. Many encoder and decoder implementations (both proprietary and open source) exist for some formats, such as MP3, which is the oldest and best-known format still in widespread use today.

Source Dates Formats Bitrate (kbit/s) Codecs Musical genres Samples Listeners Best Result Comments
ff123 2001 multiple ~128
  • MP3: Lame 3.89beta --abr 134 -h --nspsytune—athtype 2 --lowpass 16—ns-bass -8
  • MP3: Xing within AudioCatalyst 2.1 128 kbit/s, high frequency mode disabled, simple stereo disabled
  • AAC: Liquifier Pro 5.0.0 Beta 2, Build 24 streaming 128, equalization disabled, dynamics disabled, dual mono encoding disabled, audio bandwidth overridden by the program, set at 17995 Hz
  • MPC: mppenc.exe version 1.7.9c -radio -ltq_gain 10 -tmn 12 -nmt 4.8
  • WMAv8: Windows Media Player 7.1 (version 7.01.00.3055); Wmadmoe.dll version 8.0.0.0371 128 kbit/s
  • Ogg Vorbis: Oggdrop RC2 for Windows 32 128 kbit/s
1 16 Musepack and AAC
ff123 2001 October - 2002 January multiple ~128
  • MP3: Lame 3.89beta --abr 134 -h --nspsytune—athtype 2 --ns-bass -8
  • MP3: Xing within AudioCatalyst 2.1 128 kbit/s, high frequency mode disabled, simple stereo disabled
  • AAC: Liquifier Pro 5.0.0 Beta 2, Build 24 streaming 128, equalization disabled, dynamics disabled, dual mono encoding disabled, audio bandwidth overridden by the program, set at 17995 Hz
  • MPC: mppenc.exe version 1.7.9c -radio -ltq_gain 10 -tmn 12 -nmt 4.8
  • WMAv8: Windows Media Player 7.1 (version 7.01.00.3055); Wmadmoe.dll version 8.0.0.0371 128 kbit/s
  • Vorbis: **Oggdrop pre-RC3 for Windows 32; from CVS (10/26/01) 128 kbit/s
Various 3 25-28 Musepack
or Vorbis
ff123 2002 July multiple ~64
  • Ogg Vorbis 1.0 -b 64—managed
  • Ogg Vorbis 1.0 -q 0
  • MMJB 7.2 mp3PRO 64
  • WMA8 at 64 kbit/s (using WMP 7.1 to encode)
  • QuickTime 6.0 MPEG-4 AAC Low complexity at 64 kbit/s
Various 12 24-41 mp3PRO Both Vorbis variants were a close second.
Roberto Amorim 2003 June AAC 128 CBR
  • Psytel AAC-enc 2.15 -br 128
  • Ahead/Nero 5.5.10.35 128 kbit/s CBR, high quality
  • Sorenson Squeeze 3.5 (FhG Pro) 128 kbit/s
  • Apple QuickTime 6.3 (Apple/Dolby) 128 kbit/s high quality
  • FAAC 1.17b -a 64 (64 kbit/s/channel, ABR)
Various 10 11-18 QuickTime
Roberto Amorim 2003 July multiple ~128
  • Apple QuickTime 6.3 MP4 encoder 128 kbit/s high quality
  • LAME MP3 Encoder 3.90.3 --alt-preset 128
  • Musepack 1.14 --quality 4 --xlevel
  • Ogg Vorbis post-1.0 CVS -q 4.25
  • Windows Media Audio v9 PRO bitrate-managed 2-pass VBR 128 kbit/s
Various 12 14-24 Musepack AAC, WMA, and Vorbis tied for close second
Roberto Amorim 2003 September multiple ~64
  • Ahead/Nero 6.0.0.15 HE AAC VBR profile Streaming :: Medium, high quality
  • Ogg Vorbis post-1.0 CVS -q 0
  • mp3PRO (from Adobe Audition 1.0) VBR quality 40, Current Codec, allow M/S and IS, allow narrowing, no CRC
  • Real Audio Gecko (from Real Producer 9.0.1) 64 kbit/s
  • Windows Media Audio v9 VBR quality 50
  • QuickTime 6.3 AAC LC 64 kbit/s, Best Quality
Various 12 30-43 Nero
HE-AAC
This test showed that listeners preferred 128 kbit/s MP3 audio encoded by LAME to all the tested codecs at 64 kbit/s, with greater than 99% confidence:

"No codec delivers the marketing plot [sic] of same quality as MP3 at half the bitrates."

Roberto Amorim 2004 January MP3 ~128
  • LAME encoder 3.95 --preset 128
  • FhG MP3 encoder from Adobe Audition 1.0 VBR quality 40, "Current - Best" codec.
  • Apple iTunes 4.2 MP3 112 kbit/s VBR, Highest quality, joint stereo, smart encoding
  • GOGO-no-coda 3.12 -b 128 -a -q 0
  • Audioactive Encoder 2.04 128 kbit/s High Quality
  • Xing MP3 Encoder 1.5 VBR quality normal
Various 12 11-22 LAME The author noted that the results may have been affected by the use of an outdated version of the Xing encoder and non-optimal settings for ITunes.
Roberto Amorim 2004 February AAC ~128
  • Ahead/Nero AAC-enc v 2.6.2.0 -internet profile, high quality, LC
  • Apple iTunes 4.2 (Apple/QuickTime) 128 kbit/s
  • Compaact! 1.2beta3 (zPlane/HHI) VBR 5, high quality, LC
  • FAAC 1.23.5 -q 115
  • Real Producer 10 beta (CodingTechnologies) 128 kbit/s
Various 12 19-29 iTunes Open-source FAAC codec improved greatly since previous test
Roberto Amorim 2004 May multiple ~128
  • LAME encoder 3.96 -V5—athaa-sensitivity 1
  • Apple iTunes 4.2 128 kbit/s AAC
  • Ogg Vorbis aoTuV tuning b2 -q 4.35
  • Musepack 1.14b --quality 4.15—xlevel
  • Sony Atrac3 132 kbit/s
  • Microsoft WMA9 Std Bitrate VBR 128 kbit/s
Various 18 12-27 aoTuV (Vorbis) and Musepack
Roberto Amorim 2004 June multiple 32 CBR
  • LAME encoder 3.96 -b 32
  • Nero Ahead HE AAC+PS 32 kbit/s CBR High Quality
  • Ogg Vorbis post-1.0.1CVS --managed -b 32 resampled with SSRC
  • Real Audio 32 kbit/s stereo music codec in Helix Producer 10
  • QDesign Music Codec 2 Pro 32 kbit/s at 32 kHz, Quality mode
  • Microsoft WMA9 Std 32 kbit/s at 32 kHz
  • mp3PRO 32 kbit/s at 32 kHz, in Adobe Audition 1.5
Various 18 47-77 Nero
HE-AAC
HydrogenAudio user "guruboolez" 2004 July multiple ~175
  • MPC: musepack -standard
  • MP3: LAME 3.97 alpha -V 3; -V 2
  • Ogg Vorbis: megamix -q 6,00; -q 6,99; -q 5,50
Classical 18 1 Musepack
HydrogenAudio user "guruboolez" 2005 August multiple ~180
  • AAC: Faac 1.24.1. Release date: end 2004 (?). Setting: -q175
  • AAC: Nero Digital aacenc32 v.3.2.0.15. Release date: June 2005. Setting: -streaming (high/default encoder).
  • MP3: LAME 3.97 alpha 11. Release date: July 2005. Setting: -V2—vbr-new
  • MPC: mppenc 1.15v. Release date: march 2005. Setting: --quality 5
  • Ogg Vorbis: aoTuV beta 4 based on 1.1.1. Release date: July 2005. Setting: -q6,00
Classical 18 1 aoTuV (Vorbis) The author reflects on substantial improvements in Vorbis encoding since his previous test (above):

"Vorbis is now –thanks to Aoyumi [creator of aoTuV]– an excellent audio format for 180 kbit/s encodings (and classical music)."

gURuBoOleZZ (in French) 2005 August multiple ~96
  • AAC-LC: iTunes 4.9 / QuickTime 7.02 CBR 96
  • MP3: LAME 3.97 alpha 11 --abr 99
  • MPC: mppenc 1.14 --xlevel—quality 3 (or—thumb)
  • Ogg Vorbis: aoTuV / LANCER beta 4 based on SVN 1.1.1 -q2,00
  • WMA Standard: WMA 9.1 CBR 96
Classic, various 150 classical, 35 various 1 aoTuV and AAC tied (classical), aoTuV (various) The author selected each participating encoder by pitting multiple encoders against one another in an initial "Darwinian phase." For example, LAME was chosen as the representative MP3 encoder because it clearly outperformed four other MP3 encoders on a subset of the full sample corpus.
Sebastian Mares 2005 December multiple ~140 (nominal 128)
  • Nero AAC 3.1.0.2 VBR/Stereo - Streaming, 100-120 kbit/s [LC AAC]
  • iTunes AAC 6.0.1.3 128 kbit/s, VBR
  • LAME 3.97 Beta 2 -V5—vbr-new
  • Ogg Vorbis AoTuV 4.51 Beta -q 4.25
  • WMA Professional 9.1 Quality-Based VBR, Q50
  • Shine 0.1.4 (Low Anchor) -b 128
Various 18 18-30 4-way tie (all except Shine) "I think this test shows that with the current encoders, the quality at 128 kbit/s is very good... It's time to move to bitrates like 96 kbit/s or even lower (64 kbit/s)."
Mp3-tech.org 2006 March AAC 48
  • 3gpp 6.3.0 48 kbit/s CBR
  • Coding Technologies - Winamp 5.2 beta 393 48 kbit/s CBR HE-AAC
  • Coding Technologies - Winamp 5.2 beta 393 48 kbit/s CBR HEv2-AAC
  • Nero Digital 4.9.9.95 48 kbit/s ABR HE-AAC
  • Nero Digital 4.9.9.96 48 kbit/s ABR HEv2-AAC
  • iTunes 6.0.2 (Low Anchor) 48 kbit/s CBR
  • LAME 3.97b2 (High Anchor) -V5
Various 18 10-20 5-way tie
(all except anchors)
"... it seems that overall, plain HE-AAC might be better than HE-AAC v2 at this bitrate, but a lot more samples would be needed to be able to draw definitive conclusions regarding this.
Sebastian Mares 2006 November multiple ~48
  • Ogg Vorbis AoTuV 5 Beta -q -1
  • WMA Professional 10 1-pass CBR, 48 kbit/s
  • Nero HE-AAC May 26, 2006 -q 0.2
  • WMA Standard 9.2 Quality-Based VBR, Q10
  • iTunes AAC 7.0.2.16 48 kbit/s, CBR
Various 20 22-34 Nero
HE-AAC
WMA Professional and aoTuV tied for second
Sebastian Mares 2007 July multiple ~64
  • Ogg Vorbis AoTuV 5 Beta -q 0
  • WMA Professional 10 1-pass CBR, 64 kbit/s
  • Nero HE-AAC Jul 20 2007 -q 0.24
Various 18 21-33 Nero Digital and WMA Professional
Sebastian Mares 2008 October MP3 ~128
  • LAME 3.98.2 -V5.7
  • LAME 3.97 -V5—vbr-new
  • iTunes 8.0.1.11 112 kbit/s, VBR, highest quality, joint stereo, smart encoding, filter below 10 Hz
  • Fraunhofer IIS mp3surround CL encoder v1.5 -br 0 -m 4 -q 1 -vbri -ofl
  • Helix v5.1 2005.08.09 -X2 -U2 -V60
  • l3enc 0.99a (Low Anchor) -br 128000 -mod 1
Various 14 26-39 5-way tie
(all except L3enc)
"The quality at 128 kbps is very good and MP3 encoders improved a lot since the last test." Also notes that Fraunhofer and Helix codecs are several times faster at encoding than LAME, although virtually identical in terms of perceived audio quality.
HydrogenAudio user IgorC (March/April 2011) 2011 March multiple ~64
  • Ogg Vorbis AoTuV 6.02 Beta -q 0.1
  • Apple HE-AAC constrained VBR, high quality, 64 kbit/s
  • CELT complexity 10, VBR 67.5 kbit/s
  • Nero HE-AAC -q 0.245
Various 30 25-13 CELT / Opus In results, CELT is referred to as Opus, its name when later standardized.
HydrogenAudio user IgorC (July - August 2011) 2011 July/August LC-AAC ~96
  • Nero 1.5.4.0 -q 0.345
  • Apple QuickTime 7.6.9 true VBR, high quality, 96 kbit/s
  • Apple QuickTime 7.6.9 constrained VBR, high quality, 96 kbit/s
  • Fraunhofer IIS (via Winamp 5.62) VBR 3
  • Coding Technologies (via Winamp 5.61) CBR 100 kbps
Various 20 25 Apple QuickTime
HydrogenAudio user "Kamedo2" 2013 May MP3 ~224
  • Lame3100i -V2+
  • LAME 3.99.5 -V1
  • LAME 3.98.4 -q 0 -b 224
  • Helix v5.1 -X2 -U2 -V146
  • BladeEnc (Low Anchor) -quit -nocfg -224
Various 25 1 4-way tie
(all except BladeEnc
low anchor)
Most impairment grades rated between 4 (perceptible but not annoying) and 5 (imperceptible). Both speech samples transparent (p<0.02) except for the low anchor.
HydrogenAudio user Kamedo2 (July/September 2014) 2014 July - September multiple ~96
  • AAC Apple QuickTime iTunes 11.2.2 (qaac 2.4.1) constrained VBR, high quality, 96 kbit/s
  • Opus 1.1 VBR, 96 kbit/s
  • Ogg Vorbis aoTuV Beta6.03 -q 2.2 (~96 kbps)
  • MP3 LAME 3.99.5 VBR, -V 5 (~130 kbps, a well-known comparison but at higher bitrate)
  • AAC FAAC v1.28 (Mid-low Anchor) -b 96
  • AAC FAAC v1.28 (Low Anchor) -q 30 (~52 kbps)
Various 40 33 Opus In results Opus is clear winner, Apple AAC is second, Ogg Vorbis and higher-bitrate LAME MP3 are statistically tied in joint third place. FAAC, known to be inferior in advance, was used to discard bad results and as quality scale anchor.


Cunningham and McGregor 2019 February multiple 192 - 1411
  • Uncompressed WAV
  • MP3 CBR 192 kbps
  • AAC 192 kbps CBR
  • ACER low quality ~1023 kbps VBR
  • ACER medium quality ~1130 kbps VBR
  • ACER high quality ~1233 kbps VBR
Pop 10 100 5-way tie (WAV, MP3, AAC, ACER HQ, ACER MQ) Participants reported no perceived differences between the uncompressed, MP3, AAC, ACER high quality, and ACER medium quality compressed audio in terms of noise and distortions but that the ACER low quality format was perceived as being of lower quality. However, in terms of participants’ perceptions of the stereo field, all formats under test performed as well as each other, with no statistically significant differences.[1]


Source Dates Formats Bitrate (kbit/s) Codecs Musical genres Samples Listeners Best Result Comments

See also

[edit]

References

[edit]
  1. ^ Cunningham, Stuart; McGregor, Iain (2019). "Subjective Evaluation of Music Compressed with the ACER Codec Compared to AAC, MP3, and Uncompressed PCM". International Journal of Digital Multimedia Broadcasting. 2019: 1–16. doi:10.1155/2019/8265301. Material was copied from this source, which is available under a Creative Commons Attribution 4.0 International License.
[edit]