ABX test

From Wikipedia, the free encyclopedia
Jump to: navigation, search

An ABX test is a method of comparing two choices of sensory stimuli to identify detectable differences between them. A subject is presented with two known samples (sample A, the first reference, and sample B, the second reference) followed by one unknown sample X that is randomly selected from either A or B. If X cannot be identified reliably with a low p-value in a predetermined number of trials, then the null hypothesis cannot be rejected and it cannot be proven that there is a perceptible difference between A and B.

ABX tests can easily be performed as double-blind trials, eliminating any possible unconscious influence from the researcher or the test supervisor. Because samples A and B are provided just prior to sample X, the difference does not have to be discerned from assumption based on long-term memory or past experience. Thus, the ABX test answers whether or not even under ideal circumstances a perceptual difference can be found.

ABX tests are commonly used in evaluations of digital audio data compression methods; sample A is typically an uncompressed sample, and sample B is a compressed version of A. Audible compression artifacts that indicate a shortcoming in the compression algorithm can be identified with subsequent testing. ABX tests can also be used to compare the different degrees of fidelity loss between two different audio formats at a given bitrate.

ABX tests can be used to audition input, processing, and output components as well as cabling: virtually any audio product or prototype design.

Contents

[edit] Hardware tests

Two QSC ABX Comparators in a traveling rack

ABX test equipment utilizing relays to switch between two different hardware paths can help determine if there are perceptual differences in cables and components. Video, audio and digital transmission paths can be compared. If the switching is microprocessor controlled, double-blind tests are possible.

Loudspeaker level and line level audio comparisons could be performed on an ABX test device offered for sale as the "ABX Comparator" by QSC Audio Products from 1998 to 2004. Other hardware solutions have been fabricated privately by individuals or organizations for internal testing.

[edit] Confidence

If only one ABX trial were performed, random guessing would incur a 50% chance of choosing the correct answer, the same as flipping a coin. In order to make a statement having some degree of confidence, many trials must be performed. By increasing the number of trials, the likelihood of statistically asserting a person's ability to distinguish A and B is enhanced for a given confidence level. A 95% confidence level is commonly considered statistically significant.[1] The company QSC, in the ABX Comparator user manual, recommended a minimum of ten listening trials in each round of tests.[2]

Results required for a 95% confidence level:[3][4]

Number of trials 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
Minimum number correct 9 9 10 10 11 12 12 13 13 14 15 15 16 16 17 18

QSC recommended that no more than 25 trials be performed, as listener fatigue can set in, making the test less sensitive (less likely to reveal one's actual ability to discern the difference between A and B).[2] However a more sensitive test can be obtained by pooling the results from a number of such tests using separate individuals or tests from the same listener conducted in between rest breaks. For a large number of total trials N, a significant result (one with 95% confidence) can be claimed if the number of correct responses exceeds N/2+\sqrt{N}. Important decisions are normally based on a higher level of confidence, since an erroneous "significant result" would be claimed in one of 20 such tests simply by chance.

[edit] Software tests

The foobar2000 and the Amarok audio players support software-based ABX testing, the latter using a third-party script. aveX is an open-source software mainly developed for Linux which also provides test-monitoring from a remote computer. ABX patcher is an ABX implementation for Max/MSP. More ABX software can be found at the archived PCABX website.

[edit] Alternatives

[edit] Algorithmic Audio Compression Evaluation

Since ABX testing requires human beings for evaluation of lossy audio codecs, it is time-consuming and costly. Therefore, cheaper approaches have been developed, e.g. PEAQ, which is an implementation of the ODG.

[edit] MUSHRA

In MUSHRA, the listener is presented with the reference (labeled as such), a certain number of test samples, a hidden version of the reference and one or more anchors. A 0-100 RATING scale makes it possible to rate very small differences.

[edit] Discrimination testing

Alternative general methods are used in discrimination testing, such as paired comparison, duo–trio, and triangle testing. Of these, duo–trio and triangle testing are particularly close to ABX testing. Schematically:

ABX
ABX – two knowns, one unknown, test is which of the knowns the unknown is: X = A or X = B.
Duo–trio
AXY – one known, two unknown (one equals A, other equals B), test is which unknown is the known: X = A (and Y = B), or Y = A (and X = B).
Triangle
XXY – three unknowns (two are A and one is B or one is A and two are B), test which is the odd one out: Y = 1, Y = 2, or Y = 3.

In this context, ABX testing is also known as "duo–trio" in "balanced reference" mode – both knowns are presented as references, rather than one alone.[5]

[edit] See also

[edit] Notes

  1. ^ AES Journal,Vol 30, # 5, 1982. David Clark. Double-Blind Comparator
  2. ^ a b QSC ABX Comparator user manual. (1998) p. 10
  3. ^ David Carlstrom. "Probability of Experimental Result Being the Same as Random Guesses". ABX Web Page. http://home.provide.net/~djcarlst/abx_bino.htm. Retrieved 2011-12-14. ] at
  4. ^ P-value
  5. ^ Meilgaard, Morten; Gail Vance Civille, B. Thomas Carr (1999). Sensory evaluation techniques (3 ed.). CRC Press. pp. 68–70. ISBN 0-8493-0276-5. http://books.google.com/books?id=XX9xwk9G0EUC&pg=RA1-PA68. 
Personal tools
Namespaces
Variants
Actions
Navigation
Interaction
Toolbox
Print/export
Languages