Copy testing
| Marketing |
|---|
| Key concepts |
| Product marketing · Pricing Distribution · Service · Retail Brand management Account-based marketing Ethics · Effectiveness · Research Segmentation · Strategy · Activation Management · Dominance Marketing operations |
| Promotional contents |
| Advertising · Branding · Underwriting Direct marketing · Personal sales Product placement · Publicity Sales promotion · Sex in advertising Loyalty marketing · SMS marketing Premiums · Prizes |
| Promotional media |
| Printing · Publication · Broadcasting Out-of-home advertising · Internet Point of sale · Merchandise Digital marketing · In-game advertising Product demonstration · Word-of-mouth Brand ambassador · Drip marketing · Visual merchandising |
|
|
This article may need to be wikified to meet Wikipedia's quality standards. Please help by adding relevant internal links, or by improving the article's layout. (March 2011)
Click [show] on right for more details.
No reason has been cited for the Wikify tag on this article.
|
Copy testing is a form of marketing research, in which, focus groups analyze content prior to airing. This specialized field of marketing research determines an ad’s effectiveness based on consumers’ responses during pre-testing. It covers all media channels including print, TV, radio, Internet etc. Also known as copy testing, pre-testing it is considered the most accurate way to predict how an ad will perform. Based upon the analysis of feedback gathered from a target audience. Each test will either qualify the ad as strong enough to meet company action standards for airing, or identify opportunities to improve the performance of the ad through editing. (Young, p.213)
Pre-testing identifies weak spots within an ad campaign, to more effectively edit ads and streamlines the selection of images to use in an integrated campaign’s print ad, to pull out the key moments for use in ad tracking, and to identify branding moments. [1]
Features of a Good Copy Testing system
In 1982, a consortium of 21 leading advertising agencies including N.W.Ayers, D’Arcy, Grey, McCann-Erikson, Needham Harper & Steers, Ogilvy & Mather, J.Walter Thompson, Young & Rubicam etc. released a public document where they laid out the PACT (Positioning Advertising Copy Testing) Principles on what constitutes a good copy testing system. PACT states a good copy testing system must meet the following criteria:
- Provides measurements which are relevant to the objectives of the advertising
- Requires agreements about how the results will be used in advance of each specific test.
- Provides multiple measurements – because single measurements are generally inadequate to assess the performance of an advertisement/
- Based on a model of human response to communications – the reception of a stimulus, the comprehension of the stimulus and the response to the stimulus.
- Allows for consideration of whether the advertising stimulus should be exposed more than once.
- Recognizes that the more finished a piece of copy is, the more soundly it can be evaluated and requires, as a minimum, that alternative executions be tested in the same degree of finish.
- Provides controls to avoid the biasing effects of the exposure context.
- Takes into account basic considerations of sample definition.
- Demonstrates reliability and validity.
Contents |
[edit] Four Types of Copy Testing Scores
There are four general themes woven into the last century of copy testing. To understand how the different types of measures relate to one another, see the heuristic advertising model here Ameritest TV Ad Model or here Copymetrics Attention, Emotion and Memory Model.
[edit] Report Card Measures
The first theme is the quest for a valid, single-number statistic to capture the overall performance of the advertising creative. This search has spawned the creation of various report card measures. These measures are used to filter commercial executions and help management make the go/no go decision about which ads to air. (Young, p. 7). The predominant copy testing measure of the 1950s and 1960s, Day-After Recall (DAR) was interpreted to measure an ad’s ability to “break through” into the mind of the consumer and register a message from the brand in long-term memory. (Honomichl) Once this measure was adopted by Procter and Gamble, it became a research staple. (Honomichl)
In the 1970s and 1980s, after DAR was determined to be a poor predictor of sales, the research industry began to depend on the measure of persuasion as an accurate predictor of sales. This shift was led, in part, by researcher Horace Schwerin who pointed out, “the obvious truth is that a claim can be well remembered but completely unimportant to the prospective buyer of the product – the solution the marketer offers is addressed to the wrong need.” (Honomichl). As with DAR, it was Procter and Gamble’s acceptance of the persuasion measure (also known as motivation) that made it an industry standard. Recall scores were still provided in copy testing reports with the understanding that persuasion was the measure that mattered. (Honomichl)
The 1970s also saw a re-examination of the “breakthrough” measure. As a result, an important distinction was made between the attention-getting power of the creative execution and how well “branded” the ad was. Thus, the separate measures of attention and branding were born. (Young, p.12)
[edit] Obstacles
In the 70s, 80s, and 90s, tests were conducted to validate a link between the recall score and actual sales. For example, Procter and Gamble reviewed 10 year’s worth of split-cable tests (100 total) and found no significant relationship between recall scores and sales. (Young, pp. 3-30) In addition, Wharton University’s marketing guru Leonard Lodish conducted an even more extensive review of test market results and also failed to find a relationship between recall and sales. (Lodish pp. 125-139) Harold Ross of Mapes & Ross found that persuasion was a better predictor of sales than recall. (Ross pp.13-16)
[edit] Diagnostic Measures
The second theme is the development of diagnostic copy testing, the main purpose of which is optimization. Understanding why diagnostic measures such as attention, brand linkage, and motivation are high or low can help advertisers identify creative opportunities to improve executions. (Young, p.7)
[edit] Obstacles
Different approaches have been developed by research companies to determine the report card measures of attention, brand linkage, and motivation. For example, Unilever analyzed a database of commercials “triple-tested” using the three leading approaches to the measure of branding (Ameritest, ASI, and Millward Brown) which shows that each of the three is measuring something uncorrelated with, and therefore different from, the other two. (Kastenholtz, Kerr & Young). This condition has to be text via to the best of advertisement in section of division
[edit] Non-Verbal Measures
The third theme is the development of non-verbal measures in response to the belief of many advertising professionals that much of a commercial’s effects – e.g. the emotional impact – may be difficult for respondents to put into words or scale on verbal rating statements. In fact, many believe the commercial’s effects may be operating below the level of consciousness. (Young, p.7) According to researcher Chuck Young, “There is something in the lovely sounds of our favorite music that we cannot verbalize – and it moves us in ways we cannot express.” (Young, p.22)
[edit] Obstacles
In the 1970s, researchers, such as Herbert Krugman sought to measure these non-verbal measures biologically by tracking brain wave activities as respondents watched commercials. (Krugman) Others experimented with galvanic skin response, voice pitch analysis, and eye-tracking. (Young, p.22) These efforts were not popularly adopted, in part, because of the limitations of the technology as well as the poor cost-effectiveness of what was widely perceived as academic, not actionable research.
[edit] Solutions
In the 1990s, the Picture Sorts were created as a method of deconstructing a viewer’s dynamic response to the film on multiple levels. A Flow of Attention graph, as one example of a Picture Sort, measures how the eye pre-consciously filters the visual information in an ad and serves both as a gatekeeper for human consciousness and as an interactive search engine. More mainstream than the biological measures, Picture Sorts have been used extensively for on-line ad testing and, because they are not language-dependent, have been used around the world by major advertisers as diverse as IBM and Unilever. (Young, p.24) Example of Ameritest Flow of Attention Graph
More recently, research companies have started to use psychological tests, such as the Stroop effect, to measure the emotional impact of copy. These techniques exploit the notion that viewers do not know why they react to a product, image, or ad in a certain way (or that they reacted at all) because such reactions occur outside of awareness, through changes in networks of thoughts, ideas, and images.
[edit] Moment-by-Moment Measures
The fourth theme, which is a variation on the previous two, is the development of moment-by-moment measures to describe the internal dynamic structure of the viewer’s experience of the commercial, as a diagnostic counterpoint to the various gestalt measures of commercial performance or predicted impact. (Young, p.7)
In the early 1980s the shift in analytical perspective from thinking of a commercial as the fundamental unit of measurement to be rated in its entirety, to thinking of it as a structured flow of experience, gave rise to experimentation with moment-by-moment systems. The most popular of these was the dial-a-meter response which required respondents to turn a meter, in degrees, toward one end of a scale or another to reflect their opinion of what was on screen at that moment. PDF
[edit] Obstacles
Unless the dial-a-meter is calibrated by normalizing the data to each individual’s reaction time, the aggregate sample data will be spread across many measurement intervals. Second, dial-a-meters contain an uncertainty range around which moment is actually being measured because of differences in respondent response times. Relatively little has been published to validate dial-a-meter diagnostics to traditional measures of overall ad performance such as recall and persuasion. PDF
[edit] Solutions
In the 1990s, the Ameritest Picture Sorts shifted the frame of measurement from clock time (the dial-a-meter approach) to the “subjective time” of experience which is tied to the rate of information flow in the film, or the ad’s visual complexity. Instead of providing a rating whenever the alarm rings, respondents rate a Picture Sort image only when the mood, message, or image changes significantly. The data results are clear, easy to understand, and visually appealing. (Young, p. 23) Examples of an Ameritest Flow of Emotion Graph can be seen in The Advertising Research Handbook, (Young, p. 202) and here [2] in Exhibit 2.
In addition, the dial-a-meter’s single-scale limitations are overcome with a set of moment-by-moment measures in three dimensions: wiktionary: Flow of Attention Flow of Attention which measures the memorability of each moment, Flow of Emotion which measures the positive or negative emotional response to each moment, and Flow of Meaning which measures how well the brand’s strategic values are being communicated in each moment.
[edit] Copy Testing in Action
Copy testing is utilized in an array of fields ranging from commercial development to presidential elections. In 2007, CNN employed this form of market testing throughout the primary and general election. Professor Rita Kirk and Dan Schill from Southern Methodist University worked with CNN to gauge voters reaction to debates between presidential hopefuls. [3]
[edit] Relevant Terms
- advertising
- aesthetic emotion
- attention
- awareness
- brand
- brand linkage
- brand stretch
- branding
- branding moment
- copy sort
- day-after recall (DAR)
- Flow of Attention
- Flow of Emotion
- Flow of Meaning
- motivation
- persuasion
- Picture Sorts
- pre-test
- program engagement
- selling-edge analysis
- semantic information
- stickiness
- stopping power
[edit] References
- http://www.ameritest.net/products/tv.php Ameritest TV Ad Model
http://www.ameritest.net/choose/
Example of Ameritest Flow of AttentionGraph http://www.ameritest.net/products/tv.php
- http://www.copymetrics.com Copymetrics Copy test New approach to test effectiveness of ads using cognitive sciences, evaluating effect on Attention, Emotion and Memory.
Article:A Short History of AdvertisingPDF (196 KiB).
Understanding Copy Pretesting (1994) Published by Advertising Research Foundation,NY.
Foreman, Tom. "Focus Group's Satisfaction Grows for GOP Field during Debate - CNN.com." CNN.com - Breaking News, U.S., World, Weather, Entertainment & Video News. CNN. Web. 20 Jan. 2012. Honomichl, J. J., Honomichl on Marketing Research, Lincolnwood, IL: NTC Business Books, 1986.
Kastenholz, J., Kerr, G., & Young, C., Focus and Fit: Advertising and Branding Join Forces to Create a Star. Marketing Research, Spring 2004, 16-21.
Krugman, H., Memory Without Recall, Exposure Without Perception. Journal of Advertising Research, July/August, 1977.
Lodish, L. M., Abraham, M., Kalmenson, Slk, Livelsberger, J., Lubetkin, B., Richardson, B., & Stevens, M. E., How TV Advertising Works: A Meta-Analysis of 389 Real World Split Cable TV Advertising Experiments. Journal of Marketing Research, May 2995, 125-139.
Ross, H., Recall vs. Persuasion: Ans Answer. Journal of Marketing Research, 1982, 22(1), 13-16.
Young, Charles E., The Advertising Research Handbook, Ideas in Flight, Seattle, WA, April 2005.
Zilberstein, Shirley. "CNN to Track Debate Viewers' Responses in Real Time." Featured Articles from CNN. CNN, 13 Dec. 2007. Web. 20 Jan. 2012.