Linguistic profiling

From Wikipedia, the free encyclopedia
Jump to: navigation, search

Linguistic profiling is the practice of identifying the social characteristics of an individual based on auditory cues, in particular dialect and accent. The theory was first developed by Professor John Baugh to explain discriminatory practices in the housing market based on the auditory redlining of prospective clientele by housing administrators. Linguistic profiling extends to issues of legal proceedings, employment opportunities, and education. The theory is frequently described as the auditory equivalent of racial profiling. The bulk of the research and evidence in support of the theory pertain to racial and ethnic distinctions, though its applicability holds within racial or ethnic groups, perceived gender and sexual orientation, and in distinguishing location of geographic origin.

Baugh's theory is distinct from linguistic profiling as defined by Hans van Halteren from the University of Nijmegen in the Netherlands. Van Halteren's theory deals with the categorization of linguistic features for the purposes of author identification and verification from a text, not necessarily specifically addressing the socially defined categories within which they are included.[1]


An important distinction exists between the many uses of linguistic profiling and the potential for discriminatory treatment. The power to determine origin or racial identity based on speech can be utilized without overt discrimination, as argued in several court cases where voice was used in the prosecution of a suspect. The negative effects of linguistic profiling are seen in the practice of denying housing or employment based on stereotypes associated with dialect and/or accent. Further negative practices are associated with education and general treatment of individuals speaking stigmatized dialects.[2] A more positive view of the practice is found in Baugh's description of expressions of ethnic pride.[3] Though average people have been shown to be well equipped in measuring social characteristics by means of speech, the failings of those unfamiliar with a speech community and the capability of manipulation of speech should be taken into account when determining the unbiased use of linguistic profiling.[4]

Ascribed categories[edit]


Between racial groups[edit]

The primary research done on linguistic profiling was a result of linguist John Baugh's experience searching for housing as an African American. Baugh found a discrepancy between the proclaimed availability of an apartment in a phone interview, in which he utilized Standard American English, and its apparent unavailability upon a face-to-face meeting with the landlord. The changed conception of the housing administrator between auditory and visual cues pointed to overt discrimination based on race.[3]

Baugh, Purnell, and Idsardi completed a set of four experiments based on the identification of dialects in American English. The resulting findings were as follows:

  • Discrimination based on dialect does occur.
  • It is possible for naïve listeners to identify ethnicity through speech.
  • Very little speech is required to make an accurate identification.
Discrimination based on American English dialect[edit]

The first experiment involved a series of telephone surveys in which a single speaker requested housing in the chosen dialects of Chicano English, African American Vernacular English, and Standard American English. Each landlord selected was subject to three requests in these three dialects, and the correlating negative and positive responses to call-back appointments were shown to favor speakers of Standard American English.[5] Their findings for the percent of call-backs for the two cities of Palo Alto and Woodside, which had African American and Hispanic Americans populations less than 5%, were as follows:

Geographic location Standard American English Chicano American English African American Vernacular English
Palo Alto 63.1% 31.9% 48.3%
Woodside 70.1% 21.8% 28.7%

Of the four geographical locations chosen in the study, those with the lowest populations of African Americans and Hispanic Americans were shown to have the greatest bias towards the non-standard dialects.

Distinguishing dialect[edit]

In order to determine the ability of people to distinguish dialect, a separate experiment was conducted. Fifty undergraduate students, all Caucasian speakers of Standard American English, were asked to identify the ethnicity behind a recording of the word "hello" spoken in either Chicano English, African American Vernacular English, and Standard American English.

Respondents were able to identify the correct dialect more than 70% of the time. Chicano English was found to be more easily identifiable than African American Vernacular English.[5]

Within racial groups[edit]

While much evidence has been collected describing linguistic profiling between racial groups within a speech community, linguistic profiling also extends to members within a racial or ethnic group. This is evidenced by a study conducted by Jaquelyn Rahman describing the perception of middle class African Americans to African American Vernacular English, or AAVE, and Standard American English. She found that subjects associated AAVE with their heritage, while perceiving African Americans who used Standard English as "acting white".[6]

Chinese American and Korean American English speech[edit]

An intra-racial distinction was researched by Newman and Wu, who conducted a study in which subjects were asked to identify various speakers based on race; the speakers included Latinos, African Americans, Chinese Americans, Korean Americans, and white speakers. Listeners tended to successfully categorize speakers as Latino, African American, white or Asian; often, they could not discern between Chinese American and Korean American English speakers, although phonetic differences exist.[7]

Voice onset time[edit]

It has been found that Korean American and Chinese American English speakers tend to have a longer voice onset time (VOT), or the length of time between a plosive and voicing, than other speakers of Standard American English. Furthermore, Korean American speakers tend to have a longer VOT than Chinese American speakers. This distinction is apparent when considering the VOT of the phonemes [ph], [kh] and [th].[7]

Phoneme Standard VOT Chinese American VOT Korean American VOT
[ph] 58 ms 77 ms 91 ms
[kh] 70 ms 75 ms 94 ms
[th] 80 ms 87 ms 126 ms



Another distinction between Korean American and Chinese American English speakers can be found in the timing of spoken syllables, or rhythm. Chinese American speakers (in particular, males), tended to speak with a more regular timing of syllables than Korean American speakers.[7]

Gender and sexual orientation[edit]

Linguistic profiling also applies to gender and sexual orientation. Munson conducted a study in which naïve listeners were asked to distinguish between heterosexual male and female speakers, and gay male and bisexual or lesbian female speakers. He found that listeners tended to classify male and female speakers by masculinity and femininity, respectively; male speakers were perceived as gay if they sounded less masculine, while female speakers were identified as bisexual or lesbian if they sounded less feminine.[8]

Perceived femininity[edit]

Linguistic features of perceived femininity include the following:

Female speakers perceived as bisexual or lesbian exhibited opposite characteristics. Furthermore, speakers who are identified as bisexual or lesbian are not necessarily perceived as masculine.[8]

Perceived masculinity[edit]

Linguistic features of perceived masculinity include the following:

  • Low vowels are produced with a higher F1 harmonic frequency
  • Back vowels are produced with a lower F2 harmonic frequency
  • A negative /s/ skew, or a skew towards the first harmonic frequency F1

Male speakers perceived as gay, tended to exhibit opposite characteristics.

In addition, male speakers who were perceived as gay exhibited greater breathiness and hyperarticulation of stressed syllables than male speakers who were perceived as heterosexual. It is important to note that speakers who are identified as gay are not necessarily perceived as feminine.[8]

Geographic origin[edit]

Linguistic profiling occurs beyond the spheres of race and ethnicity in the identification of geographic origin. Indeed, evidence suggests that listeners may successfully categorize speakers based upon dialect. Clopper and Pisoni (2003) found that naïve (or inexperienced) listeners could successfully categorize speakers as hailing from New England, the South, or the West, but had greater difficulty discerning geographic origin when a larger number of dialects were provided: New England, North, North Midland, South Midland, South, West, New York City, or Army Brat. Listeners were only able to identify speakers correctly 30% of the time. They also found evidence suggesting that residential history of the listener affected speaker categorization, and that listeners tended to use a small set of phonetic cues to make these distinctions.[9]

Utah English[edit]

Baker et al. had similar findings in a study in which Utah residents and non-Utah residents were asked to discern the degree of residency of a sample of speakers. Perhaps unsurprisingly, they found that Utah residents and western non-Utah residents tended to correctly identify speakers as being from Utah; the difficulty of other non-Utah residents in identifying Utah speakers was attributed to lack of expertise. However, the western non-Utah residents tended to use more stereotypical phonetic cues to identify speakers than Utah residents. Such findings point to the importance of experience when correctly identifying dialect or region of origin.[10]


Speakers of Utah English tend to utilize more mergers than speakers of Western American English; this is to say that speakers of Utah English will pronounce certain phonemes, that are distinct in Western American English, the same way. Some examples include fail-fell, pool-pull, card-cord, pin-pen and heel-hill. Such mergers are used more by older speakers.[10]

In institutions[edit]

Legal system[edit]

O. J. Simpson murder trial[edit]

A well-known example of the identification of race based on auditory sample in a legal setting occurred during the prosecution of O.J. Simpson. A witness testified against Simpson based on his memory of hearing a "male Black" voice. The objection of Simpson’s lawyer, Mr. Cochran, was overruled by the presiding judge.[5]

Sanchez v. People[edit]

A major precedent was formed on the use of linguistic profiling in the case of Sanchez v. People. A witness testified against a suspect based on his overhearing of an argument between two apparent Spanish speakers where the killer was identified as having a Dominican rather than a Puerto Rican accent. The New York Superior Court ruled that distinguishing between accents was permissible based on the fact that "human experience has taught us to discern the variation in the mode of speech of certain individuals." The court found that a certain degree of familiarity with the accents and dialects of a region or ethnic group qualified an individual to identify ethnicity or race in a court based on auditory evidence.[4]

Clifford v. Kentucky[edit]

A similar justification was used in the later case of Clifford v. Kentucky. A white police officer testified against Charles Clifford, an African American appellant at the Kentucky Supreme Court based on his evaluation of race from spoken language. The presiding Judge cited the findings of Sanchez v. People in justifying the officer's claim of identifying the suspect based on overheard speech. A similar case is that of Clifford v. Commonwealth, where a testimony of linguistic profiling was allowed based on the caveat that "the witness is personally familiar with the general characteristics, accents, or speech patterns of the race or nationality in question, i.e. so long as the opinion is 'rationally based on the perception of the witness'".[4]

Guidelines for use[edit]

Linguist Dennis Preston has presented an expansion of the rulings set down on the use of linguistic profiling in legal contexts. Preston argues for the further definition of "personal familiarity" with a dialect to an individual as a member of the speech community within which the identification is taking place. The person identified must be an authentic speaker with no perceived imitation of other dialects within the language. Further, there should be no evidence of overt stereotypes connecting the speaker to a particular style of language.[4]


United States v. Ferril[edit]

Linguistic profiling is very apparent in employment, as evidenced by the Supreme Court case United States v. Ferril. Shirley Ferril, a former employee of the telemarketing firm TPG, filed suit against the firm after being fired on the basis of her race. Ferril was hired by TPG, a firm that generates 60% of its revenue from providing pre-election "get-out-the-vote" phone calls to prospective voters, for the November 1994 election. She was subsequently fired after the election was over. The particular controversy about the case was TPG’s practice of matching callers to voters based upon race; with the rationale that voters would respond best when the caller was perceived to be a member of their own racial group. This was done with the particular belief that white voters would respond negatively to black callers. Indeed, African American employees would be given a "black" script to read to voters, while white employees read off of a "white script." Ferril, an African American, primarily called African American voters. Though the suit clearly displayed the fact that Ferril's work was based primarily on her race, the court allowed TPG to continue to assign callers to voters based upon dialect, accent, or speech pattern though acknowledging the practice was engaging in racial stereotypes.[11]

Perceived race and wages[edit]

There is also evidence of a relationship between wages and perceived race. Jeffrey Grogger conducted a study in which listeners were to categorize English-speakers based upon race; listeners would then give opinions regarding the speakers’ level of education, region of origin, and native language. Listeners could correctly perceive race, but not level of education. Furthermore, there was a correlation between the perceived race of the speaker and the speaker’s total earnings: African American workers who could be identified as black in the study based upon speech earned 12% less than African American workers who were not identified as black; those African American workers that could not be identified by phonetic cues earned as much white workers.[12]


Primary education[edit]

Linguistic profiling is also evident in education. Michael Sheperd’s study on teacher's perceptions of student responses compares how favorably teachers from the Los Angeles area viewed a response with the race and gender of the student speaker. Students were grouped based on white or minority and male or female. Teachers of various racial and ethnic backgrounds tended to view responses attributed to white females as being most favorable, followed by white boys, then minority girls. Students who were perceived as minority boys were ranked least favorably. Particularly noteworthy is the fact that Black and Hispanic teachers tended to rank responses given by minority boys, minority girls, and white boys, significantly lower than other teachers. While indicative of on overall stigmatization of boys, the study also provides evidence that the negative associations with minority students (who are identified through linguistic profiling) are held by members of all racial groups.[13]

Higher education[edit]

In higher education, linguistic profiling has been found to impede student comprehension. In a 1992 study, D. Rubin found that undergraduate university students would comprehend material more poorly if they heard a non-accented lecture presented with a picture of an Asian female. When the same non-accented lecture was presented with a European American teaching assistant, students had a greater ability to comprehend the material. This suggests that face identification may be enough to make students believe that language performance will be accented, which corresponded with a belief that comprehension would be reduced.[4]


Much of the research regarding the effects of linguistic profiling relates to housing. A study at the University of Pennsylvania found that discrepancies existed not only between white speakers of Standard American English and black speakers of African American Vernacular English, but in addition in between females and males and speakers of Black Accented English and African American Vernacular English when applying for housing. African Americans as a whole were also more likely to be told about the problems of creditworthiness when applying for a lease. An explanation offered by the researcher suggests the linkage between low socio-economic backgrounds and African American Vernacular English, while Black Accented English was associated with higher middle class status. Speech closer to the standard form yielded greater acceptance.[14]

The many instances of discrimination suits have failed to form a major precedent relating to this issue. Examples of individual cases include Alexander v. Riga involving the refusal of calls to African American applicants in addition to United States v. Lorantffy Care Center in which African Americans were denied admittance to nursing homes.[14]

The Fair Housing Act makes explicit the unlawfulness of discrimination against any member of a protected class, including religion, age, disability, gender, and race.[15] Refusal of housing based on the profiling of linguistic traits is clearly illegal, yet evidence must be found that the housing authority in question could indeed effectively determine the race or ethnicity of the applicant. In this way linguistic studies on the ability of lay persons to correctly identify race or ethnic groups based on auditory cues proves helpful to anti-discrimination law.

Outside the U.S.[edit]

This practice occurs in regions outside the United States, as evidenced in a 2009 study done in Athens, Greece. A telephone field experiment showed the increased difficulty for Albanians, in particular female Albanians, in securing housing. This study also showed a tendency for segregation based on discriminatory housing practices.[16]

See also[edit]


  1. ^ Van Halteren, Hans (2004). "Linguistic profiling for author recognition and verification". Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics – ACL '04. p. 199. doi:10.3115/1218955.1218981. 
  2. ^ Alim, Samy H (2005). "Critical Language Awareness in the United States: Revisiting Issues and Revising Pedagogies in a Resegregated Society". Educational Researcher. 34 (7): 24–31. doi:10.3102/0013189X034007024. 
  3. ^ a b Baugh, John, Linguistic Profiling, in Black Linguistics: Language, Society, and Politics in Africa and the Americas 155, 155-63 (Sinfree Makoni et al. eds., 2003).
  4. ^ a b c d e Salaberry, M. Rafael. Language allegiances and bilingualism in the US. Clevedon: Multilingual Matters, 2009. ISBN 978-1847691774
  5. ^ a b c Purnell, Thomas; Idsardi, William; Baugh, John (2016). "Perceptual and Phonetic Experiments on American English Dialect Identification" (PDF). Journal of Language and Social Psychology. 18: 10. doi:10.1177/0261927X99018001002. 
  6. ^ Rahman, J (2008). "Middle-Class African Americans: Reactions and Attitudes Toward African American English". American Speech. 83 (2): 141. doi:10.1215/00031283-2008-009. 
  7. ^ a b c d Newman, M; Wu, A (2011). ""do You Sound Asian when You Speak English?" Racial Identification and Voice in Chinese and Korean Americans' English". American Speech. 86 (2): 152. doi:10.1215/00031283-1336992. 
  8. ^ a b c Munson, Benjamin (2016). "The Acoustic Correlates of Perceived Masculinity, Perceived Femininity, and Perceived Sexual Orientation". Language and Speech. 50 (Pt 1): 125–42. doi:10.1177/00238309070500010601. PMID 17518106. 
  9. ^ Clopper, Cynthia G.; Pisoni, David (2004). "Some acoustic cues for the perceptual categorization of American English regional dialects". Journal of Phonetics. 32 (1): 111–140. doi:10.1016/S0095-4470(03)00009-3. PMC 3065110Freely accessible. PMID 21451736. 
  10. ^ a b Baker, W; Eddington, D; Nay, L (2009). "Dialect Identification: The Effects of Region of Origin and Amount of Experience". American Speech. 84: 48. doi:10.1215/00031283-2009-004. 
  11. ^ Smalls, DL (2004). "Linguistic Profiling and the law" (PDF). Stanford Law & Policy Review. 15. 
  12. ^ Grogger, Jeffrey (2011). "Speech Patterns and Racial Wage Inequality". Journal of Human Resources. 46: 1. doi:10.3368/jhr.46.1.1. 
  13. ^ Shepherd, Michael A (2011). "Effects of Ethnicity and Gender on Teachers' Evaluation of Students' Spoken Responses". Urban Education. 46 (5): 1011. doi:10.1177/0042085911400325. 
  14. ^ a b Massey, Douglas S; Lundy, Garvey (2016). "Use of Black English and Racial Discrimination in Urban Housing Markets". Urban Affairs Review. 36 (4): 452. doi:10.1177/10780870122184957. 
  15. ^ Fair Housing Act, 42 U.S.C. § 3601 et seq. (1968)
  16. ^ Drydakis, Nick (2010). "Ethnic Differences in Housing Opportunities in Athens". Urban Studies. 47 (12): 2573–2596. doi:10.1177/0042098009359955.