The four temperaments as illustrated by Johann Kaspar Lavater
A personality test is a questionnaire or other standardized instrument designed to reveal aspects of an individual's character or psychological makeup.
The first personality tests were developed in 1920s and were intended to ease the process of personnel selection, particularly in the armed forces. Since these early efforts of these test, a wide variety of personality tests have been developed, notably the Myers Briggs Type Indicator (MBTI), the MMPI, and a number of tests based on the Five Factor Model of personality.
Today, personality tests have become a $400 million-a-year industry and are used in a range of contexts, including individual and relationship counseling, career planning, and employee selection and development.
There are many different types of personality tests. The most common type is the self-report inventory, also commonly referred to as objective personality tests. Self-report inventory tests involve the administration of many questions/items to test-takers who respond by rating the degree to which each item reflects their behaviour and can be scored objectively. The term 'item' is used because many test questions are not actually questions; they are typically statements on questionnaires that allow respondents to indicate level of agreement (using a Likert scale or, more accurately, a Likert-type scale).
A sample item on a personality test, for example, might ask test-takers to rate the degree to which they agree with the statement "I talk to a lot of different people at parties" by using a scale of 1 ("strongly disagree") to 5 ("strongly agree"). The most widely used objective tests of personality is the Minnesota Multiphasic Personality Inventory (MMPI) which was originally designed to distinguish individuals with different psychological problems. Since then, it has become popular as a means of attempting to identify personality characteristics of people in many every-day settings.  In addition to self-report inventories, there are many other methods for assessing personality, including observational measures, peer-report studies, and projective tests (e.g. the TAT and Ink Blots).
Personality test topics 
The meaning of personality test scores are difficult to interpret in a direct sense. For this reason substantial effort is made by producers of personality tests to produce norms to provide a comparative basis for interpreting a respondent's test scores. Common formats for these norms include percentile ranks, z scores, sten scores, and other forms of standardised scores.
Test development 
A substantial amount of research and thinking has gone into the topic of personality test development. Development of personality tests tends to be an iterative process whereby a test is progressively refined. Test development can proceed on theoretical or statistical grounds. There are three commonly used general strategies: Inductive, Deductive, and Empirical . Scales created today will often incorporate elements of all three methods.
Deductive or Theoretical strategies can involve taking a previously established psychological or other theory to define the content domain and then developing test items that should in principle measure the domain of interest. This can then be accompanied by assessment by experts of the developed items to the defined construct. Test items are then selected or eliminated based upon which will result in the strongest internal validity for the scale. Advantages of this method include clearly defined and face valid questions for each measure. Measures are also more likely to apply across populations. Additionally, it requires less statistical methodology for initial development, and will often outperform other methods while requiring fewer items. However, the construct of interest must be well understood to create a thorough measure, and it may be difficult to prevent or determine if individuals are faking on the measure.
Statistical strategies are varied. Common strategies involve the use of exploratory factor analysis and confirmatory factor analysis to verify that items that are proposed to group together into factors actually do group together empirically. Reliability analysis and Item Response Theory are additional complimentary approaches. Inductive approaches are reliant on statistical strategies. Inductive strategies begin by constructing a wide group of items with no theoretical connection and administering it to a large group of participants. This allows researchers to analyze natural relationships among the questions and label components of the scale based upon how the questions group together.
The Five Factor Model of personality was developed using this method. Advantages statistical methods include the opportunity to discover previously unidentified or unexpected relationships between items or constructs. It also may allow for the development of subtle items that prevent test takers from knowing what is being measured and may represent the actual structure of a construct better than a pre-developed theory. Criticisms include a vulnerability to finding item relationships that do not apply to a broader population, difficulty identifying what may be measured in each component because of confusing item relationships, or constructs that were not fully addressed by the originally created questions.
Empirical strategies are similarly reliant on statistical methods. Empirical test construction attempts to create a measure that differentiates between different established groups. For example, this may include depressed and non-depressed individuals, or individuals high or low in levels of aggression. The goal of item creation is to find items that will be answered differently by the groups of interest. The Minnesota Multiphasic Personality Inventory was initially developed using this method.
Test evaluation 
A respondent's response is used to compute the analysis. Analysis of data is a long process. Two major theories are used here; Classical test theory (CTT)- used for the observed score , and item response theory (IRT)- "a family of models for persons' responses to items" . The two theories focus upon different 'levels' of responses and researchers are implored to use both in order to fully appreciate their results.
Firstly, item non-response needs to be addressed. Non-response can either be 'unit'- where a person gave no response for any of the n items, or 'item'- i.e., individual question. Unit non-response is generally dealt with exclusion . Item non-response should be handled by imputation- the method used can vary between test and questionnaire items. Literature about the most appropriate method to use and when can be found here .
The conventional method of scoring items is to assign '0' for an incorrect answer '1' for a correct answer. When tests have more response options (e.g. ordinal-polytomous items)- '0' when incorrect, '1' for being partly correct and '2' for being correct . Personality tests can also be scored using a dimensional (normative) or a typological (ipsative) approach. Dimensional approaches such as the Big 5 describe personality as a set of continuous dimensions on which individuals differ. From the item scores, a 'observed' score is computed. This is generally found by summing the un-weighted item scores.
Criticism and controversy 
Biased test taker interpretation 
One problem of a personality test is that the users of the test could only find it accurate because of the subjective validation involved. This is where the person only acknowledges the information that applies to him/her.
Application to non-clinical samples 
Critics have raised issues about the ethics of administering personality tests, especially for non-clinical uses. By the 1960s, tests like the MMPI were being given by companies to employees and applicants as often as to psychiatric patients. Sociologist William H. Whyte was among those who saw the tests as helping to create and perpetuate the oppressive groupthink of "The Organization Man" mid-20th century corporate capitalistic mentality.
In the 60s and 70s some psychologists dismissed the whole idea of personality, considering much behaviour to be context specific. This idea was supported by the fact that personality often does not predict behaviour in specific contexts. However, more extensive research has shown that when behaviour is aggregated across contexts, that personality can be a modest to good predictor of behaviour. Almost all psychologists now acknowledge that both social and individual difference factors (i.e., personality) influence behaviour. The debate is currently more around the relative importance of each of these factors and how these factors interact.
Respondent faking 
One problem with self-report measures of personality is that respondents are often able to distort their responses. Emotive tests in particular could in theory become prey to unreliable results due to people striving to pick the answer they feel the best fitting of an ideal character and therefore not their true response. This is particularly problematic in employment contexts and other contexts where important decisions are being made and there is an incentive to present oneself in a favourable manner.
Work in experimental settings  has also shown that when student samples have been asked to deliberately fake on a personality test, they clearly demonstrated that they are capable of doing so. Hogan, Barett and Hogan (2007) analyzed data of 5,266 applicants who did a personality test based on the big five. At the first application the applicants were rejected. After six months the applicants reapplied and completed the same personality test. The answers on the personality tests were compared and there was no significant difference between the answers.
So in practice, most people do not significantly distort. Nevertheless, a researcher has to be prepared for such possibilities. Also, sometimes participants think that tests results are more valid than they really are because they like the results that they get. People want to believe that the positive traits that the test results say they possess are in fact present in their personality. This leads to distorted results of people's sentiments on the validity of such tests.
Several strategies have been adopted for reducing respondent faking. One strategy involves providing a warning on the test that methods exist for detecting faking and that detection will result in negative consequences for the respondent (e.g., not being considered for the job). Forced choice item formats (ipsative testing) have been adopted which require respondents to choose between alternatives of equal social desirability. Social desirability and lie scales are often included which detect certain patterns of responses, although these are often confounded by true variability in social desirability.
More recently, Item Response Theory approaches have been adopted with some success in identifying item response profiles that flag fakers. Other researchers are looking at the timing of responses on electronically administered tests to assess faking. While people can fake in practice they seldom do so to any significant level. To successfully fake means knowing what the ideal answer would be. Even with something as simple as assertiveness people who are unassertive and try to appear assertive often endorse the wrong items. This is because unassertive people confuse assertion with aggression, anger, oppositional behavior, etc.
Psychological research 
Personality testing is frequently used in psychological research to test various theories of personality.
Research published by David Dunning of Cornell University, Chip Heath of Stanford University and Jerry M. Suls of the University of Iowa reveals that observers who are not involved in any type of relationship with an individual are better judges of the individual's relationships and abilities. These workers have studied a large body of investigations into self-evaluation, indicating that individuals may have flawed views about themselves and their social relationships, sometimes leading to decisions that can impact negatively on other persons' lives and/or their own.
Additional applications 
A study by American Management Association reveals that 39 percent of companies surveyed use personality testing as part of their hiring process. However, ipsative personality tests are often misused in recruitment and selection, where they are mistakenly treated as if they were normative measures.
More people are using personality testing to evaluate their business partners, their dates and their spouses. Salespeople are using personality testing to better understand the needs of their customers and to gain a competitive edge in the closing of deals. College students have started to use personality testing to evaluate their roommates. Lawyers are beginning to use personality testing for criminal behavior analysis, litigation profiling, witness examination and jury selection.
Personality tests have been around for a long time, but it wasn't until it became illegal for employers to use polygraphs that we began to see the widespread use of personality tests. The idea behind these personality tests is that employers can reduce their turnover rates and prevent economic losses in the form of people prone to thievery, drug abuse, emotional disorders or violence in the workplace.
Employers also see employment tests as more of an accurate assessment for someone's behavioral characteristics versus an employment reference who may respond neutrally or favorably in fear of a defamation lawsuit. But the problem with using personality tests as a hiring tool is the notion a person's job performance in one environment will be the same in every environment. However, the reality is that one's environment plays a crucial role in determining job performance, and not all environments are created equally. One danger of using personality tests is the results may be skewed based on a person's mood so good candidates may potentially be screened out because of unfavorable responses that reflect that mood.
Another danger of personality tests is that they can create false-negative results (i.e. honest people being labeled as dishonest) especially in cases when stress on the applicant's part is involved. There is also the issue of privacy to be of concern forcing applicants to reveal private thoughts and feelings through his or her responses that seem to become a condition for employment. Another danger of personality tests is the illegal discrimination of certain groups under the guise of a personality test. 
|This section does not cite any references or sources. (May 2012)|
It is easy for personality test participants to become complacent about their own personal uniqueness and instead become dependent on the description associated with them. This can be potentially dangerous with persons who are already suffering from a form of identity disorder or may be a catalyst to instigate particular behaviors in a person who was previously believed to be of sound mental health. The severity of the damage that individuals can sustain to their personal identity was made clear during the case Wilson v Johnson&Johnson in which the plaintiff (Wilson) sued his former employer (Johnson&Johnson) for irreparable damages that resulted from the over abundance of personality tests being administered in the workplace. Wilson argued that repeated questioning and scrutiny of his personality was a cause of strain and eventually breakdown. In this historic case, Wilson was awarded $4.7 million after jurors agreed that excessive testing caused strain and led to unnecessary scrutiny resulting in personal grief. Similar cases have been tried since and won, but none with such magnitude as this first monumental case that won mental health rights for employees.
Examples of personality tests 
- The first modern personality test was the Woodworth Personal data sheet, which was first used in 1919. It was designed to help the United States Army screen out recruits who might be susceptible to shell shock.
- The Rorschach inkblot test was introduced in 1921 as a way to determine personality by the interpretation of abstract inkblots.
- The Thematic Apperception Test was commissioned by the Office of Strategic Services (O.S.S.) in the 1930s to identify personalities that might be susceptible to being turned by enemy intelligence.
- The Minnesota Multiphasic Personality Inventory was published in 1942 as a way to aid in assessing psychopathology in a clinical setting. It can also be used to assess the Personality Psychopathology Five (PSY-5), which are similar to the Five Factor Model (FFM; or Big Five personality traits). These five scales on the MMPI-2 include aggressiveness, psychoticism, disconstraint, negative emotionality/neuroticism, and introversion/low positive emotionality.
- Myers-Briggs Type Indicator (MBTI) is a psychometric questionnaire designed to measure psychological preferences in how people perceive the world and make decisions. This 16-type indicator test is based on Carl Jung's Psychological Types, developed during World War II by Isabel Myers and Katherine Briggs. The 16-type indicator includes a combination of Extroversion-Introversion, Sensing-Intuition, Thinking-Feeling and Judging-Perceiving.The MBTI utilizes 2 opposing behavioral divisions on 4 scales that yields a "personality type".
- Keirsey Temperament Sorter developed by David Keirsey is influenced by Isabel Myers sixteen types and Ernst Kretschmer's four types.
- The True Colors (personality) Test developed by Don Lowry in 1978 is based on the work of David Keirsey in his book, "Please Understand Me" as well as the Myers-Briggs Type Indicator and provides a model for understanding personality types using the colors blue, gold, orange and green to represent four basic personality temperaments.
- The 16PF Questionnaire (16PF) was developed by Raymond Cattell and his colleagues in the 1940s and 1950s in a search to try to discover the basic traits of human personality using scientific methodology. The test was first published in 1949, and is now in its 5th edition, published in 1994. It is used in a wide variety of settings for individual and marital counseling, career counseling and employee development, in educational settings, and for basic research.
- The EQSQ Test developed by Professor Simon Baron-Cohen, Sally Wheelwright, and their team at the University of Cambridge, England, centers on the empathizing-systemizing theory of the male versus the female brain types.
- The Personal Style Indicator (PSI) classifies four aspects of innate behavior by testing a person's preferences in word associations.
- The Personality and Preference Inventory (PAPI), originally designed by Dr Max Kostick, Professor of Industrial Psychology at Boston State College, in Massachusetts, USA, in the early 1960s evaluates the behaviour and preferred work styles of individuals.
- The Strength Deployment Inventory, developed by Elias Porter, Ph.D. in 1971 and is based on his theory of Relationship Awareness. Porter was the first known psychometrician to use colors (Red, Green and Blue) as shortcuts to communicate the results of a personality test.
- The Newcastle Personality Assessor (NPA), created by Daniel Nettle, is a short questionnaire designed to quantify personality on five dimensions: Extraversion, Neuroticism, Conscientious, Agreeableness, and Openness.
- The DISC assessment is based on the research of William Moulton Marston and later work by John Grier, and identifies four personality types: Dominance; Influence; Steadiness and Conscientiousness. It is used widely in Fortune 500 companies, for-profit and non-profit organizations.
- The Winslow Personality Profile measures 24 traits on a decile scale. It was mentioned in the Disney movie Miracle due to its use to help select the members of the team. It has been used in the National Football League, the National Basketball Association, the National Hockey League and every draft choice for Major League Baseball for the last 30 years and can be taken online for personal development.
- Other personality tests include Forté Profile, Millon Clinical Multiaxial Inventory, Eysenck Personality Questionnaire, Swedish Universities Scales of Personality, and Enneagram of Personality.
- The HEXACO Personality Inventory – Revised (HEXACO PI-R) is based on the HEXACO model of personality structure, which consists of six domains, the five domains of the Big Five model, as well as the domain of Honesty-Humility.
- The Pro Development assessment is a professional development instrument for leaders that measures convergence of an individual’s Missions – (motivations and interests that excite an individual to action), Competencies – (abilities and aptitudes that enable action) and Styles – (personality and behaviors that make an individual unique). PRO-D also produces an astonishingly accurate perspective of how the different dimensions of mission, style and competency interact for the Person – Role – Organization. It was developed in Princeton under the advisement of George Gallup and Win Manning (ETS). 
- The Personality Inventory for DSM-5 (PID-5) was developed in September of 2012 by the DSM-5 Personality and Personality Disorders Workgroup with regard to a personality trait model proposed for DSM-5. The PID-5 includes 25 maladaptive personality traits as determined by Krueger, Derringer, Markon, Watson, and Skodol.
- The "Clifton Strengths Finder" is the internet-based personality test at the heart of the popular book "Now, Discover Your Strengths"
Personality tests of the Five Factor Model 
Different types of the Big Five personality traits:
- The NEO PI-R, or the Revised NEO Personality Inventory, is one of the most significant measures of the Five Factor Model (FFM). The measure was created by Costa and McCrae and contains 240 items in the forms of sentences. Costa and McCrae had divided each of the five domains into six facets each, 30 facets total, and changed the way the FFM is measured.
- The Five-Factor Model Rating Form (FFMRF) was developed by Lynam and Widiger in 2001 as a shorter alternative to the NEO PI-R. The form consists of 30 facets, 6 facets for each of the Big Five factors. The form can be obtained at http://samppl.psych.purdue.edu/~dbsamuel/research.html.
- The Five Factor Personality Inventory — Children (FFPI-C) was developed to measure personality traits in children based upon the Five Factor Model (FFM).
- The Big Five Inventory (BFI), developed by John, Donahue, and Kentle, is a 44-item self-report questionnaire consisting of adjectives that assess the domains of the Five Factor Model (FFM). The 10-Item Big Five Inventory is a simplified version of the well-established BFI. It is developed to provide a personality inventory under time constraints. The BFI-10 assesses the 5 dimensions of BFI using only two items each to cut down on length of BFI. 
- The Semi-structured Interview for the Assessment of the Five-Factor Model (SIFFM) is the only semi-structured interview intended to measure a personality model or personality disorder. The interview assesses the five domains and 30 facets as presented by the NEO PI-R, and it additional assesses both normal and abnormal extremities of each facet.
See also 
- Employment testing
- Forer effect
- Learning styles
- Objective test
- Personality psychology
- Projective test
- Psychological testing
- Sexological testing
- Kaplan, R., Saccuzzo, D. (2010). Psychological Testing: Principle, Applications, and Issues. (Eighth Edition) Belmont, CA: Wadsworth Cengage Learning.
- Stabile, Susan J. "The Use of Personality Tests as a Hiring Tool: Is the Benefit Worth the Cost?". U.PA. Journal of Labor and Employment Law.
- Carlson [et al.], Neil, R. (2010). Psychology: the Science of Behaviour. United States of America: Person Education. p. 464. ISBN 978-0-205-64524-4.
- Burisch, Matthias (March 1984). "Approaches to personality inventory construction: A comparison of merits". American Psychologist 39 (3): 214–227. doi:10.1037/0003-066X.39.3.214.
- Burisch, Matthias (1978). "Construction Strategies for Multiscale Personality Inventories". Applied Psychological Measurement 2 (1): 97–101. doi:10.1177/014662167800200110.
- McCrae, Robert; Oliver John (1992). "An Introduction to the Five-Factor Model and Its Applications". Journal of personality 60 (2): 175–215.
- Smith, Greggory; Sarah Fischer, Suzannah Fister (December 2003). "Incremental Validity Principles in Test Construction". Psychological Assessment 15 (4): 467–477.
- Ryan Joseph; Shane Lopez, Scott Sumerall (2001). In William Dorfman, Michel Hersen. Understanding Psychological Assessment: Perspective on Individual Differences (1 ed.). Springer. pp. 1–15.
- Hathaway, S. R., & McKinley, J. C. (1940). A multiphasic personality schedule(Minnesota): I. Construction of the schedule. Journal of Psychology, 10, 249-254.
- (see Lord and Novick, 1968)
- Herman J. Adèr, Gideon J. Mellenbergh (2008) Advising on Research Methods: a consultant's companion. Johannes van Kessel Publ. p. 244.
- See Hamleton and Swaminathon (1985) for a full summary of IRT
- (Mellenbergh, 2008)
- (Ader, Mellenbergh & Hand, 2008)
- (Mellenbergh, 2008)
- Arendasy, M.; Sommer, Herle, Schutzhofer, Inwanschitz (2011). "Modeling effects of faking on an objective personality test.". Journal of Individual Differences 32 (4): 210–218.
- (e.g., Viswesvaran & Ones, 1999; Martin, Bowen & Hunt, 2002)
- Hogan, Joyce. "Personality Measurement, Faking, and Employment Selection". American Psychological Association.
- Blinkhorn, S., Johnson, C., & Wood, R. (1988). Spuriouser and spuriouser:The use of ipsative personality tests.Journal of Occupational. Psychology, 61, 153-162.
- Stabile, Susan J. "The Use of Personality Tests as a Hiring Tool: Is the Benefit Worth the Cost?". U.PA. Journal of Labor and Employment Law.
- Harkness, A. R., & McNulty, J. L. (1994). The Personality Psychopathology Five (PSY-5): Issue from the pages of a diagnostic manual instead of a dictionary. In S. Strack & M. Lorr (Eds.), Differentiating normal and abnormal personality. New York: Springer.
- "International True Colors Association". Retrieved 2013-01-03.
- Porter, Elias H. (1971) Strength Deployment Inventory, Pacific Palisades, CA: Personal Strengths Assessment Service.
- Nettle, Daniel (2009-03-07). "A test of character". The Guardian (London).
- "How to Build the Perfect Batter". GQ Magazine. Retrieved 2012-07-26.
- "Winslow Online Personality Assessment". Retrieved 2012-07-26.
- Ashton, M. C., & Lee, K. (2008). The prediction of Honesty-Humility-related criteria by the HEXACO and Five-Factor models of personality. Journal of Research in Personality, 42, 1216-1228.
- "Pro-D Online Assessment". Retrieved 2012-12-18.
- Krueger, R. F., Derringer, J., Markon, K. E., Watson, D., Skodol, A. E. (2012). Initial construction of a maladaptive personality trait model and inventory for DSM-5. Psychological Medicine, 42, 1879-1890.
- Costa, P. T., Jr., & McCrae, R. R. (1992). Revised NEO Personality Inventory (NEO PI-R) and NEO Five-Factor Inventory (NEO-FFI) professional manual. Odessa, FL: Psychological Assessment Resources.
- Lynam, D. R., & Widiger, T. A. (2001). Using the five-factor model to represent the DSM-IV personality disorders: An expert consensus approach. Journal of Abnormal Psychology, 110, 401-412.
- McGhee, R.L., Ehrler, D. & Buckhalt, J. (2008). Manual for the Five Factor Personality Inventory — Children Austin, TX (PRO ED, INC).
- John, O. P., Donahue, E. M., & Kentle, R. L. (1991). The Big Five Inventory – Versions 4a and 54. Berkeley: University of California, Berkeley, Institute of Personality and Social Research.
- Beatrice Rammstedt (2007). The 10-Item Big Five Inventory: Norm Values and Investigation of Sociodemographic Effects Based on a German Population Representative Sample. European Journal of Psychological Assessment (July 2007), 23 (3), pg. 193-201
- Trull, T. J., & Widiger, T. A. (1997). Structured Interview for the Five-Factor Model of Personality. Odessa, FL: Psychological Assessment Resources.
- International Personality Item Pool - public domain list of items and scales used in personality tests.
- A number of implementations of IPIP tests