A personality test is a method of assessing human personality constructs. Most personality assessment instruments (despite being loosely referred to as "personality tests") are in fact introspective (i.e., subjective) self-report questionnaire (Q-data, in terms of LOTS data) measures or reports from life records (L-data) such as rating scales. Attempts to construct actual performance tests of personality have been very limited even though Raymond Cattell with his colleague Frank Warburton compiled a list of over 2000 separate objective tests that could be used in constructing objective personality tests. One exception however, was the Objective-Analytic Test Battery, a performance test designed to quantitatively measure 10 factor-analytically discerned personality trait dimensions. A major problem with both L-data and Q-data methods is that because of item transparency, rating scales and self-report questionnaires are highly susceptible to motivational and response distortion ranging all the way from lack of adequate self-insight (or biased perceptions of others) to downright dissimulation (faking good/faking bad) depending on the reason/motivation for the assessment being undertaken.
The first personality assessment measures were developed in the 1920s and were intended to ease the process of personnel selection, particularly in the armed forces. Since these early efforts, a wide variety of personality scales and questionnaires have been developed, including the Minnesota Multiphasic Personality Inventory (MMPI), the Sixteen Personality Factor Questionnaire (16PF), the Comrey Personality Scales (CPS), among many others. Although popular especially among personnel consultants, the Myers–Briggs Type Indicator (MBTI) has numerous psychometric deficiencies. More recently, a number of instruments based on the Five Factor Model of personality have been constructed such as the Revised NEO Personality Inventory. However, the Big Five and related Five Factor Model have been challenged for accounting for less than two-thirds of the known trait variance in the normal personality sphere alone.
Estimates of how much the personality assessment industry in the US is worth range anywhere from $2 and $4 billion a year (as of 2013). Personality assessment is used in wide a range of contexts, including individual and relationship counseling, clinical psychology, forensic psychology, school psychology, career counseling, employment testing, occupational health and safety and customer relationship management.
The origins of personality assessment date back to the 18th and 19th centuries, when personality was assessed through phrenology, the measurement of bumps on the human skull, and physiognomy, which assessed personality based on a person's outer appearances. Sir Francis Galton took another approach to assessing personality late in the 19th century. Based on the lexical hypothesis, Galton estimated the number of adjectives that described personality in the English dictionary. Galton's list was eventually refined by Louis Leon Thurstone to 60 words that were commonly used for describing personality at the time. Through factor analyzing responses from 1300 participants, Thurstone was able to reduce this severely restricted pool of 60 adjectives into seven common factors. This procedure of factor analyzing common adjectives was later utilized by Raymond Cattell (7th most highly cited psychologist of the 20th Century—based on the peer-reviewed journal literature), who subsequently utilized a data set of over 4000 affect terms from the English dictionary that eventually resulted in construction of the Sixteen Personality Factor Questionnaire (16PF) which also measured up to eight second-stratum personality factors. Of the many introspective (i.e., subjective) self-report instruments constructed to measure the putative Big Five personality dimensions, perhaps the most popular has been the Revised NEO Personality Inventory (NEO-PI-R) However, the psychometric properties of the NEO-PI-R (including its factor analytic/construct validity) has been severely criticized.
There are many different types of personality assessment measures. The self-report inventory involves administration of many items requiring respondents to introspectively assess their own personality characteristics. This is highly subjective, and because of item transparency, such Q-data measures are highly susceptible to motivational and response distortion. Respondents are required to indicate their level of agreement with each item using a Likert scale or, more accurately, a Likert-type scale. An item on a personality questionnaire, for example, might ask respondents to rate the degree to which they agree with the statement "I talk to a lot of different people at parties" on a scale from 1 ("strongly disagree") to 5 ("strongly agree").
Historically, the most widely used multidimensional personality instrument is the Minnesota Multiphasic Personality Inventory (MMPI), a psychopathology instrument originally designed to assess archaic psychiatric nosology.
In addition to subjective/introspective self-report inventories, there are several other methods for assessing human personality, including observational measures, ratings of others, projective tests (e.g., the TAT and Ink Blots), and actual objective performance tests (T-data).
The meaning of personality test scores are difficult to interpret in a direct sense. For this reason substantial effort is made by producers of personality tests to produce norms to provide a comparative basis for interpreting a respondent's test scores. Common formats for these norms include percentile ranks, z scores, sten scores, and other forms of standardised scores.
A substantial amount of research and thinking has gone into the topic of personality test development. Development of personality tests tends to be an iterative process whereby a test is progressively refined. Test development can proceed on theoretical or statistical grounds. There are three commonly used general strategies: Inductive, Deductive, and Empirical. Scales created today will often incorporate elements of all three methods.
Deductive assessment construction begins by selecting a domain or construct to measure. The construct is thoroughly defined by experts and items are created which fully represent all the attributes of the construct definition. Test items are then selected or eliminated based upon which will result in the strongest internal validity for the scale. Measures created through deductive methodology are equally valid and take significantly less time to construct compared to inductive and empirical measures. The clearly defined and face valid questions that result from this process make them easy for the person taking the assessment to understand. Although subtle items can be created through the deductive process, these measure often are not as capable of detecting lying as other methods of personality assessment construction.
Inductive assessment construction begins with the creation of a multitude of diverse items. The items created for an inductive measure to not intended to represent any theory or construct in particular. Once the items have been created they are administered to a large group of participants. This allows researchers to analyze natural relationships among the questions and label components of the scale based upon how the questions group together. Several statistical techniques can be used to determine the constructs assessed by the measure. Exploratory Factor Analysis and Confirmatory Factor Analysis are two of the most common data reduction techniques that allow researchers to create scales from responses on the initial items.
The Five Factor Model of personality was developed using this method. Advanced statistical methods include the opportunity to discover previously unidentified or unexpected relationships between items or constructs. It also may allow for the development of subtle items that prevent test takers from knowing what is being measured and may represent the actual structure of a construct better than a pre-developed theory. Criticisms include a vulnerability to finding item relationships that do not apply to a broader population, difficulty identifying what may be measured in each component because of confusing item relationships, or constructs that were not fully addressed by the originally created questions.
Empirically derived personality assessments require statistical techniques. One of the central goals of empirical personality assessment is to create a test that validly discriminates between two distinct dimensions of personality. Empirical tests can take a great deal of time to construct. In order to ensure that the test is measuring what it is purported to measure, psychologists first collect data through self- or observer reports, ideally from a large number of participants.
Self- vs. observer-reports
A personality test can be administered directly to the person being evaluated or to an observer. In a self-report, the individual responds to personality items as they pertain to the person himself/herself. Self-reports are commonly used. In an observer-report, a person responds to the personality items as those items pertain to someone else. To produce the most accurate results, the observer needs to know the individual being evaluated. Combining the scores of a self-report and an observer report can reduce error, providing a more accurate depiction of the person being evaluated. Self- and observer-reports tend to yield similar results, supporting their validity.
Direct observation reports
Direct observation involves a second party directly observing and evaluating someone else. The second party observes how the target of the observation behaves in certain situations (e.g., how a child behaves in a schoolyard during recess). The observations can take place in a natural (e.g., a schoolyard) or artificial setting (social psychology laboratory). Direct observation can help identify job applicants (e.g., work samples) who are likely to be successful or maternal attachment in young children (e.g., Mary Ainsworth's strange situation). The object of the method is to directly observe genuine behaviors in the target. A limitation of direct observation is that the target persons may change their behavior because they know that they are being observed. A second limitation is that some behavioral traits are more difficult to observe (e.g., sincerity) than others (e.g., sociability). A third limitation is that direct observation is more expensive and time-consuming than a number of other methods (e.g., self-report).
Personality tests in the workplace
Personality tests can predict something about how a job applicant will act in some workplace situations. Conscientiousness is one of the Big Five personality traits. People who are higher in the agreeableness trait tend to be less likely to fight or argue with other employees. There is a chance that an applicant may fake responses to personality test items in order to make the applicant appear more attractive to the employing organization than the individual actually is.
There are several criteria for evaluating a personality test. For a test to be successful, users need to be sure that (a) test results are replicable and (b) the test measures what its creators purport it to measure. Fundamentally, a personality test is expected to demonstrate reliability and validity. Reliability refers to the extent to which test scores, if a test were administered to a sample twice within a short period of time, would be similar in both administrations. Test validity refers to evidence that a test measures the construct (e.g., neuroticism) that it is supposed to measure.
A respondent's response is used to compute the analysis. Analysis of data is a long process. Two major theories are used here; classical test theory (CTT), used for the observed score, and item response theory (IRT), "a family of models for persons' responses to items". The two theories focus upon different 'levels' of responses and researchers are implored to use both in order to fully appreciate their results.
Firstly, item non-response needs to be addressed. Non-response can either be unit, where a person gave no response for any of the n items, or item, i.e., individual question. Unit non-response is generally dealt with exclusion. Item non-response should be handled by imputation – the method used can vary between test and questionnaire items.
The conventional method of scoring items is to assign '0' for an incorrect answer '1' for a correct answer. When tests have more response options (e.g. multiple choice items) '0' when incorrect, '1' for being partly correct and '2' for being correct. Personality tests can also be scored using a dimensional (normative) or a typological (ipsative) approach. Dimensional approaches such as the Big 5 describe personality as a set of continuous dimensions on which individuals differ. From the item scores, an 'observed' score is computed. This is generally found by summing the un-weighted item scores.
Criticism and controversy
This article needs additional citations for verification. (March 2015)
In the 1960s and 1970s some psychologists dismissed the whole idea of personality, considering much behaviour to be context-specific. This idea was supported by the fact that personality often does not predict behaviour in specific contexts. However, more extensive research has shown that when behaviour is aggregated across contexts, that personality can be a mostly good predictor of behaviour. Almost all psychologists now acknowledge that both social and individual difference factors (i.e., personality) influence behaviour. The debate is currently more around the relative importance of each of these factors and how these factors interact.
This article needs additional citations for verification. (May 2014)
One problem with self-report measures of personality is that respondents are often able to distort their responses. Emotive tests in particular could in theory become prey to unreliable results due to people striving to pick the answer they feel the best fitting of an ideal character and therefore not their true response. This is particularly problematic in employment contexts and other contexts where important decisions are being made and there is an incentive to present oneself in a favourable manner.
Work in experimental settings has also shown that when student samples have been asked to deliberately fake on a personality test, they clearly demonstrated that they are capable of doing so. Hogan, Barett and Hogan (2007) analyzed data of 5,266 applicants who did a personality test based on the Big Five. At the first application the applicants were rejected. After six months the applicants reapplied and completed the same personality test. The answers on the personality tests were compared and there was no significant difference between the answers.
So in practice, most people do not significantly distort. Nevertheless, a researcher has to be prepared for such possibilities. Also, sometimes participants think that tests results are more valid than they really are because they like the results that they get. People want to believe that the positive traits that the test results say they possess are in fact present in their personality. This leads to distorted results of people's sentiments on the validity of such tests.
Several strategies have been adopted for reducing respondent faking. One strategy involves providing a warning on the test that methods exist for detecting faking and that detection will result in negative consequences for the respondent (e.g., not being considered for the job). Forced choice item formats (ipsative testing) have been adopted which require respondents to choose between alternatives of equal social desirability. Social desirability and lie scales are often included which detect certain patterns of responses, although these are often confounded by true variability in social desirability.
More recently, Item Response Theory approaches have been adopted with some success in identifying item response profiles that flag fakers. Other researchers are looking at the timing of responses on electronically administered tests to assess faking. While people can fake in practice they seldom do so to any significant level. To successfully fake means knowing what the ideal answer would be. Even with something as simple as assertiveness people who are unassertive and try to appear assertive often endorse the wrong items. This is because unassertive people confuse assertion with aggression, anger, oppositional behavior, etc.
Research on the importance of personality and intelligence in education shows evidence that when others provide the personality rating, rather than providing a self-rating, the outcome is nearly four times more accurate for predicting grades.
A study by American Management Association reveals that 39 percent of companies surveyed use personality testing as part of their hiring process. However, ipsative personality tests are often misused in recruitment and selection, where they are mistakenly treated as if they were normative measures.
More people are using personality testing to evaluate their business partners, their dates and their spouses. Salespeople are using personality testing to better understand the needs of their customers and to gain a competitive edge in the closing of deals. College students have started to use personality testing to evaluate their roommates. Lawyers are beginning to use personality testing for criminal behavior analysis, litigation profiling, witness examination and jury selection.
This article needs additional citations for verification. (May 2014)
Personality tests have been around for a long time, but it was not until 1988 when it became illegal in the United States for employers to use polygraphs that we began to see the widespread use of personality tests. The idea behind these personality tests is that employers can reduce their turnover rates and prevent economic losses in the form of people prone to thievery, drug abuse, emotional disorders or violence in the workplace.
Employers may also view personality tests as more accurate assessment of a candidate's behavioral characteristics versus an employment reference. But the problem with using personality tests as a hiring tool is the notion a person's job performance in one environment will carry over to another work environment. However, the reality is that one's environment plays a crucial role in determining job performance, and not all environments are created equally. One danger of using personality tests is the results may be skewed based on a person's mood so good candidates may potentially be screened out because of unfavorable responses that reflect that mood.
Another danger of personality tests is that they can create false-negative results (i.e. honest people being labeled as dishonest) especially in cases when stress on the applicant's part is involved. There is also the issue of privacy to be of concern forcing applicants to reveal private thoughts and feelings through his or her responses that seem to become a condition for employment. Another danger of personality tests is the illegal discrimination of certain groups under the guise of a personality test.
- The first modern personality test was the Woodworth Personal Data Sheet, which was first used in 1919. It was designed to help the United States Army screen out recruits who might be susceptible to shell shock.
- The Rorschach inkblot test was introduced in 1921 as a way to determine personality by the interpretation of inkblots.
- The Thematic Apperception Test was commissioned by the Office of Strategic Services (O.S.S.) in the 1930s to identify personalities that might be susceptible to being turned by enemy intelligence.
- The Minnesota Multiphasic Personality Inventory was published in 1942 as a way to aid in assessing psychopathology in a clinical setting. It can also be used to assess the Personality Psychopathology Five (PSY-5), which are similar to the Five Factor Model (FFM; or Big Five personality traits). These five scales on the MMPI-2 include aggressiveness, psychoticism, disconstraint, negative emotionality/neuroticism, and introversion/low positive emotionality.
- Myers–Briggs Type Indicator (MBTI) is a questionnaire designed to measure psychological preferences in how people perceive the world and make decisions. This 16-type indicator test is based on Carl Jung's Psychological Types, developed during World War II by Isabel Myers and Katharine Briggs. The 16-type indicator includes a combination of Extroversion-Introversion, Sensing-Intuition, Thinking-Feeling and Judging-Perceiving. The MBTI utilizes 2 opposing behavioral divisions on 4 scales that yields a "personality type".
- OAD Survey is an adjective word list designated to measure seven work related personality traits and job behaviors: Assertiveness-Compliance, Extroversion-Introversion, Patience-Impatience, Detail-Broad, High Versatility-Low Versatility, Low Emotional IQ-High Emotional IQ, Low Creativity-High Creativity. It was first published in 1990 with periodic norm revisions to assure scale validity, reliability, and non-bias.
- Keirsey Temperament Sorter developed by David Keirsey is influenced by Isabel Myers sixteen types and Ernst Kretschmer's four types.
- The True Colors Test developed by Don Lowry in 1978 is based on the work of David Keirsey in his book, Please Understand Me, as well as the Myers-Briggs Type Indicator and provides a model for understanding personality types using the colors blue, gold, orange and green to represent four basic personality temperaments.
- The 16PF Questionnaire (16PF) was developed by Raymond Cattell and his colleagues in the 1940s and 1950s in a search to try to discover the basic traits of human personality using scientific methodology. The test was first published in 1949, and is now in its 5th edition, published in 1994. It is used in a wide variety of settings for individual and marital counseling, career counseling and employee development, in educational settings, and for basic research.
- The EQSQ Test developed by Simon Baron-Cohen, Sally Wheelwright centers on the empathizing-systemizing theory of the male versus the female brain types.
- The Personality and Preference Inventory (PAPI), originally designed by Dr Max Kostick, Professor of Industrial Psychology at Boston State College, in Massachusetts, USA, in the early 1960s evaluates the behaviour and preferred work styles of individuals.
- The Strength Deployment Inventory, developed by Elias Porter in 1971 and is based on his theory of Relationship Awareness. Porter was the first known psychometrician to use colors (Red, Green and Blue) as shortcuts to communicate the results of a personality test.
- The Newcastle Personality Assessor (NPA), created by Daniel Nettle, is a short questionnaire designed to quantify personality on five dimensions: Extraversion, Neuroticism, Conscientious, Agreeableness, and Openness.
- The DISC assessment is based on the research of William Moulton Marston and later work by John Grier, and identifies four personality types: Dominance; Influence; Steadiness and Conscientiousness. It is used widely in Fortune 500 companies, for-profit and non-profit organizations.
- The Winslow Personality Profile measures 24 traits on a decile scale. It has been used in the National Football League, the National Basketball Association, the National Hockey League and every draft choice for Major League Baseball for the last 30 years and can be taken online for personal development.
- Other personality tests include Forté Profile, Millon Clinical Multiaxial Inventory, Eysenck Personality Questionnaire, Swedish Universities Scales of Personality, Edwin E. Wagner's The Hand Test, and Enneagram of Personality.
- The HEXACO Personality Inventory – Revised (HEXACO PI-R) is based on the HEXACO model of personality structure, which consists of six domains, the five domains of the Big Five model, as well as the domain of Honesty-Humility.
- The Personality Inventory for DSM-5 (PID-5) was developed in September 2012 by the DSM-5 Personality and Personality Disorders Workgroup with regard to a personality trait model proposed for DSM-5. The PID-5 includes 25 maladaptive personality traits as determined by Krueger, Derringer, Markon, Watson, and Skodol.
- The Process Communication Model (PCM), developed by Taibi Kahler with NASA funding, was used to assist with shuttle astronaut selection. Now it is a non-clinical personality assessment, communication and management methodology that is now applied to corporate management, interpersonal communications, education, and real-time analysis of call centre interactions among other uses.
- The Birkman Method (TBM) was developed by Roger W. Birkman in the late 1940s. The instrument consists of ten scales describing "occupational preferences" (Interests), 11 scales describing "effective behaviors" (Usual behavior) and 11 scales describing interpersonal and environmental expectations (Needs). A corresponding set of 11 scale values was derived to describe "less than effective behaviors" (Stress behavior). TBM was created empirically. The psychological model is most closely associated with the work of Kurt Lewin. Occupational profiling consists of 22 job families with over 200 associated job titles connected to O*Net.
- The International Personality Item Pool (IPIP) is a public domain set of more than 2000 personality items which can be used to measure many personality variables, including the Five Factor Model.
- The Guilford-Zimmerman Temperament Survey examined 10 factors that represented normal personality, and was used in both longitudinal studies and to examine the personality profiles of Italian pilots.
Personality tests of the five factor model
Different types of the Big Five personality traits:
- The NEO PI-R, or the Revised NEO Personality Inventory, is one of the most significant measures of the Five Factor Model (FFM). The measure was created by Costa and McCrae and contains 240 items in the forms of sentences. Costa and McCrae had divided each of the five domains into six facets each, 30 facets total, and changed the way the FFM is measured.
- The Five-Factor Model Rating Form (FFMRF) was developed by Lynam and Widiger in 2001 as a shorter alternative to the NEO PI-R. The form consists of 30 facets, 6 facets for each of the Big Five factors.
- The Ten-Item Personality Inventory (TIPI) and the Five Item Personality Inventory (FIPI) are very abbreviated rating forms of the Big Five personality traits.
- The Five Factor Personality Inventory — Children (FFPI-C) was developed to measure personality traits in children based upon the Five Factor Model (FFM).
- The Big Five Inventory (BFI), developed by John, Donahue, and Kentle, is a 44-item self-report questionnaire consisting of adjectives that assess the domains of the Five Factor Model (FFM). The 10-Item Big Five Inventory is a simplified version of the well-established BFI. It is developed to provide a personality inventory under time constraints. The BFI-10 assesses the 5 dimensions of BFI using only two items each to cut down on length of BFI.
- The Semi-structured Interview for the Assessment of the Five-Factor Model (SIFFM) is the only semi-structured interview intended to measure a personality model or personality disorder. The interview assesses the five domains and 30 facets as presented by the NEO PI-R, and it additional assesses both normal and abnormal extremities of each facet.
- Cattell R.B. (1973). Personality and Mood by Questionnaire. San Francisco, CA: Jossey-Bass. ISBN 0-87589-181-0
- Cattell, R.B., & Kline, P. (1977). The Scientific Analysis of Personality and Motivation. New York: Academic Press.
- Cattell, R.B., & Warburton, F.W. (1967). Objective Personality and Motivation Tests: A Theoretical Introduction and practical Compendium. Champaign, IL: University of Illinois Press.
- Cattell, R.B., & Schuerger, J.M. (1978). Personality Theory in Action: Handbook for the O-A (Objective-Analytic) Test Kit. Champaign, Illinois: Institute for Personality and Ability Testing. ISBN 0-918296-11-0
- Schuerger, J.M. (2008). The Objective-Analytic Test Battery. In G.J. Boyle, G. Matthews, & D.H. Saklofske. (Eds.), The SAGE Handbook of Personality Theory and Assessment: Vol. 2 – Personality Measurement and Testing (pp. 529-546). Los Angeles, CA: Sage Publishers. ISBN 9-781412-946520
- Boyle, G.J. (1985). Self report measures of depression: Some psychometric considerations. British Journal of Clinical Psychology, 24, 45-59.
- Boyle, G.J., & Helmes, E. (2009). Methods of personality assessment. In P.J. Corr & G. Matthews (Eds.), The Cambridge Handbook of Personality Psychology (pp. 110-126). Cambridge, UK: Cambridge University Press. ISBN 978-0-521-86218-9
- Boyle, G.J., Saklofske, D.H., & Matthews, G. (2015). (Eds.), Measures of Personality and Social Psychological Constructs. Amsterdam: Elsevier/Academic Press. ISBN 9-780123-869159 doi.org/10.1016/B978-0-12-386915-9.00001-2
- Saccuzzo, Dennis P.; Kaplan, Robert M. (2009). Psychological Testing: Principles, Applications, and Issues (7th ed.). Belmont, CA: Wadsworth Cengage Learning. ISBN 978-0495095552.
- Boyle, G.J., Matthews, G., & Saklofske, D.H. (2008). (Eds.), The SAGE Handbook of Personality Theory and Assessment: Vol. 1 - Personality Theories and Models. Los Angeles, CA: Sage Publishers. ISBN 9-781412-946513
- Boyle, G.J., Matthews, G., & Saklofske, D.H. (2008). (Eds.), The SAGE Handbook of Personality Theory and Assessment: Vol. 2 - Personality Measurement and Testing. Los Angeles, CA: Sage Publishers. ISBN 9-781412-946520
- Boyle, G.J. (1995). Myers-Briggs Type Indicator (MBTI): Some psychometric limitations. Australian Psychologist, 30, 71-74.
- Costa, P.T., & McCrae, R.R. (1985). The NEO Personality Inventory Manual. Odessa, FL: Psychological Assessment Resources.
- Boyle, G.J. (2008). Critique of Five-Factor Model (FFM). In G.J. Boyle, G. Matthews, & D.H. Saklofske. (Eds.), The SAGE Handbook of Personality Theory and Assessment: Vol. 1 - Personality Theories and Models. Los Angeles, CA: Sage Publishers. ISBN 9-781412-946513
- Cattell, R.B. (1995). The fallacy of five factors in the personality sphere. The Psychologist, 8, 207-208.
- Eysenck, H.J. (1992). Four ways five factors are not basic. Personality and Individual Differences, 13, 667-673.
- "Personality Testing at Work: Emotional Breakdown". The Economist.
- Elahe Nezami; James N. Butcher (16 February 2000). G. Goldstein; Michel Hersen (eds.). Handbook of Psychological Assessment. Elsevier. p. 415. ISBN 978-0-08-054002-3.
- Goldberg, L.R. (1993). "The structure of phenotypic personality traits". American Psychologist. 48 (1): 26–34. doi:10.1037/0003-066x.48.1.26. PMID 8427480.
- Thurstone, L. L. (1947). Multiple Factor Analysis. Chicago, IL: University of Chicago Press.
- Haggbloom, S.J., Warnick, R., Warnick, J.E., Jones, V.K., Yarbrough, G.L., Russell, T.M., Borecky, C.M., McGahhey, R., Powell III, J.L., Beavers, J., & Monte, E. (2002). The 100 most eminent psychologists of the 20th century. Review of General Psychology, 6, 139-152. doi: 10.1037//1089-2618.104.22.168
- Cattell, R.B., & Nichols, K.E. (1972). An improved definition, from 10 researches, of second order personality factors in Q data (with cross-cultural checks). Journal of Social Psychology, 86, 187-203.
- Boyle, G.J., Stankov, L., & Cattell, R.B. (1995). Measurement and statistical models in the study of personality and intelligence. In D.H. Saklofske & M. Zeidner (Eds.), International Handbook of Personality and Intelligence (pp. 417-446). New York: Plenum. ISBN 0-306-44749-5
- Boyle, G.J. (1985). Self-report measures of depression: Some psychometric considerations. British Journal of Clinical Psychology, 24, 45-59.
- Helmes, E., & Reddon, J.R. (1993). A perspective on developments in assessing psychopathology: A critical review of the MMPI and MMPI-2. Psychological Bulletin, 113, 453-471.
- Carlson, Neil, R.; et al. (2010). Psychology: the Science of Behaviour. United States of America: Person Education. pp. 464. ISBN 978-0-205-64524-4.
- Burisch, Matthias (March 1984). "Approaches to personality inventory construction: A comparison of merits". American Psychologist. 39 (3): 214–227. doi:10.1037/0003-066X.39.3.214.
- Burisch, M (1984). "Approaches to personality inventory construction: A comparison of merits". American Psychologist. 39 (3): 214–227. doi:10.1037/0003-066x.39.3.214.
- Jackson, D. N. (1971). "The dynamics of structured personality tests: 1971". Psychological Review. 78 (3): 229–248. doi:10.1037/h0030852.
- McCrae, Robert; Oliver John (1992). "An Introduction to the Five-Factor Model and Its Applications". Journal of Personality. 60 (2): 175–215. CiteSeerX 10.1.1.470.4858. doi:10.1111/j.1467-6494.1992.tb00970.x. PMID 1635039.
- Smith, Greggory; Sarah Fischer; Suzannah Fister (December 2003). "Incremental Validity Principles in Test Construction". Psychological Assessment. 15 (4): 467–477. doi:10.1037/1040-3522.214.171.1247. PMID 14692843.
- Ryan Joseph; Shane Lopez; Scott Sumerall (2001). William Dorfman, Michel Hersen (ed.). Understanding Psychological Assessment: Perspective on Individual Differences (1 ed.). Springer. pp. 1–15.
- C., Ashton, Michael (2017-06-13). Individual Differences and Personality (3rd ed.). ISBN 9780128098455. OCLC 987583452.
- "Human Resources". Archived from the original on 2018-04-09. Retrieved 2018-04-08.
- Schonfeld, I.S., & Mazzola, J.J. (2013). Strengths and limitations of qualitative approaches to research in occupational health psychology. In R. Sinclair, M. Wang, & L. Tetrick (Eds.), Research methods in occupational health psychology: State of the art in measurement, design, and data analysis (pp. 268-289). New York: Routledge.
- Ones, D.S. (2009). Personality at work: Raising awareness and correcting misconceptions. Human Performance, 18, 389-404. doi:10.1207/s15327043hup1804_5
- Urbina, Susana (2014-06-30). Essentials of Psychological Testing (Second ed.). Hoboken. New Jersey: John Wiley & Sons, Incorporated. pp. 127–128, 165–167. ISBN 978-1-118-70725-8. Retrieved 2018. Check date values in:
- (see Lord and Novick, 1968)
- Herman J. Adèr, Gideon J. Mellenbergh (2008) Advising on Research Methods: a consultant's companion. Johannes van Kessel Publ. p. 244.
- See Hamleton and Swaminathon (1985) for a full summary of IRT
- (Mellenbergh, 2008)
- Doll, Edgar Arnold (1953). The measurement of social competence: a manual for the Vineland social maturity scale. Educational Test Bureau, Educational Publishers. doi:10.1037/11349-000. archived at 
- Arendasy, M.; Sommer, Herle; Schutzhofer, Inwanschitz (2011). "Modeling effects of faking on an objective personality test". Journal of Individual Differences. 32 (4): 210–218. doi:10.1027/1614-0001/a000053.
- (e.g., Viswesvaran & Ones, 1999; Martin, Bowen & Hunt, 2002)
- Hogan, Joyce (2007). "Personality Measurement, Faking, and Employment Selection" (PDF). The Journal of Applied Psychology. American Psychological Association. 92 (5): 1270–85. doi:10.1037/0021-9010.92.5.1270. PMID 17845085. Archived from the original (PDF) on 2013-06-05.
- Poropat, Arthur E. (2014-08-01). "Other-rated personality and academic performance: Evidence and implications". Learning and Individual Differences. 34: 24–32. doi:10.1016/j.lindif.2014.05.013.
- Blinkhorn, S.; Johnson, C.; Wood, R. (1988). "Spuriouser and spuriouser:The use of ipsative personality tests". Journal of Occupational Psychology. 61 (2): 153–162. doi:10.1111/j.2044-8325.1988.tb00279.x.
- "State Laws on Polygraphs and Lie Detector Tests". www.nolo.com.
- Stabile, Susan J. "The Use of Personality Tests as a Hiring Tool: Is the Benefit Worth the Cost?" (PDF). U.PA. Journal of Labor and Employment Law.
- Harkness, A. R., & McNulty, J. L. (1994). The Personality Psychopathology Five (PSY-5): Issue from the pages of a diagnostic manual instead of a dictionary. In S. Strack & M. Lorr (Eds.), Differentiating normal and abnormal personality. New York: Springer.
- "International True Colors Association". Archived from the original on 2012-03-20. Retrieved 2013-01-03.
- Porter, Elias H. (1971) Strength Deployment Inventory, Pacific Palisades, CA: Personal Strengths Assessment Service.
- Nettle, Daniel (2009-03-07). "A test of character". The Guardian. London.
- "How to Build the Perfect Batter". GQ Magazine. Retrieved 2012-07-26.
- "Winslow Online Personality Assessment". Retrieved 2012-07-26.
- Ashton, M. C.; Lee, K. (2008). "The prediction of Honesty-Humility-related criteria by the HEXACO and Five-Factor models of personality". Journal of Research in Personality. 42 (5): 1216–1228. doi:10.1016/j.jrp.2008.03.006.
- Krueger, R. F.; Derringer, J.; Markon, K. E.; Watson, D.; Skodol, A. E. (2012). "Initial construction of a maladaptive personality trait model and inventory for DSM-5". Psychological Medicine. 42 (9): 1879–1890. doi:10.1017/s0033291711002674. PMC 3413381. PMID 22153017.
- Spenser, Scott. "The History of the Process Communication Model in Astronaut Selection" Archived 2013-10-27 at the Wayback Machine, Cornell University, December 2000. Retrieved 19 June 2013
- Conway, Kelly (2008). "Methods and systems for determining customer hang-up during a telephonic communication between a customer and a contact center". US Patent Office.
- Steiner, Christopher (2012). “Automate This: How Algorithms Came to Rule Our World”. Penguin Group (USA) Inc., New York. ISBN 9781101572153.
- Goldberg, L. R.; Johnson, J. A.; Eber, H. W.; Hogan, R.; Ashton, M. C.; Cloninger, C. R.; Gough, H. C. (2006). "The International Personality Item Pool and the future of public-domain personality measures". Journal of Research in Personality. 40: 84–96. doi:10.1016/j.jrp.2005.08.007.
- Terracciano, Antonio; McCrae, Robert R.; Costa, Paul T. (2006). "Longitudinal trajectories in Guilford-Zimmerman temperament survey data: results from the Baltimore longitudinal study of aging". The Journals of Gerontology. Series B, Psychological Sciences and Social Sciences. 61 (2): P108–116. doi:10.1093/geronb/61.2.p108. ISSN 1079-5014. PMC 2754731. PMID 16497954.
- Giambelluca, A.; Zizolfi, S. (1985). "[The Guilford Zimmerman Temperament Survey (GZTS): concurrent criterion validity. Study of a sample of 150 pilot cadets of the Aeronautics Academy of Pozzuoli]". Rivista di Medicina Aeronautica e Spaziale. 52 (2): 139–149. ISSN 0035-631X. PMID 3880032.
- Giambelluca, A.; Zizolfi, S. (1985). "[The Guilford-Zimmerman Temperament Survey (GZTS). The results of its first use in military aeronautics: descriptive statistics, intercorrelation matrix and competitive validity with the MMPI. A study on a sample of 150 student officer pilots of the Pozzuoli Aeronautics Academy]". Rivista di Medicina Aeronautica e Spaziale. 52 (1): 29–46. ISSN 0035-631X. PMID 3880382.
- Costa, P. T., Jr., & McCrae, R. R. (1992). Revised NEO Personality Inventory (NEO PI-R) and NEO Five-Factor Inventory (NEO-FFI) professional manual. Odessa, FL: Psychological Assessment Resources.
- Lynam, D. R.; Widiger, T. A. (2001). "Using the five-factor model to represent the DSM-IV personality disorders: An expert consensus approach". Journal of Abnormal Psychology. 110 (3): 401–412. doi:10.1037/0021-843x.110.3.401. PMID 11502083. S2CID 17468718.
- Gosling, Samuel D; Rentfrow, Peter J; Swann, William B (2003). "A very brief measure of the Big-Five personality domains". Journal of Research in Personality. 37 (6): 504–528. doi:10.1016/S0092-6566(03)00046-1. ISSN 0092-6566.
- McGhee, R.L., Ehrler, D. & Buckhalt, J. (2008). Manual for the Five Factor Personality Inventory — Children Austin, TX (PRO ED, INC).
- John, O. P., Donahue, E. M., & Kentle, R. L. (1991). The Big Five Inventory – Versions 4a and 54. Berkeley: University of California, Berkeley, Institute of Personality and Social Research.
- Beatrice Rammstedt (2007). The 10-Item Big Five Inventory: Norm Values and Investigation of Sociodemographic Effects Based on a German Population Representative Sample. European Journal of Psychological Assessment (July 2007), 23 (3), pg. 193-201
- Trull, T. J., & Widiger, T. A. (1997). Structured Interview for the Five-Factor Model of Personality. Odessa, FL: Psychological Assessment Resources.