Jump to content

Coding (social sciences)

From Wikipedia, the free encyclopedia

In the social sciences, coding is an analytical process in which data, in both quantitative form (such as questionnaires results) or qualitative form (such as interview transcripts) are categorized to facilitate analysis.

One purpose of coding is to transform the data into a form suitable for computer-aided analysis. This categorization of information is an important step, for example, in preparing data for computer processing with statistical software. Prior to coding, an annotation scheme is defined. It consists of codes or tags. During coding, coders manually add codes into data where required features are identified. The coding scheme ensures that the codes are added consistently across the data set and allows for verification of previously tagged data.[1]

Some studies will employ multiple coders working independently on the same data. This also minimizes the chance of errors from coding and is believed to increase the reliability of data.


One code should apply to only one category and categories should be comprehensive. There should be clear guidelines for coders (individuals who do the coding) so that code is consistent.

Quantitative approach[edit]

For quantitative analysis, data is coded usually into measured and recorded as nominal or ordinal variables.

Questionnaire data can be pre-coded (process of assigning codes to expected answers on designed questionnaire), field-coded (process of assigning codes as soon as data is available, usually during fieldwork), post-coded (coding of open questions on completed questionnaires) or office-coded (done after fieldwork). Note that some of the above are not mutually exclusive.

In social sciences, spreadsheets such as Excel and more advanced software packages such as R, Matlab, PSPP/SPSS, DAP/SAS, MiniTab and Stata are often used.

Qualitative approach[edit]

For disciplines in which a qualitative format is preferential, including ethnography, humanistic geography or phenomenological psychology a varied approach to coding can be applied. Iain Hay (2005) outlines a two-step process beginning with basic coding in order to distinguish overall themes, followed by a more in depth, interpretive code in which more specific trends and patterns can be interpreted.[2]

Much of qualitative coding can be attributed to either grounded or a priori coding.[3] Grounded coding refers to allowing notable themes and patterns emerge from the document themselves, where as a priori coding requires the researcher to apply pre-existing theoretical frameworks to analyze the documents. As coding methods are applied across various texts, the researcher is able to apply axial coding, which is the process of selecting core thematic categories present in several documents to discover common patterns and relations.[4]

Coding is considered a process of discovery and is done in cycles. Prior to constructing categories, a researcher might apply a first and second cycle coding methods.[3] There are a multitude of methods available, and a researcher will want to pick one that is suited for the format and nature of their documents. Not all methods can be applied to every type of document. Some examples of first cycle coding methods include:

  • In Vivo Coding: codes terms and phrases used by the participants themselves. The objective is to attempt to give the participants a voice in the research.
  • Process Coding: this method uses gerunds ("-ing" words) only to describe and display actions throughout the document. It is useful for examining processes, emotional phases and rituals.
  • Versus Coding: uses binary terms to describe groups and processes. The goal is to see which processes and organizations are in conflict with each other throughout the document. These can be both conceptual and grounded objects.
  • Values Coding: codes that attempt to exhibit the inferred values, attitudes and beliefs of participants. In doing so, the research may discern patterns in world views.
  • Sub-coding: Other names of this method are embedded coding, nested coding or joint coding. This involves assigning primary and second order codes to a word or phrase. It serves the purpose of adding detail to a code. The primary and secondary codes are often called parent and children codes.[5]
  • Simultaneous Coding: When same parts of the data have different meanings and two or more codes are applied to the same parts, then this kind of coding is called Simultaneous Coding.[3]

The process can be done manually, which can be as simple as highlighting different concepts with different colours, or fed into a software package. Some examples of qualitative software packages include Atlas.ti, MAXQDA, NVivo, QDA Miner, and RQDA.

After assembling codes it is time to organize them into broader themes and categories. The process generally involves identifying themes from the existing codes, reducing the themes to a manageable number, creating hierarchies within the themes and then linking themes together through theoretical modeling.[6]


Creating memos during the coding process is integral to both grounded and a priori coding approaches. Qualitative research is inherently reflexive; as the researcher delves deeper into their subject, it is important to chronicle their own thought processes through reflective or methodological memos, as doing so may highlight their own subjective interpretations of data.[7] It is crucial to begin memoing at the onset of research. Regardless of the type of memo produced, what is important is that the process initiates critical thinking and productivity in the research. Doing so will facilitate easier and more coherent analyses as the project draws on.[8] Memos can be used to map research activities, uncover meaning from data, maintaining research momentum and engagement and opening communication.[9]

See also[edit]


  1. ^ "Coding Schemes | CAWSE" (PDF). General Coding and Annotation Conventions. 2018-07-02. Retrieved 2019-03-10.
  2. ^ Hay, I. (2005). Qualitative research methods in human geography (2nd ed.). Oxford: Oxford University Press.
  3. ^ a b c Saldaña, Johnny. (2015). "The Coding Manual for Qualitative Researchers" (3rd ed.). SAGE Publications Ltd.
  4. ^ Grbich, Carol. (2013). "Qualitative Data Analysis" (2nd ed.). The Flinders University of South Australia: SAGE Publications Ltd.
  5. ^ Saldaña, Johnny (2012). The Coding Manual for Qualitative Researchers. London: Sage. pp. 58–181. ISBN 9781446247372.
  6. ^ Ryan, Gery and H. Bernard. (2003). "Techniques to Identify Themes." Field Methods. Vol.15(1). pp85-109.
  7. ^ Primeau, Loree A. (2003). "Reflections on Self in Qualitative Research: Stories of Family" The American Journal of Occupational Therapy. Vol. 57, 9-16
  8. ^ Charmaz, Kathy. (2006). "Constructing Grounded Theory: A Practical Guide through Qualitative Analysis." SAGE Publications.
  9. ^ Birks et al. (2008). "Memoing in qualitative research" Journal of Research in Nursing. SAGE Publications. Vol. 13


  • Hay, I. (2005). Qualitative research methods in human geography (2nd ed.). Oxford: Oxford University Press.
  • Grbich, Carol. (2013). "Qualitative Data Analysis" (2nd ed.). The Flinders University of South Australia: SAGE Publications Ltd.
  • Saldaña, Johnny. (2015). "The Coding Manual for Qualitative Researchers" (3rd ed.). SAGE Publications Ltd.