Thematic analysis

From Wikipedia, the free encyclopedia

Thematic analysis is one of the most common forms of analysis within qualitative research.[1][2] It emphasizes identifying, analysing and interpreting patterns of meaning (or "themes") within qualitative data.[1] Thematic analysis is often understood as a method or technique in contrast to most other qualitative analytic approaches - such as grounded theory, discourse analysis, narrative analysis and interpretative phenomenological analysis - which can be described as methodologies or theoretically informed frameworks for research (they specify guiding theory, appropriate research questions and methods of data collection, as well as procedures for conducting analysis). Thematic analysis is best thought of as an umbrella term for a variety of different approaches, rather than a singular method. Different versions of thematic analysis are underpinned by different philosophical and conceptual assumptions and are divergent in terms of procedure. Leading thematic analysis proponents, psychologists Virginia Braun and Victoria Clarke[3] distinguish between three main types of thematic analysis: coding reliability approaches (examples include the approaches developed by Richard Boyatzis[4] and Greg Guest and colleagues[2]), code book approaches (these includes approaches like framework analysis,[5] template analysis[6] and matrix analysis[7]) and reflexive approaches.[8][9] They describe their own widely used approach first outlined in 2006 in the journal Qualitative Research in Psychology[1] as reflexive thematic analysis.[10] Their 2006 paper has over 120,000 Google Scholar citations and according to Google Scholar is the most cited academic paper published in 2006. The popularity of this paper exemplifies the growing interest in thematic analysis as a distinct method (although some have questioned whether it is a distinct method or simply a generic set of analytic procedures[11]).


Thematic analysis is used in qualitative research and focuses on examining themes or patterns of meaning within data.[12] This method can emphasize both organization and rich description of the data set and theoretically informed interpretation of meaning.[1] Thematic analysis goes beyond simply counting phrases or words in a text (as in content analysis) and explores explicit and implicit meanings within the data.[2] Coding is the primary process for developing themes by identifying items of analytic interest in the data and tagging these with a coding label.[4] In some thematic analysis approaches coding follows theme development and is a deductive process of allocating data to pre-identified themes (this approach is common in coding reliability and code book approaches), in other approaches - notably Braun and Clarke's reflexive approach - coding precedes theme development and themes are built from codes.[3] One of the hallmarks of thematic analysis is its flexibility - flexibility with regards to framing theory, research questions and research design.[1] Thematic analysis can be used to explore questions about participants' lived experiences, perspectives, behaviour and practices, the factors and social processes that influence and shape particular phenomena, the explicit and implicit norms and 'rules' governing particular practices, as well as the social construction of meaning and the representation of social objects in particular texts and contexts.[13]

Thematic analysis can be used to analyse most types of qualitative data including qualitative data collected from interviews, focus groups, surveys, solicited diaries, visual methods, observation and field research, action research, memory work, vignettes, story completion and secondary sources. Data-sets can range from short, perfunctory response to an open-ended survey question to hundreds of pages of interview transcripts.[14] Thematic analysis can be used to analyse both small and large data-sets.[1] Thematic analysis is often used in mixed-method designs - the theoretical flexibility of TA makes it a more straightforward choice than approaches with specific embedded theoretical assumptions.

Thematic analysis is sometimes claimed to be compatible with phenomenology in that it can focus on participants' subjective experiences and sense-making;[2] there is a long tradition of using thematic analysis in phenomenological research.[15] A phenomenological approach emphasizes the participants' perceptions, feelings and experiences as the paramount object of study. Rooted in humanistic psychology, phenomenology notes giving voice to the "other" as a key component in qualitative research in general. This approach allows the respondents to discuss the topic in their own words, free of constraints from fixed-response questions found in quantitative studies.

Thematic analysis is sometimes erroneously assumed to be only compatible with phenomenology or experiential approaches to qualitative research. Braun and Clarke argue that their reflexive approach is equally compatible with social constructionist, poststructuralist and critical approaches to qualitative research.[16] They emphasise the theoretical flexibility of thematic analysis and its use within realist, critical realist and relativist ontologies and positivist, contextualist and constructionist epistemologies.

Like most research methods, the process of thematic analysis of data can occur both inductively or deductively.[1] In an inductive approach, the themes identified are strongly linked to the data.[4] This means that the process of coding occurs without trying to fit the data into pre-existing theory or framework. But inductive learning processes in practice are rarely 'purely bottom up'; it is not possible for the researchers and their communities to free themselves completely from ontological (theory of reality), epistemological (theory of knowledge) and paradigmatic (habitual) assumptions - coding will always to some extent reflect the researcher's philosophical standpoint, and individual/communal values with respect to knowledge and learning.[1] Deductive approaches, on the other hand, are more theory-driven.[17] This form of analysis tends to be more interpretative because analysis is explicitly shaped and informed by pre-existing theory and concepts (ideally cited for transparency in the shared learning). Deductive approaches can involve seeking to identify themes identified in other research in the data-set or using existing theory as a lens through which to organise, code and interpret the data. Sometimes deductive approaches are misunderstood as coding driven by a research question or the data collection questions. A thematic analysis can also combine inductive and deductive approaches, for example in foregrounding interplay between a priori ideas from clinician-led qualitative data analysis teams and those emerging from study participants and the field observations.[18]

Different approaches to thematic analysis[edit]

Coding reliability[4][2] approaches have the longest history and are often little different from qualitative content analysis. As the name suggests they prioritise the measurement of coding reliability through the use of structured and fixed code books, the use of multiple coders who work independently to apply the code book to the data, the measurement of inter-rater reliability or inter-coder agreement (typically using Cohen's Kappa) and the determination of final coding through consensus or agreement between coders. These approaches are a form of qualitative positivism or small q qualitative research,[19] which combine the use of qualitative data with data analysis processes and procedures based on the research values and assumptions of (quantitative) positivism - emphasising the importance of establishing coding reliability and viewing researcher subjectivity or 'bias' as a potential threat to coding reliability that must be contained and 'controlled for' to avoiding confounding the 'results' (with the presence and active influence of the researcher). Boyatzis[4] presents his approach as one that can 'bridge the divide' between quantitative (positivist) and qualitative (interpretivist) paradigms. Some qualitative researchers are critical of the use of structured code books, multiple independent coders and inter-rater reliability measures. Janice Morse argues that such coding is necessarily coarse and superficial to facilitate coding agreement.[20] Braun and Clarke (citing Yardley[21]) argue that all coding agreement demonstrates is that coders have been trained to code in the same way not that coding is 'reliable' or 'accurate' with respect to the underlying phenomena that is coded and described.[13]

Code book approaches like framework analysis,[5] template analysis[6] and matrix analysis[7] centre on the use of structured code books but - unlike coding reliability approaches - emphasise to a greater or lesser extent qualitative research values. Both coding reliability and code book approaches typically involve early theme development - with all or some themes developed prior to coding, often following some data familiarisation (reading and re-reading data to become intimately familiar with its contents). Once themes have been developed the code book is created - this might involve some initial analysis of a portion of or all of the data. The data is then coded. Coding involves allocating data to the pre-determined themes using the code book as a guide. The code book can also be used to map and display the occurrence of codes and themes in each data item. Themes are often of the shared topic type discussed by Braun and Clarke.[3]

Reflexive approaches centre organic and flexible coding processes - there is no code book, coding can be undertaken by one researcher, if multiple researchers are involved in coding this is conceptualised as a collaborative process rather than one that should lead to consensus. Individual codes are not fixed - they can evolve throughout the coding process, the boundaries of the code can be redrawn, codes can be split into two or more codes, collapsed with other codes and even promoted to themes.[13] Reflexive approaches typically involve later theme development - with themes created from clustering together similar codes. Themes should capture shared meaning organised around a central concept or idea.[22]

Braun and Clarke and colleagues have been critical of a tendency to overlook the diversity within thematic analysis and the failure to recognise the differences between the various approaches they have mapped out.[23] They argue that this failure leads to unthinking 'mash-ups' of their approach with incompatible techniques and approaches such as code books, consensus coding and measurement of inter-rater reliability.


There is no one definition or conceptualisation of a theme in thematic analysis.[24] For some thematic analysis proponents, including Braun and Clarke, themes are conceptualised as patterns of shared meaning across data items, underpinned or united by a central concept, which are important to the understanding of a phenomenon and are relevant to the research question.[3] For others (including most coding reliability and code book proponents), themes are simply summaries of information related to a particular topic or data domain; there is no requirement for shared meaning organised around a central concept, just a shared topic.[3] Although these two conceptualisations are associated with particular approaches to thematic analysis, they are often confused and conflated. What Braun and Clarke call domain summary or topic summary themes often have one word theme titles (e.g. Gender, Support) or titles like 'Benefits of...', 'Barriers to...' signalling the focus on summarising everything participants said, or the main points raised, in relation to a particular topic or data domain.[3] Topic summary themes are typically developed prior to data coding and often reflect data collection questions. Shared meaning themes that are underpinned by a central concept or idea[22] cannot be developed prior to coding (because they are built from codes), so are the output of a thorough and systematic coding process. Braun and Clarke have been critical of the confusion of topic summary themes with their conceptualisation of themes as capturing shared meaning underpinned by a central concept.[25] Some qualitative researchers have argued that topic summaries represent an under-developed analysis or analytic foreclosure.[26][27]

There is controversy around the notion that 'themes emerge' from data. Braun and Clarke are critical of this language because they argue it positions themes as entities that exist fully formed in data - the researcher is simply a passive witness to the themes 'emerging' from the data.[1] Instead they argue that the researcher plays an active role in the creation of themes - so themes are constructed, created, generated rather than simply emerging. Others use the term deliberatively to capture the inductive (emergent) creation of themes. However, it is not always clear how the term is being used.

Prevalence or recurrence is not necessarily the most important criteria in determining what constitutes a theme; themes can be considered important if they are highly relevant to the research question and significant in understanding the phenomena of interest.[1] Theme prevalence does not necessarily mean the frequency at which a theme occurs (i.e. the number of data items in which it occurs); it can also mean how much data a theme captures within each data item and across the data-set. Themes are typically evident across the data set, but a higher frequency does not necessarily mean that the theme is more important to understanding the data. A researcher's judgement is the key tool in determining which themes are more crucial.[1]

There are also different levels at which data can be coded and themes can be identified—semantic and latent.[4][1] A thematic analysis can focus on one of these levels or both. Semantic codes and themes identify the explicit and surface meanings of the data. The researcher does not look beyond what the participant said or wrote. Conversely, latent codes or themes capture underlying ideas, patterns, and assumptions. This requires a more interpretative and conceptual orientation to the data.

For Braun and Clarke, there is a clear (but not absolute) distinction between a theme and a code - a code captures one (or more) insights about the data and a theme encompasses numerous insights organised around a central concept or idea. They often use the analogy of a brick and tile house - the code is an individual brick or tile, and themes are the walls or roof panels, each made up of numerous codes. Other approaches to thematic analysis don't make such a clear distinction between codes and themes - several texts recommend that researchers "code for themes".[28] This can be confusing because for Braun and Clarke, and others, the theme is considered the outcome or result of coding, not that which is coded. In approaches that make a clear distinction between codes and themes, the code is the label that is given to particular pieces of the data that contributes to a theme. For example, "SECURITY can be a code, but A FALSE SENSE OF SECURITY can be a theme."[28]

Methodological issues[edit]

Reflexivity journals[edit]

Given that qualitative work is inherently interpretive research, the positionings, values, and judgments of the researchers need to be explicitly acknowledged so they are taken into account in making sense of the final report and judging its quality.[29] This type of openness and reflection is considered to be positive in the qualitative community.[30] Researchers shape the work that they do and are the instrument for collecting and analyzing data. In order to acknowledge the researcher as the tool of analysis, it is useful to create and maintain a reflexivity journal.[31]

The reflexivity process can be described as the researcher reflecting on and documenting how their values, positionings, choices and research practices influenced and shaped the study and the final analysis of the data. Reflexivity journals are somewhat similar to the use of analytic memos or memo writing in grounded theory, which can be useful for reflecting on the developing analysis and potential patterns, themes and concepts.[14] Throughout the coding process researchers should have detailed records of the development of each of their codes and potential themes. In addition, changes made to themes and connections between themes can be discussed in the final report to assist the reader in understanding decisions that were made throughout the coding process.[32]

Once data collection is complete and researchers begin the data analysis phases, they should make notes on their initial impressions of the data. The logging of ideas for future analysis can aid in getting thoughts and reflections written down and may serve as a reference for potential coding ideas as one progresses from one phase to the next in the thematic analysis process.[14]

Coding practice[edit]

Questions to consider whilst coding may include:[14]

  • What are people doing? What are they trying to accomplish?
  • How exactly do they do this? What specific means or strategies are used?
  • How do people talk about and understand what is going on?
  • What assumptions are they making?
  • What do I see going on here? What did I learn from note taking?
  • Why did I include them?

Such questions are generally asked throughout all cycles of the coding process and the data analysis. A reflexivity journal is often used to identify potential codes that were not initially pertinent to the study.[14]

Sample size considerations[edit]

There is no straightforward answer to questions of sample size in thematic analysis; just as there is no straightforward answer to sample size in qualitative research more broadly (the classic answer is 'it depends' - on the scope of the study, the research question and topic, the method or methods of data collection, the richness of individual data items, the analytic approach[33]). Some coding reliability and code book proponents provide guidance for determining sample size in advance of data analysis - focusing on the concept of saturation or information redundancy (no new information, codes or themes are evident in the data). These attempts to 'operationalise' saturation suggest that code saturation (often defined as identifying one instances of a code) can be achieved in as few as 12 or even 6 interviews in some circumstances.[34] Meaning saturation - developing a "richly textured" understanding of issues - is thought to require larger samples (at least 24 interviews).[35] There are numerous critiques of the concept of data saturation - many argue it is embedded within a realist conception of fixed meaning and in a qualitative paradigm there is always potential for new understandings because of the researcher's role in interpreting meaning.[36] Some quantitative researchers have offered statistical models for determining sample size in advance of data collection in thematic analysis. For example, Fugard and Potts offered a prospective, quantitative tool to support thinking on sample size by analogy to quantitative sample size estimation methods.[37] Lowe and colleagues proposed quantitative, probabilistic measures of degree of saturation that can be calculated from an initial sample and used to estimate the sample size required to achieve a specified level of saturation.[38] Their analysis indicates that commonly-used binomial sample size estimation methods may significantly underestimate the sample size required for saturation. All of these tools have been criticised by qualitative researchers (including Braun and Clarke[39]) for relying on assumptions about qualitative research, thematic analysis and themes that are antithetical to approaches that prioritise qualitative research values.[40][41][42]

Braun and Clarke's six phases of thematic analysis[edit]

Phase[1] Process Result Reflexivity Journal Entries[1]
Phase 1 Read and re-read data in order to become familiar with what the data entails, paying specific attention to patterns that occur. Preliminary "start" codes and detailed notes. List start codes in journal, along with a description of what each code means and the source of the code.
Phase 2 Generate the initial codes by documenting where and how patterns occur. This happens through data reduction where the researcher collapses data into labels in order to create categories for more efficient analysis. Data complication is also completed here. This involves the researcher making inferences about what the codes mean. Comprehensive codes of how data answers research question. Provide detailed information as to how and why codes were combined, what questions the researcher is asking of the data, and how codes are related.
Phase 3 Combine codes into overarching themes that accurately depict the data. It is important in developing themes that the researcher describes exactly what the themes mean, even if the theme does not seem to "fit". The researcher should also describe what is missing from the analysis. List of candidate themes for further analysis. Reflexivity journals need to note how the codes were interpreted and combined to form themes.
Phase 4 In this stage, the researcher looks at how the themes support the data and the overarching theoretical perspective. If the analysis seems incomplete, the researcher needs to go back and find what is missing. Coherent recognition of how themes are patterned to tell an accurate story about the data. Notes need to include the process of understanding themes and how they fit together with the given codes. Answers to the research questions and data-driven questions need to be abundantly complex and well-supported by the data.
Phase 5 The researcher needs to define what each theme is, which aspects of data are being captured, and what is interesting about the themes. A comprehensive analysis of what the themes contribute to understanding the data. The researcher should describe each theme within a few sentences.
Phase 6 When the researchers write the report, they must decide which themes make meaningful contributions to understanding what is going on within the data. Researchers should also conduct "member checking". This is where the researchers go back to the sample at hand to see if their description is an accurate representation. A thick description of the results. Note why particular themes are more useful at making contributions and understanding what is going on within the data set. Describe the process of choosing the way in which the results would be reported.

Phase 1: Becoming familiar with the data[edit]

This six-phase process for thematic analysis is based on the work of Braun and Clarke and their reflexive approach to thematic analysis.[1][43] This six phase cyclical process involves going back and forth between phases of data analysis as needed until you are satisfied with the final themes.[1] Researchers conducting thematic analysis should attempt to go beyond surface meanings of the data to make sense of the data and tell a rich and compelling story about what the data means.[1] The procedures associated with other thematic analysis approaches are rather different. This description of Braun and Clarke's six phase process also includes some discussion of the contrasting insights provided by other thematic analysis proponents. The initial phase in reflexive thematic analysis is common to most approaches - that of data familiarisation. This is where researchers familiarize themselves with the content of their data - both the detail of each data item and the 'bigger picture'. In other approaches, prior to reading the data, researchers may create a "start list" of potential codes.[44] As Braun and Clarke's approach is intended to focus on the data and not the researcher's prior conceptions they only recommend developing codes prior to familiarisation in deductive approaches where coding is guided by pre-existing theory. For Miles and Huberman, in their matrix approach, "start codes" should be included in a reflexivity journal with a description of representations of each code and where the code is established.[44] Analyzing data in an active way will assist researchers in searching for meanings and patterns in the data set. At this stage, it is tempting to rush this phase of familiarisation and immediately start generating codes and themes; however, this process of immersion will aid researchers in identifying possible themes and patterns. Reading and re-reading the material until the researcher is comfortable is crucial to the initial phase of analysis. While becoming familiar with the material, note-taking is a crucial part of this step in order begin developing potential codes.[1]


After completing data collection, the researcher may need to transcribe their data into written form (e.g. audio recorded data such as interviews).[1] Braun and Clarke provide a transcription notation system for use with their approach in their textbook Successful Qualitative Research. Quality transcription of the data is imperative to the dependability of analysis. Criteria for transcription of data must be established before the transcription phase is initiated to ensure that dependability is high.[2]

Some thematic analysis proponents - particular those with a foothold in positivism - express concern about the accuracy of transcription.[2] Inconsistencies in transcription can produce 'biases' in data analysis that will be difficult to identify later in the analysis process.[2] For others, including Braun and Clarke, transcription is viewed as an interpretative and theoretically embedded process and therefore cannot be 'accurate' in a straightforward sense, as the researcher always makes choices about how to translate spoken into written text.[1] However, this does not mean that researchers shouldn't strive for thoroughness in their transcripts and use a systematic approach to transcription. Authors should ideally provide a key for their system of transcription notation so its readily apparent what particular notations means. Inserting comments like "*voice lowered*" will signal a change in the speech. A general rough guideline to follow when planning time for transcribing - allow for spending 15 minutes of transcription for every 5 minutes of dialog. Transcription can form part of the familiarisation process.[1][13]

After this stage, the researcher should feel familiar with the content of the data and should be able to start to identify overt patterns or repeating issues the data. These patterns should be recorded in a reflexivity journal where they will be of use when coding data. Other TA proponents conceptualise coding as the researcher beginning to gain control over the data. They view it as important to mark data that addresses the research question. For them, this is the beginning of the coding process.[2]

Phase 2: Generating codes[edit]

The second step in reflexive thematic analysis is tagging items of interest in the data with a label (a few words or a short phrase). This label should clearly evoke the relevant features of the data - this is important for later stages of theme development. This systematic way of organizing and identifying meaningful parts of data as it relates to the research question is called coding. The coding process evolves through the researcher's immersion in their data and is not considered to be a linear process, but a cyclical process in which codes are developed and refined.

The coding process is rarely completed from one sweep through the data. Saladana recommends that each time researchers work through the data set, they should strive to refine codes by adding, subtracting, combining or splitting potential codes.[14] For Miles and Huberman, "start codes" are produced through terminology used by participants during the interview and can be used as a reference point of their experiences during the interview.[44] For more positivist inclined thematic analysis proponents, dependability increases when the researcher uses concrete codes that are based on dialogue and are descriptive in nature.[2] These codes will facilitate the researcher's ability to locate pieces of data later in the process and identify why they included them. However, Braun and Clarke urge researchers to look beyond a sole focus on description and summary and engage interpretatively with data - exploring both overt (semantic) and implicit (latent) meaning.[1] Coding sets the stage for detailed analysis later by allowing the researcher to reorganize the data according to the ideas that have been obtained throughout the process. Reflexivity journal entries for new codes serve as a reference point to the participant and their data section, reminding the researcher to understand why and where they will include these codes in the final analysis.[2] Throughout the coding process, full and equal attention needs to be paid to each data item because it will help in the identification of otherwise unnoticed repeated patterns. Coding as inclusively as possible is important - coding individual aspects of the data that may seem irrelevant can potentially be crucial later in the analysis process.[1]

For sociologists Coffey and Atkinson, coding also involves the process of data reduction and complication.[45] Reduction of codes is initiated by assigning tags or labels to the data set based on the research question(s). In this stage, condensing large data sets into smaller units permits further analysis of the data by creating useful categories. In-vivo codes are also produced by applying references and terminology from the participants in their interviews. Coding aids in development, transformation and re-conceptualization of the data and helps to find more possibilities for analysis. Researchers should ask questions related to the data and generate theories from the data, extending past what has been previously reported in previous research.[45]

Data reduction (Coffey and Atkinson[45])[edit]

For some thematic analysis proponents, coding can be thought of as a means of reduction of data or data simplification (this is not the case for Braun and Clarke who view coding as both data reduction and interpretation). For Coffey and Atkinson, using simple but broad analytic codes it is possible to reduce the data to a more manageable feat. In this stage of data analysis the analyst must focus on the identification of a more simple way of organizing data. using data reductionism researchers should include a process of indexing the data texts which could include: field notes, interview transcripts, or other documents. Data at this stage are reduced to classes or categories in which the researcher is able to identify segments of the data that share a common category or code.[45] Siedel and Kelle suggested three ways to aid with the process of data reduction and coding: (a) noticing relevant phenomena, (b) collecting examples of the phenomena, and (c) analyzing phenomena to find similarities, differences, patterns and overlying structures. This aspect of data coding is important because during this stage researchers should be attaching codes to the data to allow the researcher to think about the data in different ways.[45] Coding can not be viewed as strictly data reduction, data complication can be used as a way to open up the data to examine further.[45] The below section addresses Coffey and Atkinson's process of data complication and its significance to data analysis in qualitative analysis.[45]

Data complication (Coffey and Atkinson[45])[edit]

For Coffey and Atkinson, the process of creating codes can be described as both data reduction and data complication. Data complication can be described as going beyond the data and asking questions about the data to generate frameworks and theories. The complication of data is used to expand on data to create new questions and interpretation of the data. Researchers should make certain that the coding process does not lose more information than is gained.[45] Tesch defined data complication as the process of reconceptualizing the data giving new contexts for the data segments. Data complication serves as a means of providing new contexts for the way data is viewed and analyzed.[45]

Coding is a process of breaking data up through analytical ways and in order to produce questions about the data, providing temporary answers about relationships within and among the data.[45] Decontextualizing and recontextualizing help to reduce and expand the data in new ways with new theories.[45]

Phase 3: Generating initial themes[edit]

Searching for themes and considering what works and what does not work within themes enables the researcher to begin the analysis of potential codes. In this phase, it is important to begin by examining how codes combine to form over-reaching themes in the data. At this point, researchers have a list of themes and begin to focus on broader patterns in the data, combining coded data with proposed themes. Researchers also begin considering how relationships are formed between codes and themes and between different levels of existing themes. It may be helpful to use visual models to sort codes into the potential themes.[1]

Themes differ from codes in that themes are phrases or sentences that identifies what the data means. They describe an outcome of coding for analytic reflection. Themes consist of ideas and descriptions within a culture that can be used to explain causal events, statements, and morals derived from the participants' stories. In subsequent phases, it is important to narrow down the potential themes to provide an overreaching theme. Thematic analysis allows for categories or themes to emerge from the data like the following: repeating ideas; indigenous terms, metaphors and analogies; shifts in topic; and similarities and differences of participants' linguistic expression. It is important at this point to address not only what is present in data, but also what is missing from the data.[14] conclusion of this phase should yield many candidate themes collected throughout the data process. It is crucial to avoid discarding themes even if they are initially insignificant as they may be important themes later in the analysis process.[1]

Phase 4: Reviewing themes[edit]

This phase requires the researchers to check their initial themes against the coded data and the entire data-set - this is to ensure the analysis hasn't drifted too far from the data and provides a compelling account of the data relevant to the research question. This process of review also allows for further expansion on and revision of themes as they develop. At this point, researchers should have a set of potential themes, as this phase is where the reworking of initial themes takes place. Some existing themes may collapse into each other, other themes may need to be condensed into smaller units, or let go of all together.[1]

Specifically, this phase involves two levels of refining and reviewing themes. Connections between overlapping themes may serve as important sources of information and can alert researchers to the possibility of new patterns and issues in the data. For Guest and colleagues, deviations from coded material can notify the researcher that a theme may not actually be useful to make sense of the data and should be discarded. Both of this acknowledgements should be noted in the researcher's reflexivity journal, also including the absence of themes.[2] Codes serve as a way to relate data to a person's conception of that concept. At this point, the researcher should focus on interesting aspects of the codes and why they fit together.[2]

Level 1 (Reviewing the themes against the coded data)[edit]

Reviewing coded data extracts allows researchers to identify if themes form coherent patterns. If this is the case, researchers should move onto Level 2. If themes do not form coherent patterns, consideration of the potentially problematic themes is necessary.[1] If themes are problematic, it is important to rework the theme and during the process, new themes may develop.[1] For example, it is problematic when themes do not appear to 'work' (capture something compelling about the data) or there is a significant amount of overlap between themes. This can result in a weak or unconvincing analysis of the data. If this occurs, data may need to be recognized in order to create cohesive, mutually exclusive themes.[1]

Level 2 (Reviewing the themes against the entire data-set)[edit]

Considering the validity of individual themes and how they connect to the data set as a whole is the next stage of review. It is imperative to assess whether the potential thematic map meaning captures the important information in the data relevant to the research question. Once again, at this stage it is important to read and re-read the data to determine if current themes relate back to the data set. To assist in this process it is imperative to code any additional items that may have been missed earlier in the initial coding stage. If the potential map 'works' to meaningfully capture and tell a coherent story about the data then the researcher should progress to the next phase of analysis. If the map does not work it is crucial to return to the data in order to continue to review and refine existing themes and perhaps even undertake further coding. Mismatches between data and analytic claims reduce the amount of support that can be provided by the data. This can be avoided if the researcher is certain that their interpretations of the data and analytic insights correspond.[1] Researchers repeat this process until they are satisfied with the thematic map. By the end of this phase, researchers have an idea of what themes are and how they fit together so that they convey a story about the data set.[1]

Phase 5: Defining and naming themes[edit]

Defining and refining existing themes that will be presented in the final analysis assists the researcher in analyzing the data within each theme. At this phase, identification of the themes' essences relate to how each specific theme forms part of the entire picture of the data. Analysis at this stage is characterized by identifying which aspects of data are being captured and what is interesting about the themes, and how the themes fit together to tell a coherent and compelling story about the data.

In order to identify whether current themes contain sub-themes and to discover further depth of themes, it is important to consider themes within the whole picture and also as autonomous themes. Braun and Clarke recommend caution about developing many sub-themes and many levels of themes as this may lead to an overly fragmented analysis.[46] Researchers must then conduct and write a detailed analysis to identify the story of each theme and its significance.[1] By the end of this phase, researchers can (1) define what current themes consist of, and (2) explain each theme in a few sentences. It is important to note that researchers begin thinking about names for themes that will give the reader a full sense of the theme and its importance.[1] Failure to fully analyze the data occurs when researchers do not use the data to support their analysis beyond simply describing or paraphrasing the content of the data. Researchers conducting thematic analysis should attempt to go beyond surface meanings of the data to make sense of the data and tell an accurate story of what the data means.[1]

Phase 6: Producing the report[edit]

After final themes have been reviewed, researchers begin the process of writing the final report. While writing the final report, researchers should decide on themes that make meaningful contributions to answering research questions which should be refined later as final themes. For coding reliability proponents Guest and colleagues, researchers present the dialogue connected with each theme in support of increasing dependability through a thick description of the results.[2] The goal of this phase is to write the thematic analysis to convey the complicated story of the data in a manner that convinces the reader of the validity and merit of your analysis.[1] A clear, concise, and straightforward logical account of the story across and with themes is important for readers to understand the final report. The write up of the report should contain enough evidence that themes within the data are relevant to the data set. Extracts should be included in the narrative to capture the full meaning of the points in analysis. The argument should be in support of the research question. For some thematic analysis proponents, the final step in producing the report is to include member checking as a means to establish credibility, researchers should consider taking final themes and supporting dialog to participants to elicit feedback.[2] However, Braun and Clarke are critical of the practice of member checking and do not generally view it as a desirable practice in their reflexive approach to thematic analysis.[13] As well as highlighting numerous practical concerns around member checking, they argue that it is only theoretically coherent with approaches that seek to describe and summarise participants' accounts in ways that would be recognisable to them.[13] Given their reflexive thematic analysis approach centres the active, interpretive role of the researcher - this may not apply to analyses generated using their approach.

Advantages and disadvantages[edit]

A technical or pragmatic view of research design centres researchers conducting qualitative analysis using the most appropriate method for the research question.[13] However, there is rarely only one ideal or suitable method so other criteria for selecting methods of analysis are often used - the researcher's theoretical commitments and their familiarity with particular methods. Thematic analysis provides a flexible method of data analysis and allows for researchers with various methodological backgrounds to engage in this type of analysis.[1] For positivists, 'reliability' is a concern because of the numerous potential interpretations of data possible and the potential for researcher subjectivity to 'bias' or distort the analysis. For those committed to qualitative research values, researcher subjectivity is viewed as a resource (rather than a threat to credibility), and so concerns about reliability do not hold. There is no one correct or accurate interpretation of data, interpretations are inevitably subjective and reflect the positioning of the researcher. Quality is achieved through a systematic and rigorous approach and through the researcher continually reflecting on how they are shaping the developing analysis. Braun and Clarke have developed a 15-point quality checklist for their reflexive approach. For coding reliability thematic analysis proponents, the use of multiple coders and the measurement of coding agreement is vital.[2]

Thematic analysis has several advantages and disadvantages, it is up to the researchers to decide if this method of analysis is suitable for their research design.


  • The theoretical and research design flexibility it allows researchers - multiple theories can be applied to this process across a variety of epistemologies.[1]
  • Well suited to large data sets.[2][1]
  • Code book and coding reliability approaches are designed for use with research teams.
  • Interpretation of themes supported by data.[2]
  • Applicable to research questions that go beyond an individual's experience.[2]
  • Allows for inductive development of codes and themes from data.[14]


  • Thematic analysis may miss nuanced data if the researcher is not careful and uses thematic analysis in a theoretical vacuum.[2][1]
  • Flexibility can make it difficult for novice researchers to decide what aspects of the data to focus on.[1]
  • Limited interpretive power of analysis is not grounded in a theoretical framework.[1]
  • Difficult to maintain sense of continuity of data in individual accounts because of the focus on identifying themes across data items.[1]
  • Does not allow researchers to make technical claims about language usage (unlike discourse analysis and narrative analysis).[1]


See also[edit]


  1. ^ a b c d e f g h i j k l m n o p q r s t u v w x y z aa ab ac ad ae af ag ah ai aj ak al am an ao ap aq Braun, Virginia; Clarke, Victoria (2006). "Using thematic analysis in psychology". Qualitative Research in Psychology. 3 (2): 77–101. doi:10.1191/1478088706qp063oa. hdl:10125/42031. S2CID 10075179.
  2. ^ a b c d e f g h i j k l m n o p q r s t Guest, Greg; MacQueen, Kathleen; Namey, Emily (2012). Applied thematic analysis. Thousand Oaks, California: SAGE Publications. p. 11.
  3. ^ a b c d e f Braun, Virginia; Clarke, Victoria (2019). "Thematic Analysis". Handbook of Research Methods in Health Social Sciences. Hoboken, New Jersey: Springer. pp. 843–860. doi:10.1007/978-981-10-5251-4_103. ISBN 978-981-10-5250-7. S2CID 239210796.
  4. ^ a b c d e f Boyatzis, Richard (1998). Transforming qualitative information: Thematic analysis and code development. Thousand Oaks, CA: Sage.
  5. ^ a b Gale, Nicola; Heath, Gemma (2013). "Using the framework method for the analysis of qualitative data in multi-disciplinary health research". BMC Medical Research Methodology. 13: 117. doi:10.1186/1471-2288-13-117. PMC 3848812. PMID 24047204.
  6. ^ a b King, Nigel; Brooks, Joanna (2016). Template analysis for business and management students. Sage.
  7. ^ a b Groenland, Edward (2014). "Employing the Matrix Method as a Tool for the Analysis of Qualitative Research Data in the Business Domain". SSRN. doi:10.2139/ssrn.2495330. S2CID 59826786. SSRN 2495330.
  8. ^ Langdridge, Darren (2004). Introduction to research methods and data analysis in psychology. The Open University.
  9. ^ Hayes, Nicky (2000). Doing psychological research. Open University Press.
  10. ^ Braun, Virginia; Clarke, Victoria (2019). "Reflecting on reflexive thematic analysis". Qualitative Research in Sport, Exercise and Health. 11 (4): 589–597. doi:10.1080/2159676x.2019.1628806. S2CID 197748828.
  11. ^ Willig, Carla (2013). Introducing qualitative research in psychology. Open University Press.
  12. ^ Daly, Jeanne; Kellehear, Allan; Gliksman, Michael (1997). The public health researcher: A methodological approach. Melbourne, Australia: Oxford University Press. pp. 611–618. ISBN 978-0195540758.
  13. ^ a b c d e f g Braun, Virginia; Clarke, Victoria (2013). Successful qualitative research: A practical guide for beginners. Sage.
  14. ^ a b c d e f g h Saldana, Johnny (2009). The Coding Manual for Qualitative Researchers. Thousand Oaks, California: Sage.
  15. ^ Dapkus, Marilyn (1985). "A thematic analysis of the experience of time". Journal of Personality and Social Psychology. 49 (2): 408–419. doi:10.1037/0022-3514.49.2.408. PMID 4032226.
  16. ^ Clarke, Victoria; Braun, Virginia (2014). "Thematic Analysis". Encyclopedia of Critical Psychology. Springer. pp. 1947–1952. doi:10.1007/978-1-4614-5583-7_311. ISBN 978-1-4614-5582-0.
  17. ^ Crabtree, B (1999). Doing Qualitative Research. Newbury Park, CA: Sage.
  18. ^ Huang, H., Jefferson, E. R., Gotink, M., Sinclair, C., Mercer, S. W., & Guthrie, B. (2021). Collaborative improvement in Scottish GP clusters after the Quality and Outcomes Framework: a qualitative study. British Journal of General Practice, 71(710), e719-e727.
  19. ^ Kidder, Louise; Fine, Michelle (1987). "Qualitative and quantitative methods: When stories converge". New Directions for Program Evaluation. 1987 (Fall) (35): 57–75. doi:10.1002/ev.1459.
  20. ^ Morse, Janice (1997). ""Perfectly Healthy, but Dead": The Myth of Inter-Rater Reliability". Qualitative Health Research. 7 (4): 445–447. doi:10.1177/104973239700700401.
  21. ^ Yardley, Lucy (2008). "Demonstrating validity in qualitative psychology". Qualitative Psychology: A Practical Guide to Research Methods. Sage: 235–251.
  22. ^ a b Braun, Virginia; Clarke, Victoria (2014). "How to use thematic analysis with interview data". The Counselling and Psychotherapy Research Handbook: 183–197.
  23. ^ Terry, Gareth; Hayfield, Nikki; Clarke, Victoria; Braun, Virginia (2017). "Thematic analysis". The Sage Handbook of Qualitative Research in Psychology: 17–36. doi:10.4135/9781526405555. ISBN 9781473925212.
  24. ^ DeSantis, Lydia; Ugarriza, Doris (2000). "The concept of theme as used in qualitative nursing research". Western Journal of Nursing Research. 22 (3): 351–372. doi:10.1177/019394590002200308. PMID 10804897. S2CID 37545647.
  25. ^ Clarke, Victoria; Braun, Virginia (2018). "Using thematic analysis in counselling and psychotherapy research: A critical reflection". Counselling and Psychotherapy Research. 18 (2): 107–110. doi:10.1002/capr.12165.
  26. ^ Connelly, Lynne; Peltzer, Jill (2016). "Underdeveloped Themes in Qualitative Research: Relationship With Interviews and Analysis". Clinical Nurse Specialist. 30 (1): 52–57. doi:10.1097/nur.0000000000000173. PMID 26626748. S2CID 5942773.
  27. ^ Sandelowski; Leeman, Jennifer (2012). "Writing using qualitative health research findings". Qualitative Health Research. 22 (10): 1404–1413. doi:10.1177/1049732312450368. PMID 22745362. S2CID 26196750.
  28. ^ a b Saldana, Johnny (2009). The Coding Manual for Qualitative Researchers. Thousand Oaks, California: Sage. p. 13.
  29. ^ Creswell, John (1994). Research Design: Qualitative & Quantitative Approaches. Thousand Oaks, CA: Sage Publications, Inc. pp. 147.
  30. ^ Locke, L.F. (1987). Proposals that work: A guide for planning dissertations and grant proposals. Newbury Park, CA: Sage Publications, Inc.
  31. ^ Creswell, John (2007). Qualitative Inquiry & Research Design: Choosing Among Five Approaches. Thousand Oaks, CA: Sage Publications, Inc. pp. 178–180.
  32. ^ Lincoln; Guba (1995). "Criteria For Rigor in Qualitative research". {{cite journal}}: Cite journal requires |journal= (help)
  33. ^ Malterud, Kirsti (2016). "Sample Size in Qualitative Interview Studies: Guided by Information Power". Qualitative Health Research. 26 (13): 1753–1760. doi:10.1177/1049732315617444. PMID 26613970. S2CID 34180494.
  34. ^ Guest, Greg; Bunce, Arwen; Johnson, Laura (2006). "How Many Interviews Are Enough?: An Experiment with Data Saturation and Variability". Field Methods. 18 (1): 59–82. doi:10.1177/1525822x05279903. S2CID 62237589.
  35. ^ Hennink, Monique; Kaiser, Bonnie (2016). "Code Saturation Versus Meaning Saturation: How Many Interviews Are Enough?". Qualitative Health Research. 27 (4): 591–608. doi:10.1177/1049732316665344. PMC 9359070. PMID 27670770. S2CID 4904155.
  36. ^ Low, Jacqueline (2019). "A Pragmatic Definition of the Concept of Theoretical Saturation". Sociological Focus. 52 (2): 131–139. doi:10.1080/00380237.2018.1544514. S2CID 149641663.
  37. ^ Fugard AJ, Potts HW (10 February 2015). "Supporting thinking on sample sizes for thematic analyses: A quantitative tool". International Journal of Social Research Methodology. 18 (6): 669–684. doi:10.1080/13645579.2015.1005453.
  38. ^ Lowe, Andrew; Norris, Anthony C.; Farris, A. Jane; Babbage, Duncan R. (2018). "Quantifying Thematic Saturation in Qualitative Data Analysis". Field Methods. 30 (3): 191–207. doi:10.1177/1525822X17749386. ISSN 1525-822X. S2CID 148824883.
  39. ^ Braun, Virginia; Clarke, Victoria (2016). "(Mis)conceptualising themes, thematic analysis, and other problems with Fugard and Potts' (2015) sample-size tool for thematic analysis" (PDF). International Journal of Social Research Methodology. 19 (6): 739–743. doi:10.1080/13645579.2016.1195588. S2CID 148370177.
  40. ^ Hammersley, Martyn (2015). "Sampling and thematic analysis: a response to Fugard and Potts". International Journal of Social Research Methodology. 18 (6): 687–688. doi:10.1080/13645579.2015.1005456. S2CID 143933992.
  41. ^ Byrne, David (2015). "Response to Fugard and Potts: supporting thinking on sample sizes for thematic analyses: a quantitative tool". International Journal of Social Research Methodology. 16 (6): 689–691. doi:10.1080/13645579.2015.1005455. S2CID 144817485.
  42. ^ Emmel, Nick (2015). "Themes, variables, and the limits to calculating sample size in qualitative research: a response to Fugard and Potts" (PDF). International Journal of Social Research Methodology. 18 (6): 685–686. doi:10.1080/13645579.2015.1005457. S2CID 55615136.
  43. ^ Braun, Virginia; Clarke, Victoria (2012). "Thematic analysis". APA Handbook of Research Methods in Psychology. Vol. 2. pp. 57–71. doi:10.1037/13620-004. ISBN 978-1-4338-1005-3.
  44. ^ a b c Miles, M.B. (1994). Qualitative data analysis: An expanded sourcebook. Thousand Oaks, California: Sage. ISBN 9780803955400.
  45. ^ a b c d e f g h i j k l Coffey, Amanda; Atkinson, Paul (1996). Making Sense of Qualitative Data. Sage. p. 30.
  46. ^ Clarke, Victoria; Braun, Virginia (2016). "Thematic analysis". Analysing Qualitative Data in Psychology. Sage: 84–103.