Benjamin L. Stewart
ELT Cast
Analyzing Qualitative Data
0:00
-2:27:44

Analyzing Qualitative Data

An overview for the English language teacher trainer and novice researcher

Notes to the talk

Here is a brief that summarizes the main themes and important ideas discussed in the provided sources: an audio recording ("Data Analysis_042025.mp3") of a thesis seminar session and excerpts from a document titled "Making Sense of Stories: Analyzing Qualitative Data in ELT Teacher Training." The primary focus is on data analysis techniques, particularly qualitative coding, triangulation, and the potential for incorporating quantitative elements.

I. Key Themes and Important Ideas:

A. Importance of Completing Data Collection Before Analysis:

  • The seminar leader emphasizes that data analysis should only begin after all data collection is complete. "Today's discussion is about data analysis. All of you have collected or very close to having completed uh collecting all of your data and this is an important requirement to continue the process of data analysis... If you are still trying to collect some information, know that what we talk about today uh you need to wait."

  • Starting analysis prematurely, before all data is gathered, is considered a "mistake."

B. Understanding the Purpose of Data Analysis:

  • Data analysis is crucial for understanding the collected data and determining what is relevant and significant to report in the results and discussion sections of the thesis.

  • It helps researchers move from a large amount of raw data to focused and insightful findings. "Think of it like this. All of you are at this point, you've collected, if not all, most of your data. So you have all this data that you've collected... Ahora con todo esta información which data is not relevant... So you're going to then include this circle represents now only the information that relates to your research questions... Now from your data analysis... you're going to then figure out ok of all this information that now is relevant to my study, what is worth What is including in mys discussion?"

  • Not all relevant data needs to be reported; the analysis helps identify the most "important, surprising, insightful, interesting" findings.

C. The Concept and Importance of Triangulation:

  • Triangulation involves bringing together different data sources (e.g., interviews, observations, documents) to gain a more comprehensive understanding of the research topic.

  • It allows for comparison between what participants say they do/believe, what they actually do (observed), and their planning/reflection processes. "Think of this if it helps to look at it like this. Your um your information here is allowing you to compare different things. For example, what people say they do or believe... What do they actually do? Well, to know that, what do we have to do? Have to observe."

  • The seminar leader stresses the importance of having sufficient data to triangulate and encourages participants to address any concerns about this. "If anybody today right now has concerns about whether or not you have the types of data to allow you to triangulate, we need to have a discussion today."

  • The "Making Sense of Stories" document provides specific examples of triangulation in ELT teacher training research, such as comparing planned instructions in lesson plans with delivered instructions observed in the classroom. "Compare the planned instructions (document) with the delivered instructions (observation). Were planned ICQs actually used?"

D. Introduction to Qualitative Coding:

  • Qualitative coding is defined as a systematic process of labeling and organizing segments of text data (transcriptions, observation notes, documents) to identify patterns, themes, and concepts relevant to the research questions. "The process of coding is the process of labeling text. Coding is a systematic way to make sense of rich, complex, and often messy reality of language."

  • All audio and video data must be transcribed into text before coding. Microsoft Word Online's transcription feature is suggested as a tool.

  • The coding process involves identifying text segments (words, phrases, sentences, paragraphs) that relate to the research questions and assigning specific labels or "codes" to them. "You're coding things that relate to your research questions... Porque estamos en este proceso distinguendo, tenemos que distinguir qué sirve para nuestro estudio, qué no sirve, vamos a dejarlo fuera."

E. Levels of Qualitative Coding:

  • The seminar introduces a three-level inductive coding approach:

  • Level One (Initial Codes): Creating very specific labels directly from the text, the literature review, or using in vivo codes (participant's exact words). "The first you create... the code, the label comes from your literature review... Using a label a code directly. If anx dijo eso... Tú puedes seleccionar esta frase. ¿Qué lebo puedes poner? Anxious, anxiety."

  • Level Two (Categories): Grouping the initial, specific codes into broader, more conceptual categories. "When we finish, you should have a long list of codes. And so I would do it in something like Excel... Les Segundo nivel es ordenar. Este grupo de códigos initial codes va aquí y voy a crear otra código. Puede ser como en category que representen todos sus códigos que son más específicos."

  • Level Three (Themes): Grouping the categories into overarching themes that provide a higher level of understanding and relate directly to the research questions. "Level three, yo voy a poner este themes. Ya vamos a tener categorías, ¿verdad? Cada categoría va a tener sus initial codes. ¿Qué hicimos? ¿Qué hacemos para este nivel level? Categoriz group these categories into these yes."

  • The "Making Sense of Stories" document also describes a similar iterative coding process, including immersion, initial/open coding, developing a codebook, focused/axial coding, and identifying themes/selective coding.

F. The Codebook:

  • The outcome of the coding process is a codebook, which is a crucial part of the methodology section of the thesis.

  • The codebook will list all the codes used, potentially organized by categories and themes, and may include definitions and examples. "Cuando terminen, you're going to have a codebook... you're going to include your codebook that's going to include all of the codes that you used and it's going to be an outline como esema in word. categories initial codes."

  • The methodology section will describe the coding process and reference the codebook in the appendix.

G. Incorporating Frequencies and Duration (Quantitative Elements):

  • The seminar leader emphasizes that qualitative data can be converted into quantitative data (frequencies, duration) for analysis. "How many of you think you'll need to analyze because we can convert qualitative information into quantitative information..."

  • This involves counting the occurrences of specific codes or measuring the length of certain events (e.g., teacher-student exchanges, use of relaxation techniques).

  • Examples discussed include tracking the frequency of positive/negative reinforcement, scaffolding, relaxation techniques, and the duration of collaborative work or interactions with specific students.

  • The "Making Sense of Stories" document provides detailed examples of how to quantify qualitative data by defining observable behaviors, developing coding rules, and using presence/absence or frequency counts in spreadsheets.

H. Relationship Between Analysis and Reporting:

  • The analysis process directly informs what will be reported in the results and discussion sections. "We don't know what to write in the results and discussion until we understand the data. To understand the data, we need to analyze the data."

  • The evidence presented in the results section will often consist of direct quotes from the data that have been coded.

  • The analysis (coding, identifying themes, considering frequencies) helps determine the structure and content of the results and discussion.

I. Openness to Modifying Research Questions:

  • Based on the initial findings during data analysis, it may be necessary to slightly modify the research questions to better align with the emerging answers. "It's very common at this point as you are analyzing your data and when you come back on May 5th that in some cases we may need to modify slightly your research question."

  • However, any modifications should remain within the scope of the literature review.

J. Timeline and Expectations:

  • Participants are expected to begin the data analysis process (coding, considering frequencies) during the break before the next group session on May 5th.

  • This analysis is considered a crucial step that will significantly impact the quality of the thesis.

  • The final thesis paper is due on May 22nd, followed by mock presentations starting on May 26th and oral defenses.

K. Utilizing Large Language Models (LLMs) as Research Assistants:

  • The "Making Sense of Stories" document introduces the potential of using LLMs to assist with qualitative data analysis.

  • LLMs can help with generating initial coding ideas, applying preliminary coding schemes, calculating frequencies of codes, and analyzing Likert scale questionnaires.

  • However, it is strongly emphasized that researchers must critically assess, validate, and cross-reference the output from LLMs to avoid bias and inaccuracies. LLMs should be seen as tools for augmentation, not replacements for rigorous methodological practices.

II. Notable Quotes:

  • "Today's discussion is about data analysis. All of you have collected or very close to having completed uh collecting all of your data and this is an important requirement to continue the process of data analysis..."

  • "Please don't make that mistake. Okay. Today what we're going to be talking about is a process of analyzing qualitative information, but it's also a way to for you to start thinking about what you're going to report."

  • "This concept of triangulation is going to be very important in today's discussion for data analysis. Think of this if it helps to look at it like this. Your um your information here is allowing you to compare different things."

  • "Qualitative coding is the process of systematically identifying, labeling, and organizing segments of your data to discover patterns, themes, concepts, and relationships relevant to your research questions."

  • "Coding is simply labeling. It's giving a name to the text that you have."

  • "Repito, los códigos tien que ser super específico. Si comenzamos demasiado general, we don't have any place to go if we start to general."

  • "All qualitative can be converted to quantitative data and vice versa. When conducting qualitative data, you might find it useful to convert data to quantitative data and then analyze it."

  • "Correlation does NOT imply causation!"

III. Implications for Thesis Work:

  • Participants need to prioritize transcribing their audio/video data and engaging in the initial levels of qualitative coding.

  • They should actively think about how triangulation will be achieved in their studies using their collected data sources.

  • Considering potential quantitative analysis (frequencies, duration) can add another layer of insight to their findings.

  • Developing a detailed and well-defined codebook is essential for a rigorous and transparent analysis process.

  • Researchers should remain flexible and open to refining their research questions based on the initial insights from the data analysis.

  • While LLMs can be helpful tools, they should be used judiciously and with critical evaluation.

This briefing document provides a comprehensive overview of the key aspects of data analysis discussed in the provided sources, highlighting the importance of systematic qualitative methods and the potential for integrating quantitative elements in ELT teacher training research. Participants are encouraged to begin their analysis promptly and seek clarification on any doubts.

Review

Quiz

  1. According to the speaker, what is the primary focus of today's session? Why is it being addressed at this particular point in the semester?

  2. Explain the significance of triangulation in qualitative data analysis as described in the audio. Provide an example of how triangulation could be applied using different data sources mentioned.

  3. Summarize the three levels of coding for qualitative data analysis discussed in the audio. What is the purpose of moving through these levels?

  4. Describe what a codebook is and when it should be developed in the data analysis process. What key information does it contain?

  5. Explain the difference between creating codes and using in vivo codes. Provide an example of each based on the provided material.

  6. Why does the speaker emphasize the importance of transcribing all audio and video data to text before beginning the coding process?

  7. What is the speaker's advice regarding modifying research questions at this stage of the thesis process? What important caveat does they mention?

  8. Describe at least three examples from the audio of how qualitative data can be converted and analyzed using frequencies or duration.

  9. According to the speaker, what constitutes the "results and discussion" section of the thesis paper in relation to the analyzed data? How does this differ from the literature review?

  10. What reminders were given regarding the assessment components and attendance policy for the thesis seminar?

Quiz Answer Key

  1. The primary focus of today's session is data analysis, specifically for qualitative information. This is being addressed now because students have either completed or are very close to completing their data collection, which is a necessary prerequisite for starting the analysis process.

  2. Triangulation is the process of bringing together and comparing information from different data sources (e.g., interviews, observations, documents) to gain a more nuanced and credible understanding of the research topic. For example, a researcher might compare a teacher's stated beliefs about differentiated instruction in an interview with their observed teaching practices and relevant lesson plans to see if these different sources of information align.

  3. The three levels of coding are: (1) Initial/Level One Coding, which involves assigning specific labels or codes to segments of text; (2) Level Two Coding, where initial codes are grouped into broader categories; and (3) Level Three Coding, where categories are further grouped into overarching themes. The purpose of moving through these levels is to move from specific data points to more general analytical insights and patterns.

  4. A codebook is a central document that is developed as the researcher codes their data. It lists all the codes being used, provides a clear definition for each code, outlines inclusion and exclusion criteria for applying the code, and often includes example snippets from the data that illustrate the code. It ensures consistency in the coding process.

  5. Creating codes involves the researcher developing labels for segments of text based on their understanding of the data and research questions, potentially drawing from the literature review. Using in vivo codes involves using the exact words or phrases spoken by the participants as the codes themselves. For example, in the teacher interview snippet, "grammar mistake" is an in vivo code, while "delayed correction" is a created code.

  6. The speaker emphasizes transcribing all audio and video data to text because the process of coding, which involves identifying and labeling segments of data, is primarily applied to text. Therefore, to analyze non-textual data in this way, it must first be converted into a textual format.

  7. The speaker advises students to be open to slightly modifying their research questions based on the initial findings from the data analysis. However, they caution that any modifications should still align with the original literature review and the overall purpose of the research.

  8. Examples of converting qualitative data to quantitative for analysis include: tracking the frequency of positive and negative reinforcement used by a teacher during a lesson; measuring the duration of student-teacher interactions; and counting the number of times a specific vocabulary strategy is implemented in a classroom.

  9. The "results and discussion" section of the thesis paper primarily consists of the analyzed data, presented as evidence (results), and the researcher's interpretation and explanation of these findings in relation to the research questions and existing literature (discussion). This differs from the literature review, which presents findings from previous studies to provide context for the current research.

  10. The speaker reminded students that their tutoring grade only makes up 40% of their final thesis seminar grade, with the oral defense and written thesis evaluation contributing the remaining 60%. They also reiterated the attendance policy, where missing a tutoring session equates to five absences, and exceeding three missed sessions may require taking an extraordinary exam.

Essay Format Questions

  1. Discuss the role of data analysis as a crucial bridge between data collection and the reporting of findings in qualitative research. Using examples from the provided audio, explain why skipping the data analysis stage can lead to significant challenges in the thesis writing process.

  2. Critically evaluate the concept of triangulation in qualitative research, drawing on the examples and explanations provided in the sources. Discuss the strengths and potential limitations of using multiple data sources to enhance the credibility and depth of research findings in ELT teacher training.

  3. Explain the three-level coding process for qualitative data analysis presented in the audio, emphasizing the importance of specificity in initial coding and the subsequent development of categories and themes. How does this systematic approach contribute to making sense of complex qualitative data?

  4. Considering the information provided on analyzing frequencies and duration in qualitative data, discuss the value of incorporating quantitative elements into a primarily qualitative study. Provide specific examples from the audio of how this mixed-methods approach can enrich the analysis and provide additional insights in ELT research.

  5. Reflect on the advice given regarding the iterative nature of qualitative research, including the potential need to modify research questions after initial data analysis. Discuss the importance of maintaining an open mind and flexibility throughout the research process while ensuring alignment with the existing literature review and overall research focus.

Glossary of Key Terms

  • Coding (Qualitative): The process of systematically identifying, labeling, and organizing segments of qualitative data (text, audio transcripts, observation notes) to discover patterns, themes, concepts, and relationships relevant to the research questions.

  • Triangulation: The use of multiple data sources, methods, investigators, or theories to provide a more comprehensive and nuanced understanding of a research phenomenon, enhancing the credibility and validity of the findings.

  • Codebook: A central document that lists all the codes used in a qualitative study, along with their definitions, inclusion and exclusion criteria, and sometimes example data excerpts. It serves as a guide for consistent coding.

  • In Vivo Code: A type of code that uses the exact words or phrases spoken by the participants as the label for a segment of data.

  • Initial Coding (Open Coding/Level One Coding): The first stage of qualitative data analysis where researchers go through the data and assign preliminary, descriptive codes to segments of text, often staying close to the data itself.

  • Focused Coding (Axial Coding/Level Two Coding): A later stage of qualitative data analysis where initial codes are reviewed, refined, combined, and grouped into broader categories based on their relationships and patterns.

  • Thematic Analysis (Selective Coding/Level Three Coding): The process of identifying overarching themes or central ideas that emerge from the categories developed during focused coding, which help to answer the research questions.

  • Frequency Analysis: A method of quantitative data analysis that involves counting how often specific codes, behaviors, or events occur within the data.

  • Duration Analysis: A method of quantitative data analysis that involves measuring the length of time that specific events or interactions last within the data.

  • Transcription: The process of converting audio or video recordings into written text.

  • Likert Scale: A psychometric scale commonly involved in research that employs questionnaires. It is the most widely used approach to scaling responses in survey research, such that the term (or equivalently likert-type scale) is often used interchangeably with rating scale, although there are other types of rating scales.

  • Inductive Approach: A research approach that starts with specific observations and data, then moves towards identifying broader patterns, themes, and theories. The coding process described is largely inductive.

  • Deductive Approach: A research approach that starts with a general theory or hypothesis and then gathers data to test or confirm it.

  • Research Question: A specific inquiry that the research aims to answer. It guides the data collection and analysis processes.

  • Literature Review: A comprehensive summary and analysis of existing scholarly literature relevant to the research topic, providing context and identifying gaps in knowledge.

  • Methodology: The section of a research paper that describes the methods used to collect and analyze data. The coding process and codebook would be described in this section.

  • Results and Discussion: The section of a research paper where the findings of the data analysis are presented (results) and interpreted in relation to the research questions and existing literature (discussion).

  • Assessment (Thesis Seminar): The evaluation of a student's work in the thesis seminar, which includes the tutor's grading (40%), the oral defense (20%), and the evaluation of the written thesis (40%).

  • Oral Defense: A formal presentation of the completed thesis to a panel of examiners, who then ask questions about the research.

Discussion about this episode

User's avatar