|
DR. BOHRNSTEDT: I really think this is a superb example of how multiple methods can be fruitfully applied within a single study. I really like the interplay that was discussed between the qualitative and the quantitative data.
The emphasis of this paper, the New Hope study really is on the importance of the ethnographic substudy that was done — the ways in which it really enriched the data collected as part of a larger randomized field trial. There are more benefits to it than I can possibly summarize. One of the things that when we talked last Friday we agreed I would try to give some take-away points from this study and then the other thing I want to do is also talk about two examples that I, personally, have been involved — other studies that are employing multiple methods as well to try to reinforce I think the importance of how quantitative and qualitative methods can work together.
The first take away from this presentation by Greg today is that qualitative data can help understand the quantitative data. Recall the finding that boys in the experimental group were rated by their teachers as better behaved and performing better than those in the control group. The ethnographic data provided a plausible explanation for the finding that likely would have gone unnoticed or undiscovered without the qualitative data, namely, that the study participants were concerned about the role that gangs and drugs might play in their sons' lives and then how they were able to in fact use the resources for the program intervention to try to help offset that risk. So, I think that is one important take away the way it really informed the quantitative analysis.
The second is that the qualitative data can be used to generate hypotheses to be tested either in the data at hand or in future studies. In the New Hope study the qualitative data suggested that the number of employment-related barriers at the onset of the program might be a predictor of effects on earnings. The researchers then used this insight to construct a measure of such barriers and showed as you saw that employment-related barriers were indeed related to differences in earnings between the experimental and the control groups.
I think there is a kind of depth of understanding one gets from talking to respondents at length and across many subject areas that is simply impossible to get, for example, from the collection of surveys.
I thought it was important and noteworthy that the directors of the ethnographic study provided a set of topics, again as Greg gave us, as a way to ensure both breadth and depth for the kind of data that were collected.
This made the ethnographic data hypothesis-rich, if you will. Finally, as Greg points out the data from the ethnographic study was also useful for constructing new items for follow-ups and allowing them to test hypotheses that otherwise probably would not have been tested. So, I think this hypothesis-generating function for the ethnographic study and the way it contributed is a second kind of take-away point.
Third, qualitative data provide useful anecdotes or exemplars for illustrating quantitative findings. It really allows us to better understand, if you will, the quantitative data. The authors provided one such example at the beginning of the paper where they document the importance of the child care subsidy for a young mother named Maria which allowed her to actually then go out and seek and secure employment. If you had a chance to look at the longer report it is replete with such examples showing how the program benefited those in the experimental group as compared to the control group.
In fact, when I looked at the longer report, Greg, I was just struck by those little boxes and how they drew me to it as a way to sort of really understand and comprehend in interesting ways what we were observing in the quantitative data.
The fourth thing is again I want to reinforce the idea of when doing this kind of ethnographic work in fact choosing a random sample of respondents rather than using some sort of purposeful sampling. While as Greg and his colleagues point out the confidence intervals around this are large nonetheless it allows you at least the opportunity to get some idea of whether what you are observing is sensible or not sensible and if you have an even larger group obviously the confidence intervals could have been even smaller.
So, in short those were four take-aways I had from this and I think that they are useful for all of us. You know one question I wrestled with was this ethnographic study essential or merely useful. I don't know the answer to that and you may want to think about it, but certainly I think that when you look at what you were able to do for example in constructing this index of barriers to employment that without those data you probably would not have been able to do this analysis. Would the study have suffered without those kinds of analyses? That is a useful thing to ponder and I think it is in the spirit of Tom Cook's question as well.
With the remainder of my time I just want to talk about two other examples quickly that I was involved with, one where we used qualitative data to provide validity for some quantitative measures and a second study where we actually used the qualitative data as the primary measure.
The first example is from a large non-experimental evaluation of class size reduction which Jim mentioned, with which I was involved. California as I think a lot of you know in the mid-nineties provided sizable incentives to districts to reduce their K-3 classrooms to 20 students. Part of the rationale for doing so actually came from the Tennessee Star Project which is a randomized field trial in which it showed both short-and-long-term effects for kids who in fact were in smaller versus larger classrooms, but as part of the work we were doing in California we also thought it was important to look at instructional practices and we included instructional practices because we were concerned that structural changes alone, that is moving to a smaller class size might not be sufficient to in fact improve student achievement. That is, we weren't certain that small class size without concomitant changes in instructional practices was likely to make a different in achievement gains with kids.
We, in fact, found only a very weak association between student achievement and California's class size reduction program, but we used a couple of different methods to try to look at instructional practices. Our primary method was a set of surveys which were done both cross-sectionally and longitudinally and secondarily we videotaped a sub-sample of the teachers in both the reduced and non-reduced size classrooms and recorded the type of instructional practices that these teachers used.
What did we find out? We found out that teachers in smaller classes in fact did use smaller groupings of students and they did a bit more one-on-one instruction with readers who were having some difficulty but the overall conclusion we came to was that there was really no fundamental way in which instruction as delivered was very different in the reduced versus the non-reduced size classes. Importantly the videotape data confirmed what we found in the surveys; that is that when we looked at those videos we couldn't see those differences either. Well, was the videotaping essential for this study? In retrospect you could say no, but without question I think from our perspective those videotape data really confirmed what we had found in the survey data and for us it was viewed as an important component of the study.
The second study done by AIR along with Florida State University is now in the process of testing the efficacy of four commercially available reading programs to help struggling third and fifth graders in Allegheny County, Pennsylvania.
By the way this study is being funded by IES as well a consortium of foundations. In this study schools were randomly assigned to one of four programs and then within schools sample classrooms were either assigned to an experimental or control condition.
This study uses administrative records, document analyses, teachers' logs, a variety of surveys as well as videotaping.
Our task in this study was to study implementation of the four programs with a central focus of assessing the fidelity with which each of them was being delivered in classrooms. I have been actually buoyed by the discussion today about the importance of implementation because somehow I haven't heard as much said about implementation as I have in previous conversations of this type.
To accomplish this task we videotaped all of the teachers twice during the year, once in the fall and once in the spring, then created a running record of each class. What we did is we put what we call time stamps into these running records noting what topics were covered, when, how much time was spent on them, when student errors occurred, how student errors were dealt with and so on. Because the programs actually varied quite widely with respect to the degree to which they are scripted we actually needed a separate set of codes and had to develop them for each intervention, but we also wanted to be able to compare fidelity of implementation across the various programs. So, we developed some higher order sets of codes which were more global and what we are just now literally in the process of doing is taking these qualitative data and converting them to a set of quantitative measures of the degree to which in fact fidelity occurred within each of these programs.
Obviously without adequate implementation one would not be expecting to find much in terms of effects associated with these programs. So, in this case the qualitative data — two things about it: one is it was the primary or is the primary way in which we are looking at implementation but secondly notice and this was referred to I think by Tom Cook today, we took those qualitative data and we are now converting them to a quantitative measure.
We, also, have ratings from the program provider staff for Allegheny County instructional leaders as well as we had the program providers themselves come in and do observations twice and gave ratings on how well they thought in fact the programs were being delivered and I can say that while we haven't released the results of the study that we were really buoyed by the degree of implementation fidelity and actually it came through with all three sets of measures.
I was struck by Sue's comments this morning, and I think that there might be, let me end with this, a useful distinction between research designs and difference in research designs and research methods and I think the two kind of are getting mixed up, if you will.
For me anyway a research design is a framework used for the collection of data to answer research questions and they range from true experiments on the one hand to observational studies on the other. When I think of research designs I think of Campbell and Stanley's classic 1966 paper where they discussed rival plausible hypotheses that can or cannot be ruled out as a function of the kind of research design you use.
Research methods by contrast describe the way in which data are collected, whether they be interviews, questionnaires, videotapes, field observations, case studies and so on.
Some of them are designed to produce quantitative methods like we saw an example in your work, Greg of how you used survey data to construct an index. In some cases they are not. I mean we are simply interested in case studies.
So, when I think of this section we are talking about right now I think of it as in fact looking at how multiple methods can be used within the context of a single study and I think that is very, very important, but I think there is a larger question that is out there as well and I think it is part of what is driving the discussion today and that is that there is sense that certain kinds of research designs are being privileged in educational research and that is causing some heartburn among some people at least out there.
|