|
MS. WINTERS: Thank you all once again for coming. The purpose of these concurrent sessions is to illustrate some of the issues raised in the earlier session through discussion of investigations that have employed multiple methods.
In this room we are going to hear from Mary Ann Huntley and Jere Confrey whom I will introduce momentarily. Mary Ann is going to talk about some of the work she has done evaluating traditional and reform math curricula. After hearing from her Jere will offer her reactions to the presentation and provide us with her own insights on employing multiple methods in education research.
Mary Ann Huntley is an assistant professor in the Department of Mathematical Sciences at the University of Delaware with a joint appointment in the School of Education. She is also a research associate with the Center for the Study of Mathematical Curriculum funded by the National Science Foundation.
Jere Confrey is a professor of education at the University of Texas where she is also co-director of the UT Secondary Preparation Program.
She also directs the Systemic Research Collaborative for Education in Mathematics, Science and Technology. She is the Vice Chair of the NRC's Mathematical Sciences Education Board and she recently chaired the authoring committee for this report on evaluating circular effectiveness, judging the quality of mathematical evaluations. You can read more about Mary Ann and Jere in the bio sketches included with the participant materials and with that I will turn it over to you.
DR.HUNTLEY: I will begin by providing some context about reform math curricula. Then I will discuss a comparative study of traditional and reform curricula which I worked on while I was a graduate student and throughout my talk I will be using the acronym CPMP to refer to the high school math curriculum developed by the Core-Plus Mathematics Project. The purpose of my talk today will be on methodological aspect of this evaluation study.
So, a little bit of context. The Curriculum and Evaluation Standards for School Mathematics published in 1989 by the National Council of Teachers of Mathematics set forth a broad vision of mathematical content and pedagogy for K-12 in the United States but at that time there were no curricula that aligned particularly well with this vision. So, the NSF issued a series of RFPs for teams of authors consisting of teachers working together with professors including math professors as well as math education professors and working with publishers to produce commercially published materials to be used by teachers in classrooms with students and the resulting curricula are often referred to as “reform math curricula.” This is what I mean by that term and they differ in substantive ways from more traditional curricula in terms of both content and processes for school mathematics.
Math education researchers are studying many aspects of reform math curricula. People are analyzing the content: Is it correct? Is it what we value? Is it presented in a developmentally appropriate fashion?
People are examining the effects of reform curricula on students’ attitudes, persistence in course taking and achievement. Notice that all of these students are happy [referring to slide]. They all have smiling faces because they are learning mathematics. People are also studying the effects of reform curricula on teachers, on their knowledge and beliefs and so on.
So, these curricula are being studied by many people from many different avenues. Today I will be focusing on evaluation of the effects of one reform math curriculum on students' learning.
So, what does it mean to evaluate the effects of a math curriculum? Lay persons probably think that curriculum evaluation involves just giving kids tests of their mathematical knowledge. This is a naive answer. One of my goals this morning is to problematize this issue for you and this is the goal of any evaluation study to obtain information about what works.
All we have to do is look at the results of the tests and look at the one where the kids score the highest, right? Again, I believe that this view is too simplistic.
Now, let me discuss the evaluation study on which I am lead author. First of all, I would like to point your attention to the composition of the team that was assembled to conduct this study. It included a mix of faculty and graduate students bringing complementary strengths. There were math educators, PhD mathematicians. We brought in statistical consultants as needed.
Also, we brought various cultural perspectives to this project and together we formed a strong team. The purpose of our research was to compare the effects on students of the Core-Plus approach to algebra with the effects of more conventional high school math curricula.
This was our main research question: What is the algebraic understanding skill and problem-solving ability of Core-Plus and control students ending the third year of high school mathematics?
Now, the focus of my talk, the methodology employed. Given the purpose of the study and the research questions we decided to use a mixed methods approach and this involved obtaining quantitative and qualitative data. Our main objective in the study was to assess students' algebraic understanding. So, this was on student assessment, but we also knew that valid evaluations of any curriculum require studying the effects of true or faithful implementation of the curriculum. So, we developed interview protocols to help us understand how teachers in the study were using their curriculum
Using this framework from the recent Educational Researcher article by Johnson et al., the Core-Plus evaluation falls into the lower left quadrant. Our primary design was quantitative but we did collect some qualitative data and we did not have two distinct phases of data collection separated in time.
There were two main findings from our study, first that Core-Plus students are better at applied problem solving and that control students are stronger at single manipulation. There was value added to our using the mixed methods approach.
In the paper you can read about two sites where we obtained contrary findings. For example, the student data at site 4 showed the Core-Plus students' performance matched the performance of control students on problems involving algebraic symbol manipulation and this was contrary to the general finding that control students generally fare better than Core-Plus student on these problems.
The interview data was important in helping us understand this discrepant finding. In particular our interviews with the Core-Plus teachers led us to understand that they had supplemented the Core-Plus curriculum with materials giving students more practice on traditional algebraic skills. Without this interview data we could only have speculated about possible factors that contributed to this anomalous finding.
So, we used the mixed methods approach, a battery of paper and pencil assessments together with teacher interviews.
Irrespective of any study design researchers face challenges and difficult decisions concerning data collection and analysis options and today I will outline three of the challenges that we faced and it is important to note that these are not challenges we encountered because we used mixed methodologies. Rather these are challenges anyone faces who does curriculum evaluation and I hope this will give you some insight into the complexity of this type of work.
Three challenges we faced are ensuring students' opportunity to learn the tested content, ensuring that the assessments capture students' thinking — what they know and are able to do — and ensuring that we tested true or faithful curricular implementation.
I would like to point out that this is just the tip of the iceberg. There are many other issues with which we struggled that I won't have time to discuss in my presentation including questions about participants, test administration, and data analysis. This is not a complete list either and perhaps I can talk more about these issues if you are interested over lunch.
For the first challenge, opportunity to learn, of course it makes sense to assess students on mathematical topics that they have had equal opportunities to learn but this is easier said than done especially when we are talking about curricula that have such very different approaches to algebra. The Core-Plus approach to algebra differs in many substantive ways from more traditional algebra curricula. The Core-Plus approach emphasizes mathematical modeling and use of graphing calculators while students work together on real-world problems.
In Core-Plus, algebra is taught together with other content strands. Topics are organized in a different fashion and there is much less attention in the curriculum to formal manipulation procedures which is at the heart of traditional algebra. So, we were faced with a dilemma of developing assessment items that allowed students to be successful irrespective of curricular approach.
This is the schema of mathematical modeling that guided our thinking about designing assessment items [referring to slide]. In the paper you will find a general description of this model, but I would like to give you a concrete example from my experience as an applied mathematician. When I worked at the Eastman Kodak Research Laboratories one of my main tasks was to work with a team of five mathematicians including myself to develop a model of the serum development process a mathematical model. So, the mathematicians met with scientists, engineers, and chemists to learn about the physical situation, what was involved in serum development and then we went back to our offices and developed a mathematical model. We developed numerical methods for solving the equations. We generated results, then went back to the scientists and learned more about the physical situation and had them help us interpret the results which led to a further refinement of the model and so on. So, there is a mathematical model in this cyclic process and it very much guided our thinking in the development of the assessment items.
For this schema we identified three main components of the effect of algebraic thinking. Using algebraic ideas and techniques to mathematize quantitative problem situations — using algebraic principles and procedures like solution and equations and inequality to produce results beyond information given in the original situation and interpreting results of mathematical reasoning and calculations in the problematic situation.
Using this theory we developed three types of assessments. First we thought that each component of algebraic problem solving and reasoning activity requires a variety of constituent understandings and skills. So, we assessed these components separately using problems quite typical of Core-Plus and other high school reform math curricula and we thought that Core-Plus students would fare better than students using more traditional curricula on this part of the assessment.
Second, because many high stakes college admissions and placement tests still require skill in algebraic symbol manipulation without the use of technology we developed make it algebra items, problems that call for transformation of algebraic expressions and solutions of equations in systems devoid of context. On these problems we hypothesized the control students would fare better than Core-Plus students.
Third, we thought that students who have effective command of algebra in functions can execute the complete process outlined in the schema of mathematical modeling I presented earlier. So, the third assessment consisted of comprehensive problems involving all three phases. By developing three types of assessments we honored both curricular perspectives — traditional and reform — and we thought it was important that all participating students felt successful on at least some assessment items.
So, let me emphasize that the development of assessment items was not haphazard. It was guided by theory. Another challenge of curriculum evaluation concerns the format of the test. Specifically we wrestled with how to design assessments that elicit student thinking and give us information about that thinking. Also, we puzzled over the question about whether to study a few students through intensive individual interviews or to study a large number of students to give us insight for performance measures.
We decided to use a battery of paper and pencil assessments but of course any form of assessments has its trade-offs. Constraints of using paper and pencil assessments largely involve issues of efficiency. The low cost in terms of test administration meant that we were able to test a relatively large number of students, nearly 900 all together, but when we looked at students' completed test papers we were disappointed at how poorly students explained their reasoning strategies. This was true for both Core-Plus and control students. Additionally we did not know how students used their calculators in solving the problems. The latter is especially important because reform curricula like Core-Plus make extensive use of calculators, but recall the purpose of the research, to compare the effects on students of the Core-Plus approach to algebra with the effects of more conventional high school math curricula.
We did not set out to study in-depth students' reasoning processes and calculator use as the solve algebra problems, but we did study this issue in a smaller scale clinical interview study that we conducted to clarify and extend our findings, extend the evaluation study. Our main research question for the study was: How do 11th grade students solve algebra problems? What are the reasoning processes and how do they use calculators?
The clinical interview study involves 44 pairs of 11th grade students. We had two sources of data. First and most importantly we conducted test-based interviews with semi-structured probes. Each pair of students talked aloud as they solved the algebra problems and we audiotaped these conversations.
Secondarily we administered surveys to teachers and students regarding frequency and type of use of graphing calculators in each of the classes from which student participants were drawn. To recap, to learn about students and algebra in the Core-Plus evaluation study our primary focus was to understand what Core-Plus students had learned about algebra compared with students who had used a more traditional curricular approach and guided by a schema of mathematical modeling we developed various tests to assess students' understanding, skill and problem-solving ability in algebra and functions.
Our decision to test a large number of students using a battery of paper and pencil assessments meant that we necessarily obtained limited information about the reasoning processes and how they used their calculators to solve the problems.
We then designed a study in which we focused on these things. Note that this clinical interview study has a different purpose and different methodology and different research questions. Certainly these two studies do not give us complete information about students and algebra. No one study does it all.
A third challenge for curriculum evaluators concerns curriculum implementation. Valid effects of any curricular reform necessitates studying true or faithful implementation of the curriculum. For our evaluation study of Core-Plus this involved determining whether the Core-Plus teachers were actually using the Core-Plus materials and whether they were faithful to the Core-Plus model of instruction, similarly for teachers of the more traditional materials. This is a critical issue. What if we had found that the teachers using a more traditional curriculum were frequently supplementing with Core-Plus units and were using the Core-Plus instructional model? Then we would not have had a fair comparison because we would have been comparing CPMT versus CPMT.
So, to get information from participating teachers about their instructional practices we conducted interviews with each of them. We asked them about additions and omissions from the intended curriculum, their typical classroom instructional practices, their use of calculators, reactions to the Core-Plus curriculum and assessment of practice. These audiotaped interviews were very helpful to us as I explained earlier, but there were real trade-offs in our decision to use interviews.
On the positive side, teacher interviews are relative inexpensive especially compared with classroom observations and interviews are less intimidating to teachers, again compared with classroom observations.
On the other hand, using interviews has its drawbacks. In my direct experience teachers tend to say what they think researchers want to hear rather than what really think. Also there is a trust issue that must be established between researchers and teachers, and in this study many of the participating teachers, we met them for the first time when they were collecting data. So, there was no time to generate that trust.
So, in the evaluation study our decision to use teacher interviews as proxies for classroom observations meant that we did not have an independent view of the classroom practice. This is a serious limitation. In fact, there is a growing and convergent body of evidence suggesting that in general the math curricula are positively impacting students and teachers. Most studies including the Core-Plus evaluation study do not provide sufficient information about how the curricula are being used by teachers and students.
This is a serious concern that is endemic to the reform and traditional curricula.
In the United States test kits are a major determinant of classroom practice and therefore have considerable influence as to what and how mathematics is taught, but as articulated by Barbara Curran a written curriculum cannot capture or fully represent teachers. They say that a teachers necessarily select from and adapt materials to suit their own students. This creates a gap between curriculum development presentations for students and what actually happens in practice.
Developers design what turn out to be ingredients in, not determinants of, the actual curriculum and Barbara Curran's statement was echoed by Jeremy Kilpatrick who says that classroom instruction can turn out quite differently for teachers implementing the same curriculum. He says that two classrooms in which the same curriculum is supposedly being implemented may look very different. The activities of teacher and students in each room may be quite dissimilar with different learning opportunities available, different mathematical ideas under consideration, and different outcomes achieved.
Moreover, what does it mean to implement a reformed math curriculum with high fidelity as the authors intend when a basic tenet of effective mathematics teaching is that teachers know and understand the materials deeply and they are able to draw upon that knowledge with flexibility during classroom instruction?
What exactly is fidelity of implementation? People talk about high fidelity of implementation but are careful in describing what they mean by fidelity. What does it look like? How do you know it when you see it? Does it vary by content and in what way does implementation of one reform curriculum differ from the implementation of another?
This last fact is a real problem because people tend to run together the five middle school NSF funded materials as if they were all the same but there are important differences between them and we need to understand those differences.
With funding from the Spencer Foundation through the National Academy of Education Spencer Postdoctoral Fellowship that I am currently working on I am investigating this issue of fidelity of implementation for two middle grades reform math curricula, the two curricula that have greatest market penetration — Math Thematics and Connected Mathematics. In this study the research question looks at the issue of what it means to implement connected math and mathematics to go to the author's intent. In other words what are the essential characteristics and acceptable adaptations of these two curricula? My goal is to generate operational definitions for fidelity of implementation of each of these curricula.
In my study of fidelity of implementation I have two major sources of data. First and foremost I am videotaping classroom observations and second of all I am conducting interviews with authors, teachers, and students of those curricula.
So, let me recap. In our Core-Plus evaluation study, in order to study student learning we needed information about how the curricula were being implemented, that is we needed information about teachers' instructional practices. So, we interviewed each of the participating teachers, which served as a proxy for classroom observation, but if they are just reports about teachers' practices, not direct information about their actual practice, to better understand how reformed curricula are being implemented in classrooms I have designed another study where my primary method for collecting data involves videotaped classroom observations and certainly these two studies do not give us complete information about understanding curriculum implementation. Further investigation is warranted.
I have just outlined three challenges we faced when designing the evaluation study, what mathematics should be tested, in what format do we present the assessments, and how to make sure that we are testing true effects of the curricula. In discussing these three challenges I explained that one research does not and cannot tell the whole story about the effectiveness of a curriculum.
So, let us return to the question I posed at the beginning about what it means to evaluate the effects of the math curriculum on students' learning.
As I hope I have shown you discussing the effects of a curriculum is not a simple matter. It requires multiple studies, better a body of work implementing multiple methodologies.
And you remember this question I posed about which math curriculum worked best. I believe this question is ill posed. It is like asking which car works best. As Professor Cook just said, "Do you go for a Cadillac or for a Yugo?" Certain cars are better depending on one's purpose for driving this and the conditions under which it will be driven and one's choice of car is also a function of your values, fuel efficiency, status and so on. Answering the question about which car works best only makes sense in a given context. In other words it is an opening for further conversation. Likewise the question about which curriculum works best is complex. There are important differences between reform and traditional curricula and any curriculum is likely to work better for certain goals, better than others and under some conditions better than others.
Given the challenges that researchers face and the multiple methods required to meet those challenges the results can be expected to be more complicated than whether a particular curriculum works or whether one curriculum is the best ergo we need to develop contrasting profiles of effects that might be expected from use of different programs.
This completes my talk.
Thank you.
(Applause.)
|