|
Workshop on Understanding and Promoting Knowledge Accumulation in Education:
Tools and Strategies for Education Research
Day 1 – June 30, 2003
Remarks by Dr. Barbara Schneider
DR. BARBARA SCHNEIDER: I would like to start by talking about what I think are some of the factors that bear on - and I am going to change this now to “knowledge accumulation,” rather than “accumulation in educational research.”
So the field of educational research has become increasingly diverse and fragmented, resulting in different understandings of what constitutes scientific inquiry, evidence and interpretation.
It is unrealistic to expect all educational researchers to share identical perspectives on researchable problems, given their dissimilar disciplinary training and methodological approaches.
However, the lack of a self-regulating community that appreciates diversity, yet upholds a set of principles for quality research undermines both the work and the professionals in the field. So, then, how do you create scientific norms that unite a community of diverse paradigms and methods, and, as a sociologist, you can understand why I am immediately drawn to this whole question.
Replication presents us with an important, unifying concept, and a starting point, it seems to me, for constructive dialogue. Unless there is some commonality of understanding for why we need replication, the mechanisms for ensuring the replication of research are probably not worth pursuing.
I am going to assume that even those with the most constructivist beliefs about science would agree that replication is important. Replication begins with data sharing. For it is the sharing of information about studies, including the actual data upon which findings are based, that makes replication possible.
Data sharing is essential to replication, and it is a vehicle through which investigators can build upon designs, create and revise measures and study different populations for purposes of developing new theories.
Replication involves both applying the same conditions to multiple cases, as well as replicating the designs, including cases that are sufficiently different to justify the generalization of results in theories.
Without convergence of results from multiple studies, the objectivity, neutrality and generalizability of research is questionable.
I would like to begin by looking at what some of the mechanisms are that facilitate replication.
In some fields, there is a tradition of replication in data sharing. For example, in the natural and physical sciences, they are enormous about data sharing, which are, in part, a response in changes to how research is conducted in these fields.
The one-time investigations in fields such as astronomy were based on natural observations. Today, natural observations have been transformed through the collection and processing of images of phenomenon, images and information that are available to multiple investigators simultaneously.
Consortiums of scientists work on data sets housed and centralized facilities. Similarly, in physics, there are projects where researchers - through various technologies work out problems at the same time in locations throughout the world. Organized in teams and using the same databases, scientists race to be the first in making new discoveries.
Attentive to advances in the field, researchers in the natural and physical sciences are quick to replicate experiments or reanalyze data from simulations reported at scientific meetings and in journals.
The more significant a study’s findings, the greater the likelihood that other investigators will attempt to replicate the results.
Looking at the natural and physical sciences and some of the social sciences, it appears that there are at least three mechanisms that are especially useful for encouraging and facilitating replication.
Now, I am not going to go big like David. I am going to go very small, and these are kind of baby steps, it seems to me, but, at the same time, would make a very big difference with respect to some these kinds of issues.
The first is upholding ethical standards for replication and data sharing in the professional associations.
The second is reinforcing these standards by requiring researchers to abide by them when publishing findings in professional journals.
And the third, maintaining institutional infrastructures that assist researchers in data sharing and provide opportunities for study replication.
So let’s start with the - one on professional ethics. The matters of replication and data sharing are so fundamental to the social and physical sciences that many of the professional associations have enacted ethical codes for the sharing of data and pertinent documentation.
For example, in the ethical standards for the American Sociological Association, there are six key elements regarding data sharing, and when these were written, they were written in conjunction with APA. I don’t want to read through every one of these. There are basically five different slides. Let’s just start with the first one, “Sociologists make their data available after completion of the project or its major publications, except where proprietary agreements with employers, contractors or clients preclude such accessibility,” and then as you go down this list, you’ll see how, in fact, data sharing is explicitly identified within ASA.
Now are we - yes, but I want people to have a chance to at least see them and read them to themselves.
The ethical standards of ASA, as I had said, were written in conjunction with APA. Similar ethics policies regarding data sharing are not included in the AERA ethics codes. Last revised in 2000, the AERA ethics codes covers few of those items related to data sharing and they have a much more abbreviated way of dealing with issues of confidentiality.
Now, let’s move on to the journals. Now, data sharing in the journals, in the physical and social sciences, norms for replication are reinforced in professional and disciplinary journals. In journals such as Science and Nature, once an article is published, the author has to make data available to those who wish to replicate the results. What this actually means is that when the manuscript is accepted, the author is required to make all materials and methods available to researchers for their own use. In both Science and Nature, there are specific procedures for data sharing that the author is expected to follow once a paper is accepted for publication.
I want to underscore that these procedures are not simply web contact addresses. Rather, the data must be arrayed in identified files that directly correspond to results reported in tables and figures in the manuscript. Nature requires that any supporting data sets for which there is no public repository must be made available to any interested reader on and after the publication date. The author is expected to provide a URL to a specific website containing the data. Researchers who encounter a persistent refusal to comply with these guidelines are instructed to contact the editor with - and they have an actual procedure for what is called a “materials complaint.”
The social sciences also have procedures for ensuring the replication of research. Although these procedures are less established than those in the physical sciences, as in the physical sciences, the replication issue is made easier by journal guidelines. In sociology, there is a substantial body of research based on national data sets that is available to researchers for reanalysis.
Demographers, for example, work primarily with secondary data sets, although the use of other types of data is now more common. Demography, the leading journal of demographers, requires authors with accepted manuscripts to preserve the data used in their analysis and to make the data available to others at a reasonable cost from six months after the publication date to a period of three years thereafter. Exceptions are possible.
Now, one of the premier ASA journals, The American Sociological Review, has a similar policy and in the manuscript guidelines, the ethics code for data sharing is reiterated. And, finally, the third point has to do with an infrastructure for application.
Now, there are institutional infrastructures that facilitate study replication and data sharing. The government has several centers and institutes that are designed and operate to make data accessible to researchers. One of the most helpful of these, I would say, is NCS, the National Center for Education Statistics, and somebody else is going to address this, so I won’t go on with NCS.
Instead, I will go to ICPSR. That is the Inter-University Consortium for Political and Social Research at the University of Michigan, which was established in 1962, and it maintains and provides access to a vast archive of social science data for research and instruction and offers training and quantitative methods to facilitate more effective use of this data. There are over 500 member colleges and universities around the world who belong to the consortium. Faculty and students in member institutions can browse the data sets which are identified by thematic categories, one of them, of course, being education.
Another type of institution - and I wanted to talk a little bit more extensively about this - that includes both survey data and qualitative data is the Murray Research Center at the Radcliffe Institute for Advanced Study at Harvard University. The Murray Center is dedicated to the study of lives over time and promotes the use of existing social science data to explore human development and social change. Founded in 1976, the center has established a national archive of materials from over 270 studies which are available for new research.
Now, the nice thing about the Murray Center is that in how it is atypical is that it allows numerous searchers to contact the subjects of some of the existing data sets in order to do follow-up data collection. Recently, John Law(?) and Rob Sampson wrote the award-winning book, Crime in the Making, Pathways and Turning Points Through Life, which was based on follow-up of the Sheldon and Eleanor Block crime study done in 1940 and followed their people until 1963.
Similarly, we have other kinds of follow-up studies, including one by David McClellan(?) and Carol Francis, the study that Sears and McCabe and Levin did of patterns of child rearing.
The Murray Center houses data that were collected using a variety of methods, including longitudinal, cross-sectional, survey, case study, experimental designs and those that use video tapes.
Two important new education data sets available at the center are the Achievement Motivation among Latino Studies and ECLS’ Maryland Adolescent Growth in Context Study.
And then there are publications at the center, including the guide to data resources in the longitudinal studies inventory that includes the sample, the methods, and the measures used in the studies are archived through the center.
Now, I am going to get into a little sociology - very slight. Well, it seems to me that norms are developed through sanctions and rewards and reinforced informally through close ties among members. In the instances of data sharing in fields outside of education, there are incentives, such as ethics code and journal submission requirements that encourage researchers to be forthcoming with their data.
Institutions like NCS, IPSR and the Murray Center also facilitate data sharing. However, these organizations do not have the authority to force researchers to share information. To create strong norms for data sharing and replication, there needs to be incentives for doing so. So, now, what is one kind of incentive for data sharing?
Data archiving is becoming a matter of course. At universities and research centers across the U.S., data are routinely archived. Computer technology and advances in software make the storage of data of all types, from extensive field notes to video tapes, considerably less cumbersome than it was a decade ago. However, archiving data for historical reference is quite different than producing data files that can be easily read and used by others for replication purposes. What is needed may be some type of incentive that would encourage investigators to think at the outset of their work about data sharing and opportunities for replication.
There has been some discussion about requiring investigators who receive awards from either federal or other sources to provide a public-use data file at the completion of their work.
Presently, I am funded by the Alfred P. Sloan Foundation and have been for the last 15 years. Part of my funding from them is that I will make available public use data sets on the data as soon as they are cleaned, and I follow the procedures of the IRB at my university and the IRB at universities across the United States, and, in fact, the world, to try to work out different kinds of circumstances, so that, in fact, investigators can have access to survey time use and extensive interview and case materials on our 500 families.
So now where does replication occur? And replication in education - at least at a quantitative level - we have heard a lot about that from Helen today, and she brought up Jim Coleman and talked about his study, but then what David Cohen said, which is really the most important part of that, at least for this part of my talk, is the fact that once those study findings were released and there was so much of an uproar about them, that for a year long Harvard University undertook a reanalysis of those findings. So this wasn’t just looking at the numbers. This was actually running the data, as David explained - because Mike Smith was one of those people that actually reran the data. Well, in 1982 when Coleman then, again, released his findings for high school and beyond, again, there was a great deal of criticism, and, again, there was secondary analysis, and if you look at what went on at that time, Stanford University, under the direction of Hank Levin, did a similar kind of thing that went on at Harvard, and that resulted in a two-volume book.
So this type of reanalysis, I would say, that has occurred with these old national educational studies is not occurring right now with NELS and ECLS. These are two massive national education studies. When these data sets were released, individual investigators produced relatively small-scale studies on a variety of subjects, some of which are very narrow and consequently rarely replicated.
However, some small studies have produced significant results that it seems to me should be replicated with different samples and using other types of designs.
Two other studies, both of which are controversial, which Helen talked about today - well, she talked about the Tennessee Star(?) experiment and the New York Voucher experiment. However, there is one important caveat that wasn’t brought out. That data that is not accessible to all researchers across the United States, in fact, that this is a bone of contention among a number of researchers, and I know this because I was just the editor of EPA and we have letters about some of the problems that are associated with, in fact, getting access to data.
So, now, what about replication on the qualitative side? Now - and I think that very often people worry about replication when using qualitative data. I thought that one thing I might bring to the panel and to this audience is a new study that is dedicated not only to replication, but to intensive data sharing and analysis that is ethnographic, and this is the three-city ethnography funded by NICHD, and it is conducting a fine-grading(?) assessment of how, over time, welfare reform policies influence the day-to-day lives of low-income African American, Latino, Hispanic and non-Hispanic white families. Located in Boston, Chicago and San Antonio, the study focuses on the interaction of welfare policies, family behavior and child development. Led by a team of senior researchers - Linda Burton and William Julius Wilson - who have directed a number of long-term ethnographic studies of urban economically-disadvantaged families in low-income neighborhoods, the ethnography team consists of over 210 research scientists, ethnographers, qualitative-data analysts, system programmers and staff, all of whom are committed to conducting high-quality ethnographic research.
The three-city ethnography project is important as it underscores how research with qualitative data can be shared among multiple researchers.
As Burton recently reported, the study has already generated an extensive qualitative data set consisting of over 45,000 pages of field notes and supporting data which have been organized into a consistent data-management system.
As Burton writes, “The qualitative data analysts read field notes about families and neighborhoods to which they are assigned, build organizational systems to classify the material and look for emergent and reoccurring themes. The goal is to prepare materials for several layers of analysis, a truly ongoing and collaborative process.”
Now, okay, there are several reasons for presenting these examples. One is that replication is neither comforting nor comfortable as the controversy over the Coleman Report makes clear. However, strong training in collaborative research where the work of investigators is routinely scrutinized and graduate students are fully engaged in the project, such as in the instance of the three city projects, helps to minimize personal risk while maximizing the opportunity for replication.
Now, I just want to speak a little bit about facilitating data sharing through secondary analysis, and I want to do this because this is a lot of the work that I have done and it, I think, is helpful in understanding these issues of knowledge accumulation.
The National Education Data Sets are designed to represent the population of the U.S. or certain age groups within it. Data sets that are based on probability samples such as ECLS’, NOWSE(?) and baccalaureate and beyond are potentially useful for testing and constructing measures, contextualizing the results of smaller-scale studies and training graduate students.
So are these data sets perfect? No, of course they are not perfect, and I merely want to - when I’m speaking now I’m talking primarily about the longitudinal data sets. These great big efforts, there are all kinds of issues that come up. Some critics have argued that some of the data sets do not incorporate advances in the field.
Another problem has been that the large data sets - that investigators are often under-informed about statistical advantages that improve the analytical capacity of the data sets, and that these two things, however, I would say, are really starting to be minimized, and if we look at some of the kinds of methodological work that is going on right now, procedures have been addressed for addressing problems of selection bias, and imputation techniques now allow investigators to avoid problems associated with missing data.
Advances in statistical techniques are increasing the accuracy and the power of multi-area analyses of longitudinal data. Propensity analyses, which I think is what Helen was alluding to, which predict outcomes based on the probability that actors will be behave in certain ways, are yielding results that are nearly as powerful as those achieved in carefully-designed random experiments.
And then I just want to get to - this is some data that my colleague, David Stevenson, and I put together using five different data sets over time that show how educational expectations among high-school students had changed, and this is probably one of the - I think makes the case very clearly that you can see if you look at that very last bar in the histogram, you’ll see at the very top that is the proportion of kids that expect to only achieve a high-school education. The rest of the people in that - and that are based on populations of people across the United States. It’s a weighted number. Those are the numbers of kids that actually think they are going to go to college, and if you go to the purple and blue bar, those are the ones that think they are going to graduate, and the very bottom - I think its purple really - are those kids that think that they are going to get graduate degrees.
Now, if we look at the large-scale data sets, I want to say that they - in summary - that they can do three very important things. One is that - okay. I think I’m moving - The first is that they can be used to contextualize the results of smaller studies. We have done some of this cross work ourselves, and they also provide training for young scholars in quantitative analysis. In fact, you know, this is one of those places where you both can get the same answer.
Now, I have to take a little bit of time to talk about what we are doing at the University of Chicago. Yes, I see five minutes. This has to do with our IERI project, and I’m sure most of you know that IERI is a collaborative program and it is interested in looking at trying to understand - I want to - I’m moving very quickly. These projects are all focused on identifying and understanding the conditions that are essential for moving promising educational models and programs and strategies to scale.
Scale, in this sense, is seen primarily as taking educational interventions that prove to be effective in one setting to larger and more diverse populations in various sites using a variety of research designs that employ random assignment of subjects, observations, surveys and other methods. These studies are producing new information on learning, instruction and achievement.
So, basically, what you have are some of the kinds of things that David was talking about. In fact, he is an IERI project which is a community of scholars working on a project and working on that project over time using a variety of different methods, primarily looking at achievement and with the idea that they will find promising new interventions that eventually can be brought to scale.
Now, what role do we have in that? Well, we have what is called the Data Research and Development Center. This is at the University of Chicago. Essentially, what we are trying to do is to improve the research capacity of faculty, students and scholars engaged in interdisciplinary work in the areas of learning, instruction and achievement.
And the DRDC is working with other IERI – all these acronyms - researchers to determine what programs and interventions are most effective for different students using IERI and other national and international data sets.
So over the past year, we have been working on building a community of scholars who share designs and results. We have been accomplishing this through several different activities. The first is by learning how individuals interact with each other and who they are relying on for advice in their research. Training of students in their professional lives through a social network analysis we anticipate learning what venues strengthen information sharing.
We have also started a process of building common measures that researchers can use across data sets. Currently, we are examining how factors such as a social-economic status are measured across different data sets and determining how they are predicted powers change, based on different cohorts. That is one of the reasons why I wanted to show you the work that David and I did, because, basically, this is a similar kind of project that we are going to be doing within the DRDC around other kinds of measures that we see in a lot of the IERI data sets.
Second, we are providing technical assistance to investigators. The types of technical assistance we are providing are project specific and build on the expertise of the investigators.
So, very quickly, we are working on questions related to study design, and, here, we are helping with sampling expertise. Then we are also working on access cooperation and assignment providing information on conducting scientific research in schools, instrumentation, accessing the fidelity of implementation, sharing designs, instruments and items for measuring conditions that effect the implementation of the intervention, and, finally, analysis advising on what analytic procedures such as hierarchical linear analysis would be most useful. We are measuring the effectiveness of the intervention.
To further the development of a professional community, we have also developed a website and are working with our computer science department to establish virtual laboratories where investigators will be able to interact with each other through learning access grids and other technologies.
And then want to end with a caveat.
See, you didn’t even have to show me the piece of paper.
We don’t have all of the answers. There are several centers in large-scale projects with talented individuals who are engaged in rigorous scientific work and successfully working and training students to undertake many of the core activities of the DRDC.
Some of our projects are unlikely to work as planned or produce the outcomes that we anticipated. On the other hand, some of the activities we pursue we hope will make a difference, and perhaps these projects will be imported, adopted and transformed in other areas.
In a somewhat limited, but significant way, one of our goals is to strengthen research - in education that encourage the systematic study of innovation through, as Steven Jay Gould has said, replication with difference, building the best case for generality.
Thank you.
|