|
MR. DANIELSON: I am going to provide a little bit of context first. We run a peer review process that supports more than research. So, I am going to provide some context, first, about the breadth of what we do, and then I will focus in on research.
Broadly, our R&D program has several functions. This is the goal of the program, to not only support the development of research, but also to support the movement of that research into practice.
Around this circle are the several programs that we procured, research and development, technology -- which part of the technology work we do is R&D work, but we do dissemination work as well -- professional development, technical assistance with dissemination, we have something called the state improvement grant program, we fund parent training information and we fund large-scale studies and evaluation.
So, pretty much anything we would need to support to move research into practice, we have authority to do. So, in some ways, we are lucky in that respect.
In 1997, our statute was reauthorized. In that reauthorization, Congress provided that we should use a standing panel for review.
They did not go on to provide very much detail about how we would conduct the review. There is, however, some language about membership on the standing panel.
We use about 300 reviewers a year across all the competitions we run -- research, professional development and so on.
We have constructed study sections that correspond to the substantive areas within which we conduct reviews. In some cases, the study section could be in an area like professional development but also an area within professional development.
So, it could be both the nature, the work that we are doing, the strategy, the training within an area of expertise, but also the substantive areas within that.
That could mean that people would actually be identified as a research reviewer, but also then there would be an additional way in which people would be identified. That is typically done by the substantive expertise on the panel.
We then construct panels for individual competitions. I will give you one example. We do a large research competition every year. From year to year, it may have in the neighborhood of, typically, 150 to 200 applications.
We establish a panel for that, a set of panels for that competition. The panels typically, in field initiated research, they typically would be three-person panels around their substantive areas.
For example, we can predictably expect that in the high incidence disabilities areas, like learning disabilities, we might have three panels, which means that we would need about nine reviewers in that area.
In some areas, like autism, we might only have one panel and, therefore, need only three reviewers. Those are typically constructed each year and aren't necessarily the particular panels -- they are not necessarily the same people from year to year.
One of the things that, in our statutory requirements regarding the membership of the standing panels, the statute actually lays out that these people need to be represented on the panel -- teacher trainers, researchers, dissemination specialists, administrators, parent trainers, policy makers, parents of children with disabilities, individuals with disabilities.
Keep in mind that the standing panel is the panel from which we draw to do all of our competitions. Sometimes the competition we are doing is just to make awards for parent training centers. You would have parent trainers on this list of people who are on the standing panel.
In selecting subpanels to actually do review, the statute directs us that we should use persons with substantive and technical expertise, and also use parents, individuals with disabilities and persons from diverse backgrounds.
We interpret that in the language that says subpanel, we interpret that to mean competition. So, for every competition we run, we intend to apply those statutory requirements.
That would be, again, in the case of field initiated research. The first one is not so hard to deal with. The second one, of course, is much more challenging for us to deal with. I will talk a bit more about how we deal with it.
In selecting reviewers, one of the major criteria, in fact, the major criteria, is to ensure that the people who were selected to actually conduct the reviews, have the skills to apply the selection criteria in order to make judgements.
Now, I am going to talk in a moment about the substantive criteria. The selection criteria in the case of research would necessitate that a reviewer have expertise in research methodology, because everyone on the review panel has to make judgements about the technical adequacy of the proposals. In addition, they would have to have substantive expertise in the topic that they research.
That would be true even in the case of these parents of individuals with disabilities and diverse membership on the panel.
So, in attempting to ensure that we have those people on the panel, we would look and ensure that they have research methodology expertise, as well as substantive expertise in the area.
In some areas, that is less challenging than others. For example, in the area of deafness, there is actually a very long history of people who could potentially go into that field. So, there are a lot of people at the PhD level in that field.
In other areas, that is substantially less true and it is more of a challenge for us to get a group of people in other areas.
In fact, there are people now that, when we identify that they exist, after a while they tell us to stop calling them, because there are so few people.
We, of course, need to avoid conflicts of interest. That is a very big issue. Plus, in contracts with the NIH, it is somewhat of a problem, in that if people have submitted to a competition, they are, under our rules in the Department of Education, they cannot serve as a reviewer in that competition. So, conflict of interest is actually a very big factor for us in the big competitions.
Again, availability is a very big factor. We do struggle to get people to review. I think somebody made the comment earlier, that people who are grantees often are easier to get to review, and we do exploit that, of necessity. Sometimes, in fact, we do use guilt as a way to get people to participate.
One way in which our peer review process is different from NIH is that our staff do manage the panels. However, our staff are really not in a decision-making role on the panels. They don't participate in the discussions, nor do they typically read the proposals. They are there really to just facilitate the discussion, move it along, make sure that we get through all of the review, and do a bit of maybe qualitative review.
One of the things that is sometimes a challenge for us is to make sure that reviewers do a good job documenting their comments.
So, one of the things that I ask my staff to do is to review the documentation before the reviewers leave town, to make sure that we are not in the position of mailing these back to people and finding out there really isn't very much documentation or justification, in some cases, for scores that applicants are going to receive.
Up until 9-11, we did all of our reviews on site. Since 9-11, particularly during the period immediately after 9-11, I guess in some ways we were forced to consider off-site reviews.
We continue to permit that. It is something that we continue to be maybe less than enthusiastic about doing, but feeling that at least for now it is something that we may have to do in order to get enough people to be willing to do it.
Applications provided prior to the panel meeting, the panelists independently rate and score before they discuss. There is panel discussion and re-scoring and each reviewer produces an individual summary of their review, but there is also a summary of the panel discussion that is submitted by the lead reviewer.
We use 100-point scale for scoring, and I will show you in a moment how those points are allocated. There is, in addition, a qualitative recommendation that is approved, disapproved, conditional approval.
Usually, what a conditional approval means is that they believe that it should be funded, but they believe that there is some major issue that needs to be resolved.
I can tell you that one common basis for conditional approval is that they do not yet have consent of the school district in order to work and do research, presuming that they are working in a school doing that, and that that would probably result in conditional approval, until they provide satisfactory evidence that the schools are willing to work with them.
The other intervention that the panel does -- this is one thing they do when they do their discussion -- is that they sometimes generate, sometimes separate from submissions, they also generate what we call negotiation issues.
Sometimes they recommend it for funding, but they see some things that they believe are things that could be improved in the application. They may lay out the recommendations for us to negotiate with the applicant, in the event that the applicant makes it into the funding range, issues that we would clarify.
All of the summaries and individual reviews are returned to the applicant.
This is the story in research. Clearly, the criteria are different for other programs, other than research. High points actually go for the technical quality of the application, the significant importance and quality of personnel, quality of management, adequacy of resources.
Resources often deal with kind of whether the budget is really a cost effective budget for the research you are going to do.
One of the things that we do, as I listened to other people talk about what they do, one of the things that is kind of clear is that we love numbers, and we do a lot with them when we get them.
We do initially, what we find is that, in the data entry -- it is not surprising -- there are data entry errors. So, we do a lot of kind of edit tracking, just to make sure that we have got the data entry done well.
There is also -- starting again with passive descriptive analyses, just looking at score distributions, what they look like, getting a mean standard deviation and kind of standard analysis.
I will go into a little bit more detail on all of these. We do analyses to actually look at overall panels and how the overall panel worked.
Then we do analyses to actually look at individual reviewer performance. Then we do analyses that actually are more focused on individual applications, and potentially identifying anomalies in those applications.
This goes into more detail. I am not actually going to talk through each of these. We actually do these four transformations of all the raw scores.
Typically, if we have some serious outliers, we do a trimming procedures, before we do the Z scores, and the Z score is part of the basis for the ranking.
We will show in the final rank the raw score rankings and the Z score ranking, but the ranking for the purpose of funding is the Z score ranking.
We do, in the analyses, this actually shows everything we do for each of our competitions. We do inter-rater correlations. Our average correlations across our panels would be about .75, the inter-relater correlations.
One of the reasons why the inter-relator correlations are good is that we use inter-rater correlations as part of the decision making for whether we accept what we did.
That is, if we have reviewers where we have -- in some instances, we actually have, surprisingly, reviewers scores that are negatively correlated with the review panel. That can be an artifact of an outlier, but if we look very closely at the scores, we find, in fact, that if we have inverse ordering of applications, we will, in the analysis we do, look very carefully at that.
In some instances, we will either redo an entire review, or replace the reviewer in that instance.
There is a kind of a qualitative aspect of a review where we actually do talk to the person who sat in on the panel, the staff person that managed that panel, get some sense of what happened on that panel.
We also look at the review of documentation to see what existed, but we will, as part of our post-panel discussion, in some instances redo entire panels or redo one of the reviews on the panel, and then typically we would then redo the discussion as well with the remaining two reviewers, and that one reviewer.
All of the discussion that might lead to some of this happening, typically it happens in the context of a pre-funding meeting that happens at the division level, where we are looking at extensive analysis.
The evaluation form that is done by the panel manager actually evaluates each of the reviewers on each of the panels, kind of a qualitative review of each of the reviewers.
In some cases, we have had situations where it looks like the reviewer maybe didn't -- it sounds like the reviewer maybe didn't read the applications, and that is the sort of thing that, if it did happen, it probably means the person wouldn't be asked back again, ever.
There are other things, though, in this information that it might be that if there was a pattern over time -- obviously, there could be an outlier in a review, but if it happens consistently, then we are probably likely to look again and say, this isn't somebody we want to invite back again.
Something I didn't tell you is that I think probably everybody in a management position in the organization that does a lot of peer review gets a lot of phone calls and correspondence about peer review.
So, that motivates me very highly that, if you look very critically at the review, and if there are problems with the reviewer, we would not be going to have them back again.
I would say that probably one of the things that is perhaps the most serious is if somebody has a pattern of very low scoring with very poor documentation, when applicants get that back -- low scores with no justification -- that is the sort of thing that makes my life very difficult. So, we don't like reviewers like that.
We have a long history of over 40 years of field initiated research, programs that have gone on for over 40 years.
I think there is an element -- I, of course, have not been there for all that 40 years. So, there is an element of this -- I think probably in all agencies a position -- that we didn't create these systems and, to some degree, we were all stuck with them and with their limitations.
I think we tried to be reflective about this, even though there is a broader department directive on that, that lays out some of this.
We have actually, in 1995 and 2001 and more recently there was a Presidential Commission on Special Ed that actually looked at a whole lot of things enacted in our legislation.
One of the things they commented on in the review, we are reflective about peer review. In 1995, 2001, primarily with people that we support in our peer review process gave us input, and we have been responding to that input.
I would say that -- I know Russ presented earlier today and, since we are in the same department, it is clear that some things are changing and probably will change in the department. Things will change on peer review, no doubt, as well.
These were the recommendations that came from the Presidential Commission. A lot of recommendations were to move in a direction that would make us look very much like the NIH process, separating the programs from the review.
Probably, most important, provide feedback to applicants more along the lines of NIH. That is something, in fact, we would like very much to do, but for us it would necessitate actually probably contracting it out. We don't have the staff time to provide that level of detail.
It probably also would necessitate -- we review and provide for the most part the same level of feedback for every application. We get about 1,400 applications, and across all the programs a year, probably 4,500 applications.
I think that, to provide that level of review, we would be forced to make a discrimination that maybe only half the applications would get that level of feedback, and a number of things like that.
Clearly, right now we are looking at -- we are reflective about the review. Some of the things like that may happen, even though we would have to contract out for that kind of process.
Challenges. Funding, clearly, to contract out, is a challenge for us. We get lots of applications. We do lots of reviews.
I think if we were to do some kind of a triage process, screen out some of these applications, perhaps with less than three reviews, that would obviously be a way to help deal with this.
Frankly, I think one of the big problems we have -- I suspect it is true for most agencies -- is that, at the point where the money runs out, applications are often separated by very small point differentials.
I think maybe there hasn't been quite enough attention paid to where you need to discriminate out. The stuff at the top, I would bet you that some of the applications that rise to the top would rise to the top whether you had three or six reviewers, or if you changed reviewers.
I think those that are at the point where the money runs out often have very small differences separating winners and losers.
The fact that I think if we put standard errors around all of these applications, what we find is that we find a large band of applications that are essentially the same.
I think there is where -- I think from the point of view of applicants, there is a reason to be concerned. I think from the point of view of the public, I am pretty comfortable, in a competition that funds only the top 10 percent, that we are funding high quality research.
I can understand that people who are a point or two below the cut point may feel that they should have ended up with something. I think that is something that we need to worry about. Thank you.
|