The National Academies: Advisers to the Nation on Science, Engineering, and Medicine
NATIONAL ACADEMY OF SCIENCES NATIONAL ACADEMY OF ENGINEERING INSTITUTE OF MEDICINE NATIONAL RESEARCH COUNCIL
Current Operating Status
CORE HOMEPAGE

ABOUT CORE

FOCUS OF CORE

CORE MEETINGS, WORKSHOPS & PRODUCTS

RELATED NRC EFFORTS


MS. AUGUST: Before I start this talk, I just want to let you know that this report was done five years ago. Five years is a long time.

So, I really had to read through the report again, in order to prepare this presentation, and I hope that I really remember enough of what we did to be able to answer your questions. We will see how it plays out.

The report was prepared for the National Educational Research Policy and Priority Board in October of 1998. Its intent was to review the implementation of the then OERI, NERI, CTB standards for evaluation and peer review of grant and cooperative agreement applications.

It was designed to assist OERI and NERF in considering whether to make changes to the standards or their applications.

Vladamir Morosky(?) and I were the authors of the report under subcontract to AIR. An expert panel oversaw our work, and they helped frame the study, direct its progress, critique the report, and came to consensus on its recommendations.

The panel included Penny Peterson, who is here with us today, and thank you for that, Chris Kroft(?), Paul Hakel(?), Sharon Lewis, who was director of research for the Council of Great City Schools, Judy Shumley(?), who was then assistant director for social policy and planning for the National Science Foundation.

I just want to say that this was a really excellent panel, and they were very involved in the process. They helped us, and consulted with us on the methodology that we used, and reviewed drafts of the report, and came to consensus on its recommendations.

So, what were the research questions that we framed here? What are the appropriate uses of peer reviewers with field initiated and center competitions?

Is the selection process of peer reviewers comprehensive and unbiased? Are the peer reviewers appropriate for the applications they review?

Are peer reviewers adequately trained for the review process? Is the review process for each type of competition carried out effectively?

Does the peer review process yield reviews that provide information to make funding decisions? How are funding decisions made? What changes or reforms in the peer review system will help improve the system?

Okay, now, let me tell you -- what I hope to do is really tell you what we did, briefly, then describe the results, and then Penny is going to talk to you about the recommendations.

With regard to the FIS panel, we studied 20 review panels across five institutes in two fiscal years, and those were fiscal year 1996 and fiscal year 1997.

Just to tell you how many panels there were, just to give you an idea, in fiscal year 1997, there were 38 panels across the five institutes.

One panel from each of the five institutes was selected for more in-depth study, because the institute director identified it as successful. The remaining 15 panels were selected from a stratified random sample, ensuring that all institutes and years would be represented.

Once we identified a panel, then we selected applicants and reviewers from the panel. This approach entailed 34 interviews of applicants and 40 interviews of reviewers across the competition.

To the extent that they were available, we reviewed the material associated with the randomly selected applicants and reviewers. We had 38 applications, 100 reviewers and reviewers' resumes.

For the center competition, we conducted two in-depth case studies in competition, one identified as highly successful as ED staff and the other identified as problematic.

We also interviewed applicants and reviewers for all the other competitions. There were seven in total.

The approach entailed 17 interviews with applicants and 14 interviews with reviewers across the center competitions.

To the extent that they were available, we also reviewed materials associated with the randomly selected applicants and reviewers, 12 applicants, 41 reviews and reviewers' resumes. A total of 61 reviewers reviewed 47 applications for the center competitions over the two years.

So, this just gives you a sense of the work that we were doing, where we got the information that was analyzed.

Now, what did we do with all this. Besides the interviews of reviewers and panelists and applicants that I just mentioned, we interviewed, we had a structured interview protocol that we used, to interview the applicants and the reviewers for these randomly selected panels.

We examined reviewer credentials and the reviews themselves, and I will discuss this in detail in a few minutes.

In addition to the FIS and center reviews I just described, we also examined in on the review panels of six panels in greater depth. All these panels were panels selected by ED staff as being highly successful.

We also looked in more depth at two center competition panels, one nominated as successful and one problematic, as I mentioned.

So, we did in-depth work on those six FIS panels and the two center reviews, as well as doing a random selection of reviewers and applicants for the other panels that we randomly selected.

If you are interested in details, you can look at the report, because it was done fairly systematically, that is, which applicants and which reviewers we decided to go with.

We also analyzed the laws and regulations that govern the grants and cooperative agreements. The analysis considered the requirements of the laws, regulations, and their implications in conducting competition.

The analyses also considered any problematic elements and conflicts among requirements. We interviewed Department of Education staff involved in administering and overseeing the competitions, and staff from several professional associations. Relevant comments and insights were incorporated into this analysis.

We also examined the peer review process conducted in other offices within the Department of Education, federal agencies and private foundations.

The other offices within ED, including an office i the Office of Special Education Programs, the fund for innovation in secondary education. The other federal agencies included the National Institutes of Health, the National Science Foundation and the National Endowment for the Humanities.

Because the report is extensive and the time is limited, I am not going to report on findings from the case studies or on the review process in other offices and agencies, but this information is in your book.

Before I start, I just want to say, I am going to focus on two key areas, which I think are maybe things that should be common.

What is the fit between reviewers and the applications they were reading, and the other is the quality of the review.

Before I report on these findings, I want to describe the requirements that are in the standards for peer review, that are germane to this discussion.

The standards require that individuals selected as peer reviewers have the following qualifications: demonstrated expertise, including training and experience, in the subject area of the competition; in-depth knowledge of policy or practice in education; in-depth knowledge of theoretical perspectives or methodological approaches in the subject areas of the competition.

I would like to note that a major issue that arose in practice was in determining whether the subject area defined by the standards meant research about the subject area, or just general knowledge that could be a result of practical experience.

With regard to selection, I also would like to note that there is little doubt that considerable effort is required to fund individuals who meet all the qualifications of the standard.

Nonetheless, the standards appear to be an effort to raise the bar, ensuring a high quality group of peer reviewers, by requiring that all members of the panel meet these qualifications. The standards clearly indicated that creating a panel was not to be a mix and match effort, with one reviewer represented the subject area, one practice, and the other methods and theory, as someone described it.

In fact, many of the reviewers were confused about this, as were many people in education. They seemed in some cases to really take a mix and match approach, where they did just that, which resulted in two and sometimes three reviewers having no expertise in research methodology.

In addition to the individual qualifications for peer reviewers, the standards required that the secretary select, to the extent feasible, peer reviewers for each competition who represented a broad range of perspectives.

The standards also proscribed certain aspects of the competition, including when peer review was to be used, the minimum numbers of reviewers on an application, reviewer obligations and review process and evaluation criteria.

Again, with regard to the evaluation criteria, the standards specified a set of broad criteria that include quality, national significant, quality of the project design, quality of potential contributions of personnel, adequacy of resources, and quality of the management staff.

The standards specify that reviewers evaluate and rate each proposal based on these evaluation criteria, and support the ratings for each proposal with precise written comments based on the reviewer's analysis of the strengths and weaknesses of the proposal with respect to each of the applicable technical evaluation materials.

So, it is important to keep this in mind, because these standards are the standards that establish the procedure that OERI, now IES has put in place to select reviewers and conduct the review process.

Now, let's see what actually happened. So, again, I am going to focus this talk on two things, the fit between the reviewers and the applications they were reviewing, and the second is to assess the quality of the review.

To assess the substantive fit between reviewers and applications, we matched to see whether there was a match between application content, theory and method, in the background and experience of the individuals who conducted the reviews.

We did the following. We actually requested all the resumes of the reviewers from OERI, and that is what we used.

We didn't do anything more elaborate than that, because this is the information that the OERI staff actually uses to make decisions about which people should serve as reviewers.

For the FIS competitions, for each of the panels and applications selected through the stratified random selection, as I said, we requested the resumes of all three peer reviewers from OERI.

We also examined the fit between applications and the reviewer credentials for the five center competitions, for which we had the necessary information.

This is the approach we used. If an individual had a doctorate, we looked at the field of the doctorate and the individual qualification.

If the field generally required research for a doctorate, we assumed that the individual had a research background.

If the field might or might not require original research -- for example, people working in instruction and administration -- we looked at positions held and publications as well, and we did the same for individuals without doctorates.

We didn't attempt a detailed match between reviewer credentials and specific applications. Not only was the data insufficient to permit this type of analysis, but we recognized that they had a much stricter rule than was suggested in the standards.

Again, remember, each of the institutes had their own focus. So, it was the focus of the institute that we looked at with regard to assessing the qualifications of the reviewers to review these applications.

Reviewers were also interviewed regarding possible concerns about serving as reviewers. This is an interesting endeavor.

We asked them about lack of knowledge in the subject area and/or methods, conflicts of interest, timing of reviews.

We asked whether their concerns had been addressed satisfactorily by the Department of Education staff, the extent to which the subject area of the competition was described in sufficient detail for them to determine whether they were qualified to assess applications, and their assessment of their own and their fellow panel members' qualifications for serving as reviewers.

So, what did we find here? While most of the reviewers in the sample had conducted research in education, a sizeable minority had not.

In the fiscal year 1997 FIS reviews, the group for which we had the greatest amount of systematic information, the following was true. Of the 35 reviewers on 12 panels whose resumes were reviewed, 17 appeared to be educational researchers. An additional six may well have had research experience, although the resumes for these individuals were insufficient to make that determination.

The remaining 12 individuals -- now, this is 12 out of 35 and they were randomly selected -- about a third did not indicate any research experience or publications on their resumes.

They included persons who had served as teachers, school administrators, state officials, tribal officials, teacher trainers, and university administrators.

Most had a solid background in educational policy and practice and familiarity with the general subject being reviewed, but did not meet the criterion of having studied and conducted research in the general field in which they were reviewing applications.

In addition, across institutes, the most common areas in which research experience appeared to be lacking was the design and conduct of evaluations, and this is evaluations for all different scenarios.

That is, we found that many reviewers did not have the methodological expertise needed to review the project design of the study that they were supposed to review.

This was a real problem because this was one of the major criteria used to rate the studies, was the design of the study.

Of the 29 applications we matched to the reviewers, 10 were evaluation studies in whole or in part. Judging from their experience and publications, few of the individuals who reviewed these applications appeared to have conducted evaluations themselves, let alone experimental studies that required elaborate sample design.

So, what did we find out from looking at the center reviews? What did we find looking at the fit between the resumes for people who were reviewing the centers applications?

The panels for the center competitions appeared to be qualified to review them, by and large. Not all the members of each panel held doctorates or conducted research in the competition subject areas, but most of them met these criteria.

The exceptions among the five panels were the post-secondary improvement center, for which most of the reviewers had conducted research, but only a minority appeared to have studied reforms in secondary education, and the adult literacy center, for which only a minority of the reviewers had either doctorates or research experience.

We were provided with six of the nine reviewers' resumes. So, it is possible that the data of the missing reviewers would alter this.

So, what about these interviews? What did we find out from the FIS interviews? Of the 40 reviewers surveyed who had participated in the FIS competition, 34 had no concerns about serving as a reviewer at any time in the process. Now, this is interesting in light of what they say next.

Thirty-eight of 40 reviewers indicated that the subject area was described in sufficient detail for them to make a determination as to whether they were qualified to serve as part of the review.

Twenty-nine reviewers reported that their expertise was appropriate. If you are interested, I have information on why some of them thought that their expertise was not appropriate.

One expressed concern because of a lack of specific expertise in the subject area, given the range of proposals to be reviewed. She reported it was appropriate because they were working on a panel with other reviewers who complemented their expertise.

Again, I noted that a lot of people thought that it was fine if only one person on the panel had the requisite methodological expertise.

Three were concerned because of a lack of background in research design and methodology. Three were concerned because they lacked expertise in the subject areas of the range of proposals that they had to review.

Okay, what about reviewers impression of the expertise of their fellow panelists? This was mixed. Across the two years, 18 of the 30 reviewers expressed satisfaction with the expertise of their fellow panelists. Twelve reviewers were more critical of the expertise. Nine cited a lack of expertise in research design and methodology as their main concern.

So, let's look at the centers peer reviews. Of the 14 reviewers who participated in center competitions, nine had no concerns about serving as a reviewer at any time.

All but one said the subject area was described in sufficient detail for them to make a determination.

With regard to reviewer subject area of expertise, all reviewers reported that they had appropriate expertise.

With regard to their assessment of their colleagues' expertise, 11 reviewers were very satisfied with their expertise, and many comment positively on the variety of backgrounds or perspectives of the panel.

Now, what we found out from the interviews was slightly different from what we found out by actually looking at the match between the resumes of these reviewers and the applications they were reviewing.

MS. SCHNEIDER (Committee on Research in Education): When you talked to the reviewer -- off microphone]. So, there are only three people on a panel?

MS. AUGUST: Yes.

MS. SCHNEIDER: For centers, it was like seven or eight?

MS. AUGUST: Seven or eight.

MS. SCHNEIDER: If we had eight people -- [off microphone].

MS. AUGUST: For the centers, I think that except for the two centers, competitions that I mentioned for secondary and adult literacy. The others, the reviewers for the center competitions were, by and large, I think qualified, but I think it was very different for the other ones.

We sampled 20 panels across the two years for the FIS. Then we interviewed everybody on those panels.

MS. SCHNEIDER: So, if you only had an N of three, and two of the -- so, basically, it is two out of three. So, you would have a greater probability, you could get more people to agree it was adequate because there were a few more people on the panel, than if you only had three.

MS. AUGUST: Exactly. I think that was a problem and one of our recommendations actually stems from that as an issue.

I maybe should have explained a little bit more about these competitions. Basically, for the FIS competitions, there were three reviewers for 10 applications.

MS. SCHNEIDER: Then those never went to a full panel?

MS. AUGUST: No, they never went beyond that small group. For fiscal year 1996, actually the review was done off site, and the only time the panelists actually talked to each other was if there were some very large differences of opinion amongst the reviewers with regard to the applications, in fiscal year 1997.

So, nine or 10 applications were sent to three people, and these people independently rated these things and then the ratings went into the Department of Education and they convened a conference call with these people if they were very discrepant.

In fiscal year 1997, they actually brought people to Washington. Again, the applications were reviewed prior to arriving here. The meeting was held to sort of discuss the applications.

People then independently, or supposedly independently, were supposed to re-rate them, and then that is how it happened.

For the center review, it was all done on site and there were seven or eight people per competition.

MS. SCHNEIDER: I know that is not how the FIS competitions used to be run. I remember when I served on an FIS panel several times, we came to Washington and they were full panels with 10 or 12 people. When did they move to this kind of three person review?

MS. AUGUST: This was in fiscal year 1996 and 1997. They actually may have changed the procedure after that. We were actually asked to do this review because these standards suggested it, then OERI and the board.

They gave sort of the standards to competitions and then they said, why don't you look at what is going on here. So, we did.

So, it could be that the reviews that you served on were done differently because of this report. That is possible.

MS. SCHNEIDER: It was the practice that there were reviews in the field and then they were brought to panel and there was a panel review.

That was as late as probably 1992, 1993. They were full panels.

MS. AUGUST: So, what you are talking about was prior to what I am talking about. Here, there were only three people that looked at 10 applications.

MS. SCHNEIDER: I am not sure I would disagree with what you found.

MS. AUGUST: The match between reviewers and applicants, and then I am going to talk to you about the quality of the reviews.

I think the reviewers, by and large -- what is really interesting to me is that, for fiscal year 1996, actually the quality of the reviews was better and the quality of the reviewers were better.

I think that is because people actually didn't have to come to Washington to do this. That is, they could stay home and do these reviews.

The quality of the reviewers for the center competition was also, by and large, better. There were nine competitions, did I say? Two of them were really a problem that created really a lot of problems for the department because some of the applicants were quite angry about what happened. So, there was a lot of discussion about this, after a couple of these reviews I am referring to became very political.

By and large, what happened was that people in the department put a lot of pressure on solid researchers to come in and do these reviews, because they took them more seriously when they had expertise.

The center was getting a lot of money over five years, so they wanted to make sure they had decent people reviewing these things.

So, they really put pressure on academics in the field to come in and do that which is why I think, by and large, there was less trouble, at least for some proportion of those reviews.

Now, let me talk about the quality of the reviews. I just want to note that we really had to argue to do this. We thought, how can we do a report on the peer review process, if you don't make some effort to assess the quality of the reviews.

Originally, the contract that was let out by the department didn't have this as a component. I said, well, forget it, it isn't worth doing.

Put yourself on the line here and look at these reviews. It was also very labor intensive and not very entertaining.

I was very happy to go back and start doing second language acquisition research again, which is what I usually do, rather than spending a whole year interviewing applicants and reviewers on the phone.

We did over 100 interviews, and reading these resumes and reading these applications, but I thought it was really important to do that.

So, let me tell you what we found here. First, again, the standards specified that the reviewer comments should be concise.

I think what is meant by that is, does the reviewer provide brief, precise, specific and persuasive argument about the design methods, et cetera, of the proposed research.

The comments need to be related to the evaluation criteria, those five criteria I mentioned -- national significance, et cetera -- consistent with the scores.

If you make a comment that something is really excellent, then you don't give the section a score that indicates that, it has a lot of problems.

The review should make clear whether the research is likely to yield valid and useful information.

Based on the expert panel, we also added another criteria here, which we thought was important if we were going to assess the quality of these things, and that is, that the comments were sufficiently elaborated.

That is, is the reviewer's judgement amply and expertly justified, because that really, I think, should be in the statements anyway. That is the point of writing the review.

To determine their quality, we read 141 reviews produced for sampled applications in the 1996-1997 fiscal year applications. So, 141 reviews as well as reviews for all the five center competitions.

I read at least two reviews for each of the sampled fiscal year competitions. So, we were reading the assessment of reviews and produced a one-page evaluation sheet listing a series of questions about the review based on the standards, including our additional standard.

In addition, applicants were interviewed to determine their assessment of the quality of the reviews they received. So, we rated all these reviews and we also interviewed the applicants themselves.

So, what did we find with regard to breadth of coverage? In terms of breadth of coverage, most reviews met the letter of the standards, in that they provided comments on each broad evaluation criteria, the comments were related to the evaluation criteria and the scores reflected the comments.

To the extent that they were provided, they were most often found in the first criteria, national significance and design. The other evaluation criteria received much less on it, but let's look at the quality with regard to breadth.

Having noted the breadth of the reviews, it is important to note, as well, that most provided very little depth. Most comments were related to national significance and design, first of all.

With regard to both national significance and project design, most reviewer descriptions of application strengths did little more than identify or document what was included in the application, sometimes accompanied by a summary judgement of its quality, sometimes not.

So, what often happened is, the reviewer just repeated something that happened to be in the application, with no comment on that.

In addition, with regard to significance, most comments were related to the importance of the problem, the research it was addressing, rather than whether the research itself would be significant in addressing the problem area.

So, this was a problem. People were submitting an application to study science in middle grades, and the reviewer would say, this is an important problem. We need to improve the science of middle graders, but they wouldn't sort of tie the problem to the actual application they were reading.

In the area of design, comments were mostly related to whether goals affect those outcomes, or the conceptual framework specified, rather than whether the design specifics were appropriate and rigorous.

This is a real problem, because the design was a large part of what you were supposed to rate in these applications.

Typical statements included: a set of performance objectives and a time line was presented. Research questions are stated. The objectives are achievable. The research design is rigorous. It was really pretty bad.

In describing application weaknesses, reviewers were more likely to express independent judgement in the area of design and in the area of national significance.

For example, a third of the FIS reviews included a relatively detailed discussion of design weaknesses, and most of these comments drew on the independent and personal knowledge, not solely on the application. This is 30 of like the total sample we were looking at.

Although there was supposed to be a description of weaknesses under national significance, some reviews didn't consider this weakness in detail. I guess what I am saying is, not all reviewers were doing that.

Then, other sections of the review, like staffing, budget and management, really generated very little detailed discussion.

So, then we decided to categorize these reviews as good, bad or indifferent, based on their breadth and depth of coverage.

Of the 79 reviews conducted for fiscal year 1997, about a third were good, there were detailed assessments of an application's strengths and weaknesses, they displayed the reviewer's knowledge as it was brought to bear on the application.

Another 20 percent of the reviews could be characterized as poor. They were so poorly written that it was impossible to know what positions they were taking or advice they were providing, or ignored the research components of research projects, focusing, instead, on the application's intervention as a program or a demonstration.

Eight different reviewers accounted for about half the reviews that we read. They were fragmentary. They listed elements in the design or in the needs or theory sections of applications as strengths, with little support for that designation, sometimes adding a short judgement, sometimes not. Most of the time they just sort of repeat what is in the application, rather than evaluating it.

Although far fewer applications were reviewed for 1996, the quality of those reviews appear to be somewhat better than fiscal year 1997. This is what I was talking about earlier. However, of the 21 reviews we examined, almost half could be considered poor.

While some center competition reviews were detailed -- this is with regard to centers -- far too many fell in the indifferent category.

About half of the 41 center reviews were relatively detailed. The other half were brief and provided mainly the same types of short descriptive, or normative statements, as those described above for the FIS reviews.

What did we find out? What about our interviews with applicants? Overall, applicants' assessment of the reviews was mixed.

In terms of applicants' assessment of the usefulness and extensiveness of the reviews, eight applicants of 34 in the FIS competition gave the reviews low ratings on this criterion, 16 gave reviews mixed ratings and about 10 rated them high.

Ratings on the same criteria for center applications were nine for of 17, five of 17 mixed, and three of 17 high.

Actually, what these people said about these applications was very consistent with what we were finding. The reason for the FIS and center applications negative or mixed assessments were varied, and including disagreement with the comments, comments that were considered superficial or irrelevant, no comments about design, lack of examples, comments that were illegible, limited explanation for comments, proposal not carefully read, large discrepancies among reviewer comments, reviewer comments too similar to each other, and summary statements that did not mesh with comments in the individual categories.

MS. SCHNEIDER: These are for everybody?

MS. AUGUST: Yes, this is both for the FIS and centers, and I think what I am reporting on here was the fiscal year 1997 FIS and center reviews.

In terms of the extent to which applicants considered that reviewers demonstrated appropriate expertise, which was the second question we asked -- the first was, you know, what did you think about the review, and the second was, what did you think about the expertise of the people reviewing your proposal --

MS. SCHNEIDER: Can I ask another question? The thing it looks to me like is that the applicants were giving slightly higher ratings on the reviews than you gave for the reviews themselves.

If you look at the FIS, you have 16 of them giving mixed and 10 of them rating them high. That is 26, and eight of them basically saying they are poor.

MS. AUGUST: The mixed weren't like center reviews either. The mixed were really mixed in terms of what people were looking for.

MS. SCHNEIDER: Given the quality of what you said the reviews sounded like, I am surprised that more of the applicants didn't give it more negative ratings.

MS. AUGUST: Maybe they didn't trust me when I told them that their names were going to be withheld. That is possible, that people just weren't comfortable being totally straightforward, even though I said, look, I am the only person that is going to know that.

Remember, people depend on the Department of Education for their research. That might account for it, that they weren't comfortable being totally straightforward.

PARTICIPANT: Did their assessment correlate with their success?

MS. AUGUST: I don't know. That is a good question. You mean, if they had been a successful applicant, if they felt better?

PARTICIPANT: Right.

MS. AUGUST: You know, we interviewed -- you know, I have to go back and look at the sampling within panels. I think we automatically interviewed applicants that had been successful.

If none had been successful for the panel, we then randomly selected two others from that panel. I think we were interviewing two people from each panel.

Not many applicants were successful. Very few of these things were funded. By and large, most applicants were unsuccessful in this endeavor.

In the book here, I have the number of people who actually received FIS grants, if you want to read it, but very few people really did.

MR. FLODEN (Committee on Research in Education): Does that mean that all the people you are referring to, all the reviews you are referring to, are for successful applications.

MS. AUGUST: Yes, to the extent that there was a successful applicant in the panel that we randomly selected. At first, we selected panelists that had been used to assess these FIS studies across the two years, to make sure that we had a representative sample from both years and from all reviews.

Within the panel, as I recall -- and I will have to look at this again in a minute -- but as I recall, we automatically included the person who had been -- if there was a successful applicant --

MR. FLODEN: So, you over-sampled.

MS. AUGUST: Exactly, which makes it actually even more troubling. Remember, you don't have to go and look and see, for these 20 panels, how many panels actually included successful applicants. As I mentioned, there were very few. You would assume it would be random, but very few of people who submitted applicants actually got funded.

MS. SCHNEIDER: Were there large numbers of applicants for FIS grants?

MS. AUGUST: The report states, actually, for each institute how many applications were submitted and how many were funded, if you want to look at the report. Also, we have very detailed information about within-panel, how applicants and reviewers were selected. So, all the information is in the report. It is really very specific about that.

It would be interesting, actually, to go back and look at the discrepancies, if there are any, between what I said and what the people I interviewed said. Remember, we gave these mixed reviews as well. It wasn't all bad.

It would be interesting to look at, for sort of the same applications, whether there was a big difference between what we said and what the people who were actually applicants said.

I think that is it, really, before Penny gets up here. Okay, just one other thing. In terms of the extent to which applicants considered that reviewers demonstrated appropriate expertise, 21 of all FIS applicants rated reviewers low or mixed on expertise. So, 21 of 34. I don't think this was that discrepant. Nine of 34 rated them as high. Among center applicants, 14 of 17 rated reviewers low on expertise, and three as high. So, that was sort of worse, actually, than when we read the applications.

Penny, remember, she is going to report on our recommendations here. In making these recommendations, remember, we interviewed staff extensively. I think I interviewed everybody in OERI who had anything to do with these competitions for each institute.

I interviewed Jerry Shruve(?) from AERA. We reviewed the processes that took place in all these other organizations.

We wanted to see, well, this isn't working, what is working. We picked places that did very different sorts of review, like NIH or NSF.

There are nice descriptions of what, at least five years ago, anyway -- who knows what they are doing now -- but five years ago, what these different groups were doing with regard to peer review.

Then we did these case studies I am not reporting on, where we looked in much more depth at panels that had been nominated as high by the department. Otherwise, I would be up here for another hour.

All this information is in the report. So, the recommendations are based on this additional information also.

MR. FLODEN: Just one question, Diane. When you talk about the reviews being good or bad, you are rating them on this amount of information and not on whether they were actually picking the best proposals.

MS. AUGUST: Exactly, we didn't rate that at all. We were just looking at the quality of reviews for a given application.

RSS News Feed | Subscribe to e-newsletters | Feedback | Back to Top