Review of 'An Evaluation of Course Evaluations'

Reviewer: E.J. Masicampo

Publication date of review: 2014-12-28

Bookmark

E.J. Masicampo4

An Evaluation of Course EvaluationsCrossref ScienceOpen

A nice overview of the problems with student evaluations of teaching.

Average rating:	    Rated 4.5 of 5.
Level of importance:	    Rated 5 of 5.
Level of validity:	    Rated 3 of 5.
Level of completeness:	    Rated 4 of 5.
Level of comprehensibility:	    Rated 5 of 5.
Competing interests:	None

Reviewed article

Record: found
Abstract: found
Article: found

Is Open Access

An Evaluation of Course Evaluations

Philip Stark, Richard Freishtat (2014)

Student ratings of teaching have been used, studied, and debated for almost a century. This article examines student ratings of teaching from a statistical perspective. The common practice of relying on averages of student teaching evaluation scores as the primary measure of teaching effectiveness for promotion and tenure decisions should be abandoned for substantive and statistical reasons: There is strong evidence that student responses to questions of “effectiveness” do not measure teaching effectiveness. Response rates and response variability matter. And comparing averages of categorical responses, even if the categories are represented by numbers, makes little sense. Student ratings of teaching are valuable when they ask the right questions, report response rates and score distributions, and are balanced by a variety of other sources and methods to evaluate teaching.

0 comments Cited 72 times     Rated -3 of 5. – based on 3 reviews

(Latest)

Bookmark

Review information

ScienceOpen disciplines: Computer science,Arts,Uncategorized,Law,History,Economics

Review text

The paper provides a nice overview of the problems with student evaluations of teaching. These include statistical issues and problems with the overall approach when gathering and interpreting teaching evaluations. The authors recommend alternative methods, such as peer observations of teaching and creation of teaching portfolios.

The paper does a good job of making a convincing argument for the ineffectiveness of traditional teaching evaluations. The paper appears to cite a large number of relevant papers related to the limitations of teaching evaluations. The paper is also exceptionally well-written and interesting.

While the argument against traditional teaching evaluations is rather convincing, there are a few aspects of the paper that somewhat undermine the argument. The tone is often fairly casual, and much of the content is devoted to examples and anecdotes. Some of this is fine, but it seems that this has come at the cost of paying too little attention to the empirical work on teaching evaluations. Much of this work is cited, and all of it appears to be highly relevant to the paper’s thesis, but very little of that work is discussed in detail. There are also some strong statements that are not supported by sound arguments or data. For example, the authors spend much time talking about why teaching evaluations, as currently measured, do not quite measure teaching effectiveness. Then the statement is made students simply cannot rate effectiveness, but it is not clear why this has to be the true. Perhaps it is, but such a strong claim should be explained.

There is also an imbalance in the discussion about the limitations of teaching evaluations. A large amount of space is given to the statistical problems, but these are simply the same limitations that are always true whenever convenience sampling or measures of central tendency are used. These limitations do still apply and so are worth discussing. But I would have thought it appropriate to devote more space to issues unique to teaching evaluations. Again, there seems to be ample work to discuss in that regard, but relatively little of that discussion takes place.

I also think the paper could do a better job of explaining why the alternatives to traditional teaching evaluations should be pursued. I think most faculty are aware of the limitations of teaching evaluations. But they continue to use them for lack of a better alternative. The authors’ recap states that it is practical and valuable to have faculty observe each other’s classes and that it is practical and valuable to create and review teaching portfolios. But a case for these claims is not effectively made. The “What is better?” section lacks evidence to support these claims. There is no evidence that teaching portfolios or classroom observations are better in any way than teaching evaluations. They certainly do not seem practical. To be fair, there is a mention of classroom observation taking about four hours, which is debatable as a practical amount of time. But practicality also should include ease of evaluating. The sample letter at the end of the paper provides a case that is fairly easy to assess. But this is a rather unrealistic case. How should faculty evaluate everyday faculty’s teaching and portfolios?

Comments

Comment on this review

Version and Review History

Preprint

Reviewed by Carol Lauer Reviewed by Michele Pellizzari Reviewed by E.J. Masicampo