Student ratings of teaching have been used, studied, and debated for almost a century.
This article examines student ratings of teaching from a statistical perspective.
The common practice of relying on averages of student teaching evaluation scores as
the primary measure of teaching effectiveness for promotion and tenure decisions should
be abandoned for substantive and statistical reasons: There is strong evidence that
student responses to questions of “effectiveness” do not measure teaching effectiveness.
Response rates and response variability matter. And comparing averages of categorical
responses, even if the categories are represented by numbers, makes little sense.
Student ratings of teaching are valuable when they ask the right questions, report
response rates and score distributions, and are balanced by a variety of other sources
and methods to evaluate teaching.