+1 Recommend
    • Review: found
    Is Open Access

    Review of 'Gender bias in student evaluation of teaching or a mirage?'

    Gender bias in student evaluation of teaching or a mirage?Crossref
    he Uttl and Violo paper should be required reading in all methodology courses.
    Average rating:
        Rated 5 of 5.
    Level of importance:
        Rated 5 of 5.
    Level of validity:
        Rated 5 of 5.
    Level of completeness:
        Rated 5 of 5.
    Level of comprehensibility:
        Rated 5 of 5.
    Competing interests:

    Reviewed article

    • Record: found
    • Abstract: found
    • Article: found
    Is Open Access

    Gender bias in student evaluation of teaching or a mirage?

     Bob Uttl (corresponding) ,  Victoria Violo (2020)
    In a recent small sample study, Khazan et al. (2020) examined SET ratings received by one female teaching (TA) assistant who assisted with teaching two sections of the same online course, one section under her true gender and one section under false/opposite gender. Khazan et al. concluded that their study demonstrated gender bias against female TA even though they found no statistical difference in SET ratings between male vs. female TA ( p = .73). To claim gender bias, Khazan et al. ignored their overall findings and focused on distribution of six negative SET ratings and claimed, without reporting any statistical test results, that (a) female students gave more positive ratings to male TA than female TA, (b) female TA received five times as many negative ratings than the male TA, and (c) female students gave most low scores to female TA. We conducted the missing statistical tests and found no evidence supporting Khazan et al.s claims. We also requested Khazan et al.s data to formally examine them for outliers and to re-analyze the data with and without the outliers. Khazan et al. refused. We read off the data from their Figure 1 and filled in several values using the brute force, exhaustive search constrained by the summary statistics reported by Khazan et al.. Our re-analysis revealed six outliers and no evidence of gender bias. In fact, when the six outliers were removed, the female TA was rated higher than male TA but non-significantly so.

      Review information


      This work has been published open access under Creative Commons Attribution License CC BY 4.0, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Conditions, terms of use and publishing policy can be found at


      Review text

      Although the Khazan et al. (2020) study is so problematic that no professional reader will accept their conclusion of gender differences in student ratings, the surgical precision with which Uttl and Violo analyze this article is a joy. The Uttl and Violo paper should be required reading in all methodology courses.


      In a replication of MacNell et al (2015), Khazan et al. (2020) had a teaching assistant (TA) evaluated who assisted in two sections of a course, once under her true gender and once under a false (i.e., opposite) gender. Thus, students who rated the performance of the TA were either under the impression that the TA was male or that the TA was female.


      Uttl and Violo start out with the obvious and fundamental problem, namely that the mean difference in the ratings of the (apparently) male and female TA did not even approach statistical significance. This should already negate the conclusion that there was a gender bias in students’ ratings of the TA.  


      But Uttl and Violo point at other serious problems. One of the most serious, in my opinion, is that students were given one photo and one short biography of each of the TA’s. Obviously, such photos and biographies could bias student ratings even before the TA started his/her work. There is sufficient evidence in the research on SET’s that demonstrate the importance of first impressions of teachers. Thus, this material should not only have been carefully pretested, but also several photos and biographies should have been used for each gender. Furthermore, the study should also have used several TA’s or at least one male and one female TA, each appearing under their own and under the opposite gender. 


      Another problem of the study is the small sample size in each of the four conditions (i.e., TA gender x student raters’ gender). As Uttl and Violo point out, their statistical power to find a gender effect of the size reported in one of the major studies of gender differences in SET’s (Ottoboni & Stark, 2016) was only 0.18.


      I do not want to go through all of their re-analyses, but merely repeat their damning conclusion, namely that the Khazan et al data reveal no evidence, whatsoever, that the female TA was rated differently from the male TA.


      The fact that Dr. Khazan refused to provide Uttl and Violo with the raw data of their study suggests that they might have had little confidence that their own conclusions might stand up to closer scrutiny. Since most reputable journals these days require that data are made available, it also reflects poorly on the Journal of the North American Colleges and Teachers of Agriculture that they did not have this requirement.


      Comment on this review