389
views
1
recommends
+1 Recommend
0
shares
    • Review: found
    Is Open Access

    Review of 'Small samples, unreasonable generalizations, and outliers: Gender bias in student evaluation of teaching or three unhappy students?'

    Bookmark
    5
    Small samples, unreasonable generalizations, and outliers: Gender bias in student evaluation of teaching or three unhappy students?Crossref
    A paper that should be read by everybody interested in gender differences in SETs
    Average rating:
        Rated 5 of 5.
    Level of importance:
        Rated 5 of 5.
    Level of validity:
        Rated 5 of 5.
    Level of completeness:
        Rated 5 of 5.
    Level of comprehensibility:
        Rated 5 of 5.
    Competing interests:
    None

    Reviewed article

    • Record: found
    • Abstract: found
    • Article: found
    Is Open Access

    Small samples, unreasonable generalizations, and outliers: Gender bias in student evaluation of teaching or three unhappy students?

    In a widely cited and widely talked about study, MacNell et al. (2015) examined SET ratings of one female and one male instructor, each teaching two sections of the same online course, one section under their true gender and the other section under false/opposite gender. MacNell et al. concluded that students rated perceived female instructors more harshly than perceived male instructors, demonstrating gender bias against perceived female instructors. Boring, Ottoboni, and Stark (2016) re-analyzed MacNell et al.s data and confirmed their conclusions. However, the design of MacNell et al. study is fundamentally flawed. First, MacNell et al. section sample sizes were extremely small, ranging from 8 to 12 students. Second, MacNell et al. included only one female and one male instructor. Third, MacNell et al.s findings depend on three outliers -- three unhappy students (all in perceived female conditions) who gave their instructors the lowest possible ratings on all or nearly all SET items. We re-analyzed MacNell et al.s data with and without the three outliers. Our analyses showed that the gender bias against perceived female instructors disappeared. Instead, students rated the actual female vs. male instructor higher, regardless of perceived gender. MacNell et al.s study is a real-life demonstration that conclusions based on extremely small sample-sized studies are unwarranted and uninterpretable.
      Bookmark

      Review information

      10.14293/S2199-1006.1.SOR-EDU.APUTIGR.v1.RFGPQP
      This work has been published open access under Creative Commons Attribution License CC BY 4.0, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Conditions, terms of use and publishing policy can be found at www.scienceopen.com.

      student evaluation of teaching, SET, small samples, outliers, generalization, reproducibility

      Review text

      The evidence for gender differences in student evaluations of teaching (SET) is mixed. However, if there is a difference, it is usually female instructors, who receive worse ratings. Although not a plausible hypothesis, but one that needs to be ruled out, is that the tendency to evaluate the teaching of female instructors somewhat more negatively, could be due to their lower teaching ability. A hihgly cited study by MacNell, Driscoll and Hunt (2015) appeared to rule out this interpretation. (As of April 2020, their paper had received 154 citations.) These authors conducted an experiment, in which they manipulated the perceived gender of instructors. Two assistant instructors, (one male and one female) in an online class each operated under two different gender identities. Regardless of actual gender, male identity teachers received higher evaluation on professionalism, promptness, fairness, respectfulness, giving praise and enthusiasm. Uttl and Violo (2019) questioned these findings on several accounts: They argued that one could hardly generalize to all male or female instructors based on findings with only two individuals. Furthermore, the sample of students in each condition was rather small, ranging from 8 to 12 individuals. But most critically, there were three outliers, who gave the lowest ratings on all SET items in the two female conditions. If one removed these outliers, the gender difference disappeared. Instead, students rated the actual female instructor higher than the male instructor, regardless of perceived gender. By pointing out that the conclusion of the MacNell et al. study have no empirical basis, Uttl and Violo do not only render a service to the discipline, they will also encourage other researchers to condcut empirically sound studies on this issue. 

      Comments

      Comment on this review