Introduction – who rules?
Based on typical h and m values found, I suggest (with large error bars) that for faculty at major research universities, h ≈ 12 might be a typical value for advancement to tenure (associate professor) and that h ≈ 18 might be a typical value for advancement to full professor. Fellowship in the American Physical Society might occur typically for h ≈ 15–20. Membership in the National Academy of Sciences of the United States of America may typically be associated with h ≈ 45 and higher, except in exceptional circumstances. (Hirsch, 2005, p.16571)
More than a decade on, internationally oriented higher education rankings have become a growth industry, with mergers and splits, revised methodologies, rankings responding to rankings, raters of the rankers, and vibrant colloquies among professors and pundits who, in imitation of Brecht’s Mother Courage, have opted to drive their carts high-laden into the thick of the action. As of the last count, there are at least 11 global league tables, two comprehensive citation indexes (Scopus and Web of Science), a corpus of field-specific and area-specific registries, an expanding list of book-length studies on international rankings, and lively debate in both print and electronic form over the statistical measures most appropriate for the determination of research impact (Marginson, 2014; Hazelkorn, 2015; Stack, 2016).
Perhaps inevitably for an effervescent growth field, the exploration of all things rankings has outpaced its conceptual uptake (Chirikov, 2016).1 What we encounter today is an expanding data corpus woven into a broadcloth of interests and positions. So, for example, there is an evident link between academic global rankings and globalized markets more generally (Hazelkorn, 2009, 2014; Altbach, 2012); between institutional report cards and their media distribution (Stack, 2016); and between higher education quality assessments of whatever flavor and the so-called ‘audit’ culture or society (Hacking, 1991; Power, 1999; Strathern, 2000; O’Neil, 2016). League tables are also deeply implicated in the commercial-bureaucratic covering doctrine that has oriented the western research university in the last 30 years. Their global reach pits national and regional higher education prerogatives against an imported western model, prompting challenges of articulation and giving rise to the charge of neo-imperialism après le letter (Yudkevich et al., 2016). As Berndt Waechter (2015) has recently noted, however, much less attention has been paid to the theoretical coordinates that frame these and related concerns.2
In what follows, this paper attempts to provide such coordination by reassessing the standard justifications for the emergence and proliferation of university rankings on a global scale. The paper begins by addressing the claim most often advanced by ranking agents themselves: that such instruments are vital sources of information for a variety of stakeholders in an evolving world higher education sector (Baty, 2017). As quality control measures, however, academic rankings do not exhibit context-neutral data sets as would be the case for government consensus figures or education department enrollment numbers. Instead, they select and present factor inputs and outputs in the form of quality assessment tables conferring ordinal distinction. So, technical data are recoded as prestige markers. The number of library volumes or Nobel Prize winners on campus, for example, are not recorded for their own sake, but as proxy measures of institutional legitimacy.3 The ‘prestige auditing’ of the paper’s title, therefore, encompasses both resource assessments and positional goods competitions. Rather than treating them separately, as is conventional (Marginson and van der Wende, 2007; Chirikov, 2016), they can be more convincingly incorporated as status games of narrower and wider scope.
There are several reasons to doubt the felicity of prestige audits as trustworthy indices of academic value. Input figures might be thought reliable where factor calculations are relatively straightforward and subject to third-party verification. Output and throughput data are more difficult to isolate, either because the relevant information is unreviewable (mean entry-level salaries, for example, depend heavily upon self-reporting, which may or may not closely reflect what new graduates in fact earn) or because ‘value-added’ components resist neat quantification. The choice of factors and their relative weighting, moreover, are specific to the given metric. While not arbitrary, factor combinations are sufficiently distinguishable to support multiple ranking mechanisms. The spread of global league tables in the last decade attests to this. What emphasis is to be placed on teaching, published research, physical resources, distinguished faculty (‘stars’), student satisfaction, employment offers, and/or the number of campus swimming pools per foreign exchange student?
Status competitions introduce a separate, if affiliated, set of challenges. To the extent that academic rankings foster Matthew effects (‘for whosoever hath, to him shall be given’), the measurement of prestige is inseparable from its creation. Oxford, Cambridge, Harvard, and MIT (among a variable score of others) are on top today because they have always been on top. This is the logic of the older, increasingly discredited reputational survey that prestige audits both disguise and encourage: because one knows more about institution X, and because that information has often been confirmed as positive, confirmation bias invites the conviction that X warrants additional praise. Prestige clusters emerge that further skew the reliability of ordinal classifications.
Status competitions also induce factor gaming, by which low-hanging fruit is marked for institutional harvest. If admissions selectivity is at issue, yields can be manipulated by introducing off-season matriculation. Star faculty can be recruited in a visiting capacity to inflate short-term publication quanta. Joint authorship is emphasized not because a given project in the humanities or qualitative social sciences requires multiple investigators (up to 25 or 30 for a paper with modest data generation), but because research productivity targets foster distributed recognition. Mohamed el Naschie remains an instructive example of the incentives and pitfalls of factor gaming generally: his aggressive self-publishing, and self-referencing campaign through the Elsevier journal he edited, Chaos, Solitons, and Fractals, lifted his adopted university in Egypt to temporary prominence before the fraud was detected and the institution severed from the tables.4
Even if academic rankings were not susceptible to Matthew effects and factor gaming, however, higher education (understood as a series of status competitions) creates social welfare losses and occludes features of teaching and learning that resist ready quantification. The welfare loss is particularly problematic where public funds are redirected, however obliquely, from core education mandates to marketing strategies designed to promote a given university to a superior rank or rank tier. The greater the emphasis on the concussive logic of ‘Who rules?’, the less nuanced the appraisal of higher education’s spillover effects, which include the intrinsic satisfaction of intellectual problem-solving as well as informed civic-cultural engagement.
This paper goes on to address the genesis of academic prestige audits as responding to the need for more effective resource management in the global knowledge economy. This evolving economy is linked to the corporate-bureaucratic university’s emphasis on efficiency and competitive advantage. Global rankings face a particular, and perhaps insurmountable, challenge in providing sufficiently conclusive information where key resource factors are often difficult to quantify, and factor combinations are unevenly applicable to different university types and higher education systems. However resilient the resource management justification for academic league tables (all of the major rankings purport to base their numerical tallies on some measurable amalgam of inputs and outputs), the necessarily circumscribed nature of the data sets and their rigidly ordinal presentation create ‘data narratives’ of uneven informational value.
The paper then identifies the controlling logic of these data narratives as one of status competition. However individually committed to resource measurement, global ranking tables stage status games in which positional rank is at least as important as the factors on which it is nominally based. The argument draws upon the canonical accounts of positional goods in Bourdieu (1986) and Fred Hirsch (1977) to assess the efficacy of global academic rankings as prestige audits. Distributional skewing, although open to modulation, is endemic to league tables as a consequence of feedback distortions, investment asymmetries, and/or factor gaming.
Finally, the paper addresses the limitations of resource management and positional goods arguments even were they to satisfy their own ambitions. It reviews the argument for social welfare losses where public subsidies for higher education are channeled into programs earmarked for institutional status enhancement. It argues for a revival of the currently moribund treatment of higher education as a complex public good that unites individual quality-of-life externalities with those accruing to society as a whole. The paper’s concluding paragraphs appeal to a more nuanced and comprehensive vocabulary responsive to both the quantitative and qualitative purposes of global higher education today.
Factors and fictions
National and global rankings alike are inseparable from the emergence of what is often referred to as the ‘knowledge economy’. Originally used to designate the social and economic milieux of the so-called ‘new class’ of knowledge workers in the 1950s and 1960s (university-educated specialists in technology, communications, finance, and logistics projected to mediate between management and labor) (Mills, 1951; Galbraith, 1958; Drucker, 1970), the term evolved to cover the tech-savvy, high-skills landscape of global knowledge exchange in the 1990s.
Within this landscape, universities have acquired proportionally greater significance as talent incubators and industry pillars (Kennedy, 1997; Hazelkorn, 2015). Harvard University’s endowment wealth, to give but one example, is approximately equal to Romania’s Gross National Product (GNP); the US higher education sector is collectively one of the world’s dozen largest as measured by income generation. Universities and university systems, in turn, have developed distinctly global footprints. Campus–corporate consortia, multi-city MBA programs, international expansion (New York University in Abu Dhabi and Shanghai, for example, and the University of Nottingham in Ningbo), institutional partnerships (National University of Singapore–Yale, Duke–Wuhan), student and faculty exchanges, Massive Open Online Courses (MOOCs), digital badges, and regional policy coordination (as in the European Union) have all contributed to the erection of new international markets for knowledge work on the model of established factor and consumables markets. More variables spread across more networks, affecting an expanding number of stakeholders, create a need for informational exchange. The higher the stakes, the more urgent reliable assessment becomes. In the still inchoate global higher education sector, however, appropriate assessment mechanisms are notably absent.
The standard tools for academic auditing (regional accreditation bodies in the United States, for example, and the recently approved TEQSA in Australia) and traditional research assessment exercises (as with the REF in the United Kingdom or the Australian Research Council’s ERA) provide benchmarks for quality minima, but are too narrowly tailored to provide institutional and program-level comparisons appropriate to an emerging global higher education system.5 This is arguably still the case for cross-national assessment vehicles such as the EU’s Bologna process and the OECD’s assessment of higher education learning outcomes (AHELO). Global rankings offer a refinement and ‘topping up’ of these data sources by linking them to market efficiency objectives. The innovatory proliferation of rankings metrics in the last 15 years testifies to both the demand for such data and their stubbornly elusive application.
The deployment of academic rankings as a resource management tool, however, has also been driven by the conceptual orientation of what is variously termed the ‘enterprise’, ‘managerial’, ‘corporate’, and ‘commercial-bureaucratic’ university (Slaughter and Leslie, 1997; Marginson and Considine, 2000; Ehrenberg, 2000; Bok, 2003; Geiger, 2004; Slaughter and Rhoades, 2004). This orientation has emphasized the importance of quantifying the different facets of teaching and research in a more thoroughgoing fashion than prevailed under disciplinary professionalization in the immediate post-WWII period. This is the audit culture or society as applied to intellectual work in the university under the auspices of operational efficiency and institutional advantage.
As with the evolving global higher education system, one might view this nation-based and region-based development as responding to acute information deficits. The contemporary ‘multiversity’ is much more complex than when Clark Kerr introduced the term in 1963 to characterize, in part, the so-called ‘California model’ of institutional tiers.6 Only two years after Kerr’s (2001) Uses of the University projected the idea of the multiversity as the incipient administrated university, the Higher Education Act (HEA) introduced a loan-funding mechanism for tuition in the United States, transforming a private-payer system augmented by state and private scholarship grants into a national debt-financing scheme (Breneman, 1991; Winston, 1999; Geiger, 2004). Federal legislation, especially Title IV and Title IX of the HEA’s statutory amendments from 1972, created campus offices to monitor diversity representation and intake as a prerequisite for state funding. The enactment of the Bayh–Dole Act in 1981, vesting intellectual property rights in campus–corporate partnerships, further expanded the administrative burden on managerial offices tangentially connected to core functions of teaching and research (Mowery et al., 2004).
This shift from Jencks and Riesman’s faculty-driven university of the 1950s and early 1960s (Jencks and Riesman, 1969; Devons, 2001) to the administrated campus has been dramatic in both scale and scope. Office staff now routinely outnumber full-time faculty in prominent American universities by a factor of four or more. In 1973, it was the reverse: one in four (Blau, 1973). In Germany, according to Brembs and Brennecke (2015), the ratio is currently slightly above 2:1; in Australia, only one of 15 universities surveyed in 2012 had a ratio of less than 1:1 (Ernst and Young, 2012). Few if any exceptions to this rule by career managers will be found among higher education systems and institutions across the developed world (Ginsburg, 2011).
Not surprisingly, bureaucratic swell has generated both an ‘avalanche of numbers’ (to invoke a term of Hacking, 1991) and a preference for quantitative modeling sometimes dismissed by its detractors as a higher numerology. James Engell and Anthony Dangerfield (2005) bring the point home, however impishly, in their outline of the ‘three monies’ for disciplinary ideoscapes.7 The cash nexus not only drives operations in payroll and the endowment management office, but also extends to what is taught, studied, transferred, and contracted through intellectual work. The academic production sequence duly resembles a plug-and-solve algorithm: student/consumers pay tuition to a service provider (university) for a product (education) designed and packaged by academic managers (faculty) in the form of a license (degree) with amortizable career benefits (earnings potential). All that counts, we are left to deduce, is indeed countable.
The two strands of our resource management argument can be joined as follows. If global academic rankings respond to information deficits in an evolving knowledge economy, the corporate-bureaucratic governance framework already in place drives us to adopt such assessment mechanisms as the kind of knowledge we need. A greater number of institutional actors with a greater number of tasks directed to a greater number of boundary-spanning relationships translate into an enhanced need for coordination and accountability. Stakeholders with financial skin in the game seek information in order to predict returns on investment. Academic managers seek out comprehensive performance measures in order to allocate resources efficiently. In their tidy formatting, international coverage, widespread dissemination, and (putative) transparency, global rankings promise to satisfy all these demands as no comparable audit can.
This otherwise happy marriage of factor analysis and the urge to quantify, however, raises two seminal questions. First, do global academic report cards provide information sufficient to the tasks of academic resource management? Second, even at their best, how accurately do such report cards describe what higher education is or attempts to do? Perhaps the readiest observation about rankings as data sources facilitating cost–benefit calculations is the inchoate nature of the data. All other things being equal, the greater the number of relevant variables factored into the table, the more reliable it will be as a quality audit. A compounding of variables, however, requires more complex cross-referencing. Variable multiplication also increases the likelihood of inadequate or ‘noisy’ source material, which is reproduced in the rankings by means of weighted and normalized algorithms whose calculation is either unavailable to the user or demand an advanced understanding of regression analysis.8 To the extent that one of the principal aims of global academic rankings is transparency across multiple stakeholders, one would expect to encounter a restricted basket of indicators; alternatively, key indicators defined restrictively.
A quick survey of the three most prominent global rankings brings this point home (see Table 1). While no one would doubt the centrality of teaching and research as institutional resources, it is not at all clear how these should be measured. Which proxy variables, for example, best capture what ‘scholarly production’ actually produces? Should quality be measured by reputational consensus among established faculty, journals, and academic presses? Should it be vested in the career publication records of affiliated faculty? Should it be restricted to a current window of research activity (typically five years)? Should it be determined by journal impact factors? By citation frequency? Perhaps by some weighted scale that integrates career achievement and citation distributions (Hirsch, 2005; Times Higher Education, 2017; Topuniversities.com, 2017)?
Ranking | Teaching (%) | Research (%) | Internationala (%) | Income (%) | Other (%) |
---|---|---|---|---|---|
THE World University Rankings | 30 | 30 | 7.5 | 2.5 | – |
QS World University Rankings | 20–60b | 20–60b | 10 | – | 10c |
Shanghai (ARWU) | 45d | 40 | 5 | 10 | – |
Measures the number of international staff and students, and/or the degree of ‘international cooperation’.
QS evaluates institutions on the basis of ‘academic reputation’ (40%), which in theory covers both teaching and research, but in practice foregrounds the second.
Employer survey on quality of graduates from the given university/program.
Academic Ranking of World Universities (ARWU; ShanghaiRanking Consultancy, 2016) specifies a ‘teaching/learning’ category heading, but this is composed solely of state entrance examination scores (gaokao) and the employment rate of graduates.
Teaching indicators invite a similar criticism. To what extent are reputational surveys, entrance examination percentiles, and employment rates useful measures of instructional quality? How to account for classroom performance across a range of institution types (research universities, liberal arts colleges, technical institutes, etc.), wealth profiles, and funding models in very different education systems with widely divergent admissions and staffing policies?9 What shadow prices should be allotted to student advising, counseling, mentorship, career support, peer learning, and collegiality? All of these are recognizably important to faculty teaching, yet they cannot be tallied in the way library volumes, database subscriptions, journal impact factors, and research grants can be. Nor can one confidently translate superior resources into superior classroom instruction. The opposite might also be true: better auxiliary resources and advanced student aptitude could be seen to reduce the importance of teaching as a component of subject mastery.
Even were pooling and sourcing distortions suitably minimized, a fundamental question of data adequacy would remain: can proxy measures, however refined, capture what teaching and learning, research and institutional service, mean for those engaged in and affected by them? The addition of a Nobel Prize winner to the faculty will matter more at some institutions than at others, to certain people more than others, and in this region rather than that one. Yet we might also ask – should ask – how important the number of campus Nobel Laureates is to whatever we wish to call ‘academic quality’. There are many factors – from the intrinsic satisfaction of intellectual problem-solving to the forging of lifelong relationship to the intangibles of community building – that help determine the value and purpose of academic life. That such features of the vita academica are difficult or impossible to quantify, that they are not ‘resources’ in any conventional sense, does not lessen their relevance or vitality.
To sum up. The resource management argument for global academic rankings remains a foundational justification for their viability in an emerging global higher education sector characterized by protracted information deficits. To the extent that no other measure provides the same type of information to a comparable number of stakeholders, the argument for league tables as ‘data-thick’ assessment tools is likely to remain robust. However, there are stubborn problems in data pooling and data sourcing that vitiate the utility of global rankings as management indices. Although some of these problems have been addressed, most notably in U-Multirank’s individualized comparisons, there is reason to doubt that the basket of indicators and their integration will ever become complete enough, or flexible enough, to operate as reliable information sources across the relevant spectrum of stakeholders and institutions. Even if reliability could be satisficed, a validity challenge remains: to what extent can academic quality be measured in the first place? Is the ‘quality audit’ such a seductive phrase precisely because it voices the impossibility, even the absurdity, of any attempt to measure, rank, and rate universities and university programs by means of proxy variables?
Summing and striving
If global rankings are typically defended, when they are defended, as factor-weighted tools for high-stakes investments in human capital, they are more accurately seen as prestige competitions incorporating different data narratives. What is primarily at stake in status judgments, in turn, is not factor analysis as delineated in economics textbooks – land, labor, investment, or (embedded) capital – but the circulation of these as positional goods. At issue, in other words, are legitimacy grants rather than resource inputs as such. In positional goods markets, actors are viewed less as independent, utility-maximizing agents with an infallible nose for the truth than as differentially networked and variably obligated agents with an eye to the admired.
In one form or another, of course, universities have always housed status hierarchies: among institutions, among faculty, between faculty and students, among students (as with the nationes at the medieval University of Paris), and between graduates and non-graduates. Until 1870, Harvard routinely ranked its undergraduates not by scholarly acumen, but by family status (Morison, 1986). The earliest university rankings in the United Kingdom and the United States at the start of the twentieth century were similarly conducted as a Who’s Who of eminent alumni (Cattell, 1906; Maclean, 1900). The formalization of institutional and program status through comparative rankings is recent. Only after WWII were program-level assessments widely circulated (Webster, 1986; Usher and Savino, 2007). Among these, the American Council of Education report in 1966 broke with the longstanding practice of classification by level or tranche, opting for an ordinal ranking of doctoral programs in its place (Graham and Diamond, 1997). Institutional rankings with a general readership first appeared with the US News and World Report tables in 1983. Some 20 years later, Shanghai Jiao-Tong University inaugurated its global counterpart as part of a government initiative to measure and promote top-tier Chinese universities.
As with the resource management explanation, one can trace the emergence of national and international rankings with comprehensive stakeholder audiences to sector expansion. From a positional goods perspective, however, the central problem posed by this expansion is an information glut rather than an information deficit. This glut generates what Fred Hirsch (1977) calls ‘system congestion’: the greater the number of roughly equal aspirants to recognition, the more difficult it becomes to adjudicate rival claims. Social scarcity thereby differs from economic scarcity in the sense that it names not an absence of material resources, but a perceived gap in actionable rank distinctions. Even if all boats rise, in other words, the relative status of the boat owners does not necessarily change; one simply enters a new, higher-stakes competition.
Congestion in higher education appears in the first order as an overproduction of formal awards and credentials relative to the social and attendant economic status these facilitate (Collins, 1979). Once a given award or credential – an academic degree or faculty appointment – is recognized as socially desirable, it will attract new entrants hoping to increase their social capital (Bourdieu, 1986). As the pool of degree holders expands, however, the qualification as such loses cachet. Graduates are aware that their degrees do not necessarily signal distinction. Employers are less likely to hire a candidate simply because that person is a degree holder. A PhD does not guarantee civil treatment at the tax office or the bank.10 Where degree acquisition itself no longer regulates status competitions, auxiliary forms of distinction are introduced to reinstate social status gradients. The level of the degree, for example, becomes important beyond what it contributes to subject-matter knowledge or professional competence. ‘Where-and-which’ distinctions in degree type (by rigor, employability, etc.) and institutional and/or program reputation increase in relevance. The more degrees one possesses in successively higher tranches, the greater the prestige grant. The nearer the degree orbits to Engell and Dangerfield’s (2005) ‘three monies’ star, the more likely it will be esteemed by administrators, alumni, and the general public.
From a positional goods perspective, the function of institutional rankings in congested academic markets is clear: comprehensive league tables regulate the creation and maintenance of status distinctions. They do so, moreover, to casual outsiders as well as deeply invested insiders. Scores and scales, columns, tiers, and tranches offer up the promise of exemplary clarity. We are showered with numbers informing us who’s in, who’s out, whose star is fading, and who’s recognizably on the verge: who’s who, in other words, in the real-time zoo (Marginson, 2016).
How well is this promise kept? Although the answer to this question is partly empirical, the modular idiosyncrasies of status games suggest that such competitions are partout susceptible to distributional skewing, vitiating their effectiveness as diagnostic tools. These distortions, moreover, acknowledge a common source: the self-referential nature of status competitions. Unlike factor-based assessments, what is measured in a positional good grant is itself the measurement. Something is valuable, in other words, because it is considered valuable; someone has status because that person is acknowledged to have status. While evaluative criteria (wealth, beauty, intelligence, title, etc.) can be attached for explanatory purposes, the prestige quantum narrowly conceived is tautological.
To the extent that tautological assessments reflect the arbitrary taste of their creators, they are of little use in status competitions, which require an appeal to a common standard for purposes of regulating social scarcity. In practice, therefore, what status rankings measure is already accumulated prestige. This reconfigures the tautology as a kind of multi-tiered regress. Why is HarvOxCambMIT at the top of international ranking Y? Because it was atop international ranking X. And why was it at the top of international ranking X? Because it headlined national ranking A. And why, finally, did it headline national ranking A? Because it is HarvOxCambMIT. Reintroducing resource criteria – Nobel Laureates, library holdings, h-factor scores, and so forth – might seem to offer relief from the self-ratifying cycle, but these criteria are themselves often functions of earlier prestige grants, which lured the most qualified faculty and students to campus to begin with. They would in any case be reintroduced within the frame of a prestige audit, limiting their usefulness as norm-referenced variables.
The effect of tiered status reinforcement is the creation of Merton’s (1968) Matthew effect: the richer get richer, while the poor look for a second job. Distributional scales are duly skewed at both ends. The higher one moves in the rankings table, the greater the degree inflation. Those in lower tiers, and particularly the institutions near the bottom of such tiers, will encounter a prestige vacuum. The great majority of universities beyond the pale of tables experience prestige deficits: as with Dante’s Inferno, those occupying no space in the ideoscape are worse off than those occupying disabled spaces. Prestige clotting or clustering creates feedback loops of its own. Top-end clusters will be able to engage in the exchange of symbolic goods more profitably than those elsewhere on the list. Faculty within such institutions are more likely to have extra-institutional network ties, better access to resources, and superior influence on status-granting appointments, fellowships, and awards (Bowman and Bastedo, 2011). These advantages, in turn, will reinforce the stickiness of the cluster and expand its collective authority. In contrast, ‘negative clustering’ may occur at lower tiers by virtue of shared deficits in prestige-granting exchange networks. If so, universities in lower-end clusters may put a premium on institutional loyalty, their faculty (especially in non-STEM and business fields) actively seeking to maximize organizational rather than professional capital (Marginson, 2014).
Incentives for factor gaming also come into play where universities are unfavorably positioned relative to their own rankings ambition. Although certain types of institution (the elite American liberal arts college is the standout example) can ignore their comparative disfavor in global league tables as irrelevant to both their education mission and funding sources, pressure to move up the food chain is widely observable elsewhere. Flagship public universities in ambitious but less reputationally secure state systems (Pennsylvania, say, or Florida) may be driven to use global rankings as a leveraging device for better placement on domestic measures. Expanding private research universities in congested markets – Northeastern University, competing in the Boston area against such behemoths as Harvard, MIT, and Boston University, is a case in point – also have sharp incentives to value the recognition that comes of elite status in global rankings. Admissions ‘re-parameterization’, hiring of star faculty on a visiting basis, promotion of multiple-author article publication in humanities and arts fields, and other inflationary devices are to be expected wherever institutional investment in academic status competitions is high.
In strongly regulated institutional climates, however, factor gaming is likely to remain soft. The egregious cases of hard gaming will predictably emerge where state oversight of universities is weak or corrupt, the national or regional sector is underrepresented in the different tables, and/or financial resources are earmarked for institutional positioning strategies. Data verification may be difficult or impossible where self-reported statistics cannot be corroborated independently; reputational surveys will fail to record prestige where the surveys are either insufficiently familiar with the university surveyed or too closely affiliated to render objective appraisals. In many such cases, suitably neutral evaluators may be unavailable. In both its hard and soft forms, factor gaming further reduces the reliability of status measurements by introducing incentives to nibble and fudge – in extreme cases, to cheat – where the perceived payoffs are substantial enough to justify the risk of detection. The threat of misrepresentation, moreover, will itself contribute to the destabilizing effect; to the extent that rankings are seen to be open to manipulation, the status competitions they stage will lose credibility.
Even if the different rankings were able to eliminate or effectively neutralize these scaling problems, they would still be susceptible to the criticism that prestige audits encourage institutional arms races that are socially and economically detrimental. The launch point for this criticism is the zero-sum logic uniformly assumed to underlie positional goods calculi (Veblen, 1899; Hirsch, 1977; Frank, 1985; Heffetz and Frank, 2008; Vatiero, 2011). If economic goods are conventionally defined as rivalrous and excludable, the textbook explanation of positional goods, following Vatiero (2011), is that they are doubly so: any benefit accruing to Ego, in other words, not only pre-empts Alter from enjoying the same specific benefit, but also mandates that Alter suffers a general loss equivalent to Ego’s gain.11
There are good reasons to suppose that rigid zero-sum outcomes are rarely encountered in practice. Positional game investments are often asymmetric: different actors value different measures differently across different status groups at different times. The treatment of education as both a positional and a public good, moreover, introduces value-added residuals (Pagano, 2007). In each case, relaxation of the zero-sum posit will lessen the degree of welfare loss. Even under a weaker assumption of zero-sum norming, resources directed to prestige enhancement are ones that are not available, or fully available, for other purposes; in the case of higher education, the core professional and institutional functions of teaching and research. The problem becomes acute where public subsidies are involved, directly through legislative grants or indirectly through tax abatements. The use of public resources for the purpose of institutional status competitions contravenes sound public policy (Samuelson, 1954; Musgrave, 1959).
Finally, positional goods competitions in higher education unduly restrict the motives for pursuing a degree or an academic career. Although social distinction is clearly one reason for credential seeking, there are many others that are inadequately captured by conventional ranking factors and placements. Students often enroll in university, as well as in a particular university, because of its location, popularity with friends and mentors, a vaguely articulated desire for self-growth, or simply to buy time before deciding what they really want to do. Many of the same considerations are germane to graduate school application decisions.
Academic faculty typically face somewhat different opportunities and constraints. A frequently overlooked motive for intellectual work that unites teaching and learning is intrinsic to love of the game. Just as the designer watch also keeps accurate time, and may please in line and fit, so does thinking and writing offer something other than a display of abbreviations on a business card. Fred Hirsch’s oft-cited image of spectators tip-toeing to view a sporting match because those in front are doing likewise underscores the problem: the tip-toers here are not engaged in stretching exercises in their bedroom or at the health club, but are observers with a bespoke interest in the game before them.12 Empty the pitch and it is doubtful that the spectators will continue to stand on tiptoe to compete for unobstructed sight.
Another way to understand the love of the game is to view intrinsic motivation as a private spillover effect. In the same way an educated citizenry confers difficult-to-quantify benefits on the commonwealth, teaching and writing carry an intangible value for those meaningfully engaged in them. Were this not the case, it would be difficult to account for the chronic oversubscription of PhDs to tenure-track positions in many academic disciplines: beyond the apparent prestige of the Dr title, social networking opportunities, and a drifting hope of tenure-line appointments, motives of intellectual discovery and public service drive the pursuit of graduate study.13 Neither global rankings nor positional goods calculi have anything meaningful to tell us about these motives. Nor do they speak to communal externalities from academic collaboration or competition. Even if discrete contests for positional goods are zero-sum normed, opposing viewpoints, independent data sets, and contrasting methodologies might open up new lines of inquiry irrespective of which approach is thought superior. The benefits here could be public, private, or both. The competition itself, moreover, where it fosters careful reasoning, evidential thoroughness, and the standards of fair play, also encourages the kind of communicative rationality that Habermas (1984) and others have identified as central to a functional public sphere. The fact that such desiderata have been widely pilloried as facile mantras in the ideological stable of classical liberalism should not pre-empt a recognition of their good-faith purpose. Here again, status good measurements provide little or no guidance.
Deciding what counts
If audit promise courts audit failure in both the resource allocation and status rank approaches to quality assessments, how feasible is the reintroduction of what both elide: the purpose of the examined life for self and society? At the level of ordinal rankings, the barriers are significant. Externalities are precisely that: observables falling outside market pricing mechanisms. Therefore, ranking them encounters the same problem as quantifying their value in discrete transactions: there is no way to affix exact numbers for purposes of comparison (Stiglitz, 1999).
On an international scale, the challenges multiply. Views on satisfaction, on learning as Bildung or moral development, on item-bank relevance, and even the protocols of survey-taking, will differ by region, organization type, and intellectual culture. The complexity and expense of data collection alone would discourage investigation by any annual rankings mechanism. If such an investigation was to be attempted, the proliferation and intangibility of factor variants would predictably dissolve into a series of ad-hoc distinctions cocooned in algorithms designed to impress more than to inform. Alternatively, one might eschew a numerical calculus altogether in the name of something akin to the gross national happiness quotient promulgated by the Kingdom of Bhutan. Who is the happiest of the happy? The most self-actualized of the self-actualized?
This is not to argue for any canonical bar against data not currently part of the global league table factor set: information from student satisfaction surveys, for example, or the number of alumni in public service careers and the percentage of alumni undertaking graduate degrees in arts, humanities, and social service fields. There is also precedent for soft variable quality assessments: the United Nations human development index, or the calculations proposed by Paul Anand et al. (2005) in the capabilities approach pioneered by Amartya Sen (1999) and Martha Nussbaum (2000, 2011).14 In similar spirit, Engelbrecht (2007) has attempted to bring social wellbeing arguments into closer alignment with factor-based policy-making. Might these models chart a future reform course for the academic rankings game?
A more general, if also more oblique, appeal addresses the question of externalities in prestige audits by focusing on how we speak about higher education as a value orientation. Language matters. The jargon characteristic of the bureaucratic office, with its left-footed terms of art and clunky metaphors, is all too readily adopted by those with the task (not tasked with) of analyzing such offices and their academic environments. Bill Readings (1997, p.22) puts his finger on an earlier, and now thoroughly assimilated, nonce word in his discussion of ‘excellence’. ‘As an integrating principle’, Readings opines, ‘excellence has the singular advantage of being entirely meaningless’. As anyone who knew Readings will attest, there is no attempt at flippancy or cynicism here; the hollowing out of meaning, of language directed to statements of fact or coherent mental images, is precisely what is lost where being ‘on-message’ becomes the message itself. This is the world of knowledge as commodity, and of the iron cage of bureaucratic rectitude that supports it.
For his part, Readings proposes a distinction between accounting and accountability somewhat along the lines of the apercu that ‘not everything that counts can be counted’. The distinction is a happy one: ‘count’ in the sense of measurement comes from the French conter, which means both a summation and a narrative (hence raconteur). What is added up numerically, we might say, should also add up narratively. The advantage of incorporating the second within the scope of the first is that narratives are more tolerant of nuance, more open (at least in theory) to the complexities of judgment.
‘Judgment’ is actually a term central to the older reputational survey that rewards inclusion in a more inviting quality assessment lexicon. Even if international reputational surveys finally prove unmanageable for rankings purposes, we avoid comforting, but finally counterproductive, fallacies of false precision by maintaining a rounded discourse of value. Hence, the standard marshaling of such terms as ‘measurement’, ‘assessment’, and ‘auditing’ – indeed, of ranking and rating themselves – benefits from the balancing presence of appreciation, judgment, (e)valuation, and appraisal.
This appeal to a thick discourse of assessment may seem to invite back the biases embedded in classic liberalist accounts of social agency. Who is to say that judgment and appraisal will not function as blinds for the powerful and opportunistic? It might also be thought that changes in nomenclature will be little more than a pis aller, a rearguard attempt to plug proverbial holes in the dyke while the waves crash over the barrier. Even in an elaborated form of discursive rationality, such as the one pursued by Habermas (1984), the objective often seems distantly elusive and aspirational, an occasion for fine gestures in the absence of tangible action. Can we not all just be reasonable?
Ironically, one of the received purposes of higher education, particularly in its oft-maligned humanities precincts, is an appreciation of the complexities of what we are compelled to measure. An understanding of the richness and nuance of language is one of the ingredients of education as a public good. We recognize its civilizing power and relevance to moral development, its role in creative work, and its contribution to meaningful interpersonal communication.
All of this may seem worlds away from the in-or-out, up-or-down reality of academic rankings and quality audits today. But to the extent that these do not simply measure institutions but also ourselves as their creators, it behooves us to affirm, in how and what we speak, that the discourse of numerical exactitude is neither as exact as it claims to be nor as complete as we have every reason to expect.