MUNFA MUNFA’s Take On Metrics

In a new branding and reputation scheme, our new Vice President Research wants us all to lobby our friends to “vote MUN” on QS Scholar and build our bibliometric profiles on Google Scholar: “As with any program or resource, Google Scholar and other similar online tools, such as Scopus and Web of Science, are not perfect. However, there’s a major underlying benefit that outweighs any drawback: exposure leading to reputation” (https://bit.ly/2upd8HC and https://bit.ly/2J1isUL).

It feels like an academic version of Gary Shteyngart’s near-future dystopia Super Sad True Love Story, where characters continually rank each other using data streamed via RateMePlus technology on souped-up smartphones called äppäräts. But ratings can be gamed. Thus, a friend of main character Lenny admonishes him to stop buying books, as the habit drags his PERSONALITY rankings down. And indeed, Lenny’s younger love interest, Eunice, messages a college friend that “what freaked me out” was “Len reading a book … and I don’t mean scanning a text, like we did in Euro Classics.”

At least the VPR is (so far) advocating voluntary actions. But the more a score reflecting “reputation by exposure” (also known as “selling yourself”) becomes the measure of success, the more that score is likely to become an end in itself.

So we should all be concerned by talk at Memorial about the introduction of mandated performance indicators for assessing ASMs and their academic units, despite extensive evidence that standardized performance indicators undesirably skew research and damage fundamental aspects of scholarly life.

The trouble with metrics
Bibliometrics and other quantitative measures of research and teaching activity – including research dollars awarded, CEQ scores and degree completion times – purport to quantify merit in a standardized way. Administrators may find them attractive as a means of comparing ASMs and their units with each other and over time. High school students may turn to annual “league tables” based on such statistics to identify their top university choices. For busy ASMs, metrics might seem attractive as a shortcut in peer assessment processes. Rather than read our colleagues’ work carefully, we can contract that task out to a database.

Metrics also appear to counterbalance subjective judgments. After all, the saying goes, numbers speak for themselves.

Except they don’t. Making proper sense of any quantitative measurement – in assessing scholars, academic units or, indeed, anything else – requires a clear understanding of how the measure relates to what it is thought to capture. And when it comes to measuring performance in teaching and research, the connection between many quantitative indicators and what they purport to measure – the quality of performance – is often very weak. At best, many of the standard metrics capture only part of what we mean by performance, and do so in ways that are not readily comparable across scholars.

Research dollars, for instance, are among the most countable metrics at the university. Yet scholars and disciplines vary considerably in their funding demands.

Citations are also highly countable, at least in principle. Yet citation practices vary massively between fields, resulting in fundamentally incomparable citation rates. Whether applied within fields or across them, citation-based indicators are also systematically biased: against recent scholarship with little time to be cited; against locally-focused research (of significance to our city, province or country, but less visible internationally); and against research not published in English.

Indeed, to the extent that success or survival depends on rankings, scholars, academic units, and universities themselves are rewarded for behaving in ways that boost their scores and discouraged from pursuing approaches that, however sound, might suppress them.

In particular, there is considerable potential for metrics to be gamed, either deliberately or because, over time, academics, units and entire universities start becoming more concerned with scores than the substance they supposedly indicate. In the classroom, instructors may feel pressure to relax course requirements in order to burnish their CEQ scores. Graduate supervisors may lower their standards in the hope of bettering completion statistics. Researchers may choose topics on the basis of citation prospects – which may reflect both scholarly merit and academic fashion – rather than on their judgment about the intellectual importance or novelty of the work.

For prospective students, institutional league tables aimed at student recruitment lead to a focus on rankings rather than education. In Shytengart’s novel, Eunice’s college Major was Images and her Minor was Assertiveness – the next logical extension of universities’ growing obsession with delivering the “student experience” they believe undergrads today want, rather than education, as such?

In short, as noted in a recent article in Nature on “the fetishization of excellence,” “performances of ‘excellence’” all too easily displace the qualities that underpin actually excellent work (https://go.nature.com/2GvNxBF).

Beyond metrics
Even as mandated performance indicators seem to be looming at Memorial, there is widespread international recognition of the dangers of academic evaluation processes that privilege simple metrics. (See, for example, the Leiden Manifesto: https://go.nature.com/2vB3D3d).

Besides, the scholarly work of the university is already subject to regular and rigorous evaluation through various forms of peer review. For individual scholars, of course, the Promotion and Tenure process is central. P & T processes incorporate diverse information (both quantitative and qualitative) about research and teaching performance, and are designed to be sensitive to the varying contexts applying to the work of particular scholars. At the level of academic units, periodic academic unit reviews perform a similar function.

So why introduce new, one-size-fits-all measures of performance? One argument is that performance indicators offer administrators a way of rewarding and punishing faculty through teaching assignments and the like. But the Collective Agreement allows amply for adjustments to workload as things stand, so this appears to be a mechanism for managers to avoid taking responsibility for difficult decisions.

For ASMs, the most immediate effect of new assessment exercises is likely to be another layer added to the Paperwork Pandemic that sucks up time we could spend getting some real work done. But the longer-term impacts are more insidious.

As Marilyn Strathern (2000) notes in her introduction to Audit Cultures, a volume that identified the accountability regime in its ascendance: “checking only becomes necessary in situations of mistrust.” But replacing trust with measurement, as Cris Shore and Susan Wright note in the same volume, is likely to compound the situation. In particular, it risks replacing collegial autonomy and judgment by managerial control directed and justified by numbers, intensified anxiety and insecurity, and destructive forms of competition that encourage the kind of system-gaming described above.

MUNFA’s take? We must be alert to ensure that our most fundamental values are not threatened by the introduction of new accounting mechanisms designed to serve the needs of management not scholarship.

Further reading:
“‘Excellence R Us’: university research and the fetishisation of excellence”: https://go.nature.com/2GvNxBF

“Bibliometrics: The Leiden Manifesto for research metrics”: https://go.nature.com/2vB3D3d

“Citation Statistics.” (2008 Joint Report from the International Mathematical Union, International Council of Industrial and Applied Mathematics and Institute of Marhematical Statistics): https://bit.ly/1gDxQp7

William Bruneau and Donald C. Savage: Counting Out The Scholars: The Case Against Performance Indicators in Higher Education

Yves Gingras: Bibliometrics and Research Evaluation: Uses and Abuses

Thorsten Gruber (2014) “Academic sell-out: how an obsession with metrics and rankings is damaging academia.” Journal of Marketing for Higher Education, 24:2, 165-177

Cris Shore and Susan Wright (2015) “Governing by numbers: audit culture, rankings and the new world order.” Social Anthropology 23 (1): 22-28.

Marilyn Strathern (ed.): Audit Cultures: Anthropological Studies in Accountability, Ethics and the Academy

News & Bulletins