UK research can seldom have witnessed a cat placed more emphatically among the metaphorical pigeons than when the inclusion of 鈥渋mpact鈥 in the 2014 research excellence framework was first mooted in 2009. So the coos of relief and triumph emanating from the sector since the REF results were announced on 18 December are doubtless eliciting purrs within the UK funding bodies.
Although the concept of assessing the impact of research on the basis of case studies had originally been developed for Australia鈥檚 abortive Research Quality Framework in the mid-2000s, this was the first time that such assessment would be carried out in practice. And the fact that the annual 拢1.6 billion quality-related (QR) research budget would partially ride on the outcome made a lot of academics extremely nervous, if not downright hostile.
The funding bodies were swift to make clear that cultural impact would score just as well as its economic equivalent, provided it had comparable 鈥渞each鈥 and 鈥渟ignificance鈥. However, hearts continued to flutter about the likely interpretation of these terms by the REF鈥檚 36 assessment subpanels, and some observers predicted that the influence of impact 鈥 which counted for 20 per cent of institutions鈥 overall scores 鈥 would severely clip the wings of at least some established research powers.
The academics appointed to the panels felt the pressure, too. Malcolm Skingle, director of academic liaison at pharmaceutical firm GlaxoSmithKline, was one of the 鈥渞esearch users鈥 recruited to Main Panel A, which oversaw the life sciences. According to him, his academic colleagues were 鈥渋nitially like rabbits in the headlights, absolutely panicking because they had never evaluated a case study before鈥.
探花视频
However, a series of 鈥渃alibration exercises鈥 early in the process helped to identify 鈥渉awks and doves鈥 in scoring terms and establish a consensus on the standards to be applied.
鈥淚 was quite sceptical at first but I think [the assessment of impact] was wholly transparent and fair, and I fail to see how it could have been done much better,鈥 Skingle says. 鈥淐ompared with the outputs, case studies were pretty easy to review and assess. They were only four pages long, had a start and a finish, and if you weren鈥檛 sure about whether they were making fair claims, you could check the audit trail back to original research, or ask for corroboration if you needed it 鈥 but for the most part you didn鈥檛.鈥
探花视频
The REF results made two things immediately apparent. First, except at the margins, the established pecking order had not been overturned by impact鈥檚 influence: generally, universities that scored well for outputs also scored well for impact. Second, impact scored very highly across the disciplines, being awarded an overall grade point average of 3.24 (out of 4), compared with 2.90 for outputs.
One interpretation of the high scores is that the academics on the panels had marked leniently, lest their disciplines should be seen by funders and politicians to have lower impact than others. Willy Maley, professor of Renaissance studies at the University of Glasgow and a member of the English language and literature subpanel, admits that some academics did require a 鈥渞eality check鈥 in the calibration exercise from the research users (people from outside academia whose input he regards as invaluable) about how much impact they were really having beyond the college walls. But, according to Skingle, the opposite was true in the life sciences.
鈥淭he academics would look at something absolutely stellar and give it a 4*. So anything less, in their view, had to be marked lower than that. The users and international members eventually convinced them to imagine a 6* rating for the really stellar stuff. [That meant] you could still have case studies rated 4* that weren鈥檛 quite at that level but were still 4* by anybody鈥檚 reckoning.鈥
It has been widely noted that impact scores 鈥 and, hence, scores overall 鈥 were particularly high in the life sciences: the overall GPA given under Main Panel A was 3.50, compared with 3.17 for Panel B (physical sciences), 3.14 for Panel C (social sciences) and 3.13 for Panel D (arts and humanities). But, according to Skingle, this is only to be expected given the amount of funding pumped into those disciplines in recent years.
鈥淭hey would need their arses kicking if they couldn鈥檛 get impact from that level of investment,鈥 he says.
Whether the REF results will lead to even higher QR funding levels for the life sciences, as some have predicted, will depend on the details of the funding formulas. England鈥檚 formula will be announced by the Higher Education Funding Council for England towards the end of March. David Sweeney, director of research, education and knowledge exchange at Hefce, believes that the published impact scores are 鈥渇air and reasonable鈥, but also invites 鈥渁ll those interested in university research鈥 to read the case studies, which were published in January, and 鈥渇orm their own judgements鈥.
Sweeney is anxious to see the evaluation of impact assessment currently being carried out by (also scheduled to be unveiled at the end of March). But he believes panel feedback already entitles him to say that the case study approach 鈥渨orked effectively, has differentiated [between universities] and produced results the community is accepting鈥.
He adds: 鈥淭he case studies confirm to me that academic research makes a vast contribution to society, and I am particularly pleased that its contribution to policy development and cultural life has been captured. It is not just about money.鈥
探花视频
The case studies confirm to me that academic research makes a vast contribution to society. It is not just about money
Even the disapproval of such an ardent critic of impact as Philip Moriarty, professor of physics at the University of Nottingham, has been mildly assuaged by the high scores achieved by some non-commercial impacts: 鈥淚 couldn鈥檛 go so far as to say my opposition to impact has mellowed, but it is encouraging, at least, that public engagement seems to have been taken seriously,鈥 he says, citing Nottingham鈥檚 as an example.
However, as Dorothy Bishop, professor of developmental neuropsychology at the University of Oxford, has pointed out (鈥Good works鈥, 探花视频, 29 January), public engagement 鈥渙nly really counted [in the REF] if you could point to a piece of research that changed people鈥檚 behaviour鈥.
Given the largely favourable reception, it seems inconceivable that impact will not be part of the next REF, likely in 2020. Nevertheless, the results this time around have thrown up significant concerns that need to be addressed.
One is the heavy weight, in terms of overall score, carried by each impact case study. This was because, roughly speaking, only one case study was required for every 10 academics submitted, meaning that the difference between a 4* (outstanding) and a 3* (very considerable) rating could be significant. One solution would be to require universities to submit more case studies. However, given the concerns about the workload involved in preparing them 鈥 acknowledged by Sweeney 鈥 this seems highly unlikely.
But the issue will become more marked only if the funding councils fulfil their original intention of raising the impact weighting from 20 to 25 per cent in the next REF 鈥 as they were urged to do by (2013).
According to one observer, the effective weighting of impact is already more than 25 per cent. In published today on his blog, Seb Oliver, professor of astrophysics at the University of Sussex, reveals that because the scores for impact (and, indeed, environment) typically show a wider variation than for output, they in effect count for more than their nominal weighting in determining the overall scores. Only in public health, health services and primary care did impact have an effective weighting of less than 20 per cent (namely, 19.6), while in physics and sociology it reached almost 39 per cent: higher, in both cases, than the effective weighting of outputs. Overall, impact鈥檚 average effective weighting in the REF was 29 per cent, while for outputs 鈥 which officially counted for 65 per cent of overall profiles 鈥 it was 47 per cent.
Oliver speculates that impact鈥檚 wider spread of scores is partly because of its novelty, meaning that 鈥渟ome units of assessment didn鈥檛 know how best to present or select their best impact鈥. By 2020, views are likely to have crystallised around what constitutes a good case study in each discipline. But Oliver also notes that the low number of case studies compared with outputs inherently presents a larger 鈥渕argin for error鈥 in submissions.
Another possible way to compensate might be to cap the amount of QR funding distributed on the basis of impact scores to 20 per cent of the total. But this would still not correct impact鈥檚 disproportionate effect on the scoring itself, which is also important in reputational terms. Oliver suggests that the funders should consider standardising the scores for the REF鈥檚 different elements before combining them.
鈥淚 am not anti-impact, but if it has such a high effect on their overall position in league tables, it will drive the universities to focus disproportionately on that metric,鈥 he says. 鈥淭here is a danger that all the academics in the country [could] start diverting significant effort away from their research and into impact and I am not sure we want to go that far. I am not sure policymakers were intending [the effect of assessing impact] to be that significant.鈥
Indeed, there is evidence that universities have already cottoned on to the huge significance of the quality of each impact case study they submit. That would explain the highly disproportionate number of REF submissions that contain staff numbers just below the threshold for submitting an extra case study, as highlighted by THE in January.
Moriarty says: 鈥淚t was as clear as day right from the start 鈥 to all but Hefce, it seems 鈥 that this type of game-playing would happen. Researchers across the country were excluded from the REF 鈥 with the concomitant morale-sapping effect this has 鈥 so that their departments could 鈥減lay the numbers鈥 on impact cases. That鈥檚 a pretty strong distortion: it remains to be seen to what extent exclusion could affect the careers of these researchers.鈥
Graeme Rosenberg, REF manager in the funding bodies鈥 REF team, admits the issue needs to be 鈥渓ooked at鈥.
Other issues likely to be examined by the funding bodies include whether greater 鈥済ranularity鈥 of impact grading could be attained by formally adopting Skingle鈥檚 imaginary extra star categories. The impact template, in which institutions set out how their case studies fit into an overall strategy for maximising impact, is also likely to be revisited. According to Maley, many institutions 鈥渟truggled鈥 with it, and he suggests it might be better rolled into the environment section of the exercise (which counts for 15 per cent of the overall score).
鈥淚mpact takes time and institutions will have to think that through,鈥 he says. 鈥淪hort-term goal-setting and expecting impact everywhere in a hurry doesn鈥檛 strike me as very sensible.鈥
Further enhancement of calibration methods is also possible. Steve Furber, chair of the computer science and informatics subpanel and ICL professor of computer engineering at the University of Manchester, would like hawkish and doveish marking tendencies to be formally corrected for statistically. Meanwhile, Dame Ann Dowling, chair of Main Panel B and professor of mechanical engineering at the University of Cambridge, advocates greater efforts to calibrate impact scoring across the main panels 鈥 although Skingle is sceptical that it makes sense to compare impact in the life sciences with that in, say, the humanities.
There is a danger that all academics in the country could start diverting significant effort away from their research and into impact
Then there is the workload issue. The funding bodies are likely to permit updated versions of 2014 case studies to be submitted in 2020, if by that time the impacts have become more mature. Finding evidence of impact is also likely to be made easier by the systems universities have now put in place to help them track it as it occurs. However, Jonathan Adams, chief scientist at technology company Digital Science, which is working with King鈥檚 College London on analysing the case studies, speculates that assessors next time around will be less surprised by how much impact universities unearth 鈥 and will therefore be harder to impress.
One obvious way to cut the workload would be to ditch case studies and turn to metrics instead 鈥 an idea being mulled over by an independent review commissioned by Hefce. As regards impact, 鈥渁ltmetrics鈥, which capture data such as social media mentions, are sometimes suggested, but no one 探花视频 spoke to believes that they yet amount to an adequate replacement. The RQF鈥檚 replacement, known as , focuses on innovation statistics, such as the number of patents registered and the volume of commercialisation income. But Claire Donovan, reader in science and technology studies at Brunel University London and part of the team that developed the RQF, warns the UK sector not to commit 鈥渕etricide鈥 and embrace a measure it knows to be flawed just because it is weary of REF returns. Even if 鈥渨onderful data鈥 could be produced, the information 鈥渟till needs to be set in some kind of context鈥, she says. And the fact that metrics typically favour the sciences opens the way to 鈥減hilistine arguments that humanities don鈥檛 have any impact so why should they receive any public funds鈥.
Adams agrees that a move to any standardised metrics would be 鈥渁bsolutely bonkers鈥 because of the sheer complexity of how research actually makes an impact. And he notes that even a shift to metrics would probably not cut universities鈥 workloads since they would 鈥渟till put a wholly disproportionate amount of effort into making sure they maximise their presentation on those indicators. Academic culture is so driven by the focus on the REF that it can鈥檛 self-regulate the amount of effort put in.鈥
Despite the positive noises coming from the panels, some observers continue to regard case studies as fundamentally flawed. Patrick Dunleavy, professor of political science and public policy at the London School of Economics, has memorably dismissed them as 鈥渇airy tales of impact鈥. And while Adams predicts that the rest of the world will be quick to follow the UK鈥檚 lead, the US is pioneering an altogether different approach that involves systematically tracking the impact of university trainees (see box, opposite). According to one of its architects, Julia Lane, institute fellow at the American Institutes for Research, asking academics to track their own impact is ludicrously amateurish. 鈥淚f you give me 拢2 million, I can tell you I have had an impact,鈥 she says. However, she adds, the measure of impact should be relative to an appropriate counterfactual about what would have happened if the money had not been spent.
鈥淗ow is a biochemist, with his own little view of the world, going to figure that out? That is not science, it is storytelling. You need to unpick the process to inform the way we do research rather than saying: 鈥榃e are just really good, keep sending money鈥 鈥 which is all case studies do.
鈥淚 am not against stories, but you want to be able to summarise it to a minister in fewer than 7,000 case studies. How many ministers are actually reading them?鈥
But Adams is enthusiastic about the capacity of case studies to demonstrate universities鈥 鈥減ervasive鈥 impact on education, society, welfare, health, law, policy, the economy, the environment and culture.
鈥淣o other country has so much information about what research in universities is actually delivering,鈥 he says.
Skingle agrees, arguing that case studies, especially those located in their own geographical areas, are well placed to enthuse MPs and the public.
探花视频
鈥淭he percentage of GDP being spent on R&D means nothing to Joe Public, but if you tell them about which engineering or medical project has come to fruition鈥hat has got to be a good thing,鈥 he says.
He is also clear that the inclusion of impact in the REF 鈥 as well as in research council grant applications 鈥 has made universities more anxious to engage with industry.
But Barbara Pittam, director of academic services at the London-based Institute of Cancer Research, agrees with Lane that writing case studies is 鈥渁lways going to be painful because it is never your data鈥 they depend on: 鈥淚t is about what happens to your results externally.鈥
And despite her institution鈥檚 focus on 鈥渕aking a difference to patients鈥 and its top rank for impact in THE鈥檚 ranking, she still fears that the impact agenda will distort research priorities.
鈥淓ven for us, it still feels like the tail wagging the dog. We have deliberate strategies to create impact, but we are very clear we can鈥檛 do so without absolutely fundamental science,鈥 she says.
鈥淭here is clearly a value to being able to tell impact stories, but whether that should be part of the assessment of actual research, I don鈥檛 know. I am not sure that, politically, that is a question we have been able to ask.鈥
鈥楴ot only straightforward but also quite interesting鈥: the panellists鈥 viewpoint
鈥淲e went into the exercise somewhat concerned about how easy it would be to make sensible assessments of impact case studies, and came out rather happy and a little surprised it had turned out to be not only relatively straightforward but also quite interesting.鈥
This is the view of Steve Furber, chair of the computer science and informatics subpanel and ICL professor of computer engineering at the University of Manchester. He said the impact templates (which counted for 20 per cent of the total impact score) were read by two academics and one 鈥渞esearch user鈥, while the case studies were read by two users and one academic.
The academic led on assessing whether the underpinning research was of at least 2* quality (which was not true in all cases, leading to an 鈥渦nclassified鈥 grade). For this reason, the burden of impact assessment on the academics was relatively light. Furber鈥檚 subpanel registered the lowest impact GPA of any subpanel, but still scored 2.99. And the 鈥渉uge diversity鈥 of case studies submitted conveyed 鈥渁 strong message that there is work with impact going on right across the sector 鈥 not just where you would expect to find it in high-end institutions鈥, he says.
Meanwhile, Willy Maley, professor of Renaissance studies at the University of Glasgow and a member of the English language and literature subpanel, also reports finding impact 鈥渕ore readily assessable than people might have expected beforehand鈥. And while he laboured over impact templates, he found the case studies just as engaging to read as the outputs, and relatively straightforward to grade, too.
Experience of handling submissions referred to more than one panel (such as interdisciplinary research) also convinced him that similar standards were being applied across the board. 鈥淚 approached impact as a sceptic because I am interested in older ideas of the universities being, in some cases, necessarily insulated so the real, slow work that will produce impact can take place,鈥 he says. 鈥淏ut, by and large, [the REF assessment] worked and if it worked for a discipline such as mine, which might not be obviously oriented in that direction, that to me is a good sign.鈥
Bulletin sounding board: the case study writer
Chris O鈥橞rien, communications specialist at academic consultancy Bulletin, estimates that his firm had varying degrees of involvement in between 400 and 500 impact case studies across a wide range of disciplines and universities, despite 鈥渘ot widely marketing ourselves as a REF consultancy at the time鈥.
鈥淭here was definitely a level of panic among universities because it was the first time they had written case studies,鈥 he observes.
Many of the examples he saw initially failed to meet even the basics of eligibility, describing impacts that occurred outside the assessment period or not based on any underpinning research.
As managers and academics got their heads around the guidance, the quality improved. However, Bulletin鈥檚 services remained in high demand, with some universities involving the firm in drafting or advising on virtually all their case studies.
Typically, O鈥橞rien would receive academics鈥 first drafts 鈥 which varied wildly in 鈥渃omplexity and coherence鈥 鈥 and then liaise with the authors on improving them 鈥 beginning with an hour-long interview. One common tendency, he notes, was for academics to 鈥渦ndersell鈥 their impact. Many also struggled to construct a coherent, digestible 750-word narrative bringing out the key points. The fact that humanities researchers were, in general, better at this offset the greater difficulties they typically had in 鈥渜uantifying or evidencing their claims succinctly鈥.
鈥淚 saw a lot of case studies relating to medical advancements or technological developments that were still in a very early stage of having an impact [via a marketable product]. So I don鈥檛 feel that the fears from some quarters that humanities would be at a disadvantage were borne out,鈥 he says.
O鈥橞rien鈥檚 other key task was to make sure every claim was backed up with evidence. This often required prompting academics to seek testimony from the organisations they were claiming to have influenced. In a small number of cases 鈥 especially when the university鈥檚 interest in the case study was motivated by a desire to recognise 鈥渢he research excellence and profile of the author鈥 鈥 the evidence proved impossible to marshal, prompting O鈥橞rien to recommend that the institution abandon the study.
鈥淸By doing that] we might well have saved the university from slipping a few places down the rankings,鈥 he notes.
Tracing impact鈥檚 footprints: the case study sceptic
Julia Lane, institute fellow at the American Institutes for Research, is pioneering an approach to assessing impact that differs fundamentally from the REF.
For her, the key impact research makes is not via academic papers but through the training of students and postdoctoral researchers, who then move into other areas of the economy, taking their knowledge with them.
She cites the example of the range of fonts available on early Apple computers: the result, she says, not of 鈥渟ome calligrapher writing a paper鈥, but of founder Steve Jobs attending a calligraphy class at university.
She also points out that while the impact of papers is not geographically constrained, since they are available all over the world, the creation of a 鈥渢houghtful, literate workforce鈥 is likely to have more local benefit, creating a 鈥減owerful story鈥 to tell funders and politicians.
鈥淚f you can say that 70 per cent of graduate students and postdocs went into industry and the high wage sector, and those firms grew faster and had more exports than other firms in the economy, that is a compelling story and it is evidenced,鈥 she says.
Such information is being made available to universities in the US, on an automated basis, through Lane鈥檚 federally funded STAR metrics and the related UMetrics programmes.
Lane鈥檚 next challenge is to capture research鈥檚 鈥渁dditional social impact鈥. Her plan is to identify university leavers who 鈥渁re going off and saving the world鈥 by moving, for instance, into the non-profit sector. Although world-saving initiatives might be achieved by other means than employment, she notes that 鈥渋n reality, in order to do anything, you have to have some kind of footprint, so we are trying to capture those footprints electronically鈥.
She admits it will be another step again to assess success in saving the world.
鈥淏ut this is difficult stuff,鈥 she says. 鈥淚f this were easy it would have been done a long time ago. But, in five years, we have come a very long way down a much less burdensome path [than the UK].鈥
Case study examples from the 2014 REF
Engineering
University of Bath research into engines鈥 鈥減arasitic losses鈥 led to an estimated 40,000 tonne reduction in carbon dioxide emissions by Ford engines in 2012, saving over 拢18.7 million in fuel
Mathematical sciences
Algorithms developed at Cardiff University to improve data security in printing and network environments led to the development of patented software by Hewlett-Packard
Education
University of Exeter research influenced pedagogy by pointing out that pupils with special educational needs often don鈥檛 require specialist teaching
Psychology, psychiatry and neuroscience
A new international standard of loudness arose from University of Cambridge research into how sound is perceived. This is now widely used by industry
Law
A University of Birmingham academic鈥檚 monograph made a major contribution to developing the law of duress in Singapore and the Commonwealth
History
An Edge Hill University academic鈥檚 media appearances and campaigns improved public understanding of the Israel-Palestine conflict by 鈥渆xtending the range and improving the quality of argumentation and evidence鈥
Classics
King鈥檚 College London research into the contribution of Lord Byron and Romanticism to the creation of the Greek nation state in the early 19th century challenged the modern perception of Greek national identity
Philosophy
Research by a University College London academic provided the framework for a charity鈥檚 pioneering work on self-directed support in social care, which influenced the government鈥檚 Putting People First strategy
探花视频
Register to continue
Why register?
- Registration is free and only takes a moment
- Once registered, you can read 3 articles a month
- Sign up for our newsletter
Subscribe
Or subscribe for unlimited access to:
- Unlimited access to news, views, insights & reviews
- Digital editions
- Digital access to 罢贬贰鈥檚 university and college rankings analysis
Already registered or a current subscriber?




