探花视频

Inside the post-ChatGPT scramble to create AI essay detectors

Edtech giants and plucky start-ups are vying to create potentially lucrative tools to combat the use of AI in assessments, but will they cause more problems than they solve?

Published on
February 6, 2023
Last updated
February 10, 2023
Montage of metal detectorists on beach with newsprint. To illustrate the scramble to create AI essay detectors.
Source: Alamy/Getty montage

The case of a 鈥渄esperate鈥 student wrongly accused of plagiarism lives long in the memory of academic integrity expert Tom谩拧 Folt媒nek.

Writing about why Apple had proven to be such a successful technology business, the female undergraduate had been horrified to find that her university had flagged the essay as cheating because 30聽per cent of it matched with other sources, according to the ubiquitous plagiarism detection software developed by edtech giant Turnitin.

鈥淚 looked at the Turnitin report and saw just random matches 鈥 a couple of words here, half a sentence there 鈥 with other student essays on a similar topic,鈥 said Dr Folt媒nek, a computer science lecturer at Masaryk University in the Czech Republic.

He said the problem was that Turnitin鈥檚 database contained thousands of such essays and the student鈥檚 teacher had 鈥渂lindly鈥 followed the plagiarism score and initiated disciplinary procedures. On this occasion, the mistake was easy to rectify by comparing the essay to the work that was said to have been plagiarised and judging whether the accusation was warranted.

探花视频

ADVERTISEMENT

THE Campus views: ChatGPT has arrived 鈥 and nothing has changed


But trying to detect academic writing generated by artificial intelligence (AI) poses an altogether different challenge, according to Dr Folt媒nek.

鈥淭here is no source document to verify,鈥 he explained. 鈥淭he teacher cannot prove anything, and the student cannot defend themselves. The only thing the teacher knows is that this particular sentence or passage looks similar to what AI would generate.鈥

探花视频

ADVERTISEMENT

The emergence of ChatGPT in late 2022 鈥 and the global attention it has gained 鈥 has accelerated a race to create a potentially lucrative tool that could be used by teachers worldwide to detect when AI might have been used in assessments.

While few universities 鈥 Paris鈥 Sciences Po being an early exception 鈥 have implemented outright bans on the chatbot made by OpenAI, the clamour to understand when it has and has not been used by students has led to the development of a raft of new apps, ranging from those designed by entrepreneurial undergraduates during their winter break to Turnitin鈥檚 own version, due in the first half of this year.

Dr Folt媒nek feared that such tools would lead to more students being 鈥渞outinely鈥 accused of misconduct but without any way of defending themselves, since it is hard to convince readers that they did not use ChatGPT.

Jesse Stommel, assistant professor in the writing programme at the University of Denver, agreed that plagiarism detection tools had been 鈥減lagued鈥 by false positives and there was no reason that the same will not be true of AI detectors.

This, he said, neglected the fact that 鈥渨hen students cheat, it鈥檚 usually unintentional or non-malicious鈥 and such initiatives will only fuel 鈥渁聽culture of suspicion in education鈥riven all too much by corporate profit鈥.

AI detectors work by looking for 鈥渟tatistical variations or surprises in the text鈥, explained Mike Sharples, emeritus professor of educational technology at The Open University. 鈥淭he idea is humans tend to vary their text, they don鈥檛 just write in a predictable way. Whereas AI tools have been trained in a way that is more predictable.

鈥淭hey work to a certain extent. If you give them an essay written by ChatGPT, sometimes they can detect with a high confidence that it has been written by AI, but sometimes they can鈥檛.鈥

In Australia, University of Technology Sydney graduate Aaron Shikhule has developed AICheatCheck, which provides a score for a piece of work, showing what percentage it thinks was written by AI as well as an indication of whether the essay is of a high school or college standard.

探花视频

ADVERTISEMENT

Mr Shikhule said the tool combats AI with AI and scans words and sentences to look for patterns in a similar way that bots themselves create a piece of writing.

He and his co-founder, David Cyrus, had already been working on an app when ChatGPT exploded on to the scene, and they expedited its release to capitalise on the interest.

The response has been 鈥渢hrough the roof鈥, said Mr Shikhule, who said he was exploring ways to license a new version of the software to universities, as well as planning for how to strengthen the model so it can deal with the release of ChatGPT4, due later this year.

鈥淲e created the tool because we believe in academic responsibility,鈥 he said. 鈥淭here is nothing wrong with AI, but it is important there are mechanisms to protect from people abusing it.鈥

Another of the apps that has been making headlines is GPTZero, developed by a Princeton University senior, Edward Tian, during his winter break while finishing his thesis on AI detection for his computer science major.

A basic version is already available online, and Mr Tian has recruited a lot of help and interest to develop something more sophisticated, which he has pledged to launch soon.

But Professor Sharples said that when he put into this system an essay that he knew was generated by AI, the software said it was 鈥渕ost likely human鈥 and flagged only a few sentences as being potentially AI-written with low confidence. The prompt Professor Sharples had given ChatGPT was to write a high-quality essay with academic references. And it is capable of handling more sophisticated commands such as being asked to vary the words so they are less likely to be detected.

探花视频

ADVERTISEMENT

OpenAI itself has developed a tool for detecting AI-generated text. But it admits that it is聽鈥渘ot fully reliable鈥 and is likely to incorrectly label human writing as AI-written 9聽per cent of the time.

Dr Folt媒nek said academics should be very wary anyway about using 鈥渞andom apps on the internet鈥 to check student essays because it could violate privacy laws.

Although there was no suggestion that the new ChatGPT apps had been set up for nefarious reasons, previously an online 鈥減lagiarism detector鈥 had been found to have been storing uploaded essays and later selling them on via an affiliated essay mill, he聽cautioned.

Universities may be more likely to stick to what they know, and whatever is developed by Turnitin is likely to be in as much, if not more, demand as its plagiarism checker 鈥 which received 232 million submissions in 2021.

The company鈥檚 chief product officer, Annie Chechitelli, said an AI detector was already in development and engineers were now working at speed to get something out to customers.

鈥淭here are different ways to roll something like this out to market,鈥 she said. 鈥淥ne is you wait until it is pretty robust and well tested, and you have a certain amount of data. Or else you put something out that we know is incomplete feature-wise but shows the direction we are moving in.

鈥淲e asked our community, and they overwhelmingly told us that perfect was the enemy of the good and the sooner they could get some basic detection, the better. Using the detection itself is a deterrent. Being able to say Turnitin has this will reduce the misuse of it.鈥

A prototype of the software in development has already been shared by the company. It analyses a text to show how many of the sentences were probably written by ChatGPT 鈥 and to what degree of certainty.

Although certainly a challenge, Ms Chechitelli said she was confident that the tool would work with a high level of accuracy. What made it more complicated, however, was the different demands of users.

Some might be happy for students to use ChatGPT for certain assessments, she said, and others less so. Many want a tool that can check references 鈥 which ChatGPT has been shown to make up 鈥 or allow for additional checks on academic integrity, for example getting students to submit videos alongside their work or an essay at the beginning of the course that can act as a baseline against which to compare.

Ms Chechitelli predicted a raft of different approaches even within institutions, and Turnitin鈥檚 tool has to accommodate all this and quickly provide the information needed in an easy-to-understand format.

For some, such efforts are an 鈥渁rms race鈥 that will never end, given that future AI writing tools will be trained to produce less detectable content.

Professor Sharples said anti-cheating tools that use pattern detection are likely to be useful only temporarily, given that they will soon be overtaken by new text generators that mimic human variation in language.

鈥淚f you start penalising students based on the response from one AI system pitted against another AI system, it is a recipe for doom,鈥 he said. 鈥淪tudents are going to challenge this, and it may well get into legal battles. If universities are relying on AI detectors, it is going to be very difficult for them to defend, particularly as we know these are not foolproof.鈥

He said rethinking assessment and developing a clear set of guidelines on where such tools can and cannot be used would reduce the need for detectors.

Educators must 鈥渞aise an eyebrow at any technology鈥, agreed Dr Stommel, and be as vigilant about detectors as they may be about ChatGPT itself.

鈥淲e need to ask what pedagogies are embedded in these tools, how they are monetised, how they remove or enable student or teacher agency. The work of teaching is never easy. Institutions need to start by trusting teachers and draw them into conversations about how technology changes education.鈥

探花视频

ADVERTISEMENT

tom.williams@timeshighereducation.com

Register to continue

Why register?

  • Registration is free and only takes a moment
  • Once registered, you can read 3 articles a month
  • Sign up for our newsletter
Please
or
to read this article.

Related articles

Artificial intelligence will soon be able to research and write essays as well as humans can. So will genuine education be swept away by a tidal wave of cheating 鈥 or is AI just another technical aid that teaching and assessment will evolve to take account of? John Ross reports

8 July

Reader's comments (6)

How about we give an IT programme the following essay title "Time Flies Like an Arrow; Fruit Flies Like a banana; discuss the attributes of fruit aerodynamics as perceived by temporally-distressed angry insects".
The easiest way to check would be to sit down with the student and say "without looking at a copy of it, tell me about your essay"
I'd love to have a chat with each of my 406 first year undergraduate students about their 'Ethics for Computer Science' work... but it would take a looooong time!
Oral viva's would be the best way to check if the student wrote the essay. If they wrote the work, they will be able to explain every aspect of it.
How about a combination of the two. Don't viva every student, just a sample. Every student *might* get a mini-viva and they know this. Detectors are useful in flagging those essays that might be worth a quick chat to the student - but it's up to the educator. Equally, human judgement when marking can is useful to identify those that might be worth a chat. We could viva borderline or extreme marks (or just sample randomly) too, so that a viva doesn't necessarily mean that your work is suspect. Academic integrity training and honour codes are also vital.
I recall several years ago, in response to pressure from universities, Turnitin saying they were scrapping their 30$ student version, using which, students could test their dissertation for its plagiarism score and then edit it (I think three edits were included) until it came up as satisfactory. It appears such a service is still available: powered by Turnitin; Authorised partner Turnitin; your writing stays private - no other plagiarism checker will see your text. https://www.scribbr.com/plagiarism-checker/ Univerisities should boycott Turnitin until it stops this cynical money-grabbing activity.

Sponsored

Featured jobs

See all jobs
ADVERTISEMENT