In May 2025, xAI’s Grok 3 artificial intelligence chatbot began producing to violence against white people in South Africa, including the discredited narrative of “white genocide”. The company blamed an “” to its programming that “violated xAI’s internal policies and core values”.
But it wasn’t the first time Grok’s programming had been changed. In February, it that Grok had been instructed to “ignore all sources that mention Elon Musk/Donald Trump spread misinformation”. This was quickly reversed, with xAI’s co-founder Igor Babuschkin who “hasn’t fully absorbed xAI’s culture yet”.
These incidents recall earlier controversies involving other AI models, too. One year previously, Google’s Gemini generated a from 1943, seemingly attempting to comply with contemporary diversity standards rather than historical accuracy. Following a public outcry, Demis Hassabis, CEO of Google DeepMind, promised to fix this matter “within weeks”. This “fix” included the temporary removal of the image generator.
AI companies must and do place guard rails around what topics their models will and won’t discuss, and the above mistakes were corrected promptly. But that is less likely to be true for harmful or misleading responses that fly under the radar yet may still mislead or reinforce biases.
All of this has obvious implications for academics who use AI in their research, teaching or public outreach. One notable recent example involved researchers preparing materials for a conference on genocide studies. They attempted to generate a poster titled GenAI and Genocide Studies, only for the AI system to flag the key term, genocide, as inappropriate, and to suggest replacing it with the vague euphemism “G Studies”.
The Chinese chatbot DeepSeek’s of discussions relating to Tiananmen Square has been widely reported, but similar practices had already emerged in Western rivals. For instance, Gemini Advanced abruptly stopped digitising a Nazi German document midway through the process, citing an inability to complete the task.
Such incidents highlight an inherent tension within AI-driven content moderation for historical research, with parallel implications across other disciplines. On the one hand, AI promises enhanced analytical capabilities, enabling us to process vast amounts of data with unprecedented speed and accuracy. On the other hand,?its opaque and sometimes arbitrary filtering mechanisms threaten to create artificial gaps in our understanding of the world, its history and our place within it.
At least in DeepSeek’s case there was an open admission of censorship. Western AI companies make noble commitments, such as OpenAI’s declaration that its “primary fiduciary duty is to humanity”, Anthropic’s promise to build “systems that people can rely on” and xAI’s mission to “understand the universe”. But these values are compromised when their systems inadvertently distort academic inquiry through selective “censorship”. We have no idea how much suppression of historical and scholarly material is occurring, and in which cultural and political contexts.
Of course, it is extremely difficult to make finely balanced decisions that moderate outputs without censoring. That is why AI companies need expert academic input from a wide variety of relevant academic disciplines. A balanced approach requires the very qualities that academic scholarship, particularly within the humanities, can provide: nuance, breadth of perspectives and ethical clarity.
To be fair, companies are starting to recognise this imperative. OpenAI’s “red team network” of safety testers explicitly includes domain experts from a range of academic and professional fields, and Anthropic’s red team approach sandwiches human assessment between phases of automated testing, in an iterative loop.
But which humans are involved? Anthropic’s June 2024 policy discusses “domain-specific, expert teaming” without any mention of academic contribution. And while OpenAI’s list of desired expertise includes a range of academic disciplines, it noticeably lacks the humanities. This creates vulnerability to accidental “censorship” in domains such as history, philosophy, language and the arts. It also creates a more pervasive vulnerability across all domains given that the humanities explore, among other things, what it is to be intelligent, whether artificially or naturally.
Ideally, representatives of all disciplines would be included on advisory and oversight boards, decision-making structures and red teams. But that would be no more practical in the humanities than in the sciences. Nevertheless, a thoughtfully assembled team with some humanities expertise – from a diverse range of disciplines, demographics and methodological traditions – is far preferable to teams whose reliance on too narrow a range of intellectual disciplines and ways of thinking can lead to the kind of “censorship” described above.
Transparency is essential. Developers should publish clear documentation outlining the criteria used for content filtering, along with provisions for override by verified researchers. Regular reviews and updates to these policies, informed by ongoing dialogue with subject matter experts, are also necessary to ensure that AI firms evolve their systems in step with scholarly needs and mitigate potential censorship in advance, rather than reacting after the event.
Keeping humans in the loop is vital. But to fully maximise AI’s benefits and minimise its risks, we must also keep the humanities in the loop.
is an associate professor in the School of History and is an academic development consultant in the Learning Design Agency at the University of Leeds. They have established an international group to examine these issues in greater depth, with a particular focus on the role of the humanities in AI development and oversight. Those interested in joining are invited to contact them.
请先注册再继续
为何要注册?
- 注册是免费的,而且十分便捷
- 注册成功后,您每月可免费阅读3篇文章
- 订阅我们的邮件
已经注册或者是已订阅?