Introduction
As AI systems become increasingly capable of generating human-style prose and aggregating knowledge from vast corpora, questions of trust, accuracy, and editorial accountability have come to the fore. In this climate, Wales has emerged as a prominent voice voicing caution: not rejecting AI outright, but insisting that relying on it for “massive” or foundational truth-claims is deeply problematic for now. He highlights that tools promising to replace human editors — or to serve as encyclopaedias in their own right — risk producing what may appear plausible yet contain serious errors. His critique matters especially because Wikipedia itself is both a major source of information for humans and a training substrate for many large language models (LLMs).
Wales’s key concerns
Here are the central points of Wales’s argument, collected and synthesised from his statements and writings:
- AI is prone to plausible-but-wrong content
Wales has said of generative AI models (for example, ChatGPT) that they are “a mess” in the context of writing Wikipedia articles: they “get things wrong in a plausible way” and “make up sources”. He also remarked that we are still “a long way from ChatGPT being a reliable source”.
The crux: anything that looks like an encyclopaedia (authoritative, fact-based) but is generated by AI without proper human editorial oversight will likely contain errors that are hard to detect. - Human judgement and oversight remain essential
For Wales, the core value that Wikipedia offers lies in human editorship, community oversight, the ability to cite, to cross-check, to flag problems. He emphasised that AI may assist (for example, flagging unsourced assertions) but cannot replace human curation.
He says: “The core of Wikipedia’s editorial process still depends on human judgment.” - Risks of feedback loops and bias amplification
Another point is that Wikipedia content both is used by and relies upon training data for AI systems. Wales warns this creates a risk: if AI-generated content (with errors) is integrated into Wikipedia or into training sets, the errors can amplify.
In other words: if we allow AI to write or rewrite encyclopaedic content, it may degrade the quality of the corpus that both human and machine learners rely on. - Appropriate use-cases and caution on writing new articles
Wales does not argue that AI should never be used in the Wikipedia ecosystem — rather, he suggests lower-risk tasks such as scanning for missing citations, finding unsourced statements, or helping with translations. But he explicitly states that “writing entire articles” by AI is not currently acceptable.
He remarks: “We have tonnes of people writing about the Battle of the Bulge … what we need is help on more obscure topics, and on those, it (AI) makes a lot of mistakes.” - The larger societal dimension: quality of knowledge and truth
He sees this discussion as not merely a technical one, but tied to how knowledge is curated and trusted in the digital age: “I think we are still a long way from [AI] being a reliable source.”
And: “For me what’s really exciting about AI is the potential that we might find some ways to use AI to support our community in Wikipedia … but for writing entire articles I’m less convinced.”
Implications: Why the concern is significant
Wales’s concerns are especially significant for several reasons:
- Scale of impact: Wikipedia is one of the most widely consulted knowledge resources globally. If content degrades, the ripple effects across education, research, and public discourse are large.
- Intersection with AI training: Many large language models use Wikipedia content (among other sources) as part of their training data. If Wikipedia is weakened by errors, then AI systems built on it will inherit and amplify those errors.
- Trust in digital knowledge: In an era of ‘fake news’, deepfakes, and manipulated content, maintaining highly reliable reference works matters more than ever. The move to AI-generated content threatens to erode trust.
- Human value and labour: The role of volunteer editors, subject-matter experts, and communities in maintaining knowledge repositories is de-emphasised if AI is treated as a substitute. That has cultural and structural implications for how knowledge is produced and validated.
The “Grokipedia” / AI encyclopaedia angle
A context that underscores Wales’s caution is the recent emergence of AI-driven encyclopaedic initiatives such as the one launched by Elon Musk’s company xAI — for example “Grokipedia”.
If large-scale AI encyclopaedias attempt to replace human-edited ones, Wales’s objections become especially relevant:
- Can AI reliably generate, curate, and update thousands (or millions) of articles without falling into “hallucination” (i.e., generating plausible but false information)?
- Who flags errors in such systems, and what is the mechanism for accountability?
- How will the knowledge base be kept neutral, verifiable, and free of bias if heavily machine-generated?
Given his statements, Wales would likely regard such ventures as high-risk at present.
Counterarguments and nuance
It is worth noting the nuance in Wales’s position:
- He is not outright rejecting all use of AI in Wikipedia/context of large-scale knowledge. He sees useful roles for AI (e.g., identifying gaps, assisting editors).
- He acknowledges that AI could surpass human capacity in distant future (he once said “superhuman AI could take at least 50 years”).
Thus the argument is less about fear of AI per se, and more about premature dependence on it for tasks it is not yet suited to (such as writing full encyclopaedic articles).
Actionable questions and steps for stakeholders
For those managing knowledge platforms, libraries, academia, or even startups intending to build AI-driven knowledge systems, Wales’s remarks suggest several actionable insights:
- Ensure strong human editorial oversight: Even if AI generates first drafts, there must be rigorous mechanisms for human review, citation checking, error-flagging, bias assessment.
- Start with constrained use-cases: Use AI for supporting tasks (e.g., identifying missing references, assisting translation, detection of unsourced claims) rather than for standalone article generation.
- Maintain transparency of sourcing: If AI is used, the provenance of content must be clear. Which sources were used? Are citations verifiable? What review process was applied?
- Be alert to feedback loops: If AI systems train on content that has been AI-generated (or low-quality), they risk reproducing flaws. Thus avoid circular reuse of machine-generated text without human cleaning.
- Cultivate digital literacy: For end-users, emphasise that “this looks like an encyclopaedia” does not automatically mean “this is reliable and verified”. Teach users to check citations, to cross-validate.
- Monitor and research quality impact: Platforms should track metrics like citation completeness, error rates, bias exposure, before and after AI-integration. (For example, academic studies show Wikipedia’s reference-quality has improved over time—but AI generation may change dynamics.)
Conclusion
In sum, Jimmy Wales’s stance can be summarised as: “AI has promise, but right now it is ill-suited for replacing human-curated knowledge in high-stakes domains of encyclopaedic truth.” He warns that embracing AI too quickly for full article writing risks undermining reliability, allows plausible but false content to proliferate, and threatens the foundations of trusted knowledge systems.
The current moment is thus one of cautious integration rather than full automation of knowledge creation. Knowledge-platform architects, editors, AI developers, and users alike would do well to heed Wales’s call: recognise the limitations of generative AI, clarify where it can help, ensure human oversight remains central, and preserve accountability, transparency and trust in our shared information ecosystems.
In a world where initiatives such as Grokipedia seek to build AI-powered encyclopaedias, though Wales did not refer to that project by name, his warnings sound particularly timely: “massive errors” are not just a possibility, they are likely if human checks are absent.
