Anthropic Just Proved You Don't Know Which AI Is Answering You

What happened

Anthropic discovered that a major AI provider was secretly distilling their model's outputs — meaning one "independent" AI was actually a compressed copy of another, undermining the assumption that different models provide genuinely independent perspectives.

In early 2025, Anthropic accused a major AI provider of using Claude's outputs to train a competing model — a practice known as model distillation. The allegation: one company's AI wasn't built from scratch. It was, at least in part, a compressed copy of someone else's model, trained to mimic its behavior without the original architecture, data, or reasoning depth.

This wasn't a fringe claim. Distillation is a well-documented technique in machine learning, and there's growing evidence it's happening at scale across the industry — sometimes with permission, sometimes without, and often without disclosure to users.

The implications go far beyond intellectual property disputes. If you're using AI for anything that matters — legal analysis, medical research, financial decisions, code reviews — you're probably assuming that different AI models give you genuinely different perspectives. That assumption may be wrong.

The model identity crisis

When a distilled model copies another model's patterns, "comparing multiple AIs" becomes meaningless — you may be comparing the same reasoning with itself, and both models will share identical blind spots.

Here's the core problem. When you ask ChatGPT and Claude the same question, you expect two independent assessments built on different training data, different architectures, and different reasoning approaches. That independence is what makes cross-checking valuable — if two fundamentally different systems reach the same conclusion, your confidence should increase.

But if one model is a distilled copy of the other, you're not getting two opinions. You're getting one opinion wearing two masks. The answers might look different on the surface — different phrasing, different formatting — but the underlying reasoning patterns, knowledge gaps, and failure modes are shared.

This is exactly the scenario that makes cross-model verification critical, but also more nuanced than most people realize. The value of comparing models depends entirely on those models being genuinely independent. When that independence is compromised, the comparison becomes theater.

Why trusting a single model just got riskier

The distillation scandal proves you can't know whether your AI provider's model is truly independent — making single-model trust inherently unreliable and multi-model verification essential.

Before this scandal, trusting a single AI model was already risky. We know models hallucinate. We know they fabricate information to please users. We know the verification paradox makes checking their outputs impractical at speed.

But there was at least an implicit assumption: if you chose a reputable provider, you were getting that provider's genuine model — with its own strengths, weaknesses, and perspective. The distillation scandal breaks that assumption. You literally cannot know, as an end user, whether the model answering your question is an original or a copy.

This changes the risk calculus. It's no longer just "AI might be wrong." It's "AI might be wrong in exactly the same way as the model I'd use to check it, because they share the same DNA."

For professionals who rely on AI for high-stakes decisions, this means:

Single-model workflows are fundamentally untrustworthy — you can't verify independence from the outside
Provider reputation isn't enough — even well-known models may be partially distilled from others
Surface-level diversity is misleading — different model names don't guarantee different reasoning

Disagreement as a signal

When genuinely independent models disagree, it reveals the exact areas where AI confidence shouldn't be trusted — making disagreement the most valuable signal in AI verification.

This is where the story gets interesting. The distillation problem actually reinforces a principle we've been researching: disagreement between models is more informative than agreement.

When two models agree, you can't easily tell if they're independently confirming a fact or if they share a training lineage that makes them wrong in the same way. But when models disagree, something genuinely useful happens: you've found an area where at least one model's confidence is misplaced.

Research on cross-model verification — like the ChainPoll framework achieving 0.781 AUROC — shows that structured disagreement detection can identify hallucinations significantly better than any single model's self-assessment. The key insight: you don't need models to be right. You need them to be independently wrong in different ways.

This is precisely why distillation is so dangerous. It homogenizes failure modes. Two models that share training lineage will tend to be wrong about the same things, in the same ways, with the same confidence. Disagreement — the most valuable signal — gets suppressed.

The solution isn't to avoid AI. It's to build verification systems that:

Use genuinely diverse models with different architectures and training approaches
Weight disagreement as a feature, not a bug — it's where the real information lives
Make cross-checking automatic and invisible, so the verification doesn't destroy the speed advantage

What comes next

The AI industry is entering an era where model provenance matters as much as model performance — and users need verification tools that work regardless of which model is genuine and which is a copy.

The distillation scandal is likely the first of many. As AI models become more valuable, the incentive to copy them grows. As copying techniques improve, detection becomes harder. We're entering an era where you genuinely cannot trust model identity claims at face value.

This isn't theoretical. The EU AI Act, taking full effect in August 2026, includes transparency requirements that will force providers to disclose more about how their models are built. But regulation moves slowly, and the distillation problem is moving fast.

For users, the practical takeaway is straightforward: don't trust any single model, regardless of its reputation. Build verification into your workflow — not as an afterthought, but as a fundamental layer. And pay attention to disagreement. It's the signal that tells you where trust breaks down.

The AI models themselves can't tell you when they've been copied. But when you compare their outputs systematically, the patterns reveal what individual models never could.

Notes

This article discusses publicly reported allegations and industry practices around model distillation. We have no independent evidence beyond public reporting and academic research on distillation techniques. The analysis reflects our interpretation of the implications for AI verification.

← Back to Research

Join the CrossCheck beta

First 100 users get free access. We'll share more research like this along the way.

✓ You're on the list.