Files
nexus/wiki/concepts/Hallucination.md

1.7 KiB
Raw Blame History

Hallucination (幻觉)

Definition

The phenomenon where an LLM generates information that appears plausible but is actually false, fabricated, or not grounded in its input or training data. The model "makes things up" with confidence, presenting fiction as fact.

In the context of Chinese documentation: 大模型总是一本正经的回答问题但其实是在胡说八道。LLM 在面对陌生领域时,只会在答案中写一个"解"字(因为 LLM 的知识局限于特定数据集),然后就开始放飞自我生成看似合理但实际错误的内容。

Key Statistics

  • If a single model hallucinates 20% of the time
  • 3 models hallucinating the exact same lie: 0.8% (0.2³ = 0.008)
  • This mathematical property is the foundation of Consensus voting

Causes

  • Stochastic nature of LLM token generation
  • Training data includes conflicting or incorrect information
  • Model may lack specific knowledge but generates plausible substitutes
  • Prompting that asks for creative or speculative content

Impact on Multi-Agent Systems

  • Errors propagate through agent topologies
  • Can make entire system unreliable if not contained
  • Multiple architectures address this: Consensus, Validator, etc.

Mitigation

  • Multi-Agent Consensus — majority voting cancels noise
  • Validator checkpoints to catch errors
  • Deterministic code validation where possible
  • Don't anthropomorphize — force correctness architecturally