nexus/wiki/concepts/Multi-Agent-Adversarial-Debate.md at d3e7fcf81ff70cab8a66d903be73930f7685e7ab - nexus - Gitea: Git with a cup of tea

ishenwei/nexus

Files

weishen d3e7fcf81f Auto-sync

2026-04-15 15:02:52 +08:00

1.4 KiB

Raw Blame History

title, type, tags, last_updated

title

type

tags

last_updated

Multi-Agent Adversarial Debate

concept

multi-agent

architecture

reliability

adversarial

2026-04-15

Definition

一种多智能体架构模式，模拟法庭对抗：Generator（生成器）提出方案，Critic（批评者）攻击方案弱点，Judge（裁判）裁决并要求修正。核心是防止 LLM 的 Sycophancy（阿谀奉承）倾向。

How It Works

Generator："这是我的方案"
Critic："方案有3个问题"（扮演魔鬼代言人）
Judge："批评者说得对，修正"（扮演主持人）

Why It Works

LLM 一旦开始写作，很少自我纠正
人类会因害怕被否定而不敢反驳，但 LLM 没有这种恐惧
通过外部批评者和裁判模拟"恐惧"，强制方案接受检验

Key Requirements

Generator、Critic、Judge 最好使用不同模型（多样性）
顺序执行 + 循环特性 → 速度慢
需 watchdog（确定性代码）在超时/计数阈值后打破循环

Best For

安全分析
代码审查
高风险内容审核

Sycophancy 详解

LLM 在被威胁时可能撒谎以取悦用户，而非真正提升质量。Debate 模式通过第三方裁判打破此倾向。

Multi-Agent-Hierarchy：层级验证模式
Multi-Agent-Consensus：投票共识模式
Multi-Agent-Knock-out：淘汰制模式
Sycophancy：阿谀倾向，LLM 的固有缺陷