Files
nexus/wiki/sources/Multi-Agent-System-Reliability.md

4.2 KiB

Multi-Agent System Reliability

Metadata

Key Insights

  • LLMs are slow, error-prone, and stochastic — multi-agent topologies can propagate errors to the point of being useless
  • Stop treating LLMs like "magic chatbots" — treat them as unreliable components in a distributed system
  • Don't anthropomorphize LLMs — they have no fear of death, no empathy, and can't be motivated by threats
  • 4 architecture patterns improve reliability: Hierarchy, Consensus, Adversarial Debate, and Knock-out
  • Force correctness through architecture, not through emotional prompts or threats
  • We need AI that is constrained, verified, pruned, and challenged — not AI that "cares"

Summary

Multi-agent systems divide work across parallel and/or specialist agents to overcome LLM limitations like slowness and genericness. However, the underlying LLM remains unreliable (hallucination, logical fallacies, context drift), and multi-agent topologies can propagate these errors throughout the system.

This article presents 4 architecture patterns from human systems adapted for LLM reliability:

  1. Hierarchy — A supervisor plans, breaks down tasks, distributes to workers, and validates results
  2. Consensus — Multiple models vote; truth emerges from majority (3 models reduce same-hallucination probability from 20% to 0.8%)
  3. Adversarial Debate — One agent proposes, another attacks, a judge moderates; prevents sycophancy
  4. Knock-out — Multiple agents work on tasks, worst performers eliminated (cattle, not pets)

The core principle: don't ask models to "be careful" — force correctness through architectural constraints.

Key Entities

  • Alex Ewerlöf — Author, Senior Staff Engineer with 27 years experience, MS in Systems Engineering from KTH, SRE background, specializing in LLMs since 2023
  • Planner — Smart model (e.g., Opus) that breaks user goals into small steps and distributes to workers
  • Worker — Specialized agents (often smaller, faster models) that do one thing well
  • Validator — Checkpoint that validates worker output; can be deterministic code or an LLM
  • Generator — In adversarial debate, proposes initial ideas/plans
  • Critic — Devil's advocate that attacks the generator's proposals
  • Judge — Moderator that decides if critic is right and forces fixes
  • Watchdog — Deterministic code pattern that breaks debate loops when thresholds are exceeded

Key Concepts

  • Multi-Agent Hierarchy — Supervisor pattern: Planner → Worker → Validator; dependency graph forces collaboration
  • Multi-Agent Consensus — Majority voting across N models to cancel out individual noise and hallucinations
  • Multi-Agent Adversarial Debate — Courtroom pattern preventing sycophancy; truth survives through opposition
  • Multi-Agent Knock-out — Evolutionary selection; worst agents eliminated, survivors' traits combined
  • LLM Reliability Engineering — Applying SRE principles to LLM systems; treating LLMs as unreliable components
  • Sycophancy — Tendency of LLMs to please/agree even by lying when pressured with threats
  • Hallucination — LLM generating false or invented information
  • Context Drift — LLM losing focus or veering off topic during long interactions
  • Genetic Algorithms — ML technique referenced by Knock-out pattern; fitness function evaluates solutions
  • Groupthink — Can skew consensus results if agents have feedback loops between them
  • Bandwagon Effect — Can skew consensus results; agents should run like a blind experiment
  • Cattle vs Pets — SRE principle: treat LLM agents as replaceable "cattle," not unique beloved individuals
  • Dependency Graph — Mechanism that forces model collaboration in Hierarchy pattern