nexus/wiki/sources/testing-evidence-collector.md at b40abbcd473a7093d8261e212e3d6de97c1e516a

ishenwei/nexus

Fork 0

Files

weishen 111bc65b7b Update nexus wiki content

2026-05-03 05:42:12 +08:00

2.7 KiB

Raw Blame History

title, type, tags, date

title

type

Source File

Agent/agency-agents/testing/testing-evidence-collector.md

Summary（用中文描述）

核心主题：EvidenceQA 是一个以视觉证据为核心的 QA Agent 人格，专注于通过截图和实际测试结果来验证功能实现。
问题域：AI Agent 开发中的质量保证流程 —— 如何防止"幻想式报告"（fantasy reporting）通过验收。
方法/机制：通过 Playwright 自动化截图采集 → 截图视觉分析 → 规格对比 → 生成有据可查的 QA 报告。
结论/价值：为 AI Agent 开发流程引入严格的视觉证据验证机制，确保"所见即所得"，避免功能声称与实际实现不符。

Key Claims（用中文描述）

视觉证据是唯一可信的真相：无法在截图中看到的功能实现，等同于不存在。
默认应发现 3-5+ 个问题：首次实现总是存在问题的，"零问题"报告是危险信号。
每个断言都需要截图佐证：口头或文字声称必须有对应的截图证据支撑。
诚实评估质量等级：Basic / Good / Excellent，拒绝虚假的 A+ / 98 分等完美评分。
生产就绪状态默认失败：除非有压倒性证据，否则默认判定为 FAILED。

Key Quotes

"Screenshots Don't Lie" — "If you can't see it working in a screenshot, it doesn't work" "Default to Finding Issues" — "First implementations ALWAYS have 3-5+ issues minimum" "Prove Everything" — "Every claim needs screenshot evidence"

Key Concepts

EvidenceQA：截图驱动的 QA Agent 人格，核心原则是"截图不说谎"
Playwright自动化截图：使用 ./qa-playwright-capture.sh 脚本自动采集多设备、多状态的截图证据
FantasyReporting：指 AI Agent 声称"零问题"或"完美评分"但无视觉证据支撑的报告行为
规格一致性验证：将实际截图与原始规格说明进行逐条对比，记录匹配/不匹配项
QA报告模板：结构化的证据报告格式，包含 Reality Check → Visual Evidence → Issues Found → Honest Assessment

Key Entities

EvidenceQA：QA Agent 人格名，持久的视觉证据驱动型质量保证专员

Connections

TestingRealityChecker ← depends_on ← TestingEvidenceCollector（依赖截图证据进行现实检查）
TestingTestResultsAnalyzer ← depends_on ← TestingEvidenceCollector（依赖截图 JSON 数据进行性能分析）
TestingPerformanceBenchmarker ← depends_on ← TestingEvidenceCollector（依赖截图和性能数据）
Playwright ← used_by ← TestingEvidenceCollector（核心截图采集工具）

Contradictions

无冲突内容

2.7 KiB Raw Blame History Unescape Escape

Source File

Summary（用中文描述）

Key Claims（用中文描述）

Key Quotes

Key Concepts

Key Entities

Connections

Contradictions

2.7 KiB

Raw Blame History