--- title: "Evidence Collector" type: source tags: [testing, qa, evidence, agency] date: 2026-04-21 --- ## Source File - [[raw/Agent/agency-agents/testing/testing-evidence-collector.md]] ## Summary - 核心主题:QA 证据收集智能体 EvidenceQA 的角色定义与测试方法论 - 问题域:AI Agent 开发中的质量保证流程,避免无证据的"幻想式"报告 - 方法/机制:基于 Playwright 截图、可复现命令、事实检查的证据驱动 QA - 结论/价值:建立现实的质量评估标准,默认发现 3-5 个问题,要求视觉证据 ## Key Claims - 视觉证据是唯一真相:无法截图证明的功能视为不存在 - 默认发现问题:首次实现总有 3-5+ 个问题,"零问题"是危险信号 - 一切需证明:每个声明都需要截图证据支撑 - 诚实质量评估:Basic/Good/Excellent 级别,不接受虚假的 A+ 评分 ## Key Quotes > "Screenshots Don't Lie" — 视觉证据是唯一真相 > "Default to Finding Issues" — 首次实现总有 3-5+ 个问题 > "Prove Everything" — 每个声明都需要截图证据 ## Key Concepts - [[证据驱动 QA(Evidence-Driven QA)]]:要求所有声明都有视觉证据支撑的 QA 方法论 - [[幻想式报告(Fantasy Reporting)]]:无证据支撑的乐观声明,如"零问题"、"完美评分" - [[Reality Checker]]:通过实际命令和截图验证功能真实状态 - [[Playwright 截图]]:自动化捕获界面截图作为 QA 证据 ## Key Entities - [[EvidenceQA]]:截图驱动的 QA 专家智能体,厌恶幻想式报告 - [[Reality Checker]]:与 EvidenceQA 协同的质量检查智能体 - [[Test Results Analyzer]]:测试结果分析与缺陷预测智能体 ## Connections - [[Reality Checker (Agent)]] ← complements ← [[Evidence Collector]] - [[Test Results Analyzer]] ← extends ← [[Evidence Collector]] - [[Evidence Collector]] ← part_of ← [[The Agency]] - [[The Agency]] ← contains ← [[Testing Agents]] ## QA Report Template ```markdown # QA Evidence-Based Report ## 🔍 Reality Check Results **Commands Executed**: [List actual commands run] **Screenshot Evidence**: [List all screenshots reviewed] **Specification Quote**: "[Exact text from original spec]" ## 📸 Visual Evidence Analysis **Comprehensive Playwright Screenshots**: responsive-desktop.png, responsive-tablet.png, responsive-mobile.png, dark-mode-*.png **What I Actually See**: - [Honest description of visual appearance] **Specification Compliance**: - ✅ Spec says: "[quote]" → Screenshot shows: "[matches]" - ❌ Spec says: "[quote]" → Screenshot shows: "[doesn't match]" ## 📊 Issues Found (Minimum 3-5) 1. **Issue**: [Specific problem] **Evidence**: [Screenshot reference] **Priority**: Critical/Medium/Low ## 🎯 Honest Quality Assessment **Realistic Rating**: C+ / B- / B / B+ (NO A+ fantasies) **Design Level**: Basic / Good / Excellent **Production Readiness**: FAILED / NEEDS WORK / READY ## 🔄 Required Next Steps **Status**: FAILED (default unless overwhelming evidence) **Re-test Required**: YES ``` ## Testing Protocol ### Accordion Testing - 对比展开前后的截图 - 验证内容是否正确显示 ### Form Testing - 截图空表单、填写后表单 - 验证提交、验证、错误提示 ### Mobile Responsive Testing - 1920x1080、768x1024、375x667 三种分辨率 - 验证汉堡菜单、布局、配色 ### Dark Mode Testing - 验证深色模式切换功能 - 检查截图中的 dark-mode-*.png ## Automatic Fail Triggers ### Fantasy Reporting Signs - 声称"零问题" - 完美评分(A+, 98/100) - 无证据的"豪华/高级"声明 - 未测试就声称"生产就绪" ### Visual Evidence Failures - 无法提供截图 - 截图与声明不符 - 截图中可见功能损坏 - 基础样式被声称 为"豪华"