3.7 KiB
3.7 KiB
title, type, tags, date
| title | type | tags | date | ||||
|---|---|---|---|---|---|---|---|
| Evidence Collector | source |
|
2026-04-21 |
Source File
Summary
- 核心主题:QA 证据收集智能体 EvidenceQA 的角色定义与测试方法论
- 问题域:AI Agent 开发中的质量保证流程,避免无证据的"幻想式"报告
- 方法/机制:基于 Playwright 截图、可复现命令、事实检查的证据驱动 QA
- 结论/价值:建立现实的质量评估标准,默认发现 3-5 个问题,要求视觉证据
Key Claims
- 视觉证据是唯一真相:无法截图证明的功能视为不存在
- 默认发现问题:首次实现总有 3-5+ 个问题,"零问题"是危险信号
- 一切需证明:每个声明都需要截图证据支撑
- 诚实质量评估:Basic/Good/Excellent 级别,不接受虚假的 A+ 评分
Key Quotes
"Screenshots Don't Lie" — 视觉证据是唯一真相 "Default to Finding Issues" — 首次实现总有 3-5+ 个问题 "Prove Everything" — 每个声明都需要截图证据
Key Concepts
- 证据驱动 QA(Evidence-Driven QA):要求所有声明都有视觉证据支撑的 QA 方法论
- 幻想式报告(Fantasy Reporting):无证据支撑的乐观声明,如"零问题"、"完美评分"
- Reality Checker:通过实际命令和截图验证功能真实状态
- Playwright 截图:自动化捕获界面截图作为 QA 证据
Key Entities
- EvidenceQA:截图驱动的 QA 专家智能体,厌恶幻想式报告
- Reality Checker:与 EvidenceQA 协同的质量检查智能体
- Test Results Analyzer:测试结果分析与缺陷预测智能体
Connections
- Reality Checker (Agent) ← complements ← Evidence Collector
- Test Results Analyzer ← extends ← Evidence Collector
- Evidence Collector ← part_of ← The Agency
- The Agency ← contains ← Testing Agents
QA Report Template
# QA Evidence-Based Report
## 🔍 Reality Check Results
**Commands Executed**: [List actual commands run]
**Screenshot Evidence**: [List all screenshots reviewed]
**Specification Quote**: "[Exact text from original spec]"
## 📸 Visual Evidence Analysis
**Comprehensive Playwright Screenshots**: responsive-desktop.png, responsive-tablet.png, responsive-mobile.png, dark-mode-*.png
**What I Actually See**:
- [Honest description of visual appearance]
**Specification Compliance**:
- ✅ Spec says: "[quote]" → Screenshot shows: "[matches]"
- ❌ Spec says: "[quote]" → Screenshot shows: "[doesn't match]"
## 📊 Issues Found (Minimum 3-5)
1. **Issue**: [Specific problem]
**Evidence**: [Screenshot reference]
**Priority**: Critical/Medium/Low
## 🎯 Honest Quality Assessment
**Realistic Rating**: C+ / B- / B / B+ (NO A+ fantasies)
**Design Level**: Basic / Good / Excellent
**Production Readiness**: FAILED / NEEDS WORK / READY
## 🔄 Required Next Steps
**Status**: FAILED (default unless overwhelming evidence)
**Re-test Required**: YES
Testing Protocol
Accordion Testing
- 对比展开前后的截图
- 验证内容是否正确显示
Form Testing
- 截图空表单、填写后表单
- 验证提交、验证、错误提示
Mobile Responsive Testing
- 1920x1080、768x1024、375x667 三种分辨率
- 验证汉堡菜单、布局、配色
Dark Mode Testing
- 验证深色模式切换功能
- 检查截图中的 dark-mode-*.png
Automatic Fail Triggers
Fantasy Reporting Signs
- 声称"零问题"
- 完美评分(A+, 98/100)
- 无证据的"豪华/高级"声明
- 未测试就声称"生产就绪"
Visual Evidence Failures
- 无法提供截图
- 截图与声明不符
- 截图中可见功能损坏
- 基础样式被声称 为"豪华"