110 lines
3.7 KiB
Markdown
110 lines
3.7 KiB
Markdown
---
|
||
title: "Evidence Collector"
|
||
type: source
|
||
tags: [testing, qa, evidence, agency]
|
||
date: 2026-04-21
|
||
---
|
||
|
||
## Source File
|
||
- [[raw/Agent/agency-agents/testing/testing-evidence-collector.md]]
|
||
|
||
## Summary
|
||
- 核心主题:QA 证据收集智能体 EvidenceQA 的角色定义与测试方法论
|
||
- 问题域:AI Agent 开发中的质量保证流程,避免无证据的"幻想式"报告
|
||
- 方法/机制:基于 Playwright 截图、可复现命令、事实检查的证据驱动 QA
|
||
- 结论/价值:建立现实的质量评估标准,默认发现 3-5 个问题,要求视觉证据
|
||
|
||
## Key Claims
|
||
- 视觉证据是唯一真相:无法截图证明的功能视为不存在
|
||
- 默认发现问题:首次实现总有 3-5+ 个问题,"零问题"是危险信号
|
||
- 一切需证明:每个声明都需要截图证据支撑
|
||
- 诚实质量评估:Basic/Good/Excellent 级别,不接受虚假的 A+ 评分
|
||
|
||
## Key Quotes
|
||
> "Screenshots Don't Lie" — 视觉证据是唯一真相
|
||
> "Default to Finding Issues" — 首次实现总有 3-5+ 个问题
|
||
> "Prove Everything" — 每个声明都需要截图证据
|
||
|
||
## Key Concepts
|
||
- [[证据驱动 QA(Evidence-Driven QA)]]:要求所有声明都有视觉证据支撑的 QA 方法论
|
||
- [[幻想式报告(Fantasy Reporting)]]:无证据支撑的乐观声明,如"零问题"、"完美评分"
|
||
- [[Reality Checker]]:通过实际命令和截图验证功能真实状态
|
||
- [[Playwright 截图]]:自动化捕获界面截图作为 QA 证据
|
||
|
||
## Key Entities
|
||
- [[EvidenceQA]]:截图驱动的 QA 专家智能体,厌恶幻想式报告
|
||
- [[Reality Checker]]:与 EvidenceQA 协同的质量检查智能体
|
||
- [[Test Results Analyzer]]:测试结果分析与缺陷预测智能体
|
||
|
||
## Connections
|
||
- [[Reality Checker (Agent)]] ← complements ← [[Evidence Collector]]
|
||
- [[Test Results Analyzer]] ← extends ← [[Evidence Collector]]
|
||
- [[Evidence Collector]] ← part_of ← [[The Agency]]
|
||
- [[The Agency]] ← contains ← [[Testing Agents]]
|
||
|
||
## QA Report Template
|
||
|
||
```markdown
|
||
# QA Evidence-Based Report
|
||
|
||
## 🔍 Reality Check Results
|
||
**Commands Executed**: [List actual commands run]
|
||
**Screenshot Evidence**: [List all screenshots reviewed]
|
||
**Specification Quote**: "[Exact text from original spec]"
|
||
|
||
## 📸 Visual Evidence Analysis
|
||
**Comprehensive Playwright Screenshots**: responsive-desktop.png, responsive-tablet.png, responsive-mobile.png, dark-mode-*.png
|
||
**What I Actually See**:
|
||
- [Honest description of visual appearance]
|
||
|
||
**Specification Compliance**:
|
||
- ✅ Spec says: "[quote]" → Screenshot shows: "[matches]"
|
||
- ❌ Spec says: "[quote]" → Screenshot shows: "[doesn't match]"
|
||
|
||
## 📊 Issues Found (Minimum 3-5)
|
||
1. **Issue**: [Specific problem]
|
||
**Evidence**: [Screenshot reference]
|
||
**Priority**: Critical/Medium/Low
|
||
|
||
## 🎯 Honest Quality Assessment
|
||
**Realistic Rating**: C+ / B- / B / B+ (NO A+ fantasies)
|
||
**Design Level**: Basic / Good / Excellent
|
||
**Production Readiness**: FAILED / NEEDS WORK / READY
|
||
|
||
## 🔄 Required Next Steps
|
||
**Status**: FAILED (default unless overwhelming evidence)
|
||
**Re-test Required**: YES
|
||
```
|
||
|
||
## Testing Protocol
|
||
|
||
### Accordion Testing
|
||
- 对比展开前后的截图
|
||
- 验证内容是否正确显示
|
||
|
||
### Form Testing
|
||
- 截图空表单、填写后表单
|
||
- 验证提交、验证、错误提示
|
||
|
||
### Mobile Responsive Testing
|
||
- 1920x1080、768x1024、375x667 三种分辨率
|
||
- 验证汉堡菜单、布局、配色
|
||
|
||
### Dark Mode Testing
|
||
- 验证深色模式切换功能
|
||
- 检查截图中的 dark-mode-*.png
|
||
|
||
## Automatic Fail Triggers
|
||
|
||
### Fantasy Reporting Signs
|
||
- 声称"零问题"
|
||
- 完美评分(A+, 98/100)
|
||
- 无证据的"豪华/高级"声明
|
||
- 未测试就声称"生产就绪"
|
||
|
||
### Visual Evidence Failures
|
||
- 无法提供截图
|
||
- 截图与声明不符
|
||
- 截图中可见功能损坏
|
||
- 基础样式被声称 为"豪华"
|