Files
nexus/wiki/concepts/Visual-Coherence-Engine.md
2026-04-21 08:02:52 +08:00

28 lines
945 B
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
title: "Visual Coherence Engine"
type: concept
tags: [ai, image-generation, gemini, branding]
sources: [marketing-carousel-growth-engine]
last_updated: 2026-04-21
---
## Definition
通过 Gemini image-to-image 技术确保 6 张轮播幻灯片视觉一致性的生成系统。
## Mechanism
1. **Slide 1**:纯文本 prompt 生成,建立视觉 DNA颜色、字体、美学
2. **Slides 2-6**:使用 Slide 1 作为 reference image 输入,通过 Gemini image-to-image 生成,保持视觉一致性
## Technical Implementation
- **Model**: `gemini-3.1-flash-image-preview`
- **Input**: `--input-image slide-1.jpg` 作为参考
- **Output**: 768x1376 JPG 格式TikTok 要求)
## Brand Integration
- 通过 Playwright 提取 CSS 颜色并融入 prompt
- 字体样式和大小通过结构化 prompt 保持一致
- 背景场景叙事性演进同时保持视觉统一
## Aliases
- 视觉一致性引擎
- Image-to-Image Pipeline