nexus/wiki/concepts/Visual-Coherence-Engine.md

---
title: "Visual Coherence Engine"
type: concept
tags: [ai, image-generation, gemini, branding]
sources: [marketing-carousel-growth-engine]
last_updated: 2026-04-21
---

## Definition
通过 Gemini image-to-image 技术确保 6 张轮播幻灯片视觉一致性的生成系统。

## Mechanism
1. **Slide 1**：纯文本 prompt 生成，建立视觉 DNA（颜色、字体、美学）
2. **Slides 2-6**：使用 Slide 1 作为 reference image 输入，通过 Gemini image-to-image 生成，保持视觉一致性

## Technical Implementation
- **Model**: `gemini-3.1-flash-image-preview`
- **Input**: `--input-image slide-1.jpg` 作为参考
- **Output**: 768x1376 JPG 格式（TikTok 要求）

## Brand Integration
- 通过 Playwright 提取 CSS 颜色并融入 prompt
- 字体样式和大小通过结构化 prompt 保持一致
- 背景场景叙事性演进同时保持视觉统一

## Aliases
- 视觉一致性引擎
- Image-to-Image Pipeline