init: video-create project with skills and accounts
This commit is contained in:
351
.claude/skills/image-generator/SKILL.md
Normal file
351
.claude/skills/image-generator/SKILL.md
Normal file
@@ -0,0 +1,351 @@
|
||||
---
|
||||
name: image-generator
|
||||
description: 图片生成技能。支持 Gemini 和 Midjourney (MJ) 两个模型。批量生图、图生图、风格转换、4合1自动拆分。触发词:生图、生成图片、批量出图、图片素材、MJ生图、Gemini生图、图生图、风格转换。
|
||||
---
|
||||
|
||||
# 图片生成
|
||||
|
||||
Gemini(快速)+ MJ(精品)双模型图片生成。**以参考图为锚点**,确保批量出图风格统一。
|
||||
|
||||
---
|
||||
|
||||
## 核心原则:参考图优先
|
||||
|
||||
**参考图是生图质量的关键**。没有参考图的生图 = 风格不可控、批次不统一。
|
||||
|
||||
| 有参考图 | 无参考图 |
|
||||
|---------|---------|
|
||||
| 风格统一、色彩一致 | 每张图风格随机漂移 |
|
||||
| 构图/氛围可控 | 构图/氛围全凭模型发挥 |
|
||||
| 批次之间视觉连贯 | 同一批次看着像不同账号 |
|
||||
| 提示词可以更简洁 | 需要极长的提示词描述风格 |
|
||||
|
||||
**执行规则**:
|
||||
1. **有参考图** → 必须用参考图生图(Gemini 图生图 / MJ --sref)
|
||||
2. **无参考图** → 先让用户提供 1-3 张参考图,或先文生图 1 张让用户确认后再批量
|
||||
3. **参考图位置** → `accounts/{account}/references/` 目录
|
||||
|
||||
---
|
||||
|
||||
## 生图流程
|
||||
|
||||
两种模式:**参考风格**(风格锚点)和**锁定人物**(角色一致性)。
|
||||
|
||||
**启动生图时,必须先询问用户选择模式:**
|
||||
|
||||
> 你要哪种生图模式?
|
||||
> 1. **参考风格** — 统一批次色调/质感,每张图内容不同但风格一致(Gemini 快速 / MJ 精品)
|
||||
> 2. **锁定人物** — 同一角色在不同场景中保持一致,仅 Gemini 支持
|
||||
|
||||
```dot
|
||||
digraph image_gen {
|
||||
rankdir=TB
|
||||
node [shape=box, style=filled, fillcolor="#f5f5f5", fontsize=11]
|
||||
edge [fontsize=10]
|
||||
|
||||
start [label="用户触发生图", shape=oval, fillcolor="#e3f2fd"]
|
||||
ask [label="询问用户\n选择生图模式", shape=diamond, fillcolor="#fff9c4"]
|
||||
|
||||
read_ref [label="读取参考图(references/)\n+ 风格文件(styles/)"]
|
||||
gen_prompt [label="为每条文案生成 prompt"]
|
||||
|
||||
gemini_style [label="Gemini edit()\n风格参考", fillcolor="#e8f5e9"]
|
||||
mj_style [label="MJ --sref\n风格参考", fillcolor="#fff3e0"]
|
||||
gemini_char [label="Gemini edit()\n锁定人物\n角色一致性", fillcolor="#e1bee7"]
|
||||
|
||||
validate [label="质量校验\n与参考图对比"]
|
||||
|
||||
start -> ask
|
||||
ask -> read_ref
|
||||
read_ref -> gen_prompt
|
||||
gen_prompt -> gemini_style [label="参考风格\n快速/批量"]
|
||||
gen_prompt -> mj_style [label="参考风格\n精品/写实"]
|
||||
gen_prompt -> gemini_char [label="锁定人物"]
|
||||
gemini_style -> validate
|
||||
mj_style -> validate
|
||||
gemini_char -> validate
|
||||
}
|
||||
```
|
||||
|
||||
### 参考风格 vs 锁定人物
|
||||
|
||||
| | 参考风格 | 锁定人物 |
|
||||
|---|---------|---------|
|
||||
| 目的 | 统一批次的色调/光影/质感 | 同一角色在不同场景中保持一致 |
|
||||
| 参考图内容 | 风格样本(光影、色调、质感) | 人物正面照/半身照 |
|
||||
| 输出 | 每张图可以完全不同的内容,但风格统一 | 每张图同一人物,不同场景/动作/服装 |
|
||||
| 可用模型 | Gemini + MJ | **仅 Gemini**(edit 图生图) |
|
||||
| 提示词 | 描述场景内容,风格由参考图锚定 | 描述人物的新场景/动作/服装,角色由参考图锁定 |
|
||||
| 适用 | 风景叙事、场景插画、背景素材 | 角色连载、人物故事、IP 内容 |
|
||||
|
||||
### 锁定人物用法(Gemini 专用)
|
||||
|
||||
```bash
|
||||
# 锁定人物:人物参考图 + 新场景描述
|
||||
node .claude/skills/video-from-script/scripts/gemini-image-generator.js edit \
|
||||
"The same woman warrior standing on a cliff overlooking a burning city, dramatic lighting" \
|
||||
-i ./references/character_front.png \
|
||||
-o ./output -r 9:16
|
||||
|
||||
# 多角度锁定(正面 + 侧面)
|
||||
node .claude/skills/video-from-script/scripts/gemini-image-generator.js edit \
|
||||
"The same woman warrior in a dark forest, holding a torch" \
|
||||
-i ./references/character_front.png,./references/character_side.png \
|
||||
-o ./output -r 9:16
|
||||
```
|
||||
|
||||
**锁定人物要点**:
|
||||
- 参考图必须是**同一人物的清晰照片**(正面/半身优先)
|
||||
- 提示词用 "the same [character]" 强调角色延续
|
||||
- 多张参考图提供不同角度,一致性更强
|
||||
- 仅 Gemini 支持此模式(MJ --sref 只传风格,无法锁定人物特征)
|
||||
|
||||
---
|
||||
|
||||
## 模型选择
|
||||
|
||||
| 场景 | 模型 | 参考图用法 | 原因 |
|
||||
|------|------|-----------|------|
|
||||
| 快速出图、批量 | **Gemini** | 本地图文件直传(`-i`) | ~10s,API 直出单张 |
|
||||
| 精品图、写实/艺术 | **MJ** | 公网 URL(`-r`,`--sref`) | 高质量,4图选1,~60s |
|
||||
| 参考图融合风格 | Gemini 或 MJ | 见下方详细说明 | 两种都支持 |
|
||||
|
||||
---
|
||||
|
||||
## 前置条件
|
||||
|
||||
```
|
||||
1. node --version → >= 18
|
||||
2. cd .claude/skills/video-from-script/scripts && npm install
|
||||
3. skills/config.json 中配置 geminiApiKey 和 mjApiKey
|
||||
4. 参考图放入 accounts/{account}/references/(至少 1 张)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 参考图详细用法
|
||||
|
||||
### 放置参考图
|
||||
|
||||
```
|
||||
accounts/{account}/
|
||||
├── account.json # 模型、画幅配置
|
||||
├── styles/ # 风格提示词策略
|
||||
│ └── oriental-mythology-ue5.md
|
||||
└── references/ # 参考图(风格锚点)
|
||||
├── ref_style_1.png # 建议重命名为有意义的名字
|
||||
├── ref_style_2.png
|
||||
└── ref_style_3.png
|
||||
```
|
||||
|
||||
**参考图选择标准**:
|
||||
|
||||
| 好的参考图 | 不好的参考图 |
|
||||
|-----------|------------|
|
||||
| 代表你想要的最终风格 | 随意找的网图 |
|
||||
| 光影、色调、构图都满意 | 只有一个维度满意 |
|
||||
| 3 张以内(太多会冲突) | 10 张堆砌 |
|
||||
| 同一风格的不同场景 | 不同风格的混合 |
|
||||
|
||||
### Gemini 参考图用法
|
||||
|
||||
```bash
|
||||
# 图生图(核心用法:参考图 + 提示词)
|
||||
node .claude/skills/video-from-script/scripts/gemini-image-generator.js edit \
|
||||
"A water deity in flowing hanfu, celestial palace background" \
|
||||
-i ./references/ref_style_1.png \
|
||||
-o ./output -r 9:16
|
||||
|
||||
# 多张参考图(Gemini 同时参考多张)
|
||||
node .claude/skills/video-from-script/scripts/gemini-image-generator.js edit \
|
||||
"A water deity in flowing hanfu, celestial palace background" \
|
||||
-i ./references/ref_style_1.png,./references/ref_style_2.png \
|
||||
-o ./output -r 9:16
|
||||
|
||||
# 批量带参考图(pipeline init + run)
|
||||
node .claude/skills/video-from-script/scripts/pipeline.js init \
|
||||
--account forbidden-emperor --mode single \
|
||||
--items '[{"text":"...","imagePrompt":"...","keyword":"关键词"}]'
|
||||
node .claude/skills/video-from-script/scripts/pipeline.js run \
|
||||
--manifest ./output/forbidden-emperor_XXXXXXXX_001/manifest.json \
|
||||
--phase images
|
||||
```
|
||||
|
||||
**Gemini 参考图原理**:将参考图作为 Base64 inline data 与文本 prompt 一起发送,模型同时看到参考图和提示词。
|
||||
|
||||
### MJ 参考图用法
|
||||
|
||||
```bash
|
||||
# 单张参考图(--sref 风格参考)
|
||||
node .claude/skills/video-from-script/scripts/mj-image-generator.js \
|
||||
"A water deity in flowing hanfu, celestial palace background --sref https://i.ibb.co/xxx/ref.png --sw 200" \
|
||||
-o ./output -a 9:16
|
||||
|
||||
# 多张参考图(逗号分隔 URL)
|
||||
node .claude/skills/video-from-script/scripts/mj-image-generator.js \
|
||||
"prompt --sref URL1 URL2 --sw 200" \
|
||||
-o ./output -a 9:16
|
||||
```
|
||||
|
||||
**MJ 参考图注意**:
|
||||
- 需要**公网 URL**(本地文件需先上传 OSS)
|
||||
- `--sref` = 风格参考(Style Reference)
|
||||
- `--sw 200` = 风格权重(0-1000,200 为默认)
|
||||
- 参考图作为 prompt 尾缀传入,不是独立参数
|
||||
|
||||
### 上传参考图到公网(MJ 用)
|
||||
|
||||
```bash
|
||||
# 单张上传
|
||||
node .claude/skills/video-from-script/scripts/oss-upload.js ./references/ref_style_1.png
|
||||
# → https://i.ibb.co/xxx/ref_style_1.png
|
||||
|
||||
# 批量上传
|
||||
for f in ./references/*.png; do
|
||||
node .claude/skills/video-from-script/scripts/oss-upload.js "$f"
|
||||
done
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Gemini 完整用法
|
||||
|
||||
```bash
|
||||
# 文生图(无参考图时使用)
|
||||
node .claude/skills/video-from-script/scripts/gemini-image-generator.js generate "prompt" -o ./output -r 9:16
|
||||
|
||||
# 图生图(推荐:参考图 + 提示词)
|
||||
node .claude/skills/video-from-script/scripts/gemini-image-generator.js edit "指令" -i ./ref.jpg -o ./output
|
||||
|
||||
# 批量
|
||||
node .claude/skills/video-from-script/scripts/gemini-image-generator.js batch ./prompts.txt -o ./output
|
||||
```
|
||||
|
||||
| 参数 | 说明 |
|
||||
|------|------|
|
||||
| `-o, --output` | 输出目录 |
|
||||
| `-r, --ratio` | 宽高比:1:1, 9:16, 16:9, 3:4 等 |
|
||||
| `-s, --size` | 分辨率:512, 1K, 2K(默认), 4K |
|
||||
| `-i, --input` | 输入图片(图生图),逗号分隔多张 |
|
||||
|
||||
---
|
||||
|
||||
## MJ 完整用法
|
||||
|
||||
```bash
|
||||
# 文生图(自动拆分4张)
|
||||
node .claude/skills/video-from-script/scripts/mj-image-generator.js "prompt" -o ./output -a 9:16
|
||||
|
||||
# 带参考图(--sref)
|
||||
node .claude/skills/video-from-script/scripts/mj-image-generator.js "prompt --sref URL --sw 200" -o ./output -a 9:16
|
||||
|
||||
# 批量
|
||||
node .claude/skills/video-from-script/scripts/mj-image-generator.js batch ./prompts.txt -o ./output
|
||||
|
||||
# 不拆分(保留原始4合1)
|
||||
node .claude/skills/video-from-script/scripts/mj-image-generator.js "prompt" --no-split
|
||||
```
|
||||
|
||||
| 参数 | 说明 |
|
||||
|------|------|
|
||||
| `-o, --output` | 输出目录 |
|
||||
| `-a, --ar` | 宽高比(通过 --ar 传给 MJ) |
|
||||
| `-r, --ref` | 参考图 URL(逗号分隔) |
|
||||
| `--no-split` | 不拆分4合1 |
|
||||
| `--keep-grid` | 保留原始网格图 |
|
||||
|
||||
MJ 流程:提交 imagine → 轮询 5s/次 → 下载 4合1 → sharp 拆分为 4 张独立 PNG。
|
||||
|
||||
---
|
||||
|
||||
## 账号系统集成
|
||||
|
||||
当用户指定账号时,从 `accounts/{account}/` 读取三层资源:
|
||||
|
||||
```
|
||||
accounts/{account}/
|
||||
├── account.json → 默认模型、画幅、风格参考图URL
|
||||
├── styles/ → 风格文件(提示词模板 + 视觉规则)
|
||||
└── references/ → 参考图原始文件(风格锚点)
|
||||
```
|
||||
|
||||
**读取顺序**:
|
||||
1. `account.json` → 读取 `styles.{styleName}.references` 中的公网 URL(优先使用,免上传)
|
||||
2. `references/` → 扫描本地参考图(无 URL 时上传 OSS 获取公网 URL)
|
||||
3. `styles/*.md` → 读取提示词策略(决定 prompt 结构)
|
||||
4. `account.json` → 读取默认配置(模型、画幅)
|
||||
|
||||
### account.json 中的参考图 URL
|
||||
|
||||
参考图上传 OSS 后,将 URL 保存到 `account.json` 的 `styles.{styleName}.references` 数组中,避免重复上传:
|
||||
|
||||
```json
|
||||
{
|
||||
"styles": {
|
||||
"oriental-mythology-ue5": {
|
||||
"references": [
|
||||
{ "file": "ref_style_1.png", "url": "https://i.ibb.co/xxx/ref_style_1.png" },
|
||||
{ "file": "ref_style_2.png", "url": "https://i.ibb.co/yyy/ref_style_2.png" }
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**生图时**:先检查 `account.json` 中是否有对应风格的公网 URL,有则直接用;无则上传 `references/` 下的本地文件到 OSS,上传成功后回写 URL 到 `account.json`。
|
||||
|
||||
用户可指定风格,如 "用 cyberpunk-character 风格"。不指定时使用 `styles/` 下第一个文件。
|
||||
|
||||
---
|
||||
|
||||
## 质量要求(视频素材级)
|
||||
|
||||
为保证后续视频成片质量,图片必须:
|
||||
|
||||
- [ ] 分辨率 >= 1024px(短边)
|
||||
- [ ] 画幅与目标视频一致(9:16/16:9)
|
||||
- [ ] 无文字水印、无字幕覆盖
|
||||
- [ ] 构图留白(底部 1/4 留给字幕区域)
|
||||
- [ ] **风格与参考图统一**(同一批次色调/光影/质感一致)
|
||||
- [ ] MJ 拆分后检查 4 张图质量,丢弃不合格的
|
||||
- [ ] 每批次首图与参考图对比,风格偏差大则调整 prompt 重试
|
||||
|
||||
---
|
||||
|
||||
## 作为模块调用
|
||||
|
||||
```js
|
||||
// Gemini 文生图
|
||||
const { generate: geminiGen } = require('./gemini-image-generator')
|
||||
const r = await geminiGen('prompt', { outputDir: './out', aspectRatio: '9:16' })
|
||||
|
||||
// Gemini 图生图(带参考图)
|
||||
const { edit: geminiEdit } = require('./gemini-image-generator')
|
||||
const r = await geminiEdit('prompt', ['./ref1.png', './ref2.png'], { outputDir: './out', aspectRatio: '9:16' })
|
||||
|
||||
// MJ
|
||||
const { generate: mjGen } = require('./mj-image-generator')
|
||||
const r = await mjGen('prompt', { outputDir: './out', aspectRatio: '9:16' })
|
||||
// r.files = ['_1.png', '_2.png', '_3.png', '_4.png']
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 文件命名规则
|
||||
|
||||
Pipeline 生成的文件统一命名,keyword 来自 manifest item 的 `keyword` 字段(slugify: 保留中文和字母数字,最多 20 字符,其余变 `_`):
|
||||
|
||||
| 模式 | 文件名 | 示例 |
|
||||
|------|--------|------|
|
||||
| 单图首帧 | `scene_{NN}_{keyword}.jpeg` | `scene_01_崛起.jpeg` |
|
||||
| 首尾帧首帧 | `scene_{NN}_{keyword}.jpeg` | `scene_01_觉醒.jpeg` |
|
||||
| 首尾帧尾帧 | `scene_{NN}_{keyword}_last.jpeg` | `scene_01_觉醒_last.jpeg` |
|
||||
| MJ 候选图 | `scene_{NN}_{keyword}_cand{1-4}.jpeg` | `scene_01_崛起_cand1.jpeg` |
|
||||
|
||||
`{NN}` = 两位场景编号(01, 02, ...),对应 items 数组索引。
|
||||
|
||||
---
|
||||
|
||||
## 详细参考
|
||||
|
||||
批量生产完整流程(账号、文案、提示词生成、输出结构)见 [batch-mode.md](references/batch-mode.md)
|
||||
93
.claude/skills/image-generator/references/batch-mode.md
Normal file
93
.claude/skills/image-generator/references/batch-mode.md
Normal file
@@ -0,0 +1,93 @@
|
||||
# 批量图片生产
|
||||
|
||||
## 流程
|
||||
|
||||
```dot
|
||||
digraph batch_gen {
|
||||
rankdir=LR
|
||||
node [shape=box, style=filled, fillcolor="#f5f5f5", fontsize=11]
|
||||
|
||||
refs [label="参考图 references/\n+ 风格 styles/*.md", shape=folder, fillcolor="#e3f2fd"]
|
||||
prompts [label="生成提示词\n每条文案→imagePrompt\n+ videoPrompt"]
|
||||
model_gemini [label="Gemini edit()\n图生图(参考图直传)", fillcolor="#e8f5e9"]
|
||||
model_mj [label="MJ --sref\n风格参考(URL)", fillcolor="#fff3e0"]
|
||||
output [label="输出图片\n+ manifest.json"]
|
||||
pick [label="人工挑选\n删除不合格变体", shape=diamond, fillcolor="#fff9c4"]
|
||||
|
||||
refs -> prompts
|
||||
prompts -> model_gemini [label="快速/批量"]
|
||||
prompts -> model_mj [label="精品/写实"]
|
||||
model_gemini -> output
|
||||
model_mj -> output
|
||||
output -> pick
|
||||
}
|
||||
```
|
||||
|
||||
## 提示词生成
|
||||
|
||||
### 单图模式(默认)
|
||||
|
||||
每条文案生成 2 个 prompt:
|
||||
|
||||
| 字段 | 用途 | 规则 |
|
||||
|------|------|------|
|
||||
| `imagePrompt` | 生图 | 英文,描述画面内容 |
|
||||
| `videoPrompt` | 图生视频 | 描述**运动**(zoom/pan/dolly),不超过 50 词 |
|
||||
|
||||
### 首尾帧模式(用户指定时)
|
||||
|
||||
每条文案生成 3 个 prompt:
|
||||
|
||||
| 字段 | 用途 | 规则 |
|
||||
|------|------|------|
|
||||
| `imagePrompt` | 起始帧 | 静止状态 |
|
||||
| `lastFramePrompt` | 结束帧 | 同一场景的运动状态 |
|
||||
| `videoPrompt` | 过渡视频 | "from X to Y" 格式 |
|
||||
|
||||
首尾帧原则:同一场景、视角一致、状态对比、光照连贯。
|
||||
|
||||
## 输出目录
|
||||
|
||||
```
|
||||
output/{account}_{YYYYMMDD}_{NNN}/
|
||||
├── manifest.json # 主清单(贯穿全流程)
|
||||
├── images/ # scene_{NN}_{keyword}.jpeg
|
||||
├── videos/ # scene_{NN}_{keyword}.mp4
|
||||
└── audio/ # seg_001.mp3
|
||||
```
|
||||
|
||||
命名:图片 `scene_01_悬浮.jpeg` → 视频 `scene_01_悬浮.mp4`(keyword 支持中文)
|
||||
|
||||
## manifest.json
|
||||
|
||||
字段规范详见 [manifest-schema.md](../../video-from-script/references/manifest-schema.md)。
|
||||
|
||||
## 命令速查
|
||||
|
||||
```bash
|
||||
# Gemini 图生图(推荐,本地图直传)
|
||||
node scripts/gemini-image-generator.js edit "prompt" -i ./references/ref1.png -o ./output -r 9:16
|
||||
|
||||
# Pipeline 批量生图(推荐)
|
||||
node scripts/pipeline.js init \
|
||||
--account {account} --mode single \
|
||||
--items '[{"text":"...","imagePrompt":"...","keyword":"关键词"}]'
|
||||
node scripts/pipeline.js run \
|
||||
--manifest ./output/{account}_XXXXXXXX_001/manifest.json \
|
||||
--phase images
|
||||
|
||||
# MJ 带参考图(需先上传 OSS)
|
||||
node scripts/oss-upload.js ./references/ref1.png
|
||||
node scripts/mj-image-generator.js "prompt --sref URL --sw 200" -o ./output -a 9:16
|
||||
|
||||
# Gemini 纯文生图(无参考图时)
|
||||
node scripts/gemini-image-generator.js generate "prompt" -o ./output -r 9:16
|
||||
```
|
||||
|
||||
## 质量检查
|
||||
|
||||
- 风格与参考图一致
|
||||
- 画幅比例正确(9:16/16:9)
|
||||
- 无文字/水印/字幕覆盖
|
||||
- 主体清晰,构图留白(底部 1/4 给字幕)
|
||||
- manifest.json 与实际文件一一对应
|
||||
Reference in New Issue
Block a user