init: video-create project with skills and accounts

This commit is contained in:
2026-04-29 21:04:43 +08:00
commit dadddc7aec
64 changed files with 14715 additions and 0 deletions

View File

@@ -0,0 +1,565 @@
---
name: video-from-script
description: 素材生产路由。根据用户意图分发到对应子技能image-generator生图、capcut成片。支持单图和首尾帧两种视频模式。触发词做视频、视频素材、生图+成片、图生视频、首尾帧。
---
# 素材生产路由
## 强制规则
1. **工作流不可跳步**:分镜(纯叙事)→ Prompt 生成(分镜+风格)→ Pipeline 执行。每阶段之间必须审查结果
2. **manifest.json 是唯一状态源**:任何操作(生图、上传、替换素材)完成后必须立即回写 manifest
3. **禁止 curl 调用生图/生视频 API**:必须通过 `pipeline.js` 或对应 generator 脚本执行
4. **并行优先**:多个独立子任务必须用子 agent 并行,不要在主对话中串行完成
**禁止**:跳过分镜 / 分镜阶段读风格 / 不更新 manifest 就继续 / 一口气跑完 pipeline 不审查
---
**你(主 Agent是整个流程的导演。** 子 Agent 是执行者,你负责:理解意图、编排调度、质量卡点、用户沟通、错误恢复。
## 主 Agent 职责
| 职责 | 说明 |
|------|------|
| 意图理解 | 分析用户需求,选择正确的模式、视频模型和帧模式 |
| 编排调度 | 决定 Agent 串行/并行、传递参数、收集结果 |
| 质量卡点 | 每个阶段完成后校验结果,不合格则要求子 Agent 重做 |
| 用户沟通 | 汇报进度、请求用户决策(挑选图片、确认风格) |
| 错误恢复 | API 失败时重试或换模型,质量不达标时补生成 |
---
## 路由规则
| 用户意图 | 执行流程 | 子技能 |
|---------|---------|--------|
| "生图"、"批量图片" | 生图 | `image-generator` |
| "图片成片"、"图片轮播" | 已有图片 → 组装 | `capcut` |
| "图文成片"、"生图+成片" | 生图 → TTS+字幕+组装 | `image-generator``capcut` |
| "图生视频"、"图片转视频" | 生图 → AI视频 → 组装 | `image-generator` → Grok/VEO → `capcut` |
| "首尾帧"、"帧动画"、"关键帧" | 生图(成对) → VEO视频 → 组装 | `image-generator`(帧对) → VEO → `capcut` |
| "文案转视频"、"配音视频" | 生图 → TTS+字幕+组装 | `image-generator``capcut` |
| 只说"做视频" | **询问**:图文成片 / 图生视频(单图/首尾帧) | — |
**"图生视频"的后续追问**:用户说"图生视频"时,追问视频模式:
- **单图模式**:一张图 → 一段视频Grok 或 VEO
- **首尾帧模式**:起始帧+结束帧 → 一段过渡视频(仅 VEO
---
## Pipeline 执行流程
Agent 创建 manifest.json 后,用 `pipeline.js` 分阶段执行。**不要一口气跑完,必须在阶段之间审查结果。**
### 分工
| 角色 | 职责 |
|------|------|
| **Agent**(你) | 读取 account.json + style.md → **分镜规划** → 从分镜生成 imagePrompt/videoPrompt → 写出 manifest.json → 审查每阶段结果 |
| **Pipeline** | 机械执行:生图 → 上传 → 生视频 → TTS → 成片。每完成一个 item 写盘,支持断点续跑 |
### 执行步骤
```
Step -1: 意图确认(进入任何步骤前必须完成,逐项确认,缺一不可)
1. 内容意图:用户要做什么?
- 生图 / 图生视频 / 图片成片 / 配音视频 / 首尾帧
- 模糊时追问到明确,不要自己猜
2. 素材来源:
- 有现成文案/图片?还是需要 AI 生成文案?
- 有参考图/风格参考?
3. 视频模式(涉及视频时必问):
- 单图模式1 张图 → 1 段视频Grok 或 VEO
- 首尾帧模式2 张图 → 过渡视频(仅 VEO
4. 账号确认:
- 扫描 accounts/*/account.json 获取最新账号列表
- 展示ID、名称、风格、画幅
- 未指定 → 让用户选
- 指定了但不匹配 → 告知可用账号,问是否新建
- 确认后记住 account ID
5. 参数确认:
- 画幅9:16 / 16:9、生图模型Gemini / MJ、视频模型VEO / Grok
- 有账号时从 account.json 继承默认值,只问是否覆盖
→ 以上 5 项全部确认后agent 写出完整执行计划,让用户最终确认:
执行计划示例(根据实际任务调整):
1. 读取 {account} 账号配置 + 风格文件style.md
2. 根据用户文案生成分镜表N shot
3. 分镜 + 风格 → 生成英文 promptsimagePrompt + videoPrompt
4. pipeline.js init → 创建 manifest.json + 输出目录
5. pipeline.js run --phase images → 生图 → 人工审查
6. pipeline.js run --phase upload,videos → 上传 + 生成视频
7. pipeline.js run --phase tts,assemble → TTS + 成片
用户确认 "开始" → 进入 Step 0
用户修改 → 调整计划后重新输出
→ 禁止在用户未确认执行计划的情况下进入 Step 0
Step 0: 前置检查(账号+风格校验)
- 读取 accounts/{account}/account.json检查 styles 字段是否配置了风格文件
- 如果账号不存在或没有风格:
→ 暂停流程,通过 CLI 创建:`pipeline.js create-account --id <id> --name <名称> --references ./ref.png`
→ 然后编辑 `styles/*.md` 完善提示词策略
- 校验账号完整性:`pipeline.js validate-account --account <id>`
- 有风格则继续 Step 1
Step 1: 分镜规划(纯叙事,不读风格)
- 输入:用户文案
- 分析文案语义和节奏,拆成 N 个 shot
- 为每个 shot 规划:景别、镜头运动、画面内容(中文)、与下一 shot 的转场
- 输出分镜表(见「分镜规划规则」章节)
- 分镜与风格无关,同一分镜可换不同风格复用
Step 2: Prompt 生成 + Manifest 初始化(分镜 + 风格 → 英文 prompts → pipeline.js init
- 输入:分镜表 + style.md + account.json
- 子 Agent 将每个 shot 的中文画面描述结合风格文件,生成:
· imagePrompt英文画面描述给 Gemini/MJ
· videoPrompt英文运动描述给 Grok/VEO
· keyword, keywordColor
- **禁止 AI 手写 manifest.json**,必须通过脚本初始化:
```bash
node pipeline.js init --account <id> --mode <single|framePair> \
--items '[{"text":"文案","imagePrompt":"...","videoPrompt":"...","keyword":"关键词","keywordColor":"#FF6B35"}]'
```
- 脚本自动从 account.json 继承imageModel、videoModel、format、references
- 脚本自动创建目录、校验必填字段、设置 status=pending
- AI 只负责创意内容text、imagePrompt、videoPrompt、keyword不碰结构字段
- 首尾帧模式额外要求:每个 item 必须有 `lastFramePrompt``imagePrompt` 作为第一帧,不需要单独的 `firstFramePrompt`
- init 返回 manifest 路径,后续命令使用该路径
Step 3: 生图 → 人工审查
跑 images 阶段。完成后审查分辨率≥1024、风格一致性、构图、无水印。
不合格则删除/调 prompt 重跑,不进入下一步。
Step 4: 上传 + 生视频(可选,图文成片跳过此步)
跑 upload + videos 阶段。首尾帧模式检查过渡连贯性。
Step 5: TTS + 成片
跑 tts + assemble 阶段。检查字幕准确、BGM 不盖配音。
```
> 命令语法见下方「CLI 参考」,不在此处重复。
### CLI 参考
```bash
# 创建账号Step 0首次使用时
node pipeline.js create-account --id <id> --name <名称> \
--desc <描述> --video-model veo3-fast --references ./ref1.png,./ref2.png
# 校验账号完整性
node pipeline.js validate-account --account <id>
# 初始化 manifestStep 2 使用AI 只提供创意内容)
node pipeline.js init --account <id> --mode <single|framePair> \
--items '[{"text":"...","imagePrompt":"...","videoPrompt":"...","keyword":"...","keywordColor":"..."}]'
# 也可从文件读取 items适合大量数据
node pipeline.js init --account <id> --mode single --items-file ./items.json
# 校验 manifest 完整性
node pipeline.js validate --manifest <path>
# 跑指定阶段
node pipeline.js run --manifest <path> --phase images
node pipeline.js run --manifest <path> --phase upload,videos
# 断点续跑(跳过已完成阶段和 item
node pipeline.js run --manifest <path> --resume
# 查看进度
node pipeline.js status --manifest <path>
```
**阶段**: `images``upload``videos``tts``assemble`
**Manifest item 状态**: `pending``generating``done` / `failed`。无 status 字段视为 pending。
---
## 视频模式对比
### 单图模式
```dot
digraph single_image {
rankdir=LR
node [shape=box, style=filled, fillcolor="#f5f5f5", fontsize=11]
img [label="一张图", shape=oval]
prompt [label="videoPrompt"]
grok [label="Grok\n6s 视频", fillcolor="#fff3e0"]
veo [label="VEO\n6-8s 视频", fillcolor="#e8f5e9"]
result [label="视频输出", shape=oval, fillcolor="#e3f2fd"]
img -> prompt
prompt -> grok
prompt -> veo
grok -> result
veo -> result
}
```
- 每条文案生成 1 张图 + 1 个 videoPrompt
- Grok 和 VEO 都支持
- 提示词描述运动:"slow zoom in on subject"
### 首尾帧模式
```dot
digraph frame_pair {
rankdir=LR
node [shape=box, style=filled, fillcolor="#f5f5f5", fontsize=11]
first [label="起始帧"]
last [label="结束帧"]
prompt [label="videoPrompt"]
veo [label="VEO\n6-8s 过渡视频", fillcolor="#e8f5e9"]
result [label="视频输出", shape=oval, fillcolor="#e3f2fd"]
first -> veo
last -> veo
prompt -> veo
veo -> result
}
```
- 每条文案生成 **2 张图**firstFrame + lastFrame+ 1 个 videoPrompt
- **仅 VEO 支持**images 数组传两张图)
- 起始帧和结束帧必须是**同一场景的不同状态**
- 提示词描述过渡:"transition from idle machines to active production"
| 对比 | 单图模式 | 首尾帧模式 |
|------|---------|-----------|
| 图片数量 | N 张 | 2N 张 |
| 生图耗时 | 标准 | ~2 倍(可并行) |
| 视频连贯性 | 仅运动 | 场景变化(更强) |
| 可用模型 | Grok + VEO | 仅 VEO |
| 适用场景 | 风景、人物展示 | 状态变化、叙事过渡 |
---
## 多阶段执行策略
用 Agent 工具串行或并行执行子技能,**阶段间必须通过质量卡点**
**生图+成片(串行+人工卡点)**
```dot
digraph image_then_assemble {
rankdir=LR
node [shape=box, style=filled, fillcolor="#f5f5f5", fontsize=11]
agent1 [label="Agent 1\nimage-generator\n生成图片到 output/"]
gate1 [label="人工卡点\n用户挑选图片\n删除不合格的", shape=diamond, fillcolor="#fff9c4"]
agent2 [label="Agent 2\ncapcut\n读取精选素材 → 组装"]
agent1 -> gate1 -> agent2
}
```
**配音+生图(并行+自动校验)**
```dot
digraph parallel_image_tts {
rankdir=LR
node [shape=box, style=filled, fillcolor="#f5f5f5", fontsize=11]
agent1 [label="Agent 1\nimage-generator\n生图", fillcolor="#e8f5e9"]
agent2 [label="Agent 2\ncapcut\nTTS 配音", fillcolor="#e8f5e9"]
validate [label="自动校验\n分辨率>=1024\n画幅匹配\n音频时长匹配", shape=diamond, fillcolor="#fff9c4"]
agent3 [label="Agent 3\ncapcut\n组装全部素材 → 成片"]
agent1 -> validate
agent2 -> validate
validate -> agent3
}
```
**图生视频 - 单图模式**
```dot
digraph single_image_video {
rankdir=LR
node [shape=box, style=filled, fillcolor="#f5f5f5", fontsize=11]
agent1 [label="Agent 1\nimage-generator\n生图 + videoPrompt"]
gate1 [label="人工卡点\n用户挑选图片", shape=diamond, fillcolor="#fff9c4"]
agent2 [label="Agent 2\nGrok / VEO\n单图输入并行生成视频"]
agent3 [label="Agent 3\ncapcut\n视频片段 + 字幕 → 成片"]
agent1 -> gate1 -> agent2 -> agent3
}
```
**图生视频 - 首尾帧模式**
```dot
digraph frame_pair_video {
rankdir=LR
node [shape=box, style=filled, fillcolor="#f5f5f5", fontsize=11]
agent1 [label="Agent 1\nimage-generator\n成对生图\n(firstFrame + lastFrame)\n可并行"]
gate1 [label="人工卡点\n检查首尾帧连贯性\n同一场景/相似视角", shape=diamond, fillcolor="#fff9c4"]
agent2 [label="Agent 2\nVEO\n双图输入\nimages:[first, last]"]
agent3 [label="Agent 3\ncapcut\n视频片段 + 字幕 → 成片"]
agent1 -> gate1 -> agent2 -> agent3
}
```
**视频模型选择**
| 模型 | 时长 | 画幅 | 单图 | 首尾帧 | 特点 | API |
|------|------|------|------|--------|------|-----|
| Grok | 6s | 任意 | ✅ | ❌ | 快、稳定 | yunwu.ai |
| Veo3-fast | ~8s | 16:9, 9:16 | ✅ | ✅ | 超分、中文增强 | jimmyai.cn |
| Veo3-fast-frames | ~8s | 16:9, 9:16 | ✅ | ✅ | 多帧、质量最高 | jimmyai.cn |
图生视频注意事项:
- **并行执行**:先同时提交所有任务(并发 3再并行轮询结果
- 单个视频生成耗时 60-300 秒
- 脚本内置 3 次重试,每次自动简化提示词
- **videoPrompt 在生图阶段一并生成**
- VEO 独有:`enhance_prompt=true` 中文增强,`enable_upsample=true` 超分
- 配置在 `config.json`
### 视频大小一致性
- **同批次同模型**,不混合 Grok720P/6s和 VEO超分/8s
- 画幅统一跟随 manifest 顶层 `format`(默认 `9:16`
- 个别 item 降级到备用模型时,在 manifest 中标记 `"videoModel"` 以便追踪
### 视频生成失败降级
**降级链**: `Grok ↔ VEO → 可灵(Kling)`
**触发**: 同一 item 重试 5 次仍失败 → 用备用模型单独补生成
```bash
# Grok 失败 → VEO 补
node veo-video-generator.js --image <url> --prompt <prompt> -o ./videos
# VEO 失败 → Grok 补
node grok-video-generator.js --image <url> --prompt <prompt> -o ./videos
```
**规则**: 逐 item 降级,不卡整批次。补完后上传 OSS回写 `videoUrl`,继续 `tts → assemble`
---
## 目录规范
所有批次的输出遵循统一目录结构。完整规范见 [batch-mode.md](../image-generator/references/batch-mode.md) 的"目录规范"章节。
**核心规则**
```
output/{account}_{YYYYMMDD}_{NNN}/
├── manifest.json # 主清单(贯穿全流程)
├── prompts.txt # 原始提示词存档
├── images/ # scene_{NN}_{keyword}.jpeg首尾帧加 _last 后缀)
├── videos/ # scene_{NN}_{keyword}.mp4与图片对应
└── urls.json # OSS 公网 URL 映射
```
**命名对应关系**:图片 `scene_01_觉醒.jpeg` → 视频 `scene_01_觉醒.mp4`;首尾帧尾帧 `scene_01_觉醒_last.jpeg`MJ 候选 `scene_01_觉醒_cand1.jpeg`
---
## manifest.json 格式
完整字段规范见 [manifest-schema.md](references/manifest-schema.md)(字段权重 P0/P1/P2、读写方、流转关系
**核心规则**
- 脚本检测 `lastFrameUrl` → 首尾帧模式(传 images:[url, lastFrameUrl]);否则 → 单图模式(传 images:[url]
- 顶层 `format` 自动传给 VEO/Grok 作为画幅比例
- `account` 字段驱动 capcut_assemble 读取对应 account.json 的字幕风格配置
---
## 分镜规划规则
**分镜是 Agent 的纯叙事思考,与视觉风格无关。** 拿到文案后、读风格文件之前,先完成分镜。
短视频的画面节奏和文案节奏是脱钩的TTS 配音连续流淌,画面在配音下面切换。分镜规划的是**视觉节拍**,不是文字断句。
### 核心原则
1. **按视觉节拍切 shot**:每个 shot = 6-8 秒视频片段。不是按文字断句,而是按画面能承载的信息量切
2. **前 3 秒 hook**shot 1 必须有强视觉冲击,决定完播率
3. **景别快速交替**:相邻 shot 景别必须有落差wide → close-upclose-up → medium不要连续同一景别
4. **镜头服务情绪**:每个 cameraMove 对应文案的情绪节拍,不要无意义运动
5. **时长匹配**先算总时长shot 数 × 6-8s再和配音时长对齐
### 时长规划
分镜前先算数:
- 短视频目标时长20-60 秒
- 每个 shot 时长6-8 秒(由视频模型决定)
- shot 数量 = 目标时长 ÷ 6~8取整一般 4-8 个 shot
- 配音字数 ≈ shot 数 × 12-15 字(按正常语速)
### 分镜表字段
| 字段 | 类型 | 说明 |
|------|------|------|
| `text` | string | 该 shot 覆盖的配音文案(可能不到一句,也可能跨句) |
| `shotType` | enum | `wide` / `medium` / `close-up` / `extreme-close-up` |
| `cameraMove` | enum | `static` / `zoom-in` / `zoom-out` / `pan-left` / `pan-right` / `dolly-in` / `tracking` |
| `visualDesc` | string | 画面描述(中文),只写三件事:**主体是什么、什么状态/动作、视觉焦点在哪**。氛围和构图交给风格层 |
| `hook` | boolean | 仅 shot 1 为 true标记是否为开场钩子 |
### 景别节奏
```
shot 1 (hook): close-up 或 extreme-close-up强主体抓眼球
shot 2: wide 或 medium展开场景给上下文
shot 3-N交替: close-up→ wide→ close-up→ ...
最后一个 shot: medium 或 wide收束不过度设计
```
不要用 extreme-close-up 收尾(太紧),不要用 tracking 滥用(信息密度低)。
### 镜头运动选择
| cameraMove | 情绪 | 典型场景 |
|------------|------|---------|
| `static` | 稳定、庄严 | 建筑、静物、仪式感 |
| `zoom-in` | 聚焦、压迫 | 悬疑、揭秘、强调细节 |
| `zoom-out` | 揭示、震撼 | 从局部拉出全景,揭示真相 |
| `pan-left/right` | 环顾、流动 | 展示空间、物品陈列 |
| `dolly-in` | 沉浸、紧张 | 人物面部、关键物件 |
| `tracking` | 跟随、活力 | 运动场景、行走少用AI 生成的 tracking 质量不稳定) |
短视频默认转场是硬切不需要单独字段。特殊转场fade/dissolve仅在 Agent 判断需要情绪转换时标注在 `visualDesc` 里。
---
## 提示词生成规则
**提示词由子 Agent 生成**:主 Agent 将分镜表 + 风格文件style.md交给子 Agent子 Agent 负责将中文画面描述转化为英文 imagePrompt / videoPrompt。主 Agent 审核提示词质量,不合格则退回重做。
**前置条件**:账号必须有风格文件。无风格 → 提醒用户创建,不跳过。
### 单图模式提示词
每条文案生成:
- `imagePrompt`:画面描述(英文,给 Gemini/MJ
- `videoPrompt`:运动描述(英文,给 Grok/VEO
videoPrompt 规则:
- 描述**运动**而非内容("zoom in" 而非 "a cat"
- 与 imagePrompt 画面内容对应
- 简洁1-2 句,不超过 50 词)
- **收敛原则**:基于图片已有内容,仅描述镜头运动和微动效果
- **禁止**:大幅度环境切换、场景变化、人物位置跳变
- **推荐写法**镜头运动slow zoom/pan/dolly+ 星座/光效微动 + 保持静止氛围
- **画幅继承**manifest.json 顶层 `format` 字段(如 `"9:16"`)会自动传给 VEO无需命令行 `-a`
### 首尾帧模式提示词
每条文案生成:
- `imagePrompt`:起始帧画面(英文,与 single 模式复用同一字段)
- `lastFramePrompt`:结束帧画面(英文)
- `videoPrompt`:过渡描述(英文,给 VEO
**首尾帧提示词设计原则**
| 原则 | 说明 | 示例 |
|------|------|------|
| 同一场景 | 首尾帧是同一地点/主体的不同状态 | 都是工厂,不是两个地方 |
| 视角一致 | 相机角度/高度/距离相同 | 都是 wide shot |
| 状态对比 | imagePrompt"静止/之前"lastFramePrompt"运动/之后" | 空车间 → 生产线运转 |
| 过渡自然 | videoPrompt 描述从首到尾的变化 | "machines start up rhythmically" |
| 光照连贯 | 光源方向一致,可以有渐变 | 冷光 → 暖光可以,不能反转光源 |
**videoPrompt 规则**(首尾帧):
- 描述**过渡过程**而非单帧状态
- "from X to Y" 或 "X begins, Y happens" 格式
- 必须同时呼应 imagePrompt起始帧和 lastFramePrompt结束帧中的元素
- 简洁1-2 句,不超过 50 词)
---
## 质量卡点(跨阶段)
多阶段任务中,每个阶段完成后必须校验再进入下一阶段:
### 生图 → 成片 卡点
| 检查项 | 标准 | 不通过处理 |
|--------|------|-----------|
| 图片分辨率 | 短边 >= 1024px | 重新生成 |
| 画幅比例 | 与目标视频一致 (9:16/16:9) | 重新生成 |
| 图片内容 | 无水印、无文字、主体清晰 | 删除,人工补选 |
| 风格一致性 | 同批次风格统一 | 替换偏差大的图 |
| 数量 | 至少 3 张(< 3 张无法成片) | 补充生成 |
**首尾帧额外检查**
| 检查项 | 标准 | 不通过处理 |
|--------|------|-----------|
| 场景一致性 | 首尾帧是同一场景 | 重新生成 lastFrame |
| 视角匹配 | 构图、角度、距离一致 | 重新生成不匹配的帧 |
| 状态过渡合理 | 结束帧是起始帧的自然延续 | 调整提示词重新生成 |
**自动校验脚本**(在 Agent 间插入):
```bash
node .claude/skills/video-from-script/scripts/validate_assets.js \
--dir ./output/batch_xxx \
--min-resolution 1024 \
--expected-ratio 9:16
```
### 配音 → 成片 卡点
| 检查项 | 标准 | 不通过处理 |
|--------|------|-----------|
| 音频时长 | 与素材总时长相近±20% | 调整语速或素材时长 |
| 音频质量 | 无静音段、无爆音 | 重新生成 |
| 音频数量 | 与素材数量匹配 | 补充或裁剪 |
### AI视频 → 成片 卡点
| 检查项 | 标准 | 不通过处理 |
|--------|------|-----------|
| 视频时长 | 每段 6-8 秒 | 正常,模型固定输出 |
| 视频画质 | 无明显伪影、无黑帧 | 重新生成 |
| 过渡连贯(首尾帧) | 视频从首帧平滑过渡到尾帧 | 优化提示词重试 |
| 视频数量 | 与素材数量匹配 | 补充生成失败的视频 |
### 成片输出 卡点
| 检查项 | 标准 |
|--------|------|
| 字幕准确 | 与原始文案一一对应 |
| 关键词高亮 | 颜色醒目、位置正确 |
| 图片动画 | Ken Burns 流畅无卡顿 |
| BGM 音量 | 不盖过配音(配音为主) |
| 转场 | 无黑帧、无跳帧 |
**任何卡点不通过,必须修复后再进入下一阶段,不可跳过。**
---
## 共享资源
所有子技能共享以下资源(位于本目录):
- `scripts/` — 共享脚本gemini-image-generator.js, mj-image-generator.js, grok-video-generator.js, veo-video-generator.js, capcut_assemble.js, sync-to-jianying.js, oss-upload.js
- `accounts/` — 账号配置(详见 [account-system.md](references/account-system.md)
- `references/account-system.md` — 账号系统说明
配置统一在 `skills/config.json`API密钥、路径
---
## 子技能
| 技能 | 触发词 | 职责 |
|------|--------|------|
| `image-generator` | 生图、批量出图、MJ、Gemini | 图片生成(双模型、单图/帧对) |
| `capcut` | 成片、组装、剪映、图片轮播 | CapCut 成片组装 |

View File

@@ -0,0 +1,20 @@
{
"id": "",
"name": "",
"description": "",
"defaultFormat": "9:16",
"imageModel": "gemini",
"videoModel": "",
"batchSize": 30,
"capcut": {
"effects": [],
"filter": "",
"subtitleStyle": {
"fontSize": 36,
"color": "#FFFFFF",
"highlightColor": "#FF6B35",
"bold": true
},
"defaultBGM": ""
}
}

View File

@@ -0,0 +1,40 @@
{
"id": "forbidden-emperor",
"name": "禁忌帝王学",
"description": "禁书档案×东方密室美学×历史权谋。被删除的权力技术,历史课不教的真相。暗调古籍+烛火+朱砂,昭和大正禁书档案风格。",
"pipeline": "image-video",
"defaultFormat": "9:16",
"imageModel": "gemini",
"videoModel": "veo",
"batchSize": 10,
"styles": {
"oriental-mythology-ue5": {
"references": [
{
"file": "下载 (3).jpg",
"url": "https://i.ibb.co/GQtg388Z/6fcf1869c871.jpg"
}
]
}
},
"capcut": {
"effects": [],
"filter": "电影感:30",
"subtitleStyle": {
"font": "SourceHanSerifCN_Regular",
"fontSize": 18,
"color": "#FFFFFF",
"bold": false,
"inAnimation": "向右滑动",
"inAnimationDuration": 1000000,
"outAnimation": "向左滑动",
"outAnimationDuration": 1000000,
"alpha": 0.9,
"transformY": 350,
"hasShadow": true,
"shadowColor": "#000000",
"shadowAlpha": 0.5
},
"defaultBGM": ""
}
}

Binary file not shown.

After

Width:  |  Height:  |  Size: 331 KiB

View File

@@ -0,0 +1,193 @@
# 东方神话史诗 — 阈限梦核版
> Liminal Space × 80s Film Grain × VHS Aesthetic × 传统东方神话
> 封闭空间 + 静止 + 怀旧 + 超现实 + 十二星座
---
## 1. 核心原则
| 维度 | 规则 |
|------|------|
| 空间 | 室内外均可liminal阈限空间强调空旷、被遗弃的寂静感 |
| 氛围 | empty, still, silent, frozen moment — 空旷但不是宏大,是"被遗弃的寂静" |
| 美学 | 80s film grain, VHS aesthetic, analog photography, color bleeding, scan lines |
| 情绪 | nostalgic, uncanny, surreal, dreamlike — 怀旧、不安、超现实 |
| 人物 | 用纯尺度/位置词:`a small silhouette in the distance` / `a tiny presence at far end`,体现渺小,禁用外貌/服装/颜色描述 |
| 光照 | dim, tungsten, single light beam, fluorescent — 不是辉煌光芒,是衰弱的余光 |
| 色调 | faded, warm analog tones, desaturated — 褪色、偏暖、低饱和 |
### ⚠ --sref 参考图规则(必须遵守)
使用 --sref 时,**提示词中禁止出现任何人物描述词**figure, person, character, hanfu 等。MJ 会自动从参考图继承人物形象和风格。
| 模式 | 提示词内容 | 结果 |
|------|-----------|------|
| 有 --sref | 纯场景描述(建筑+光照+氛围),**零人物词** | ✅ 通过,参考图自动注入人物 |
| 有 --sref | 场景 + "lone figure" / "hanfu" 等人物词 | ❌ 触发 deepfake 审核 |
| 无 --sref | 场景 + "a lone figure in [color] hanfu" | ✅ 通过MJ 自行生成人物 |
参数:`--sref [URL] --sw 50`最多1张参考图。无 --sref 时不加 --sw。
---
## 2. Prompt 结构
**模式 A带 --sref 参考图(推荐)**
```
[空间类型], [建筑 + 超现实元素]. [光照 + 氛围细节]. [星座元素].
[色调] palette, 80s film grain, VHS aesthetic, analog photography, liminal space, surreal, deep depth of field, everything sharp --ar 9:16 --style raw --s 750 --sref [URL] --sw 50
```
⚠ 零人物词。参考图自动提供人物形象和风格。
**模式 B无 --sref**
```
[空间类型], [建筑 + 超现实元素]. A lone figure in [颜色] hanfu [姿态], still.
[光照 + 氛围细节]. [星座元素].
[色调] palette, 80s film grain, VHS aesthetic, analog photography, liminal space, surreal, deep depth of field, everything sharp --ar 9:16 --style raw --s 750
```
---
## 3. 十二星座提示词
> 完整 imagePrompt + videoPrompt 见下方第 6 节。
---
## 4. MJ 参数规范
| 参数 | 规范值 | 说明 |
|------|--------|------|
| --style | raw | 必须,保留 film grain 质感 |
| --s | 750 | stylize值 |
| --sw | ≤50 | ⚠ 仅配合 --sref |
| --sref | 最多1张 | 超过容易触发审核 |
| --ar | 9:16 | 竖版 |
---
## 5. 视频Prompt规则
**收敛原则**
- 基于图片已有内容,仅描述镜头微动 + 光效/星座缓慢变化
- 禁止大幅度环境切换、场景变化、人物位置跳变
- 每条不超过 40 词
**videoPrompt 模板**
```
[镜头运动] of [空间]. [光效/微粒微动]. [星座] constellation [缓慢出现/变亮]. [氛围微动]. Cinematic, photorealistic, 4K.
```
**镜头运动词表**(选一):
| 运动 | 英文 | 适用 |
|------|------|------|
| 静止 | Static shot | 默认,走廊/殿堂 |
| 前推 | Slow dolly forward | 走廊、隧道 |
| 横摇 | Slow pan | 对称空间 |
| 微推 | Slow push forward | 封闭空间 |
| 漂移 | Slow gentle drift | 水下、悬浮 |
---
## 6. 十二星座 videoPrompt
### 3.1 白羊座 Aries — 空火走廊
```text
imagePrompt: Empty fire temple corridor stretching into shadow, ancient dragon carvings on stone walls, dying embers in braziers casting dim orange glow. Stone pillars receding endlessly, ember particles frozen in warm air. A small silhouette in the distance. Aries constellation faintly glowing on distant ceiling. Warm amber faded palette, tungsten dim glow, 80s film grain, VHS aesthetic, analog photography, liminal space, surreal still atmosphere, deep depth of field, everything sharp --ar 9:16 --style raw --s 750 --sref [URL] --sw 50
videoPrompt: Slow dolly forward through fire temple corridor. Dying embers in braziers pulse dim orange. Faint amber glow slowly appears on distant ceiling. Ember particles drift in warm air. Cinematic, photorealistic, 4K.
```
### 3.2 金牛座 Taurus — 废弃温室
```text
imagePrompt: Abandoned celestial greenhouse, overgrown ancient trees with golden leaves breaking through cracked jade ceiling. Dust floating in single shaft of warm light. Crystal waterways dried up, bioluminescent flowers still glowing faintly in dim corners. A small silhouette among overgrown roots. Taurus constellation in faded emerald on dusty glass. Warm gold faded green palette, 80s film grain, VHS aesthetic, analog photography, liminal space, surreal overgrown stillness, deep depth of field, everything sharp --ar 9:16 --style raw --s 750 --sref [URL] --sw 50
videoPrompt: Static shot of abandoned celestial greenhouse. Single shaft of warm light shifts gently. Bioluminescent flowers pulse faintly in dim corners. Faded emerald glow appears on dusty glass ceiling. Dust floats in light beam. Cinematic, photorealistic, 4K.
```
### 3.3 双子座 Gemini — 镜面长廊
```text
imagePrompt: Infinite mirror corridor reflecting twin versions of a celestial observatory interior. Floating constellation maps drifting between mirror walls, starlight reflected infinitely into darkness. Two small silhouettes on opposite ends. Gemini constellation in dual silver-gold on ceiling. Silver gold faded palette, fluorescent light hum, 80s film grain, VHS aesthetic, analog photography, liminal space, surreal mirror infinity, deep depth of field, everything sharp --ar 9:16 --style raw --s 750 --sref [URL] --sw 50
videoPrompt: Static shot of infinite mirror corridor. Slow pan between twin reflections. Floating star maps drift imperceptibly. Silver-gold light shifts on ceiling. Starlight reflects softly into mirror depth. Cinematic, photorealistic, 4K.
```
### 3.4 巨蟹座 Cancer — 空月池
```text
imagePrompt: Empty moonlit pool room, vast jade chamber with perfectly still water reflecting crescent moon through glass ceiling. Moon jellyfish floating motionless above water surface. Ancient lotus carved into jade walls glowing faintly. A small silhouette kneeling at pool edge. Cancer constellation in silver-blue on water surface. Silver-blue moonlit palette, 80s film grain, VHS aesthetic, analog photography, liminal space, surreal still water, deep depth of field, everything sharp --ar 9:16 --style raw --s 750 --sref [URL] --sw 50
videoPrompt: Static shot of empty moonlit pool room. Moonlight slowly shifts across still water. Moon jellyfish drift almost imperceptibly above surface. Silver-blue light ripples faintly on water. Lotus carvings glow faintly on jade walls. Cinematic, photorealistic, 4K.
```
### 3.5 狮子座 Leo — 空金殿
```text
imagePrompt: Empty golden throne room, vast silent hall with jade pillars receding into haze. Single beam of golden light from ceiling crack illuminating dust particles drifting. Dragon carvings on walls fading into shadow. A small silhouette in the distance before the empty throne. Leo constellation blazing gold on distant ceiling mural. Warm gold shadow palette, tungsten dim lighting, 80s film grain, VHS aesthetic, analog photography, liminal space, surreal empty grandeur, deep depth of field, everything sharp --ar 9:16 --style raw --s 750 --sref [URL] --sw 50
videoPrompt: Slow dolly forward into empty golden throne room. Single beam of golden light illuminates drifting dust particles. Golden glow slowly brightens on distant ceiling mural. Jade pillars recede into haze. Cinematic, photorealistic, 4K.
```
### 3.6 处女座 Virgo — 晶石圣所
```text
imagePrompt: Enclosed crystal sanctuary interior, translucent crystal tree trunks growing through jade floor and ceiling. Bioluminescent leaves casting pale jade glow in dim space. Crystal pathways empty, ancient stone lanterns unlit but glowing faintly within. A small silhouette at the tree base. Virgo constellation in green-gold on crystal canopy. Jade green pale gold palette, 80s film grain, VHS aesthetic, analog photography, liminal space, surreal bioluminescent stillness, deep depth of field, everything sharp --ar 9:16 --style raw --s 750 --sref [URL] --sw 50
videoPrompt: Static shot of crystal sanctuary interior. Bioluminescent leaves pulse pale jade rhythmically. Crystal tree trunks shimmer faintly. Green-gold light slowly appears on crystal canopy above. Stone lanterns glow softly within. Cinematic, photorealistic, 4K.
```
### 3.7 天秤座 Libra — 暮光殿
```text
imagePrompt: Twilight balance temple interior, twin massive jade doors on opposite walls, ancient balance scale mechanism filling the center. Floating ritual instruments motionless in air. Fading light from twin windows casting long symmetrical shadows. A small silhouette at the fulcrum. Libra constellation in perfect symmetry on ceiling. Twilight gold indigo palette, fading dual light, 80s film grain, VHS aesthetic, analog photography, liminal space, surreal symmetry, deep depth of field, everything sharp --ar 9:16 --style raw --s 750 --sref [URL] --sw 50
videoPrompt: Static shot of twilight balance temple. Twin windows cast fading light that slowly dims. Floating ritual instruments drift imperceptibly. Symmetrical twilight glow holds balance on ceiling. Long symmetrical shadows stretch slowly. Cinematic, photorealistic, 4K.
```
### 3.8 天蝎座 Scorpio — 地下熔廊
```text
imagePrompt: Underground volcanic corridor, ancient stone tunnel with glowing mineral veins in walls. Dim ruby light pulsing slowly from cracks in floor. Guardian reliefs carved into stone walls emanating faint glow. A small silhouette in the corridor center. Scorpio constellation in garnet on tunnel ceiling. Deep ruby dark palette, geological dim glow, 80s film grain, VHS aesthetic, analog photography, liminal space, surreal underground stillness, deep depth of field, everything sharp --ar 9:16 --style raw --s 750 --sref [URL] --sw 50
videoPrompt: Slow push forward into underground volcanic corridor. Mineral veins in walls pulse ruby glow slowly. Garnet light brightens on tunnel ceiling. Guardian reliefs emanate faint rhythmic glow. Dim warm light from floor cracks. Cinematic, photorealistic, 4K.
```
### 3.9 射手座 Sagittarius — 废弃星阁
```text
imagePrompt: Abandoned celestial observatory interior, massive dome ceiling open to void sky. Ancient star-maps on walls faded and peeling, stone instruments covered in dust. Starlight streaming through single dome crack. A small silhouette at the observatory center. Sagittarius constellation in silver on dome ceiling. Midnight blue silver faded palette, single starlight beam, 80s film grain, VHS aesthetic, analog photography, liminal space, surreal abandoned cosmos, deep depth of field, everything sharp --ar 9:16 --style raw --s 750 --sref [URL] --sw 50
videoPrompt: Static shot inside abandoned celestial observatory. Starlight beam through dome crack shifts slowly. Silver starlight fades in on dome ceiling. Ancient star-maps peel slightly on walls. Dust drifts in single light beam. Cinematic, photorealistic, 4K.
```
### 3.10 摩羯座 Capricorn — 冰封长廊
```text
imagePrompt: Frozen ice palace corridor, frost crystals covering jade walls and ceiling. Dim blue light filtering through ice-encrusted windows. Black iron chains frozen mid-swing, frozen waterfalls visible through ice walls. A small silhouette at the corridor end. Capricorn constellation in aurora blue on ice ceiling. Deep blue ice silver palette, cold dim light, 80s film grain, VHS aesthetic, analog photography, liminal space, surreal frozen stillness, deep depth of field, everything sharp --ar 9:16 --style raw --s 750 --sref [URL] --sw 50
videoPrompt: Slow dolly forward into frozen ice palace corridor. Frost crystals glint on jade walls. Aurora blue glow slowly brightens on ice ceiling. Dim blue light filters gently through ice-encrusted windows. Frozen chains sway almost imperceptibly. Cinematic, photorealistic, 4K.
```
### 3.11 水瓶座 Aquarius — 空云浴殿
```text
imagePrompt: Empty cloud palace bathhouse, vast jade pool with perfectly still water reflecting nothing. Steam rising from warm water, bioluminescent lotus floating on surface. Empty jade archways receding into mist. A small silhouette at the pool edge. Aquarius constellation in soft cyan on water surface. Silver-blue warm mist palette, dim steamy light, 80s film grain, VHS aesthetic, analog photography, liminal space, surreal empty bathhouse, deep depth of field, everything sharp --ar 9:16 --style raw --s 750 --sref [URL] --sw 50
videoPrompt: Static shot of empty cloud palace bathhouse. Steam rises slowly from still warm water. Bioluminescent lotus pulses soft cyan. Soft cyan glow appears on water surface. Jade archways recede into gentle mist. Cinematic, photorealistic, 4K.
```
### 3.12 双鱼座 Pisces — 沉没水殿
```text
imagePrompt: Submerged crystal throne room, jade architecture visible through murky bioluminescent water. Coral growing on throne, pearls scattered on floor, jellyfish floating motionless. Dragon statue coiled around dome visible through crystal ceiling, ocean depths beyond. A small silhouette floating before the throne. Pisces constellation glowing aqua through water. Jade green aqua murky palette, bioluminescent dim glow, 80s film grain, VHS aesthetic, analog photography, liminal space, surreal underwater stillness, deep depth of field, everything sharp --ar 9:16 --style raw --s 750 --sref [URL] --sw 50
videoPrompt: Slow gentle drift through submerged crystal throne room. Coral on throne sways softly in current. Aqua light glows through water above. Jellyfish float almost motionless. Bioluminescent particles drift through murky jade-green water. Cinematic, photorealistic, 4K.
```
---
*— End of Framework —*

View File

@@ -0,0 +1,36 @@
{
"id": "military",
"name": "军事账号",
"description": "军事主题短视频账号,暗黑漫画风格,深紫焦橙双色调",
"pipeline": "image-video",
"defaultFormat": "9:16",
"imageModel": "gemini",
"videoModel": "veo3-fast",
"batchSize": 30,
"styles": {
"dark-noir-military": {
"references": [
{ "file": "grunge_br.png", "url": "https://i.ibb.co/SwfD7YM6/e3caf4ad6e8a.png" }
]
}
},
"capcut": {
"effects": ["录制边框 III"],
"filter": "电影感:40",
"subtitleStyle": {
"font": "思源黑体 Heavy",
"fontSize": 24,
"color": "#FFFFFF",
"highlightColor": "#FF6B35",
"bold": true,
"hasShadow": true,
"shadowColor": "#000000",
"shadowAlpha": 0.8,
"transformY": -380,
"alignment": 1,
"inAnimation": "淡入",
"outAnimation": "淡出"
},
"defaultBGM": ""
}
}

Binary file not shown.

After

Width:  |  Height:  |  Size: 969 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.3 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.1 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.1 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.5 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.5 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.4 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.3 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.2 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.4 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.2 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.2 MiB

View File

@@ -0,0 +1,169 @@
# dark-noir-military
暗黑漫画军事风格 — 深紫与焦橙双色调纯黑背景半调网点纹理做旧丝网印刷质感强戏剧性侧光。基于口播文案分镜导演方法论注入隐性动势Implied Motion使静态图片天然具备运动趋势。
参考图:
- `references/ref_001_grunge_portrait.png` — 紫色油彩刮痕变体
- `references/ref_002_halftone_popart.png` — 波普网点权威人物变体
- `references/ref_003_warrior_manga.png` — 日式漫画武将变体
---
## 图片提示词
### 核心视觉要素
- 主体为人物(军事/权力/悬疑角色),表情刚毅/冷峻/压抑
- 半调网点halftone dots纹理贯穿全画面
- 做旧丝网印刷质感gritty risograph print
- 强戏剧性侧光high contrast chiaroscuro大面积阴影剪影
- 深紫 `#4B0082` + 焦橙 `#E07B00` + 纯黑 `#0A0A0A` 严格三色体系
### 隐性动势Implied Motion
> 在图片提示词中,通过描述「动作的进行时态」或「趋势中的瞬间」,使图片隐含运动方向,方便后续视频生成。
**人物动作趋势:**
`slowly turning head` / `eyes narrowing` / `jaw tightening` / `lips parting slightly` /
`fingers tightening on glass` / `shoulders slowly rising` / `coat swept by unseen wind` /
`exhaling deeply` / `leaning forward imperceptibly` / `gaze drifting downward`
**场景变化趋势:**
`smoke curling upward` / `shadows lengthening` / `light fading at edges` /
`rain beginning to blur the glass` / `dust slowly settling` / `candle flame flickering` /
`fog creeping in from the distance` / `city lights blurring into streaks`
**情绪张力趋势:**
`tension building in stillness` / `a breath held before breaking` /
`the moment before collapse` / `silence stretching thin` /
`the last second of control` / `something about to shatter`
### 构图模式
- 竖版 9:16特写或半身为主
- 极端面部特写eye-level 或微仰角
- 强调明暗对比,人物从阴影中浮现
- 大面积黑色背景,主体集中在画面中心偏上
### 图片 Prompt 模板
**模板A — Gemini / Nanobanana中英混排最佳响应**
```
[中文主体描述][中文隐性动势][英文动势强化词][中文环境/光线],暗黑漫画风格,深紫色与焦橙色双色调,纯黑背景,半调网点纹理,做旧丝网印刷质感,强戏剧性侧光,大面积阴影剪影,都市悬疑电影构图,无文字,无水印,竖版构图 9:16dark noir illustration, deep purple and burnt orange duotone, halftone dot grain, gritty risograph print, high contrast chiaroscuro, bold black shadows, urban thriller aesthetic, editorial graphic novel style, no text, no watermark
```
**模板B — MidJourney纯英文MJ参数**
```
[情绪词] [主体描述], [隐性动势], [环境], [光线], [构图], dark noir comic style, limited color palette of deep purple and burnt orange on black background, halftone dot texture, gritty screen print effect, high contrast dramatic lighting, bold graphic shadows, cinematic close-up composition, editorial illustration, urban thriller aesthetic, no text, no watermark --ar 3:4 --style raw --q 2 --v 6.1
```
**结构层级对照:**
| 层级 | 中文写法Nanobanana | 英文写法MJ |
|---|---|---|
| 主体 | `中年西装男子` | `middle-aged suited man` |
| **隐性动势** | `缓缓抬起头,眼神从低沉转向冷峻` | `head slowly lifting, eyes shifting from hollow to cold` |
| 环境 | `背后的人影正在消散` | `silhouettes dissolving behind him` |
| 光线 | `单侧冷白边缘光从上方打入` | `cold rim light from above` |
| 构图 | `极端面部特写,低角度仰视` | `extreme close-up, low angle` |
### 负向提示词(每条附加)
```
【负向】彩色背景蓝色调绿色调写实照片风格卡通可爱风人物姓名画面文字logo水印过度曝光模糊
```
### 示例
**Nanobanana 示例:**
```
中年西装男子半身特写头缓缓低垂目光从正视转为向下沉落下颌微微收紧背景中模糊的人影正在向黑暗消散压抑氛围弥漫冷白边缘光从头顶单侧打入head slowly bowing, gaze sinking downward, jaw tightening, silhouettes dissolving into darkness behind him, cold rim light from above, extreme close-up, 暗黑漫画风格,深紫色与焦橙色双色调,纯黑背景,半调网点纹理,做旧丝网印刷质感,强戏剧性侧光,大面积阴影剪影,都市悬疑电影构图,无文字,无水印,竖版构图 9:16dark noir illustration, deep purple and burnt orange duotone, halftone dot grain, gritty risograph print, high contrast chiaroscuro, editorial graphic novel style, no text, no watermark
```
**MJ 示例:**
```
calculating middle-aged man in a dark suit, head slowly bowing, gaze sinking downward, jaw tightening imperceptibly, blurred silhouettes of figures dissolving in the background, oppressive atmosphere closing in, single cold rim light from above, extreme close-up composition, dark noir comic style, limited color palette of deep purple and burnt orange on black background, halftone dot texture, gritty screen print effect, high contrast dramatic lighting, bold graphic shadows, cinematic close-up composition, editorial illustration, urban thriller aesthetic, no text, no watermark --ar 3:4 --style raw --q 2 --v 6.1
```
### MJ/Gemini 参数
- MJ: `--ar 3:4 --style raw --q 2 --v 6.1`
- Gemini: 无额外参数,画幅由提示词中 `竖版构图 9:16` 控制
### 图片禁止项
- 彩色背景、蓝色调、绿色调
- 写实照片风格、卡通可爱风
- 真实人名、画面文字、logo、水印
- 过度曝光、整体模糊
- 主色调偏离紫橙黑体系
- 静态描述standing/sitting/looking单独使用必须附加动势词
---
## 视频提示词
### 核心原则
- **以图为锚,以文为魂**:视频色调/构图/人物状态必须与图片保持一致
- **动势继承**:图片提示词中的隐性动势必须在视频中被接收并放大
- **片段自洽**:每条视频首尾可衔接,开头承接图片状态,结尾留有余势
### 镜头运动类型
| 运动名称 | 英文 | 情绪效果 | 适用场景 |
|---|---|---|---|
| 缓慢推进 | slow push in / creeping zoom | 压迫感上升,悬疑收紧 | 人物内心独白,威胁逼近 |
| 缓慢拉远 | slow pull back / creeping zoom out | 孤立感,宏观俯瞰 | 孤独叙事,结局揭示 |
| 环绕运镜 | slow orbit / circular dolly | 权力感,人物立体化 | 强权人物出场,对峙 |
| 手持微颤 | subtle handheld shake | 真实感,紧张不安 | 跟踪,暗中观察 |
| 荷兰角倾斜 | dutch angle tilt | 不稳定,道德扭曲 | 阴谋,背叛,失控 |
| 垂直升降 | slow vertical rise / crane up | 格局升级,视角切换 | 从细节到全局 |
| 极慢速度 | ultra slow motion | 强调细节,时间凝固 | 关键动作,情绪爆发前 |
| 定机微动 | static with micro drift | 沉默张力,压迫平静 | 对峙,沉默,等待 |
### 光线变化动势
| 光效 | 英文 | 情绪效果 |
|---|---|---|
| 阴影缓缓吞噬画面 | shadows slowly consuming the frame | 危险临近,失控 |
| 橙光从边缘渗入 | warm orange light bleeding from edge | 希望/威胁的暗示 |
| 单光源缓慢摇曳 | single light source gently swaying | 不稳定,脆弱 |
| 逆光轮廓渐清晰 | backlit silhouette slowly sharpening | 人物揭示,权力感 |
| 闪烁的环境光 | flickering ambient light | 危机,系统崩溃感 |
### 人物微动势
| 动势 | 英文 |
|---|---|
| 眼神从空洞转冷峻 | eyes shifting from hollow to cold |
| 呼出的气息可见 | breath visible as cold vapor exhaled slowly |
| 嘴角细微下压 | corner of mouth almost imperceptibly tightening |
| 手指缓缓收紧 | fingers slowly tightening around glass |
| 转头停在一半 | head turn arrested mid-motion |
| 眼皮缓缓下垂 | eyelids slowly lowering with exhaustion |
### 视频 Prompt 模板
```
Opening on [镜头起始状态/构图], camera [运镜方式] — [主体动势演绎]. [环境/光线动态]. [情绪氛围收尾]. aspect ratio 9:16, cinematic vertical frame, 24fps film grain, duration [Xs], no text overlay, no subtitles
```
### 示例
```
Opening on a medium close-up of a calculating middle-aged man in a dark suit, head already bowed, camera beginning an imperceptibly slow creep inward toward his face — a creeping push in that tightens like a closing trap. His jaw tightens by a single degree. His gaze never rises. The blurred silhouettes of figures in the background continue their silent dissolution into darkness, as if the world is emptying itself around him. A cold rim light from above traces the edge of his skull, orange warmth bleeding faintly at the frame's periphery — warmth he has long stopped reaching for. The shot holds in stillness until stillness itself becomes a statement. aspect ratio 9:16, cinematic vertical frame, 24fps film grain, duration 4s, no text overlay, no subtitles
```
### 叙事连贯性规则
| 片段关系 | 衔接策略 |
|---|---|
| 情绪递进A→更强A | 镜头更近 + 动势更慢 + 光线更暗 |
| 情绪转折A→B | 镜头切换视角 + 光线色温转变 |
| 时间跳跃 | 运动方向反转 或 速度突变 |
| 场景切换 | 上一片段以虚焦结尾,下一片段从虚焦拉清 |
| 高潮强调 | 静止定机 + 极慢速度 + 单一光源特写 |
### 时长与动势匹配
| 画面类型 | 推荐时长 | 隐性动势强度 |
|---|---|---|
| 人物情绪特写 | 34s | 微表情(眼神 / 呼吸 / 嘴角) |
| 动作 + 情绪复合 | 45s | 肢体趋势(转身 / 抬头 / 握拳) |
| 场景全景 / 环境叙事 | 56s | 环境动势(烟雾 / 光线 / 风) |
### VEO/Grok 后缀
- VEO: `enhance_prompt=true, enable_upsample=true`
- Grok: 无额外参数
### 视频禁止项
- 大幅度环境切换、场景变化、人物位置跳变
- 快速剪辑/闪切效果
- 任何文字叠加/字幕

View File

@@ -0,0 +1,181 @@
# 账号系统规范
> 每个账号独立管理视觉风格、提示词策略和 CapCut 配置。
> 一个账号可以有多种视觉风格,每种风格是一个独立的 style 文件。
---
## 目录结构
```
accounts/
├── _template/ # 新账号模板(复制此目录创建新账号)
│ ├── account.json
│ ├── references/ # 参考图目录
│ │ └── .gitkeep
│ └── styles/ # 风格文件目录(可多个)
│ └── .gitkeep
└── {account_id}/ # 用户创建的账号
├── account.json
├── references/ # 参考图(所有风格共用)
│ ├── ref_001.png
│ └── ref_002.png
└── styles/ # 风格文件(一个文件 = 一种视觉风格)
├── cyberpunk-character.md
├── dark-archive.md
└── neon-city.md
```
---
## account.json 字段说明
```json
{
"id": "tech-talk",
"name": "科技解说",
"description": "科技类短视频账号,深色背景,赛博朋克风格",
"pipeline": "image-video", // 已废弃,保留不影响
"imageModel": "gemini",
"videoModel": "kling",
"batchSize": 30,
"capcut": {
"effects": ["录制边框 III"],
"filter": "电影感:40",
"subtitleStyle": {
"fontSize": 36,
"color": "#FFFFFF",
"highlightColor": "#FF6B35",
"bold": true
},
"defaultBGM": "https://example.com/bgm_tech.mp3"
}
}
```
| 字段 | 类型 | 说明 |
|------|------|------|
| `id` | string | 账号唯一标识(与目录名一致) |
| `name` | string | 账号显示名 |
| `description` | string | 一句话描述 |
| `pipeline` | enum | `image-only` / `image-video` |
| `defaultFormat` | string | 默认画幅9:16 / 16:9 / 1:1 / 4:3 |
| `imageModel` | string | 默认图片模型 |
| `videoModel` | string | 默认视频模型 |
| `batchSize` | number | 默认批量生成数量 |
| `capcut.effects` | string[] | CapCut 特效名称列表 |
| `capcut.filter` | string | CapCut 滤镜,格式 "名称:强度" |
| `capcut.subtitleStyle` | object | 字幕样式(字号、颜色、高亮色、加粗) |
| `capcut.defaultBGM` | string | 默认背景音乐 URL |
---
## 风格文件styles/
每种视觉风格一个文件,文件名即风格名。文件内同时包含图片和视频的提示词策略。
### 风格文件结构
```markdown
# 风格名称(英文短横线命名)
一句话描述风格。
---
## 图片提示词
### 核心视觉要素
<!-- 必选的视觉元素 -->
### 场景/背景规则
<!-- 背景要求 -->
### 色调方案
<!-- 可选的色彩组合 -->
### 构图模式
<!-- 支持的构图类型 -->
### 图片 Prompt 模板
<!-- 生成 prompt 时的固定结构 -->
### 示例
<!-- 2-3 个完整示例 -->
### MJ/Gemini 参数
<!-- 模型专用后缀参数 -->
### 图片禁止项
<!-- 不得出现的元素 -->
---
## 视频提示词
### 运镜规则
<!-- 运镜方式、节奏 -->
### 动态元素要求
<!-- 光影、角色动作、环境氛围 -->
### 视频 Prompt 模板
<!-- VEO/Grok prompt 结构 -->
### 示例
<!-- 2-3 个完整示例 -->
### VEO/Grok 后缀
<!-- 模型专用后缀 -->
### 视频禁止项
<!-- 不得出现的元素 -->
```
### 风格文件命名
使用英文短横线命名,描述性强:
- `cyberpunk-eastern-character.md` — 赛博东方角色
- `dark-forbidden-archive.md` — 暗黑禁书档案
- `neon-cityscape.md` — 霓虹城市
- `ink-wash-landscape.md` — 水墨山水
---
## 创建新账号
### 一键创建(推荐)
```bash
node scripts/pipeline.js create-account \
--id military \
--name "军事账号" \
--desc "军事主题短视频,暗黑漫画风格" \
--video-model veo3-fast \
--references ./ref1.png,./ref2.png
```
自动完成:创建目录 → 生成 account.json → 复制参考图 → 上传 OSS → 回写 URL → 生成风格骨架。
### 手动创建
1. 复制 `_template/` 目录,重命名为账号 ID
2. 编辑 `account.json` 填写账号信息
3.`references/` 中放入参考图(所有风格共用)
4. 上传参考图到 OSSURL 写入 account.json
- `node scripts/oss-upload.js accounts/{id}/references/{图片文件}`
- 将返回的 URL 写入 `styles.{styleName}.references[].url`
5.`styles/` 中创建风格文件(至少一个)
### 校验账号
```bash
node scripts/pipeline.js validate-account --account military
```
检查id 匹配、必填字段、参考图完整性、风格文件存在、OSS URL 有效。
## 添加新风格
在账号的 `styles/` 目录下新建 `.md` 文件即可,文件名即风格 ID。
Claude 调用时指定风格名,如 "用 cyberpunk-eastern-character 风格"。

View File

@@ -0,0 +1,108 @@
# manifest.json 规范
> `pipeline.js init` 创建Pipeline 执行Agent 审查。
>
> **禁止 AI 手写 manifest.json**,必须通过 `pipeline.js init` 初始化。脚本从 account.json 自动继承结构字段AI 只提供创意内容items 的 text/imagePrompt/videoPrompt/keyword
---
## 创建方式
```bash
# AI 生成创意内容后,通过脚本初始化
node pipeline.js init --account military --mode single \
--items '[{"text":"中文文案","imagePrompt":"English prompt","videoPrompt":"motion prompt","keyword":"关键词","keywordColor":"#FF6B35"}]'
# 或从文件读取
node pipeline.js init --account military --mode single --items-file ./items.json
# 校验已有 manifest
node pipeline.js validate --manifest <path>
```
---
## 顶层字段
| 字段 | 说明 | 来源 | 谁填充 |
|------|------|------|--------|
| `account` | 账号 ID | account.json | **init 自动** |
| `imageModel` | `gemini` / `mj` | account.json | **init 自动** |
| `videoModel` | `veo3-fast` / `grok-video-3` 等 | account.json | **init 自动** |
| `format` | 画幅:`9:16` / `16:9` | account.json | **init 自动** |
| `mode` | `single` 单图 / `framePair` 首尾帧 | CLI 参数 | **init 自动** |
| `references` | 参考图数组,从 account.json styles.*.references 搬入 | account.json | **init 自动** |
| `items` | 素材数组AI 提供创意内容) | CLI --items | **AI → init** |
**init 自动继承的字段不需要 AI 关心,不会出错。**
---
## references 字段
从 account.json 搬入pipeline 直接使用,不再回读 account.json。
- **Gemini** → 读 `file`(本地路径,图生图用)
- **MJ** → 读 `url`(公网 URL`--sref` 用)
---
## items[] 字段
### Agent 写入(创建时)
| 字段 | 说明 |
|------|------|
| `status` | 固定写 `"pending"` |
| `text` | 中文字幕文案 |
| `imagePrompt` | 英文画面描述(给 Gemini/MJ |
| `videoPrompt` | 英文运动描述(给 Grok/VEO描述镜头运动而非内容 |
| `keyword` | 字幕高亮关键词 |
| `keywordColor` | 高亮颜色 |
### Pipeline 回写(执行后)
| 字段 | 说明 | 写入阶段 |
|------|------|---------|
| `status` | `pending``generating``done` / `failed` | images |
| `file` | 生成的图片路径(相对 manifest | images |
| `candidates` | MJ 拆分的 4 张候选图路径Gemini 无此字段) | images |
| `url` | 图片 OSS 公网 URL | upload |
| `video` | 生成的视频路径 | videos |
| `videoDuration` | 视频时长Grok=6, VEO=8 | videos |
| `videoUrl` | 视频 OSS 公网 URL | videos |
| `audio` | TTS 音频路径 | tts |
| `duration` | 音频时长(秒) | tts |
### Agent 审查时可操作
- MJ 换选:`item.file = item.candidates[2]`
- 删除不合格 item直接从 items 数组移除,重新跑 `--phase images`
- 调整 prompt 重跑:改 `imagePrompt`status 改回 `pending`
---
## 首尾帧模式
`mode: "framePair"` 时,`imagePrompt` 作为起始帧,每个 item 额外字段:
| 字段 | 说明 | 谁填充 |
|------|------|--------|
| `imagePrompt` | 起始帧画面描述(与 single 模式复用同一字段) | AI |
| `lastFramePrompt` | 结束帧画面描述 | AI |
| `lastFrame` | 结束帧图片路径 | **pipeline images 回写** |
| `lastFrameUrl` | 结束帧 OSS URL | **pipeline upload 回写** |
**首尾帧规则**同一场景、视角一致、状态对比。VEO 检测到 `lastFrameUrl` 自动启用双图模式。
---
## 目录结构
```
output/{account}_{YYYYMMDD}_{NNN}/
├── manifest.json # 主清单
├── images/ # scene_{NN}_{keyword}.jpeg首尾帧加 _lastMJ 候选加 _cand{1-4}
├── videos/ # scene_{NN}_{keyword}.mp4
└── audio/ # seg_001.mp3
```

View File

@@ -0,0 +1,730 @@
#!/usr/bin/env node
/**
* CapCut 成片组装脚本
*
* 将图片/视频素材通过 CapCut Mate API 组装为草稿,同步到本地剪映。
*
* 用法:
* node capcut_assemble.js --input ./output/batch_xxx [选项]
*
* 配置:
* 请运行 node setup.js 生成配置
* 同步方式: 纯 Node.jssync-to-jianying.js无需 Python/uv
*/
const axios = require('axios')
const path = require('path')
const fs = require('fs')
const { syncDraft, registerDraft, triggerDirectoryScan } = require('./sync-to-jianying')
// ============================================================================
// 配置
// ============================================================================
let _config = null
function getConfig() {
if (_config) return _config
const configPath = path.join(__dirname, '..', '..', 'config.json')
if (!fs.existsSync(configPath)) {
console.error('缺少配置文件: skills/config.json')
console.error('请运行 node setup.js 生成配置')
process.exit(1)
}
const config = JSON.parse(fs.readFileSync(configPath, 'utf-8'))
if (!config.jianyingDraftPath || !config.capcutMateDir || !config.capcutMateApiBase) {
console.error('config.json 需要填写 jianyingDraftPath、capcutMateDir 和 capcutMateApiBase')
process.exit(1)
}
_config = config
return _config
}
const BASE_URL = getConfig().capcutMateApiBase
const US = 1_000_000
// ============================================================================
// CapCut API 封装
// ============================================================================
async function api(endpoint, data = {}, timeout = 60000) {
const url = `${BASE_URL}/${endpoint}`
const method = endpoint === 'get_draft' ? 'get' : 'post'
try {
const res = method === 'get'
? await axios.get(url, { params: data, timeout })
: await axios.post(url, data, { timeout })
if (res.data.code !== undefined && res.data.code !== 0) {
throw new Error(`API [${endpoint}] 返回错误: ${res.data.message}`)
}
return res.data
} catch (err) {
if (err.response) {
throw new Error(`API [${endpoint}] HTTP ${err.response.status}: ${JSON.stringify(err.response.data)}`)
}
throw err
}
}
// ============================================================================
// CLI 参数
// ============================================================================
function parseArgs(argv) {
const args = {}
for (let i = 0; i < argv.length; i++) {
if (argv[i].startsWith('--')) {
const key = argv[i].slice(2)
const value = argv[i + 1]
if (value && !value.startsWith('--')) {
args[key] = value
i++
} else {
args[key] = true
}
}
}
return args
}
function getResolution(format) {
const map = {
'9:16': { width: 1080, height: 1920 },
'16:9': { width: 1920, height: 1080 },
'1:1': { width: 1080, height: 1080 },
'4:3': { width: 1440, height: 1080 },
}
return map[format] || map['9:16']
}
// ============================================================================
// OSS 上传
// ============================================================================
const ossUpload = require(path.join(__dirname, 'oss-upload'))
async function uploadToOSS(filePath) {
const { url } = await ossUpload.uploadFile(filePath)
return url
}
async function batchUploadToOSS(inputDir, files) {
const urls = {}
for (const file of files) {
const filePath = path.join(inputDir, file)
if (!fs.existsSync(filePath)) continue
try {
urls[file] = await uploadToOSS(filePath)
console.log(` 上传: ${file} -> OK`)
} catch (err) {
console.error(` 上传失败: ${file} - ${err.message}`)
}
}
return urls
}
// ============================================================================
// 主流程
// ============================================================================
function buildTimeline(items, defaultDurationUs) {
// 音频为主轴,视频适配音频(短视频行业标准)
// 有视频时长时取 max不截断音频无视频时用音频时长
let offset = 0
return items.map(item => {
const audioDur = (item.duration != null) ? item.duration * US : 0
const videoDur = (item.videoDuration != null) ? item.videoDuration * US : 0
// 有视频:保证音频不被截断;无视频(图片模式):用音频时长
const dur = videoDur > 0
? Math.max(audioDur, videoDur)
: (audioDur || defaultDurationUs)
const entry = { start: offset, end: offset + dur, duration: dur }
offset += dur
return entry
})
}
async function assemble(args) {
const {
input,
manifest: manifestPath,
mode = 'images',
subtitles = 'true',
voiceover = 'true',
bgm,
effects: effectsStr,
filter: filterStr,
format = '9:16',
apiKey = '',
duration = '4',
animation = 'kenburns-zoom',
} = args
if (!input) throw new Error('缺少 --input 参数')
const inputDir = path.resolve(input)
const manifestFile = manifestPath
? path.resolve(manifestPath)
: path.join(inputDir, 'manifest.json')
if (!fs.existsSync(manifestFile)) {
throw new Error(`找不到 manifest.json: ${manifestFile}`)
}
const manifest = JSON.parse(fs.readFileSync(manifestFile, 'utf-8'))
const { width, height } = getResolution(format)
const defaultDurationUs = parseFloat(duration) * US
// 过滤出实际存在的文件
const items = manifest.items.filter(item => {
if (item.url) return true // 视频模式可能用 URL
const filePath = path.join(inputDir, item.file)
return fs.existsSync(filePath)
})
if (items.length === 0) throw new Error('没有可用的素材文件')
// 统一时间线:由 duration 驱动TTS 音频时长)或 fallback 到固定时长
const timeline = buildTimeline(items, defaultDurationUs)
const totalDurationUs = timeline.length > 0 ? timeline[timeline.length - 1].end : 0
const hasTTS = items.some(item => item.audio && item.duration != null)
console.log(`\nCapCut 成片组装`)
console.log(` 模式: ${mode} 画幅: ${format} (${width}x${height})`)
console.log(` 时间线: ${hasTTS ? 'TTS音频驱动' : `固定${duration}s/段`} 总时长: ${(totalDurationUs / US).toFixed(1)}s`)
console.log(` 字幕: ${subtitles} 配音: ${voiceover} 动画: ${animation}`)
console.log(` 素材: ${items.length} 个可用\n`)
const steps = []
if (mode === 'images') steps.push('upload')
steps.push('draft', 'materials', 'voiceover', 'audio', 'subtitles', 'effects', 'filter', 'save', 'sync')
const totalSteps = steps.length
let step = 0
// -- 上传图片到 OSS优先使用 manifest 中已有的 URL --
let imgUrls = {}
if (mode === 'images') {
// 先从 manifest 收集已有 URL
const needUpload = []
for (const item of items) {
if (item.url && item.url.startsWith('http')) {
imgUrls[item.file] = item.url
} else {
needUpload.push(item.file)
}
}
if (needUpload.length > 0) {
step++; console.log(`[${step}/${totalSteps}] 上传图片到 OSS (${needUpload.length} 张需上传, ${Object.keys(imgUrls).length} 张已有URL)...`)
const uploaded = await batchUploadToOSS(inputDir, needUpload)
imgUrls = { ...imgUrls, ...uploaded }
} else {
step++; console.log(`[${step}/${totalSteps}] 所有图片已有 URL跳过上传`)
}
if (Object.keys(imgUrls).length === 0) throw new Error('所有图片上传失败')
console.log(` 成功: ${Object.keys(imgUrls).length}/${items.length}\n`)
}
// -- 创建草稿 --
step++; console.log(`[${step}/${totalSteps}] 创建草稿...`)
const draftRes = await api('create_draft', { width, height })
const draftUrl = draftRes.draft_url
const draftId = new URL(draftUrl).searchParams.get('draft_id')
console.log(` draft_id: ${draftId}\n`)
// -- 导入素材 --
step++; console.log(`[${step}/${totalSteps}] 导入素材...`)
if (mode === 'images') {
await addImages(draftUrl, items, imgUrls, timeline, width, height, animation)
} else {
// 视频模式:确保所有 item 都有 videoUrlCapCut API 需要公网 URL
const missingUrl = items.filter(it => it.video && !it.videoUrl)
if (missingUrl.length > 0) {
const { uploadFile } = require('./oss-upload')
console.log(` 上传 ${missingUrl.length} 个视频到 OSS...`)
for (const item of missingUrl) {
const videoPath = path.resolve(inputDir, item.video)
try {
const { url } = await uploadFile(videoPath)
item.videoUrl = url
// 回写 manifest
if (manifestFile) {
try {
const m = JSON.parse(fs.readFileSync(manifestFile, 'utf-8'))
const mi = m.items.find(i => i.text === item.text)
if (mi) { mi.videoUrl = url; fs.writeFileSync(manifestFile, JSON.stringify(m, null, 2)) }
} catch (_) {}
}
} catch (err) {
console.log(` 视频上传失败: ${err.message}`)
}
}
}
await addVideos(draftUrl, inputDir, items, timeline, width, height)
}
// -- 添加 TTS 配音 --
step++; console.log(`[${step}/${totalSteps}] 添加 TTS 配音...`)
if (voiceover === 'true' && hasTTS) {
await addVoiceover(draftUrl, inputDir, items, timeline)
} else {
console.log(' 跳过(无 TTS 音频或未启用)')
}
// -- 添加 BGM --
step++; console.log(`[${step}/${totalSteps}] 添加背景音乐...`)
if (bgm) {
await addBGM(draftUrl, bgm, totalDurationUs)
} else {
console.log(' 跳过(未指定 --bgm')
}
// -- 读取账号字幕风格 --
const subtitleStyle = loadSubtitleStyle(manifest)
if (Object.keys(subtitleStyle).length > 0) {
console.log(` 字幕风格: ${subtitleStyle.font || '默认'} ${subtitleStyle.inAnimation ? subtitleStyle.inAnimation + '→' + subtitleStyle.outAnimation : ''}`)
}
// -- 添加字幕 --
step++; console.log(`[${step}/${totalSteps}] 添加字幕...`)
if (subtitles === 'true' && items.some(i => i.text)) {
await addSubtitles(draftUrl, items, timeline, subtitleStyle)
} else {
console.log(' 跳过')
}
// -- 添加特效 --
step++; console.log(`[${step}/${totalSteps}] 添加特效...`)
if (effectsStr) {
await addEffects(draftUrl, effectsStr, totalDurationUs)
} else {
console.log(' 跳过(未指定 --effects')
}
// -- 添加滤镜 --
step++; console.log(`[${step}/${totalSteps}] 添加滤镜...`)
if (filterStr) {
await addFilter(draftUrl, filterStr, totalDurationUs)
} else {
console.log(' 跳过(未指定 --filter')
}
// -- 保存草稿 --
step++; console.log(`[${step}/${totalSteps}] 保存草稿...`)
await api('save_draft', { draft_url: draftUrl })
console.log(' 已保存\n')
// -- 同步到本地剪映 --
step++; console.log(`[${step}/${totalSteps}] 同步到本地剪映...`)
await syncToLocalJianying(draftUrl, draftId, totalDurationUs)
console.log(' 同步完成\n')
// -- 云渲染(可选)--
if (apiKey) {
console.log('提交云渲染...')
await api('gen_video', { draft_url: draftUrl, apiKey })
console.log('渲染已提交,使用 gen_video_status 查询进度')
}
console.log(`\n成片组装完成`)
console.log(` 草稿ID: ${draftId}`)
console.log(` 总时长: ${(totalDurationUs / US).toFixed(1)}s`)
console.log(` 素材数: ${items.length}`)
console.log(` 时间线: ${hasTTS ? 'TTS音频驱动' : '固定时长'}`)
if (mode === 'videos' && subtitles === 'false') {
console.log(`\n >> 视频模式未加字幕,请在剪映中打开草稿 → 识别字幕 → 语音识别生成\n`)
}
}
// ============================================================================
// 添加图片(自动上传到 OSS
// ============================================================================
async function addImages(draftUrl, items, imgUrls, timeline, width, height, animation = '') {
const imageInfos = items.map((item, i) => {
const url = imgUrls[item.file]
if (!url) throw new Error(`图片 ${item.file} 未上传成功,无法添加`)
const tl = timeline[i]
return {
image_url: url,
width,
height,
start: tl.start,
end: tl.end,
duration: tl.duration,
animation: animation || '',
transition: i > 0 ? '溶解' : '',
transition_duration: 300000,
}
})
// 单次全量提交,所有图片在同一轨道
console.log(` 一次性添加 ${imageInfos.length} 张图片...`)
const res = await api('add_images', {
draft_url: draftUrl,
image_infos: JSON.stringify(imageInfos),
alpha: 1, scale_x: 1, scale_y: 1,
transform_x: 0, transform_y: 0,
}, 300000)
const allSegmentIds = res.segment_ids || []
console.log(` 已添加 ${items.length} 张图片`)
return allSegmentIds
}
// ============================================================================
// 添加视频(从 manifest 读取时长)
// ============================================================================
async function addVideos(draftUrl, inputDir, items, timeline, width, height) {
const videoInfos = items.map((item, i) => {
const tl = timeline[i]
return {
video_url: item.videoUrl || (item.video ? path.resolve(inputDir, item.video) : null) || item.url || path.resolve(inputDir, item.file),
width,
height,
start: tl.start,
end: tl.end,
duration: tl.duration,
mask: '',
transition: i > 0 ? '溶解' : '',
transition_duration: 300000,
volume: item.volume || 1,
}
})
// 先尝试全量提交
try {
const res = await api('add_videos', {
draft_url: draftUrl,
video_infos: JSON.stringify(videoInfos),
alpha: 1, scale_x: 1, scale_y: 1,
transform_x: 0, transform_y: 0,
scene_timelines: [],
})
console.log(` 已添加 ${items.length} 个视频片段(全量)`)
return res.segment_ids || []
} catch (err) {
if (!err.message.includes('504') && !err.message.includes('timeout')) throw err
console.log(` 全量提交超时,降级为分批添加...`)
}
// 504 回退:分批添加(每批 3 个,保持绝对时间不变)
const BATCH_SIZE = 3
const allSegmentIds = []
for (let i = 0; i < videoInfos.length; i += BATCH_SIZE) {
const batch = videoInfos.slice(i, i + BATCH_SIZE)
const batchNum = Math.floor(i / BATCH_SIZE) + 1
const totalBatches = Math.ceil(videoInfos.length / BATCH_SIZE)
console.log(` 分批 [${batchNum}/${totalBatches}] 添加 ${batch.length} 个片段...`)
const res = await api('add_videos', {
draft_url: draftUrl,
video_infos: JSON.stringify(batch),
alpha: 1, scale_x: 1, scale_y: 1,
transform_x: 0, transform_y: 0,
scene_timelines: [],
})
if (res.segment_ids) allSegmentIds.push(...res.segment_ids)
}
console.log(` 已添加 ${items.length} 个视频片段(分批)`)
return allSegmentIds
}
// ============================================================================
// 音频上传(本地文件 → OSS 公网 URL
// ============================================================================
async function uploadAudioToOSS(filePath) {
try {
const oss = require(path.join(__dirname, 'oss-upload'))
const { url } = await oss.uploadFile(filePath)
return url
} catch (err) {
throw new Error(`音频上传 OSS 失败: ${err.message}`)
}
}
async function batchUploadAudio(inputDir, items) {
const urls = {}
for (const item of items) {
if (!item.audio || item.audio.startsWith('http')) {
if (item.audio) urls[item.audio] = item.audio
continue
}
// audio 可以是相对路径或绝对路径
const filePath = path.isAbsolute(item.audio)
? item.audio
: path.resolve(inputDir, item.audio)
if (!fs.existsSync(filePath)) {
console.error(` 音频文件不存在: ${filePath}`)
continue
}
try {
urls[item.audio] = await uploadAudioToOSS(filePath)
console.log(` 上传: ${path.basename(filePath)} -> OK`)
} catch (err) {
console.error(` 上传失败: ${path.basename(filePath)} - ${err.message}`)
}
}
return urls
}
// ============================================================================
// 添加 TTS 配音(每段音频按时间线排列)
// ============================================================================
async function addVoiceover(draftUrl, inputDir, items, timeline) {
// 收集需要上传的音频
const audioItems = items.filter(item => item.audio)
if (audioItems.length === 0) {
console.log(' 无 TTS 音频文件,跳过')
return
}
// 上传本地音频到 OSS已有的 URL 直接通过)
console.log(' 上传 TTS 音频到 OSS...')
const audioUrls = await batchUploadAudio(inputDir, items)
const audioInfos = []
for (let i = 0; i < items.length; i++) {
const item = items[i]
if (!item.audio) continue
const audioUrl = audioUrls[item.audio]
if (!audioUrl) continue
const tl = timeline[i]
audioInfos.push({
audio_url: audioUrl,
start: tl.start,
end: tl.end,
duration: tl.duration,
volume: 1.0,
})
}
if (audioInfos.length === 0) {
console.log(' 所有音频上传失败,跳过配音')
return
}
await api('add_audios', {
draft_url: draftUrl,
audio_infos: JSON.stringify(audioInfos),
})
console.log(` 已添加 ${audioInfos.length} 段 TTS 配音`)
}
// ============================================================================
// 添加背景音乐
// ============================================================================
async function addBGM(draftUrl, bgmUrl, totalDurationUs) {
// 先获取音频实际时长
let audioDuration = totalDurationUs
try {
const durRes = await api('get_audio_duration', { mp3_url: bgmUrl })
if (durRes.duration) audioDuration = durRes.duration
} catch (_) {
// 无法获取时长就用视频总时长
}
await api('add_audios', {
draft_url: draftUrl,
audio_infos: JSON.stringify([{
audio_url: bgmUrl,
duration: audioDuration,
end: Math.min(audioDuration, totalDurationUs),
start: 0,
volume: 0.15,
}]),
})
console.log(` 已添加 BGM (${(audioDuration / US).toFixed(1)}s)`)
}
// ============================================================================
// 读取账号字幕风格配置
// ============================================================================
function loadSubtitleStyle(manifest) {
const account = manifest.account
if (!account) return {}
const scriptDir = __dirname
const accountFile = path.join(scriptDir, '..', 'accounts', account, 'account.json')
if (!fs.existsSync(accountFile)) return {}
try {
const accountData = JSON.parse(fs.readFileSync(accountFile, 'utf-8'))
return accountData.capcut?.subtitleStyle || {}
} catch { return {} }
}
// ============================================================================
// 添加字幕(支持关键词高亮 + 账号字幕风格)
// ============================================================================
async function addSubtitles(draftUrl, items, timeline, style = {}) {
const captions = []
// 从账号配置读取动画参数
const inAnimation = style.inAnimation || ''
const outAnimation = style.outAnimation || ''
const inAnimDuration = style.inAnimationDuration || null
const outAnimDuration = style.outAnimationDuration || null
for (let i = 0; i < items.length; i++) {
const item = items[i]
const text = item.text || item.caption || ''
if (!text) continue
const tl = timeline[i]
const keyword = item.keyword || ''
const keywordColor = style.highlightColor || item.keywordColor || style.color || '#FFFFFF'
const cap = {
start: tl.start,
end: tl.end,
text,
keyword,
keyword_color: keyword ? keywordColor : '',
keyword_font_size: 18,
}
// 动画参数(每条字幕都带)
if (inAnimation) cap.in_animation = inAnimation
if (outAnimation) cap.out_animation = outAnimation
if (inAnimDuration) cap.in_animation_duration = inAnimDuration
if (outAnimDuration) cap.out_animation_duration = outAnimDuration
captions.push(cap)
}
if (captions.length === 0) {
console.log(' 无字幕内容,跳过')
return
}
await api('add_captions', {
draft_url: draftUrl,
captions: JSON.stringify(captions),
font: style.font || null,
font_size: style.fontSize || 15,
text_color: style.color || '#ffffff',
alignment: 1,
bold: style.bold || false,
italic: false,
underline: false,
has_shadow: style.hasShadow || false,
shadow_info: style.shadowAlpha ? {
shadow_alpha: style.shadowAlpha,
shadow_color: style.shadowColor || '#000000',
shadow_diffuse: 15,
shadow_distance: 5,
shadow_angle: -45,
} : undefined,
letter_spacing: style.letterSpacing || 0,
line_spacing: style.lineSpacing || 0,
alpha: style.alpha || 1,
scale_x: 1, scale_y: 1,
transform_x: 0,
transform_y: style.transformY || 0,
style_text: 0,
})
console.log(` 已添加 ${captions.length} 条字幕 (字体: ${style.font || '默认'}, 动画: ${inAnimation || '无'}${outAnimation || '无'})`)
}
// ============================================================================
// 添加特效
// ============================================================================
async function addEffects(draftUrl, effectsStr, totalDurationUs) {
const effectNames = effectsStr.split(',').map(s => s.trim()).filter(Boolean)
const effectInfos = effectNames.map(name => ({
effect_title: name,
start: 0,
end: totalDurationUs,
}))
await api('add_effects', {
draft_url: draftUrl,
effect_infos: JSON.stringify(effectInfos),
})
console.log(` 已添加: ${effectNames.join(', ')}`)
}
// ============================================================================
// 添加滤镜
// ============================================================================
async function addFilter(draftUrl, filterStr, totalDurationUs) {
const [name, intensity] = filterStr.split(':')
await api('add_filters', {
draft_url: draftUrl,
filter_infos: JSON.stringify([{
filter_title: (name || '').trim(),
start: 0,
end: totalDurationUs,
intensity: parseFloat(intensity) || 50,
}]),
})
console.log(` 已添加: ${(name || '').trim()} 强度 ${intensity || 50}`)
}
// ============================================================================
// 同步草稿到本地剪映
// ============================================================================
async function syncToLocalJianying(draftUrl, draftId, totalDurationUs) {
await syncDraft(draftUrl, { name: draftId })
registerDraft(draftId, draftId, totalDurationUs)
}
// ============================================================================
// 主入口
// ============================================================================
async function main() {
const args = parseArgs(process.argv.slice(2))
if (!args.input) {
console.log('用法: node capcut_assemble.js --input <目录> [选项]')
console.log('')
console.log('必填:')
console.log(' --input <dir> 素材目录(含 manifest.json')
console.log('')
console.log('选项:')
console.log(' --mode images|videos 素材类型(默认 images')
console.log(' --format 9:16 画幅比例')
console.log(' --duration 4 默认每段时长/秒无TTS时的fallback默认 4')
console.log(' --voiceover true|false 是否添加TTS配音轨道默认 true')
console.log(' --subtitles true|false 是否添加字幕(默认 true')
console.log(' --bgm <url> 背景音乐 URL')
console.log(' --effects "名称1,名称2" 特效名称(逗号分隔)')
console.log(' --filter "名称:强度" 滤镜(强度 0-100')
console.log(' --apiKey <key> 云渲染 API Key可选')
console.log(' --manifest <path> manifest.json 路径')
console.log('')
console.log('时间线模式:')
console.log(' manifest.json 中每段包含 audio + duration → TTS音频驱动时间线')
console.log(' 无 audio/duration → 按 --duration 固定时长')
console.log('')
console.log('manifest.json 示例TTS驱动:')
console.log(' {"items":[{"file":"1.png","text":"文案","audio":"seg_1.mp3","duration":3.5}]}')
console.log('')
console.log('配置:')
console.log(' 请运行 node setup.js 生成配置')
process.exit(0)
}
await assemble(args)
}
main().catch(err => {
console.error(`\n错误: ${err.message}`)
process.exit(1)
})
module.exports = { assemble }

View File

@@ -0,0 +1,917 @@
/**
* Gemini Image Generator 图片生成工具
*
* 功能:
* - 文生图Text-to-Image
* - 图生图Image-to-Image
* - 多种业务场景模板
* - 批量生成
* - 自定义输出目录
*
* 使用示例:
* node gemini-image-generator.js generate "A cute cat" -o ./output -r 16:9
* node gemini-image-generator.js edit "Add sunglasses" -i ./photo.jpg
* node gemini-image-generator.js template logo --text "MyBrand"
* node gemini-image-generator.js batch ./prompts.txt
*/
const fs = require('fs')
const path = require('path')
// ============================================================================
// 配置模块
// ============================================================================
function _loadConfig() {
const configPath = path.join(__dirname, '..', '..', 'config.json')
if (fs.existsSync(configPath)) {
return JSON.parse(fs.readFileSync(configPath, 'utf-8'))
}
return {}
}
const _cfg = _loadConfig()
const Config = {
api: {
baseUrl: _cfg.geminiApiBaseUrl || 'https://yunwu.ai',
model: _cfg.geminiModel || 'gemini-3.1-flash-image-preview',
endpoint: _cfg.geminiEndpoint || `/v1beta/models/${_cfg.geminiModel || 'gemini-3.1-flash-image-preview'}:generateContent`,
key: _cfg.geminiApiKey || ''
},
// 默认输出配置
output: {
defaultDir: './output',
defaultFormat: 'png'
},
// 支持的宽高比
aspectRatios: ['1:1', '2:3', '3:2', '3:4', '4:3', '4:5', '5:4', '9:16', '16:9', '21:9'],
// 支持的分辨率
imageSizes: ['512', '1K', '2K', '4K'],
// 默认分辨率
defaultImageSize: '2K',
// 响应模式
responseModalities: {
textAndImage: ['TEXT', 'IMAGE'],
imageOnly: ['IMAGE'],
textOnly: ['TEXT']
},
// 超时设置(毫秒)
timeout: {
default: 120000, // 默认2分钟
max: 300000 // 最大5分钟
}
}
// ============================================================================
// 文件处理模块
// ============================================================================
const FileUtils = {
/**
* 确保目录存在
*/
ensureDir(dirPath) {
if (!fs.existsSync(dirPath)) {
fs.mkdirSync(dirPath, { recursive: true })
}
return dirPath
},
/**
* 图片转Base64
*/
imageToBase64(imagePath) {
const buffer = fs.readFileSync(imagePath)
const ext = path.extname(imagePath).toLowerCase()
const mimeTypes = {
'.png': 'image/png',
'.jpg': 'image/jpeg',
'.jpeg': 'image/jpeg',
'.gif': 'image/gif',
'.webp': 'image/webp'
}
return {
mimeType: mimeTypes[ext] || 'image/png',
data: buffer.toString('base64')
}
},
/**
* Base64保存为图片
*/
base64ToImage(base64Data, outputPath) {
const buffer = Buffer.from(base64Data, 'base64')
fs.writeFileSync(outputPath, buffer)
return outputPath
},
/**
* 生成唯一文件名
*/
generateFilename(prefix = 'image', ext = 'png') {
const timestamp = new Date().toISOString().replace(/[:.]/g, '-')
const random = Math.random().toString(36).substring(2, 8)
return `${prefix}_${timestamp}_${random}.${ext}`
},
/**
* 读取提示词文件
*/
readPromptsFile(filePath) {
const content = fs.readFileSync(filePath, 'utf-8')
return content.split('\n').filter(line => line.trim()).map(line => line.trim())
}
}
// ============================================================================
// API调用模块
// ============================================================================
const GeminiAPI = {
/**
* 发送生成请求
*/
async generateContent(contents, options = {}) {
const {
aspectRatio = '1:1',
imageSize = Config.defaultImageSize,
responseModalities = Config.responseModalities.textAndImage,
timeout = Config.timeout.default
} = options
const url = `${Config.api.baseUrl}${Config.api.endpoint}?key=${Config.api.key}`
const body = {
contents: contents,
generationConfig: {
responseModalities: responseModalities,
imageConfig: {
aspectRatio: aspectRatio,
imageSize: imageSize
}
}
}
console.log(`\n📡 API请求: ${Config.api.baseUrl}${Config.api.endpoint}`)
console.log(`📋 模型: ${Config.api.model}`)
console.log(`⏱️ 超时: ${timeout / 1000}`)
// 使用 AbortController 实现超时
const controller = new AbortController()
const timeoutId = setTimeout(() => controller.abort(), timeout)
try {
const response = await fetch(url, {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': `Bearer ${Config.api.key}`
},
body: JSON.stringify(body),
signal: controller.signal
})
if (!response.ok) {
const error = await response.text()
throw new Error(`API请求失败: ${response.status} - ${error}`)
}
return await response.json()
} finally {
clearTimeout(timeoutId)
}
},
/**
* 解析响应,提取图片和文本
*/
parseResponse(response) {
const result = {
text: '',
images: []
}
if (!response.candidates || !response.candidates[0]) {
return result
}
const parts = response.candidates[0].content?.parts || []
for (const part of parts) {
if (part.text) {
result.text += part.text
}
if (part.inlineData) {
result.images.push({
mimeType: part.inlineData.mimeType,
data: part.inlineData.data
})
}
}
return result
}
}
// ============================================================================
// 业务场景模板模块
// ============================================================================
const Templates = {
/**
* 写实照片模板
*/
photorealistic: {
name: '写实照片',
generate(subject, options = {}) {
const {
shotType = 'close-up portrait',
lighting = 'soft, natural golden hour light',
mood = 'serene',
environment = '',
cameraDetails = '85mm lens, shallow depth of field'
} = options
return `A photorealistic ${shotType} of ${subject}. ${environment ? `Set in ${environment}. ` : ''}The scene is illuminated by ${lighting}, creating a ${mood} atmosphere. Captured with ${cameraDetails}. Ultra-realistic, with sharp focus on key details.`
}
},
/**
* 贴纸/图标模板
*/
sticker: {
name: '贴纸/图标',
generate(subject, options = {}) {
const {
style = 'kawaii',
colorPalette = 'vibrant',
background = 'white'
} = options
return `A ${style}-style sticker of ${subject}. The design features bold, clean outlines, simple cel-shading, and a ${colorPalette} color palette. The background must be ${background}.`
}
},
/**
* Logo设计模板
*/
logo: {
name: 'Logo设计',
generate(text, options = {}) {
const {
style = 'modern, minimalist',
colorScheme = 'black and white',
shape = 'circle'
} = options
return `Create a ${style} logo${text ? ` with the text "${text}"` : ''}. The text should be in a clean, bold, sans-serif font. The color scheme is ${colorScheme}. Put the logo in a ${shape}.`
}
},
/**
* 产品图模板
*/
product: {
name: '产品图',
generate(product, options = {}) {
const {
surface = 'polished concrete surface',
lighting = 'three-point softbox setup',
angle = 'slightly elevated 45-degree shot',
background = 'minimalist'
} = options
return `A high-resolution, studio-lit product photograph of ${product}, presented on a ${surface}. The lighting is a ${lighting} designed to create soft, diffused highlights and eliminate harsh shadows. The camera angle is a ${angle} to showcase key features. Ultra-realistic. ${background} background.`
}
},
/**
* 极简设计模板
*/
minimalist: {
name: '极简设计',
generate(subject, options = {}) {
const {
position = 'bottom-right',
backgroundColor = 'off-white canvas',
lighting = 'soft, diffused lighting from the top left'
} = options
return `A minimalist composition featuring a single, ${subject} positioned in the ${position} of the frame. The background is a vast, empty ${backgroundColor}, creating significant negative space for text. ${lighting}.`
}
},
/**
* 漫画/故事板模板
*/
comic: {
name: '漫画/故事板',
generate(scene, options = {}) {
const {
style = 'gritty, noir',
panels = 3
} = options
return `Make a ${panels} panel comic in a ${style} art style with high-contrast black and white inks. ${scene}`
}
},
/**
* 风格转换模板
*/
styleTransfer: {
name: '风格转换',
generate(targetStyle, options = {}) {
const {
preserveElements = 'composition and key elements'
} = options
return `Transform the provided image into the artistic style of ${targetStyle}. Preserve the original ${preserveElements} but render with the new stylistic elements.`
}
},
/**
* 图像编辑模板
*/
edit: {
name: '图像编辑',
generate(instruction, options = {}) {
const {
preserve = 'Keep everything else unchanged, preserving the original style, lighting, and composition'
} = options
return `${instruction}. ${preserve}.`
}
},
/**
* 图像合成模板
*/
composite: {
name: '图像合成',
generate(description, options = {}) {
return `Create a new image by combining the elements from the provided images. ${description} Generate a realistic result with proper lighting and shadows.`
}
}
}
// ============================================================================
// 核心生成器类
// ============================================================================
class GeminiImageGenerator {
constructor(options = {}) {
this.outputDir = options.outputDir || Config.output.defaultDir
this.defaultAspectRatio = options.aspectRatio || '1:1'
this.defaultImageSize = options.imageSize || Config.defaultImageSize
if (!Config.api.key) {
console.warn('警告: 未设置API密钥')
}
}
/**
* 文生图
*/
async textToImage(prompt, options = {}) {
const {
aspectRatio = this.defaultAspectRatio,
imageSize = this.defaultImageSize,
outputDir = this.outputDir,
filename = null
} = options
console.log(`\n🎨 生成图片: "${prompt.substring(0, 50)}..."`)
console.log(`📐 宽高比: ${aspectRatio}`)
console.log(`📏 分辨率: ${imageSize}`)
const contents = [{
role: 'user',
parts: [{ text: prompt }]
}]
const response = await GeminiAPI.generateContent(contents, { aspectRatio, imageSize })
const result = GeminiAPI.parseResponse(response)
if (result.text) {
console.log(`📝 模型回复: ${result.text}`)
}
const savedFiles = []
FileUtils.ensureDir(outputDir)
for (let i = 0; i < result.images.length; i++) {
const img = result.images[i]
const ext = img.mimeType.split('/')[1] || 'png'
const outputFilename = filename || FileUtils.generateFilename('generated', ext)
const outputPath = path.join(outputDir, outputFilename)
FileUtils.base64ToImage(img.data, outputPath)
savedFiles.push(outputPath)
console.log(`✅ 已保存: ${outputPath}`)
}
return {
text: result.text,
images: result.images,
savedFiles
}
}
/**
* 图生图(带参考图编辑)
*/
async imageToImage(prompt, inputImages, options = {}) {
const {
aspectRatio = this.defaultAspectRatio,
imageSize = this.defaultImageSize,
outputDir = this.outputDir
} = options
console.log(`\n🖼️ 编辑图片: "${prompt.substring(0, 50)}..."`)
console.log(`📁 输入图片: ${Array.isArray(inputImages) ? inputImages.length : 1}`)
console.log(`📏 分辨率: ${imageSize}`)
const parts = [{ text: prompt }]
// 处理输入图片
const images = Array.isArray(inputImages) ? inputImages : [inputImages]
for (const imgPath of images) {
const { mimeType, data } = FileUtils.imageToBase64(imgPath)
parts.push({
inlineData: {
mime_type: mimeType,
data: data
}
})
}
const contents = [{
role: 'user',
parts: parts
}]
const response = await GeminiAPI.generateContent(contents, { aspectRatio, imageSize })
const result = GeminiAPI.parseResponse(response)
if (result.text) {
console.log(`📝 模型回复: ${result.text}`)
}
const savedFiles = []
FileUtils.ensureDir(outputDir)
for (let i = 0; i < result.images.length; i++) {
const img = result.images[i]
const ext = img.mimeType.split('/')[1] || 'png'
const outputFilename = FileUtils.generateFilename('edited', ext)
const outputPath = path.join(outputDir, outputFilename)
FileUtils.base64ToImage(img.data, outputPath)
savedFiles.push(outputPath)
console.log(`✅ 已保存: ${outputPath}`)
}
return {
text: result.text,
images: result.images,
savedFiles
}
}
/**
* 使用模板生成
*/
async generateFromTemplate(templateName, ...args) {
const template = Templates[templateName]
if (!template) {
throw new Error(`未知的模板: ${templateName}。可用模板: ${Object.keys(Templates).join(', ')}`)
}
const options = args[args.length - 1] || {}
const prompt = template.generate(...args)
console.log(`📋 使用模板: ${template.name}`)
return this.textToImage(prompt, options)
}
/**
* 批量生成
*/
async batchGenerate(prompts, options = {}) {
const results = []
const total = prompts.length
console.log(`\n🚀 开始批量生成,共 ${total} 个任务`)
for (let i = 0; i < prompts.length; i++) {
console.log(`\n[${i + 1}/${total}] 处理中...`)
try {
const result = await this.textToImage(prompts[i], {
...options,
filename: `batch_${i + 1}.png`
})
results.push({ success: true, prompt: prompts[i], result })
} catch (error) {
console.error(`❌ 失败: ${error.message}`)
results.push({ success: false, prompt: prompts[i], error: error.message })
}
}
const successCount = results.filter(r => r.success).length
console.log(`\n✨ 批量生成完成: ${successCount}/${total} 成功`)
return results
}
/**
* 多轮对话编辑
*/
createChatSession(options = {}) {
const history = []
return {
async send(message, inputImages = null) {
const parts = [{ text: message }]
// 如果有输入图片
if (inputImages) {
const images = Array.isArray(inputImages) ? inputImages : [inputImages]
for (const imgPath of images) {
const { mimeType, data } = FileUtils.imageToBase64(imgPath)
parts.push({
inlineData: {
mime_type: mimeType,
data: data
}
})
}
}
// 添加用户消息到历史
history.push({
role: 'user',
parts: parts
})
const response = await GeminiAPI.generateContent(history, options)
const result = GeminiAPI.parseResponse(response)
// 添加模型回复到历史(需要包含图片数据以便后续编辑)
const modelParts = []
if (result.text) {
modelParts.push({ text: result.text })
}
for (const img of result.images) {
modelParts.push({
inlineData: {
mime_type: img.mimeType,
data: img.data
}
})
}
if (modelParts.length > 0) {
history.push({
role: 'model',
parts: modelParts
})
}
// 保存图片
const savedFiles = []
FileUtils.ensureDir(options.outputDir || this.outputDir)
for (const img of result.images) {
const ext = img.mimeType.split('/')[1] || 'png'
const outputFilename = FileUtils.generateFilename('chat', ext)
const outputPath = path.join(options.outputDir || this.outputDir, outputFilename)
FileUtils.base64ToImage(img.data, outputPath)
savedFiles.push(outputPath)
console.log(`✅ 已保存: ${outputPath}`)
}
return {
text: result.text,
images: result.images,
savedFiles
}
},
getHistory() {
return history
}
}
}
}
// ============================================================================
// CLI接口模块
// ============================================================================
const CLI = {
/**
* 解析命令行参数
*/
parseArgs(args) {
const result = {
command: '',
params: [],
options: {}
}
let i = 0
while (i < args.length) {
const arg = args[i]
if (arg.startsWith('--')) {
const key = arg.substring(2)
const nextArg = args[i + 1]
if (nextArg && !nextArg.startsWith('-')) {
result.options[key] = nextArg
i += 2
} else {
result.options[key] = true
i++
}
} else if (arg.startsWith('-')) {
const key = arg.substring(1)
const shortOptions = {
'o': 'output',
'r': 'ratio',
's': 'size',
'i': 'input',
't': 'template',
'h': 'help'
}
const fullKey = shortOptions[key] || key
const nextArg = args[i + 1]
if (nextArg && !nextArg.startsWith('-')) {
result.options[fullKey] = nextArg
i += 2
} else {
result.options[fullKey] = true
i++
}
} else if (!result.command) {
result.command = arg
i++
} else {
result.params.push(arg)
i++
}
}
return result
},
/**
* 显示帮助信息
*/
showHelp() {
console.log(`
🎨 Gemini Image Generator - 云雾API图片生成工具
📦 模型: ${Config.api.model}
用法:
node gemini-image-generator.js <command> [options]
命令:
generate <prompt> 文生图
edit <prompt> 图生图(需要 -i 指定输入图片)
template <name> 使用模板生成
batch <file> 批量生成(从文件读取提示词)
list-templates 列出所有可用模板
选项:
-o, --output <dir> 输出目录 (默认: ./output)
-r, --ratio <ratio> 宽高比 (1:1, 16:9, 9:16, 3:2, 2:3 等)
-s, --size <size> 分辨率 (512, 1K, 2K, 4K默认: 2K)
-i, --input <file> 输入图片路径用于edit命令
-t, --template <name> 模板名称
--text <text> Logo文字用于logo模板
--subject <subject> 主题内容
--style <style> 风格
-h, --help 显示帮助信息
示例:
# 基础文生图 16:9 2K分辨率
node gemini-image-generator.js generate "A cute cat wearing a hat" -o ./my-images -r 16:9 -s 2K
# 高分辨率4K图片
node gemini-image-generator.js generate "A landscape photo" -r 16:9 -s 4K
# 图生图编辑
node gemini-image-generator.js edit "Add sunglasses to this person" -i ./photo.jpg
# 使用Logo模板
node gemini-image-generator.js template logo --text "MyBrand" --style minimalist
# 使用产品图模板
node gemini-image-generator.js template product --subject "a minimalist ceramic coffee mug"
# 批量生成
node gemini-image-generator.js batch ./prompts.txt -o ./batch-output
可用宽高比:
${Config.aspectRatios.join(', ')}
可用分辨率:
${Config.imageSizes.join(', ')}
可用模板:
${Object.entries(Templates).map(([k, v]) => `${k} (${v.name})`).join('\n ')}
`)
},
/**
* 列出模板
*/
listTemplates() {
console.log('\n📋 可用模板:\n')
for (const [key, template] of Object.entries(Templates)) {
console.log(` ${key.padEnd(15)} - ${template.name}`)
}
console.log('')
},
/**
* 执行命令
*/
async run(args) {
const { command, params, options } = this.parseArgs(args)
if (options.help || command === 'help' || !command) {
this.showHelp()
return
}
const generator = new GeminiImageGenerator({
outputDir: options.output || Config.output.defaultDir,
aspectRatio: options.ratio || '1:1',
imageSize: options.size || Config.defaultImageSize
})
switch (command) {
case 'generate': {
const prompt = params.join(' ')
if (!prompt) {
console.error('❌ 请提供生成提示词')
return
}
await generator.textToImage(prompt, {
aspectRatio: options.ratio,
imageSize: options.size,
outputDir: options.output
})
break
}
case 'edit': {
const prompt = params.join(' ')
const inputImages = options.input?.split(',').map(p => p.trim())
if (!prompt) {
console.error('❌ 请提供编辑指令')
return
}
if (!inputImages || inputImages.length === 0) {
console.error('❌ 请使用 -i 指定输入图片')
return
}
await generator.imageToImage(prompt, inputImages, {
aspectRatio: options.ratio,
imageSize: options.size,
outputDir: options.output
})
break
}
case 'template': {
const templateName = params[0] || options.template
if (!templateName) {
this.listTemplates()
return
}
const template = Templates[templateName]
if (!template) {
console.error(`❌ 未知的模板: ${templateName}`)
this.listTemplates()
return
}
// 根据模板类型处理参数
let templateOptions = { aspectRatio: options.ratio, outputDir: options.output }
switch (templateName) {
case 'logo':
await generator.generateFromTemplate('logo', options.text || '', {
style: options.style || 'modern, minimalist',
colorScheme: 'black and white'
}, templateOptions)
break
case 'product':
await generator.generateFromTemplate('product', options.subject || params.slice(1).join(' ') || 'a product', {
surface: 'polished concrete surface'
}, templateOptions)
break
case 'photorealistic':
await generator.generateFromTemplate('photorealistic', options.subject || params.slice(1).join(' ') || 'a person', {}, templateOptions)
break
case 'sticker':
await generator.generateFromTemplate('sticker', options.subject || params.slice(1).join(' ') || 'a cute character', {}, templateOptions)
break
default:
await generator.generateFromTemplate(templateName, params.slice(1).join(' ') || '', {}, templateOptions)
}
break
}
case 'batch': {
const filePath = params[0]
if (!filePath) {
console.error('❌ 请提供提示词文件路径')
return
}
const prompts = FileUtils.readPromptsFile(filePath)
await generator.batchGenerate(prompts, {
aspectRatio: options.ratio,
outputDir: options.output
})
break
}
case 'list-templates': {
this.listTemplates()
break
}
default:
console.error(`❌ 未知命令: ${command}`)
this.showHelp()
}
}
}
// ============================================================================
// 导出模块
// ============================================================================
module.exports = {
// 核心类
GeminiImageGenerator,
// 模块
Config,
FileUtils,
GeminiAPI,
Templates,
CLI,
// 便捷方法
generate: async (prompt, options) => {
const generator = new GeminiImageGenerator(options)
return generator.textToImage(prompt, options)
},
edit: async (prompt, images, options) => {
const generator = new GeminiImageGenerator(options)
return generator.imageToImage(prompt, images, options)
},
fromTemplate: async (templateName, ...args) => {
const generator = new GeminiImageGenerator(args[args.length - 1] || {})
return generator.generateFromTemplate(templateName, ...args)
}
}
// ============================================================================
// 主入口
// ============================================================================
// 如果直接运行此脚本
if (require.main === module) {
const args = process.argv.slice(2)
CLI.run(args).catch(error => {
console.error(`\n❌ 错误: ${error.message}`)
process.exit(1)
})
}

View File

@@ -0,0 +1,580 @@
#!/usr/bin/env node
/**
* Grok Video Generator - 图生视频工具
*
* 功能:
* - 提交图生视频任务Grok 模型)
* - 轮询直到完成60-240秒
* - 失败自动优化提示词重试最多3次
* - 下载结果视频
*
* 用法:
* node grok-video-generator.js --image ./ref.jpg --prompt "zoom in slowly" -o ./output
* node grok-video-generator.js batch ./tasks.json -o ./output
*/
const fs = require('fs')
const path = require('path')
const https = require('https')
const http = require('http')
// ============================================================================
// 配置
// ============================================================================
function loadConfig() {
const configPath = path.join(__dirname, '..', '..', 'config.json')
if (fs.existsSync(configPath)) {
return JSON.parse(fs.readFileSync(configPath, 'utf-8'))
}
return {}
}
const cfg = loadConfig()
const Config = {
baseUrl: cfg.grokApiBaseUrl || 'https://yunwu.ai',
apiKey: cfg.grokApiKey || '',
model: cfg.grokModel || 'grok-video-3',
pollInterval: 10000, // 轮询间隔 10 秒Grok 慢)
maxPollTime: 300000, // 单次最大等待 5 分钟
maxRetries: 3, // 失败重试次数
}
// ============================================================================
// 提示词优化(失败时自动调整)
// ============================================================================
const PromptOptimizer = {
/**
* 根据失败原因优化提示词
*/
optimize(prompt, failReason, attempt) {
let optimized = prompt
// 第1次重试简化提示词
if (attempt === 1) {
optimized = simplifyPrompt(prompt)
console.log(` 重试策略: 简化提示词`)
}
// 第2次重试添加安全后缀
if (attempt === 2) {
optimized = `${simplifyPrompt(prompt)}, smooth motion, high quality`
console.log(` 重试策略: 简化 + 安全后缀`)
}
// 第3次重试极简提示词
if (attempt >= 3) {
optimized = extractCoreSubject(prompt)
console.log(` 重试策略: 极简提示词`)
}
return optimized
}
}
function simplifyPrompt(prompt) {
// 去掉过长描述,保留核心部分
const parts = prompt.split(',').map(s => s.trim())
return parts.slice(0, 3).join(', ')
}
function extractCoreSubject(prompt) {
// 提取第一个逗号或句号前的内容
const match = prompt.match(/^([^.!,]+)/)
return match ? match[1].trim() : 'cinematic motion'
}
// ============================================================================
// API
// ============================================================================
const GrokApi = {
/**
* 提交图生视频任务
*/
async create(imageUrl, prompt, options = {}) {
const {
aspectRatio = '9:16',
size = '720P',
mode = 'custom',
model = Config.model,
} = options
let finalPrompt = prompt.trim()
if (mode && !finalPrompt.includes('--mode')) {
finalPrompt = `${finalPrompt} --mode=${mode}`
}
const body = {
model,
prompt: finalPrompt,
aspect_ratio: aspectRatio,
size,
images: [imageUrl],
}
console.log(`\n📡 提交 Grok 视频任务`)
console.log(` 模型: ${model}`)
console.log(` 提示词: ${finalPrompt.substring(0, 80)}...`)
console.log(` 参考图: ${imageUrl.substring(0, 60)}...`)
console.log(` 画幅: ${aspectRatio}`)
const res = await fetch(`${Config.baseUrl}/v1/video/create`, {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': `Bearer ${Config.apiKey}`,
},
body: JSON.stringify(body),
})
const result = await res.json()
if (!result.id) {
throw new Error(`Grok 提交失败: ${JSON.stringify(result)}`)
}
console.log(` 任务 ID: ${result.id}`)
return result.id
},
/**
* 查询任务状态
*/
async query(taskId) {
const res = await fetch(`${Config.baseUrl}/v1/video/query?id=${taskId}`, {
headers: { 'Authorization': `Bearer ${Config.apiKey}` },
})
return await res.json()
},
/**
* 轮询直到完成
*/
async poll(taskId) {
const startTime = Date.now()
let lastProgress = 0
console.log(`\n⏳ 等待 Grok 视频生成(预计 60-240 秒)...`)
while (Date.now() - startTime < Config.maxPollTime) {
const task = await GrokApi.query(taskId)
if (task.status === 'completed') {
console.log(`\n✅ 视频生成完成!`)
console.log(` 视频: ${task.video_url}`)
return {
success: true,
videoUrl: task.video_url,
thumbnailUrl: task.thumbnail_url || '',
}
}
if (task.status === 'failed' || task.error) {
throw new Error(task.error || task.message || 'Grok 生成失败')
}
const progress = task.progress || 0
if (progress !== lastProgress) {
lastProgress = progress
const elapsed = Math.round((Date.now() - startTime) / 1000)
process.stdout.write(` 进度: ${progress}% 已等待: ${elapsed}s 状态: ${task.status}\r`)
}
await new Promise(r => setTimeout(r, Config.pollInterval))
}
throw new Error(`Grok 生成超时 (${Config.maxPollTime / 1000}s)`)
},
}
// ============================================================================
// 图片下载工具
// ============================================================================
async function download(url, outputPath) {
const protocol = url.startsWith('https') ? https : http
return new Promise((resolve, reject) => {
const file = fs.createWriteStream(outputPath)
protocol.get(url, (response) => {
if (response.statusCode >= 300 && response.statusCode < 400 && response.headers.location) {
file.close()
fs.unlinkSync(outputPath)
return download(response.headers.location, outputPath).then(resolve).catch(reject)
}
response.pipe(file)
file.on('finish', () => { file.close(); resolve(outputPath) })
}).on('error', (err) => {
file.close()
if (fs.existsSync(outputPath)) fs.unlinkSync(outputPath)
reject(err)
})
})
}
// ============================================================================
// 核心流程(带重试)
// ============================================================================
async function generate(imageUrl, prompt, options = {}) {
const { outputDir = './output', aspectRatio = '9:16', size = '720P' } = options
if (!Config.apiKey) throw new Error('未配置 grokApiKey请在 config.json 中添加')
fs.mkdirSync(outputDir, { recursive: true })
let currentPrompt = prompt
let lastError = null
for (let attempt = 0; attempt <= Config.maxRetries; attempt++) {
try {
if (attempt > 0) {
currentPrompt = PromptOptimizer.optimize(prompt, lastError, attempt)
console.log(`\n🔄 第 ${attempt} 次重试`)
console.log(` 新提示词: ${currentPrompt}`)
}
// 1. 提交
const taskId = await GrokApi.create(imageUrl, currentPrompt, { aspectRatio, size })
// 2. 轮询
const result = await GrokApi.poll(taskId)
// 3. 下载
const timestamp = new Date().toISOString().replace(/[:.]/g, '-')
const videoFile = path.join(outputDir, `${timestamp}_grok.mp4`)
await download(result.videoUrl, videoFile)
console.log(` 下载完成: ${videoFile}`)
// 下载缩略图(如有)
let thumbnailFile = null
if (result.thumbnailUrl) {
thumbnailFile = path.join(outputDir, `${timestamp}_thumb.jpg`)
try { await download(result.thumbnailUrl, thumbnailFile) } catch (_) {}
}
return {
success: true,
taskId,
prompt: currentPrompt,
originalPrompt: prompt,
attempts: attempt + 1,
files: [videoFile],
thumbnail: thumbnailFile,
}
} catch (err) {
lastError = err.message
console.error(` ❌ 第 ${attempt + 1} 次失败: ${err.message}`)
if (attempt < Config.maxRetries) {
console.log(` 等待 5 秒后重试...`)
await new Promise(r => setTimeout(r, 5000))
}
}
}
throw new Error(`Grok 视频生成失败(已重试 ${Config.maxRetries} 次): ${lastError}`)
}
/**
* 并行批量生成:先同时提交所有任务,再并行轮询结果
* 5 张图 ~2 分钟全部完成(而非串行的 ~10 分钟)
*
* 输入格式(支持两种):
* 1. tasks 数组: [{ image, prompt, text, videoPrompt }]
* 2. manifest.json: { items: [{ file, url, text, videoPrompt, keyword, keywordColor }] }
*
* videoPrompt 由图片生成阶段一并产出,描述视频运动(如 "slow zoom in on subject"
*/
async function batchGenerate(tasks, options = {}) {
const { outputDir = './output', aspectRatio = '9:16', size = '720P' } = options
const concurrency = options.concurrency || 3
if (!Config.apiKey) throw new Error('未配置 grokApiKey请在 config.json 中添加')
fs.mkdirSync(outputDir, { recursive: true })
// 如果 tasks 是 manifest 格式,转换
if (tasks.items && Array.isArray(tasks.items)) {
tasks = tasks.items.map(item => ({
image: item.url || item.image || '',
prompt: item.videoPrompt || item.prompt || 'cinematic motion',
text: item.text || item.caption || '',
keyword: item.keyword || '',
keywordColor: item.keywordColor || '',
file: item.file || '',
}))
}
// Phase 1: 并行提交所有任务(限制并发数)
console.log(`\n📡 并行提交 ${tasks.length} 个视频任务(并发: ${concurrency}...`)
const submitted = []
for (let i = 0; i < tasks.length; i += concurrency) {
const batch = tasks.slice(i, i + concurrency)
const batchResults = await Promise.allSettled(
batch.map(async (task, j) => {
const idx = i + j
const prompt = task.videoPrompt || task.prompt
console.log(` [${idx + 1}/${tasks.length}] 提交: ${prompt.substring(0, 50)}...`)
try {
const taskId = await GrokApi.create(task.image, prompt, { aspectRatio, size })
return { idx, taskId, task, error: null }
} catch (err) {
console.error(` [${idx + 1}] 提交失败: ${err.message}`)
return { idx, taskId: null, task, error: err.message }
}
})
)
submitted.push(...batchResults.map(r => r.value || r.reason))
}
const pendingTasks = submitted.filter(s => s.taskId)
if (pendingTasks.length === 0) {
console.error('\n❌ 所有任务提交失败')
return tasks.map((task, idx) => ({
success: false, ...task,
error: (submitted.find(s => s.idx === idx) || {}).error || '提交失败',
}))
}
// Phase 2: 并行轮询所有已提交任务
console.log(`\n⏳ 并行等待 ${pendingTasks.length} 个视频生成...`)
const pollResults = await Promise.allSettled(
pendingTasks.map(async ({ idx, taskId, task }) => {
const prompt = task.videoPrompt || task.prompt
const result = await pollWithRetry(taskId, prompt, { outputDir, aspectRatio, size })
return { idx, ...result, task }
})
)
// 合并结果
const results = []
for (let i = 0; i < tasks.length; i++) {
const submittedInfo = submitted.find(s => s.idx === i)
if (!submittedInfo || !submittedInfo.taskId) {
results.push({ success: false, ...tasks[i], error: submittedInfo?.error || '提交失败' })
continue
}
const pollResult = pollResults.find(r => {
if (r.status === 'fulfilled') return r.value.idx === i
return false
})
if (pollResult && pollResult.status === 'fulfilled') {
results.push({ success: true, ...tasks[i], ...pollResult.value })
} else {
const reason = pollResult?.reason?.message || '生成失败'
results.push({ success: false, ...tasks[i], error: reason })
}
}
const ok = results.filter(r => r.success).length
console.log(`\n✨ 批量完成: ${ok}/${tasks.length} 成功`)
// 输出 manifest.json供 capcut_assemble.js 使用,文案与视频一一对应)
const manifestItems = results
.filter(r => r.success && r.files && r.files.length > 0)
.map(r => {
const item = {
file: path.basename(r.files[0]),
duration: 6, // Grok 固定 6 秒
}
// 保留原始文案text 或 caption 字段)
if (r.text) item.text = r.text
if (r.caption) item.caption = r.caption
if (r.keyword) item.keyword = r.keyword
if (r.keywordColor) item.keywordColor = r.keywordColor
return item
})
if (manifestItems.length > 0 && !options.skipManifestWrite) {
const manifestPath = path.join(outputDir, 'manifest.json')
const manifest = { items: manifestItems }
fs.writeFileSync(manifestPath, JSON.stringify(manifest, null, 2))
console.log(` 已生成 manifest.json${manifestItems.length} 条,文案与视频对应)`)
}
return results
}
/**
* 轮询 + 失败重试(单任务)
*/
async function pollWithRetry(taskId, prompt, options = {}) {
let currentTaskId = taskId
let currentPrompt = prompt
let lastError = null
for (let attempt = 0; attempt <= Config.maxRetries; attempt++) {
try {
if (attempt > 0) {
currentPrompt = PromptOptimizer.optimize(prompt, lastError, attempt)
console.log(`\n 🔄 重试 (任务 ${currentTaskId.substring(0, 8)}...): ${currentPrompt.substring(0, 50)}`)
currentTaskId = await GrokApi.create(
options.imageUrl || '',
currentPrompt,
{ aspectRatio: options.aspectRatio, size: options.size }
)
}
const result = await GrokApi.poll(currentTaskId)
const timestamp = new Date().toISOString().replace(/[:.]/g, '-')
const videoFile = path.join(options.outputDir || './output', `${timestamp}_grok.mp4`)
await download(result.videoUrl, videoFile)
let thumbnailFile = null
if (result.thumbnailUrl) {
thumbnailFile = path.join(options.outputDir || './output', `${timestamp}_thumb.jpg`)
try { await download(result.thumbnailUrl, thumbnailFile) } catch (_) {}
}
return {
taskId: currentTaskId,
prompt: currentPrompt,
originalPrompt: prompt,
attempts: attempt + 1,
file: videoFile,
files: [videoFile],
duration: 6,
thumbnail: thumbnailFile,
}
} catch (err) {
lastError = err.message
if (attempt < Config.maxRetries) {
await new Promise(r => setTimeout(r, 5000))
}
}
}
throw new Error(`重试 ${Config.maxRetries} 次后仍失败: ${lastError}`)
}
// ============================================================================
// CLI
// ============================================================================
function showHelp() {
console.log(`
🎬 Grok Video Generator - 图生视频工具
用法:
node grok-video-generator.js --image <url> --prompt "指令" [options]
node grok-video-generator.js batch <manifest.json|tasks.json> [options]
选项:
-o, --output <dir> 输出目录 (默认: ./output)
-a, --ar <ratio> 宽高比 (默认: 9:16)
-s, --size <size> 分辨率 (默认: 720P)
--mode <mode> 生成模式 (默认: custom)
--model <model> 模型名称 (默认: grok-video-3)
--retries <n> 失败重试次数 (默认: 3)
-h, --help 帮助
示例:
node grok-video-generator.js --image http://img.com/ref.jpg --prompt "zoom in"
node grok-video-generator.js batch ./manifest.json -o ./videos
manifest.json 格式(由生图阶段生成,含 videoPrompt:
{
"items": [
{
"file": "img_001.png",
"url": "http://...", // 图片 URLOSS 上传后的地址)
"text": "这段视频的字幕文案", // CapCut 字幕
"keyword": "关键词", // 字幕高亮词
"videoPrompt": "slow zoom in on subject, cinematic" // 视频运动提示词
}
]
}
videoPrompt 在生图阶段由 AI 一并生成,描述视频运动而非图片内容。
批量完成后自动输出 manifest.json含 text/duration供 capcut_assemble.js 直接使用。
`)
}
async function main() {
const args = process.argv.slice(2)
if (args.includes('-h') || args.includes('--help') || args.length === 0) {
showHelp()
return
}
let command = 'single'
let params = []
const options = {
outputDir: './output',
aspectRatio: '9:16',
size: '720P',
mode: 'custom',
imageUrl: '',
prompt: '',
}
let i = 0
if (args[0] === 'batch') {
command = 'batch'
i = 1
}
while (i < args.length) {
const arg = args[i]
if (arg === '-o' || arg === '--output') {
options.outputDir = args[++i]
} else if (arg === '-a' || arg === '--ar') {
options.aspectRatio = args[++i]
} else if (arg === '-s' || arg === '--size') {
options.size = args[++i]
} else if (arg === '--mode') {
options.mode = args[++i]
} else if (arg === '--image') {
options.imageUrl = args[++i]
} else if (arg === '--prompt') {
options.prompt = args[++i]
} else if (arg === '--retries') {
Config.maxRetries = parseInt(args[++i], 10)
} else {
params.push(arg)
}
i++
}
if (command === 'batch') {
const filePath = params[0]
if (!filePath || !fs.existsSync(filePath)) {
console.error('请提供 tasks.json 路径')
process.exit(1)
}
const tasks = JSON.parse(fs.readFileSync(filePath, 'utf-8'))
await batchGenerate(tasks, options)
} else {
if (!options.imageUrl) {
console.error('请提供 --image 参数(图片 URL')
process.exit(1)
}
if (!options.prompt) {
console.error('请提供 --prompt 参数')
process.exit(1)
}
await generate(options.imageUrl, options.prompt, options)
}
}
// ============================================================================
// 导出
// ============================================================================
module.exports = { generate, batchGenerate, pollWithRetry, GrokApi, PromptOptimizer }
if (require.main === module) {
main().catch(err => {
console.error(`\n❌ 错误: ${err.message}`)
process.exit(1)
})
}

View File

@@ -0,0 +1,438 @@
#!/usr/bin/env node
/**
* MJ Image Generator - Midjourney 图片生成工具
*
* 功能:
* - 提交 imagine 任务
* - 轮询直到完成
* - 下载 4 合 1 结果图
* - 自动拆分为 4 张独立图片
* - 支持参考图(图生图)
*
* 用法:
* node mj-image-generator.js "a cute cat" -o ./output
* node mj-image-generator.js "cyberpunk city" -o ./output -r http://example.com/ref.jpg
* node mj-image-generator.js batch ./prompts.txt -o ./output
*/
const fs = require('fs')
const path = require('path')
const https = require('https')
const http = require('http')
const sharp = require('sharp')
// ============================================================================
// 配置
// ============================================================================
function loadConfig() {
const configPath = path.join(__dirname, '..', '..', 'config.json')
if (fs.existsSync(configPath)) {
return JSON.parse(fs.readFileSync(configPath, 'utf-8'))
}
return {}
}
const cfg = loadConfig()
const Config = {
baseUrl: cfg.mjApiBaseUrl,
apiKey: cfg.mjApiKey || '',
pollInterval: 5000, // 轮询间隔 5 秒
maxPollTime: 300000, // 最大等待 5 分钟
}
// ============================================================================
// API 调用
// ============================================================================
const MJApi = {
/**
* 提交 imagine 任务
*/
async submit(prompt, options = {}) {
const { referenceImages = [], botType = 'mj', aspectRatio = '', mjParams = '', styleWeight = 100 } = options
let finalPrompt = prompt
if (referenceImages.length > 0) {
const srefSection = `--sref ${referenceImages.join(' ')} --sw ${styleWeight}`
finalPrompt = `${prompt} ${srefSection}`.trim()
}
if (aspectRatio) {
finalPrompt = `${finalPrompt} --ar ${aspectRatio}`
}
if (mjParams) {
finalPrompt = `${finalPrompt} ${mjParams}`
}
const body = {
prompt: finalPrompt,
base64Array: [],
botType,
}
console.log(`\n📡 提交 MJ 任务`)
console.log(` 提示词: ${finalPrompt.substring(0, 80)}...`)
console.log(` 参考图: ${referenceImages.length}`)
const res = await fetch(`${Config.baseUrl}/mj/submit/imagine`, {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': `Bearer ${Config.apiKey}`,
},
body: JSON.stringify(body),
})
const result = await res.json()
if (result.code !== 1 && result.code !== '1') {
throw new Error(`MJ 提交失败: code=${result.code}, ${result.description || JSON.stringify(result)}`)
}
const taskId = result.result
console.log(` 任务 ID: ${taskId}`)
return taskId
},
/**
* 轮询任务状态
*/
async poll(taskId) {
const startTime = Date.now()
let lastProgress = ''
console.log(`\n⏳ 等待 MJ 生成...`)
while (Date.now() - startTime < Config.maxPollTime) {
const res = await fetch(`${Config.baseUrl}/mj/task/${taskId}/fetch`, {
headers: { 'Authorization': `Bearer ${Config.apiKey}` },
})
const task = await res.json()
const status = task.status
if (status === 'SUCCESS') {
console.log(`\n✅ 生成完成!`)
console.log(` 图片 URL: ${task.imageUrl}`)
return {
success: true,
imageUrl: task.imageUrl,
prompt: task.prompt || task.promptEn,
}
}
if (status === 'FAILURE') {
const errMsg = task.failReason || '未知原因'
throw new Error(`MJ 生成失败: ${errMsg}`)
}
// 显示进度
const progress = task.progress || ''
if (progress !== lastProgress) {
lastProgress = progress
process.stdout.write(` 进度: ${progress}% 状态: ${status}\r`)
}
await new Promise(r => setTimeout(r, Config.pollInterval))
}
throw new Error(`MJ 生成超时 (${Config.maxPollTime / 1000}s)`)
},
}
// ============================================================================
// 图片处理
// ============================================================================
const ImageUtils = {
/**
* 下载图片到本地
*/
async download(url, outputPath) {
const protocol = url.startsWith('https') ? https : http
return new Promise((resolve, reject) => {
const file = fs.createWriteStream(outputPath)
protocol.get(url, (response) => {
// 处理重定向
if (response.statusCode >= 300 && response.statusCode < 400 && response.headers.location) {
file.close()
fs.unlinkSync(outputPath)
return ImageUtils.download(response.headers.location, outputPath).then(resolve).catch(reject)
}
response.pipe(file)
file.on('finish', () => {
file.close()
resolve(outputPath)
})
}).on('error', (err) => {
file.close()
if (fs.existsSync(outputPath)) fs.unlinkSync(outputPath)
reject(err)
})
})
},
/**
* 将 4 合 1 图片拆分为 4 张独立图片
*/
async split4(gridImagePath, outputDir, prefix = 'mj') {
const image = sharp(gridImagePath)
const metadata = await image.metadata()
const { width, height } = metadata
// MJ 4 合 1 是 2x2 网格,每格约一半
const halfW = Math.floor(width / 2)
const halfH = Math.floor(height / 2)
const positions = [
{ name: `${prefix}_1`, x: 0, y: 0 }, // 左上
{ name: `${prefix}_2`, x: halfW, y: 0 }, // 右上
{ name: `${prefix}_3`, x: 0, y: halfH }, // 左下
{ name: `${prefix}_4`, x: halfW, y: halfH }, // 右下
]
const files = []
for (const pos of positions) {
const outputPath = path.join(outputDir, `${pos.name}.png`)
await sharp(gridImagePath)
.extract({ left: pos.x, top: pos.y, width: halfW, height: halfH })
.toFile(outputPath)
files.push(outputPath)
}
return files
},
}
// ============================================================================
// 核心流程
// ============================================================================
async function generate(prompt, options = {}) {
const {
outputDir = './output',
referenceImages = [],
aspectRatio = '',
mjParams = '',
split = true,
keepGrid = false,
} = options
if (!Config.apiKey) {
throw new Error('未配置 mjApiKey请在 config.json 中添加')
}
fs.mkdirSync(outputDir, { recursive: true })
// 1. 提交任务
const taskId = await MJApi.submit(prompt, { referenceImages, aspectRatio, mjParams, styleWeight: options.styleWeight })
// 2. 轮询等待
const result = await MJApi.poll(taskId)
// 3. 下载
const timestamp = new Date().toISOString().replace(/[:.]/g, '-')
const gridFile = path.join(outputDir, `${timestamp}_grid.png`)
await ImageUtils.download(result.imageUrl, gridFile)
console.log(` 下载完成: ${gridFile}`)
// 4. 拆分
const allFiles = [gridFile]
if (split) {
const prefix = timestamp
const splitFiles = await ImageUtils.split4(gridFile, outputDir, prefix)
allFiles.push(...splitFiles)
console.log(` 拆分完成: ${splitFiles.length} 张图片`)
if (!keepGrid) {
fs.unlinkSync(gridFile)
allFiles.shift()
console.log(` 已删除网格图`)
}
}
return {
success: true,
taskId,
imageUrl: result.imageUrl,
files: allFiles,
}
}
/**
* 批量生成(并发提交 + 并行轮询)
*/
async function batchGenerate(prompts, options = {}) {
const concurrency = options.concurrency || prompts.length
const results = new Array(prompts.length).fill(null)
// 分批提交,每批 concurrency 个并行
for (let batchStart = 0; batchStart < prompts.length; batchStart += concurrency) {
const batchEnd = Math.min(batchStart + concurrency, prompts.length)
const batchIndices = []
for (let i = batchStart; i < batchEnd; i++) batchIndices.push(i)
// 并行提交所有任务
console.log(`\n📡 批量提交 [${batchStart + 1}-${batchEnd}/${prompts.length}]...`)
const taskIds = await Promise.all(batchIndices.map(async (i) => {
try {
const taskId = await MJApi.submit(prompts[i], {
referenceImages: options.referenceImages,
aspectRatio: options.aspectRatio,
mjParams: options.mjParams,
styleWeight: options.styleWeight,
})
return { i, taskId }
} catch (err) {
console.error(` [${i + 1}] ❌ 提交失败: ${err.message}`)
results[i] = { success: false, prompt: prompts[i], error: err.message }
return { i, taskId: null }
}
}))
// 过滤出成功的任务,并行轮询
const activeTasks = taskIds.filter(t => t.taskId)
console.log(` 已提交 ${activeTasks.length} 个任务,并行等待生成...\n`)
const pollResults = await Promise.all(activeTasks.map(async ({ i, taskId }) => {
try {
const result = await MJApi.poll(taskId)
// 下载 + 拆分
const timestamp = new Date().toISOString().replace(/[:.]/g, '-')
const gridFile = path.join(options.outputDir || './output', `${timestamp}_grid.png`)
fs.mkdirSync(path.dirname(gridFile), { recursive: true })
await ImageUtils.download(result.imageUrl, gridFile)
const allFiles = [gridFile]
if (options.split !== false) {
const prefix = timestamp
const splitFiles = await ImageUtils.split4(gridFile, options.outputDir || './output', prefix)
allFiles.push(...splitFiles)
if (!options.keepGrid) {
fs.unlinkSync(gridFile)
allFiles.shift()
}
}
console.log(` [${i + 1}/${prompts.length}] ✅ 完成`)
return { i, success: true, prompt: prompts[i], taskId, imageUrl: result.imageUrl, files: allFiles }
} catch (err) {
console.error(` [${i + 1}/${prompts.length}] ❌ 失败: ${err.message}`)
return { i, success: false, prompt: prompts[i], error: err.message }
}
}))
for (const r of pollResults) results[r.i] = r
}
const ok = results.filter(r => r && r.success).length
console.log(`\n✨ 批量完成: ${ok}/${prompts.length} 成功`)
return results
}
// ============================================================================
// CLI
// ============================================================================
function showHelp() {
console.log(`
🎨 MJ Image Generator - Midjourney 图片生成工具
用法:
node mj-image-generator.js <prompt> [options]
node mj-image-generator.js batch <file> [options]
选项:
-o, --output <dir> 输出目录 (默认: ./output)
-r, --ref <urls> 参考图 URL逗号分隔
-a, --ar <ratio> 宽高比 (1:1, 16:9, 9:16, 3:4, 4:3 等)
-c, --concurrency <n> 并发数 (默认: 全部并行)
--no-split 不拆分 4 合 1
--keep-grid 保留原始网格图
-h, --help 帮助
示例:
node mj-image-generator.js "a cute cat" -o ./cats
node mj-image-generator.js "cyberpunk city" -a 16:9
node mj-image-generator.js "portrait" -r http://img.com/ref.jpg -a 9:16
node mj-image-generator.js batch ./prompts.txt -o ./batch
`)
}
async function main() {
const args = process.argv.slice(2)
if (args.includes('-h') || args.includes('--help') || args.length === 0) {
showHelp()
return
}
let command = 'generate'
let params = []
const options = { outputDir: './output', split: true, keepGrid: false, referenceImages: [], aspectRatio: '', concurrency: 0, mjParams: '', styleWeight: 100 }
let i = 0
if (args[0] === 'batch') {
command = 'batch'
i = 1
}
while (i < args.length) {
const arg = args[i]
if (arg === '-o' || arg === '--output') {
options.outputDir = args[++i]
} else if (arg === '-a' || arg === '--ar') {
options.aspectRatio = args[++i]
} else if (arg === '-r' || arg === '--ref') {
options.referenceImages = args[++i].split(',').map(s => s.trim()).filter(Boolean)
} else if (arg === '--no-split') {
options.split = false
} else if (arg === '--keep-grid') {
options.keepGrid = true
} else if (arg === '-c' || arg === '--concurrency') {
options.concurrency = parseInt(args[++i], 10) || 0
} else if (arg === '--mj-params') {
options.mjParams = args[++i]
} else if (arg === '--sw') {
options.styleWeight = parseInt(args[++i], 10) || 100
} else {
params.push(arg)
}
i++
}
if (command === 'batch') {
const filePath = params[0]
if (!filePath || !fs.existsSync(filePath)) {
console.error('请提供提示词文件路径')
process.exit(1)
}
const prompts = fs.readFileSync(filePath, 'utf-8')
.split('\n').filter(l => l.trim()).map(l => l.trim())
await batchGenerate(prompts, options)
} else {
const prompt = params.join(' ')
if (!prompt) {
console.error('请提供提示词')
process.exit(1)
}
await generate(prompt, options)
}
}
// ============================================================================
// 导出
// ============================================================================
module.exports = { generate, batchGenerate, MJApi, ImageUtils }
if (require.main === module) {
main().catch(err => {
console.error(`\n❌ 错误: ${err.message}`)
process.exit(1)
})
}

View File

@@ -0,0 +1,171 @@
#!/usr/bin/env node
/**
* OSS 文件上传工具
*
* 上传图片/视频到阿里云 OSS返回签名 URL。
* 支持单文件和批量上传。
*
* 用法:
* node oss-upload.js ./image.png
* node oss-upload.js ./video.mp4 --dir videos/
* node oss-upload.js batch ./manifest.json
*/
const OSS = require('ali-oss')
const path = require('path')
const fs = require('fs')
// ============================================================================
// 配置
// ============================================================================
function getConfig() {
const configPath = path.join(__dirname, '..', '..', 'config.json')
const config = JSON.parse(fs.readFileSync(configPath, 'utf-8'))
if (!config.ossRegion || !config.ossAccessKeyId || !config.ossAccessKeySecret || !config.ossBucket) {
console.error('config.json 需要填写 ossRegion, ossAccessKeyId, ossAccessKeySecret, ossBucket')
process.exit(1)
}
return config
}
function createClient(config) {
return new OSS({
region: config.ossRegion,
accessKeyId: config.ossAccessKeyId,
accessKeySecret: config.ossAccessKeySecret,
bucket: config.ossBucket,
secure: true,
})
}
// ============================================================================
// 上传
// ============================================================================
async function uploadFile(filePath, options = {}) {
const config = getConfig()
const client = createClient(config)
if (!fs.existsSync(filePath)) {
throw new Error(`文件不存在: ${filePath}`)
}
const folder = options.folder || config.ossFolder || 'tmp/'
const basename = options.name || path.basename(filePath)
const ossPath = `${folder}${basename}`
const buffer = fs.readFileSync(filePath)
await client.put(ossPath, buffer)
const expires = config.ossExpires || 31536000
const url = client.signatureUrl(ossPath, { expires })
return { url, ossPath, size: buffer.length }
}
async function uploadBuffer(buffer, options = {}) {
const config = getConfig()
const client = createClient(config)
const folder = options.folder || config.ossFolder || 'tmp/'
const basename = options.name || `${Date.now()}${options.ext || '.png'}`
const ossPath = `${folder}${basename}`
await client.put(ossPath, buffer)
const expires = config.ossExpires || 31536000
const url = client.signatureUrl(ossPath, { expires })
return { url, ossPath }
}
// ============================================================================
// 批量上传(读 manifest.json 中的 file 列表)
// ============================================================================
async function batchUpload(manifestPath, baseDir) {
const manifest = JSON.parse(fs.readFileSync(manifestPath, 'utf-8'))
const dir = baseDir || path.dirname(manifestPath)
const results = {}
for (const item of manifest.items) {
const filePath = path.join(dir, item.file)
if (!fs.existsSync(filePath)) continue
const name = path.basename(item.file)
try {
const { url } = await uploadFile(filePath, { name })
results[item.file] = url
console.log(` OK: ${name}`)
} catch (err) {
console.error(` FAIL: ${name} - ${err.message}`)
}
}
return results
}
// ============================================================================
// CLI
// ============================================================================
function parseArgs(argv) {
const args = { _: [] }
for (let i = 0; i < argv.length; i++) {
if (argv[i].startsWith('--')) {
const key = argv[i].slice(2)
const val = argv[i + 1]
if (val && !val.startsWith('--')) { args[key] = val; i++ }
else args[key] = true
} else {
args._.push(argv[i])
}
}
return args
}
async function main() {
const args = parseArgs(process.argv.slice(2))
const cmd = args._[0]
if (!cmd) {
console.log('用法: node oss-upload.js <file> [--dir folder] [--name filename]')
console.log(' node oss-upload.js batch <manifest.json> [--dir <baseDir>]')
process.exit(0)
}
if (cmd === 'batch') {
const manifest = args._[1]
if (!manifest) { console.error('指定 manifest.json'); process.exit(1) }
console.log(`批量上传: ${manifest}`)
const results = await batchUpload(manifest, args.dir)
console.log(`\n完成: ${Object.keys(results).length} 个文件`)
// 写回 urls
const urlsPath = path.join(args.dir || path.dirname(manifest), 'urls.json')
const existing = fs.existsSync(urlsPath) ? JSON.parse(fs.readFileSync(urlsPath, 'utf-8')) : {}
Object.assign(existing, results)
fs.writeFileSync(urlsPath, JSON.stringify(existing, null, 2))
console.log(`URLs 已写入: ${urlsPath}`)
} else {
const filePath = path.resolve(cmd)
console.log(`上传: ${filePath}`)
const { url, ossPath, size } = await uploadFile(filePath, {
folder: args.dir,
name: args.name,
})
console.log(`\nOSS 路径: ${ossPath}`)
console.log(`签名 URL: ${url}`)
console.log(`文件大小: ${(size / 1024).toFixed(1)} KB`)
}
}
module.exports = { uploadFile, uploadBuffer, batchUpload }
if (require.main === module) {
main().catch(err => {
console.error(`错误: ${err.message}`)
process.exit(1)
})
}

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,7 @@
{
"dependencies": {
"ali-oss": "^6.21.0",
"axios": "^1.15.2",
"sharp": "^0.34.5"
}
}

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,175 @@
#!/usr/bin/env node
/**
* 阿里云 Qwen-TTS 批量语音合成脚本
*
* 输入 JSON 文件格式:
* {
* "segments": [
* {"id": 1, "text": "第一段文案"},
* {"id": 2, "text": "第二段文案"}
* ],
* "voice": "Cherry", // 可选,覆盖 config
* "output_dir": "./audio" // 可选,默认 ./audio
* }
*
* 输出 JSON (stdout):
* {
* "segments": [
* {"id": 1, "text": "...", "audio": "./audio/seg_001.wav", "duration": 3.456},
* ...
* ]
* }
*
* 也可作为模块调用:
* const { synthesize } = require('./qwen-tts')
* const { filePath, duration } = await synthesize('你好世界', { voice: 'Cherry' })
*/
const axios = require('axios')
const fs = require('fs')
const path = require('path')
const CONFIG_PATH = path.join(__dirname, '..', '..', 'config.json')
function loadConfig() {
if (!fs.existsSync(CONFIG_PATH)) throw new Error(`config.json 不存在: ${CONFIG_PATH}`)
return JSON.parse(fs.readFileSync(CONFIG_PATH, 'utf-8'))
}
/**
* 单段语音合成(非流式)
* @param {string} text - 要合成的文本
* @param {object} options - { voice, model, language, outputDir, id }
* @returns {{ filePath: string, duration: number }}
*/
async function synthesize(text, options = {}) {
const config = loadConfig()
const apiKey = options.apiKey || config.ttsApiKey
if (!apiKey) throw new Error('ttsApiKey 未配置,请在 config.json 中设置')
const baseUrl = (options.apiBaseUrl || config.ttsApiBaseUrl || 'https://dashscope.aliyuncs.com/api/v1').replace(/\/$/, '')
const model = options.model || config.ttsModel || 'qwen-tts'
const voice = options.voice || config.ttsVoice || 'Cherry'
const language = options.language || config.ttsLanguage || 'Chinese'
const outputDir = options.outputDir || './audio'
fs.mkdirSync(outputDir, { recursive: true })
// 确保文本有句末标点,让 TTS 生成自然语调和尾部停顿
text = text.trimEnd()
if (!/[。!?.!?…]$/.test(text)) text += '。'
const url = `${baseUrl}/services/aigc/multimodal-generation/generation`
let res
try {
res = await axios.post(url, {
model,
input: {
text,
voice,
language_type: language,
},
}, {
headers: {
'Authorization': `Bearer ${apiKey}`,
'Content-Type': 'application/json',
},
timeout: 60000,
})
} catch (err) {
const detail = err.response?.data
throw new Error(`TTS API 错误: ${err.message}${detail ? ' ' + JSON.stringify(detail) : ''}`)
}
const audioUrl = res.data?.output?.audio?.url
if (!audioUrl) {
throw new Error(`TTS API 未返回音频 URL: ${JSON.stringify(res.data)}`)
}
// 下载音频到本地
const id = options.id || 1
const fileName = `seg_${String(id).padStart(3, '0')}.wav`
const filePath = path.resolve(outputDir, fileName)
const audioRes = await axios.get(audioUrl, { responseType: 'arraybuffer', timeout: 30000 })
const wavBuffer = Buffer.from(audioRes.data)
// 追加 0.3s 静音(句间气口)
const silenceSec = options.silencePadding !== undefined ? options.silencePadding : 0.3
const silenceBytes = Math.round(24000 * 2 * silenceSec)
const silenceBuffer = Buffer.alloc(silenceBytes, 0)
const finalBuffer = Buffer.concat([wavBuffer, silenceBuffer])
// 更新 WAV 头的文件大小
finalBuffer.writeUInt32LE(finalBuffer.length - 8, 4)
finalBuffer.writeUInt32LE(wavBuffer.length - 44 + silenceBytes, 40)
fs.writeFileSync(filePath, finalBuffer)
const duration = (finalBuffer.length - 44) / (24000 * 2)
return { filePath, duration }
}
/**
* 批量语音合成
* @param {Array<{id: number, text: string}>} segments
* @param {object} options - { voice, outputDir }
* @returns {Array<{id: number, text: string, audio: string, duration: number}>}
*/
async function synthesizeBatch(segments, options = {}) {
const results = []
for (const seg of segments) {
console.error(` 合成 #${seg.id}: ${seg.text.substring(0, 30)}...`)
const { filePath, duration } = await synthesize(seg.text, {
...options,
id: seg.id,
})
results.push({
id: seg.id,
text: seg.text,
audio: filePath,
duration: Math.round(duration * 1000) / 1000,
})
// 间隔 0.5 秒避免限流
await new Promise(r => setTimeout(r, 500))
}
return results
}
// CLI 入口
async function main() {
const inputJson = process.argv[2]
if (!inputJson) {
console.error('用法: node qwen-tts.js <input.json>')
console.error('')
console.error('input.json 格式:')
console.error(JSON.stringify({
segments: [{ id: 1, text: '文案' }],
voice: 'Cherry',
output_dir: './audio',
}, null, 2))
process.exit(1)
}
const config = JSON.parse(fs.readFileSync(inputJson, 'utf-8'))
const segments = config.segments
const options = {
voice: config.voice,
outputDir: config.output_dir || './audio',
}
const results = await synthesizeBatch(segments, options)
const output = { segments: results }
process.stdout.write(JSON.stringify(output, null, 2) + '\n')
}
if (require.main === module) {
main().catch(err => {
console.error('TTS 合成失败:', err.message)
process.exit(1)
})
}
module.exports = { synthesize, synthesizeBatch }

View File

@@ -0,0 +1,336 @@
#!/usr/bin/env node
/**
* 同步 CapCut Mate 草稿到本地剪映(独立版)
*
* 从 API 获取草稿文件列表 → 下载到本地剪映目录 → 路径重写 → 远程素材本地化 → 注册 + 触发扫描
* 不依赖 Electron、不依赖 capcut-mate Python 环境。
*
* 用法:
* node sync-to-jianying.js <draft_url> [--name "草稿名称"]
*
* draft_url 格式: http://xxx/openapi/capcut-mate/v1/get_draft?draft_id=xxx
*/
const axios = require('axios')
const path = require('path')
const fs = require('fs')
const { createWriteStream } = require('fs')
const fsp = fs.promises
const { execFile } = require('child_process')
// ============================================================================
// 配置
// ============================================================================
function getConfig() {
const configPath = path.join(__dirname, '..', '..', 'config.json')
return JSON.parse(fs.readFileSync(configPath, 'utf-8'))
}
// ============================================================================
// 工具函数
// ============================================================================
function isHttpUrl(value) {
if (!value || typeof value !== 'string') return false
try {
const parsed = new URL(value)
return parsed.protocol === 'http:' || parsed.protocol === 'https:'
} catch { return false }
}
function extractDraftId(url) {
const match = url.match(/draft_id=([^&]+)/)
return match ? match[1] : null
}
function winPath(p) {
return p.replace(/\//g, '\\')
}
function getFileExtFromUrl(url, fallback = '.bin') {
try { return path.extname(new URL(url).pathname) || fallback }
catch { return fallback }
}
// ============================================================================
// 下载
// ============================================================================
async function downloadStream(url, filePath) {
await fsp.mkdir(path.dirname(filePath), { recursive: true })
const res = await axios.get(url, { responseType: 'stream', timeout: 60000 })
if (res.status !== 200) throw new Error(`HTTP ${res.status}: ${url}`)
return new Promise((resolve, reject) => {
const writer = res.data.pipe(createWriteStream(filePath, { flags: 'w' }))
writer.on('close', resolve)
writer.on('error', reject)
res.data.on('error', reject)
})
}
// ============================================================================
// 路径重写(核心逻辑来自 desktop-client/download.js
// ============================================================================
function updatePathValue(obj, key, targetDir, draftId) {
const oldVal = obj[key]
if (!oldVal || typeof oldVal !== 'string') return
const idIndex = oldVal.indexOf(draftId)
if (idIndex === -1) return
const relativePath = oldVal.substring(idIndex).replace(/\//g, path.sep)
const cleaned = relativePath.replace(draftId + path.sep, '')
obj[key] = path.join(targetDir, cleaned)
}
function recursivelyUpdatePaths(obj, targetDir, draftId) {
if (Array.isArray(obj)) { obj.forEach(item => recursivelyUpdatePaths(item, targetDir, draftId)); return }
if (obj && typeof obj === 'object') {
if (obj.path && typeof obj.path === 'string') updatePathValue(obj, 'path', targetDir, draftId)
for (const key in obj) {
if (obj.hasOwnProperty(key)) recursivelyUpdatePaths(obj[key], targetDir, draftId)
}
}
}
// ============================================================================
// 远程素材本地化(下载 http/https URL 素材到本地)
// ============================================================================
async function localizeRemoteMaterials(materials, draftDir) {
if (!materials || typeof materials !== 'object') return
const supportedTypes = ['videos', 'audios']
const cache = new Map()
// 收集所有需要下载的素材
const downloadTasks = []
for (const matType of supportedTypes) {
const list = materials[matType]
if (!Array.isArray(list)) continue
for (const item of list) {
if (!item || typeof item !== 'object') continue
if (!isHttpUrl(item.path)) continue
const subDir = matType === 'videos'
? (item.type === 'photo' ? 'images' : 'videos')
: matType === 'audios' ? 'audios' : 'misc'
const ext = getFileExtFromUrl(item.path, matType === 'audios' ? '.mp3' : '.mp4')
const baseName = (item.material_name || item.name || item.id || Date.now()) + ext
const localPath = path.join(draftDir, 'assets', subDir, baseName)
if (!cache.has(item.path)) {
cache.set(item.path, localPath)
downloadTasks.push({ item, url: item.path, localPath, baseName })
}
item.path = cache.get(item.path)
}
}
if (downloadTasks.length === 0) return
// 并行下载(最多 8 个并发)
const CONCURRENCY = 8
console.log(` 素材本地化: ${downloadTasks.length} 个文件,${CONCURRENCY} 并发...`)
for (let i = 0; i < downloadTasks.length; i += CONCURRENCY) {
const batch = downloadTasks.slice(i, i + CONCURRENCY)
await Promise.all(batch.map(async (task, j) => {
try {
await fsp.mkdir(path.dirname(task.localPath), { recursive: true })
await downloadStream(task.url, task.localPath)
console.log(` [${i + j + 1}/${downloadTasks.length}] ${task.baseName} OK`)
} catch (err) {
console.error(` [${i + j + 1}/${downloadTasks.length}] ${task.baseName} FAIL: ${err.message}`)
}
}))
}
}
// ============================================================================
// 注册到 root_meta_info.json
// ============================================================================
function registerDraft(draftId, draftName, totalDurationUs) {
const { jianyingDraftPath } = getConfig()
const rootMetaPath = path.join(jianyingDraftPath, 'root_meta_info.json')
const draftDir = path.join(jianyingDraftPath, draftId)
const rootMeta = JSON.parse(fs.readFileSync(rootMetaPath, 'utf-8'))
if (rootMeta.all_draft_store.some(d => d.draft_fold_path === winPath(draftDir))) {
console.log(' 已注册,跳过')
return
}
const now = Date.now() * 1000
rootMeta.all_draft_store.unshift({
cloud_draft_cover: false, cloud_draft_sync: false,
draft_cloud_last_action_download: false, draft_cloud_purchase_info: '',
draft_cloud_template_id: '', draft_cloud_tutorial_info: '',
draft_cloud_videocut_purchase_info: '',
draft_cover: winPath(path.join(draftDir, 'draft_cover.jpg')),
draft_fold_path: winPath(draftDir),
draft_id: draftId, draft_is_ai_shorts: false,
draft_is_cloud_temp_draft: false, draft_is_invisible: false,
draft_is_web_article_video: false,
draft_json_file: winPath(path.join(draftDir, 'draft_content.json')),
draft_name: draftName || draftId, draft_new_version: '',
draft_root_path: winPath(jianyingDraftPath),
draft_timeline_materials_size: 0, draft_type: '',
draft_web_article_video_enter_from: '',
streaming_edit_draft_ready: true,
tm_draft_cloud_completed: '', tm_draft_cloud_entry_id: -1,
tm_draft_cloud_modified: 0, tm_draft_cloud_parent_entry_id: -1,
tm_draft_cloud_space_id: -1, tm_draft_cloud_user_id: -1,
tm_draft_create: now, tm_draft_modified: now, tm_draft_removed: 0,
tm_duration: totalDurationUs || 0,
})
fs.writeFileSync(rootMetaPath, JSON.stringify(rootMeta, null, 4), 'utf-8')
console.log(` 已注册: ${draftName || draftId}`)
}
// ============================================================================
// 触发剪映目录扫描robocopy 技巧)
// ============================================================================
function triggerDirectoryScan(targetDir) {
if (!fs.existsSync(targetDir)) return
const tmpDir = targetDir + '.tmp'
if (process.platform === 'win32') {
execFile('robocopy', [targetDir, tmpDir, '/E', '/COPY:DAT', '/R:1', '/W:1', '/NP', '/NJH', '/NJS'],
{ windowsHide: true }, (err) => {
const code = err ? err.code : 0
if (code >= 8) console.log(` 扫描触发失败 (code ${code})`)
else console.log(' 已触发剪映扫描')
try { fs.rmSync(tmpDir, { recursive: true, force: true }) } catch {}
})
} else if (process.platform === 'darwin') {
execFile('rsync', ['-a', targetDir + '/', tmpDir], (err) => {
if (!err) console.log(' 已触发剪映扫描')
try { fs.rmSync(tmpDir, { recursive: true, force: true }) } catch {}
})
}
}
// ============================================================================
// 主流程
// ============================================================================
async function syncDraft(draftUrl, options = {}) {
const config = getConfig()
const draftId = extractDraftId(draftUrl)
if (!draftId) throw new Error('无法从 URL 提取 draft_id')
const jianyingDraftPath = config.jianyingDraftPath
const draftDir = path.join(jianyingDraftPath, draftId)
console.log(`\n同步草稿到本地剪映`)
console.log(` draft_id: ${draftId}`)
console.log(` 目标目录: ${draftDir}\n`)
// 1. 获取文件列表
console.log('[1/4] 获取文件列表...')
const res = await axios.get(draftUrl, { timeout: 30000 })
if (res.data.code !== undefined && res.data.code !== 0) {
throw new Error(`API 错误: ${res.data.message}`)
}
const fileUrls = res.data.files || []
console.log(` 获取 ${fileUrls.length} 个文件\n`)
if (fileUrls.length === 0) {
console.log(' 无文件,跳过')
return
}
// 2. 下载文件
console.log('[2/4] 下载文件...')
let success = 0, failed = 0
for (let i = 0; i < fileUrls.length; i++) {
const fileUrl = fileUrls[i]
try {
// 解析本地路径
const urlObj = new URL(fileUrl)
const idIndex = urlObj.pathname.indexOf(draftId)
if (idIndex === -1) { failed++; continue }
const relativePath = urlObj.pathname.substring(idIndex).replace(/\//g, path.sep)
const cleaned = relativePath.replace(draftId + path.sep, '')
const filePath = path.join(draftDir, cleaned)
await fsp.mkdir(path.dirname(filePath), { recursive: true })
const fileName = path.basename(filePath)
if (fileUrl.endsWith('.json')) {
// JSON 文件:下载 → 路径重写 → 素材本地化 → 写入
const jsonRes = await axios.get(fileUrl, { timeout: 30000 })
const jsonData = jsonRes.data
if (jsonData?.materials) {
recursivelyUpdatePaths(jsonData.materials, draftDir, draftId)
await localizeRemoteMaterials(jsonData.materials, draftDir)
}
await fsp.writeFile(filePath, JSON.stringify(jsonData, null, 2), 'utf-8')
} else {
// 二进制文件:流式下载
await downloadStream(fileUrl, filePath)
}
console.log(` [${i + 1}/${fileUrls.length}] OK: ${fileName}`)
success++
} catch (err) {
console.error(` [${i + 1}/${fileUrls.length}] FAIL: ${path.basename(fileUrl)} - ${err.message}`)
failed++
}
}
console.log(` 下载完成: ${success}/${fileUrls.length}${failed ? `, 失败 ${failed}` : ''}\n`)
// 3. 注册到剪映
console.log('[3/4] 注册到剪映...')
registerDraft(draftId, options.name)
console.log('')
// 4. 触发扫描
console.log('[4/4] 触发剪映扫描...')
triggerDirectoryScan(draftDir)
console.log(`\n同步完成! 打开剪映即可看到草稿。\n`)
}
// ============================================================================
// CLI
// ============================================================================
function parseArgs(argv) {
const args = {}
for (let i = 0; i < argv.length; i++) {
if (argv[i].startsWith('--')) {
const key = argv[i].slice(2)
const val = argv[i + 1]
if (val && !val.startsWith('--')) { args[key] = val; i++ }
else args[key] = true
} else {
if (!args.draftUrl) args.draftUrl = argv[i]
}
}
return args
}
async function main() {
const args = parseArgs(process.argv.slice(2))
if (!args.draftUrl) {
console.log('用法: node sync-to-jianying.js <draft_url> [--name "草稿名称"]')
console.log('')
console.log('draft_url: http://xxx/openapi/capcut-mate/v1/get_draft?draft_id=xxx')
process.exit(0)
}
await syncDraft(args.draftUrl, { name: args.name })
}
module.exports = { syncDraft, registerDraft, triggerDirectoryScan }
if (require.main === module) {
main().catch(err => {
console.error(`\n错误: ${err.message}`)
process.exit(1)
})
}

View File

@@ -0,0 +1,594 @@
#!/usr/bin/env node
/**
* VEO Video Generator - 图生视频工具Google Veo 模型)
*
* 功能:
* - 提交图生视频任务Veo2/Veo3 模型)
* - 支持 enhance_prompt中文提示词自动转英文
* - 支持 enable_upsample超分辨率
* - 轮询直到完成60-300秒
* - 失败自动优化提示词重试最多3次
* - 批量并行生成 + manifest.json 文案透传
*
* 用法:
* node veo-video-generator.js --image ./ref.jpg --prompt "zoom in slowly" -o ./output
* node veo-video-generator.js batch ./manifest.json -o ./output
*/
const fs = require('fs')
const path = require('path')
const https = require('https')
const http = require('http')
// ============================================================================
// 配置
// ============================================================================
function loadConfig() {
const configPath = path.join(__dirname, '..', '..', 'config.json')
if (fs.existsSync(configPath)) {
return JSON.parse(fs.readFileSync(configPath, 'utf-8'))
}
return {}
}
const cfg = loadConfig()
const Config = {
baseUrl: cfg.veoApiBaseUrl,
apiKey: cfg.veoApiKey || '',
model: cfg.veoModel || 'veo3-fast-frames',
enhancePrompt: cfg.veoEnhancePrompt !== undefined ? cfg.veoEnhancePrompt : true,
enableUpsample: cfg.veoEnableUpsample !== undefined ? cfg.veoEnableUpsample : true,
pollInterval: 10000,
maxPollTime: 600000, // 单次最大等待 10 分钟Veo3 可能更慢)
maxRetries: 3,
}
// 模型图片数量限制
const MODEL_IMAGE_LIMIT = {
'veo2': 1,
'veo2-fast': 1,
'veo3-fast': 1,
'veo3-fast-frames': 1,
}
// veo3 只支持 16:9 和 9:16
const VEO3_RATIOS = ['16:9', '9:16']
// ============================================================================
// 提示词优化(失败时自动调整)
// ============================================================================
const PromptOptimizer = {
optimize(prompt, failReason, attempt) {
let optimized = prompt
if (attempt === 1) {
optimized = simplifyPrompt(prompt)
console.log(` 重试策略: 简化提示词`)
}
if (attempt === 2) {
optimized = `${simplifyPrompt(prompt)}, smooth motion, high quality`
console.log(` 重试策略: 简化 + 安全后缀`)
}
if (attempt >= 3) {
optimized = extractCoreSubject(prompt)
console.log(` 重试策略: 极简提示词`)
}
return optimized
}
}
function simplifyPrompt(prompt) {
const parts = prompt.split(',').map(s => s.trim())
return parts.slice(0, 3).join(', ')
}
function extractCoreSubject(prompt) {
const match = prompt.match(/^([^.!,]+)/)
return match ? match[1].trim() : 'cinematic motion'
}
// ============================================================================
// API
// ============================================================================
const VeoApi = {
async create(imageUrl, prompt, options = {}) {
const {
aspectRatio = '9:16',
model = Config.model,
enhancePrompt = Config.enhancePrompt,
enableUpsample = Config.enableUpsample,
lastFrameUrl = '', // 首尾帧模式:结束帧 URL
} = options
// veo3 画幅校验
if (model.includes('veo3') && !VEO3_RATIOS.includes(aspectRatio)) {
throw new Error(`veo3 模型仅支持 ${VEO3_RATIOS.join('/')} 画幅`)
}
// 单图模式: [imageUrl],首尾帧模式: [firstFrame, lastFrame]
const images = []
if (imageUrl) images.push(imageUrl)
if (lastFrameUrl) images.push(lastFrameUrl)
const mode = lastFrameUrl ? '首尾帧' : '单图'
const body = {
model,
prompt,
images,
enhance_prompt: enhancePrompt,
enable_upsample: enableUpsample,
aspect_ratio: aspectRatio,
}
console.log(`\n📡 提交 VEO 视频任务 [${mode}]`)
console.log(` 模型: ${model}`)
console.log(` 提示词: ${prompt.substring(0, 80)}...`)
if (lastFrameUrl) {
console.log(` 起始帧: ${imageUrl.substring(0, 60)}...`)
console.log(` 结束帧: ${lastFrameUrl.substring(0, 60)}...`)
} else {
console.log(` 参考图: ${imageUrl ? imageUrl.substring(0, 60) + '...' : '无'}`)
}
console.log(` 画幅: ${aspectRatio}`)
console.log(` 中文增强: ${enhancePrompt}`)
console.log(` 超分: ${enableUpsample}`)
const res = await fetch(`${Config.baseUrl}/v1/video/create`, {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Accept': 'application/json',
'Authorization': `Bearer ${Config.apiKey}`,
},
body: JSON.stringify(body),
})
const result = await res.json()
if (!result.id) {
throw new Error(`VEO 提交失败: ${JSON.stringify(result)}`)
}
console.log(` 任务 ID: ${result.id}`)
return result.id
},
async query(taskId) {
const res = await fetch(`${Config.baseUrl}/v1/video/query?id=${taskId}`, {
headers: {
'Authorization': `Bearer ${Config.apiKey}`,
'Accept': 'application/json',
},
})
return await res.json()
},
async poll(taskId) {
const startTime = Date.now()
let lastProgress = 0
console.log(`\n⏳ 等待 VEO 视频生成(预计 60-300 秒)...`)
while (Date.now() - startTime < Config.maxPollTime) {
const task = await VeoApi.query(taskId)
if (task.status === 'completed') {
console.log(`\n✅ 视频生成完成!`)
console.log(` 视频: ${task.video_url}`)
return {
success: true,
videoUrl: task.video_url,
}
}
if (task.status === 'failed') {
throw new Error(task.error || task.message || 'VEO 生成失败')
}
const progress = task.progress || 0
if (progress !== lastProgress) {
lastProgress = progress
const elapsed = Math.round((Date.now() - startTime) / 1000)
process.stdout.write(` 进度: ${progress}% 已等待: ${elapsed}s 状态: ${task.status}\r`)
}
await new Promise(r => setTimeout(r, Config.pollInterval))
}
throw new Error(`VEO 生成超时 (${Config.maxPollTime / 1000}s)`)
},
}
// ============================================================================
// 图片下载工具
// ============================================================================
async function download(url, outputPath) {
const protocol = url.startsWith('https') ? https : http
return new Promise((resolve, reject) => {
const file = fs.createWriteStream(outputPath)
protocol.get(url, (response) => {
if (response.statusCode >= 300 && response.statusCode < 400 && response.headers.location) {
file.close()
fs.unlinkSync(outputPath)
return download(response.headers.location, outputPath).then(resolve).catch(reject)
}
response.pipe(file)
file.on('finish', () => { file.close(); resolve(outputPath) })
}).on('error', (err) => {
file.close()
if (fs.existsSync(outputPath)) fs.unlinkSync(outputPath)
reject(err)
})
})
}
// ============================================================================
// 核心流程(单任务带重试)
// ============================================================================
async function generate(imageUrl, prompt, options = {}) {
const { outputDir = './output', aspectRatio = '16:9' } = options
if (!Config.apiKey) throw new Error('未配置 veoApiKey请在 config.json 中添加')
fs.mkdirSync(outputDir, { recursive: true })
let currentPrompt = prompt
let lastError = null
for (let attempt = 0; attempt <= Config.maxRetries; attempt++) {
try {
if (attempt > 0) {
currentPrompt = PromptOptimizer.optimize(prompt, lastError, attempt)
console.log(`\n🔄 第 ${attempt} 次重试`)
console.log(` 新提示词: ${currentPrompt}`)
}
const taskId = await VeoApi.create(imageUrl, currentPrompt, { aspectRatio, lastFrameUrl: options.lastFrameUrl })
const result = await VeoApi.poll(taskId)
const timestamp = new Date().toISOString().replace(/[:.]/g, '-')
const videoFile = path.join(outputDir, `${timestamp}_veo.mp4`)
await download(result.videoUrl, videoFile)
console.log(` 下载完成: ${videoFile}`)
return {
success: true,
taskId,
prompt: currentPrompt,
originalPrompt: prompt,
attempts: attempt + 1,
files: [videoFile],
}
} catch (err) {
lastError = err.message
console.error(` ❌ 第 ${attempt + 1} 次失败: ${err.message}`)
if (attempt < Config.maxRetries) {
console.log(` 等待 5 秒后重试...`)
await new Promise(r => setTimeout(r, 5000))
}
}
}
throw new Error(`VEO 视频生成失败(已重试 ${Config.maxRetries} 次): ${lastError}`)
}
// ============================================================================
// 批量并行生成(支持 manifest.json 输入输出)
// ============================================================================
async function batchGenerate(tasks, options = {}) {
const { outputDir = './output' } = options
let aspectRatio = options.aspectRatio || '16:9'
const concurrency = options.concurrency || 2
if (!Config.apiKey) throw new Error('未配置 veoApiKey请在 config.json 中添加')
fs.mkdirSync(outputDir, { recursive: true })
// 支持 manifest 格式
if (tasks.items && Array.isArray(tasks.items)) {
// manifest 级画幅format > defaultFormat > 命令行默认值
if (tasks.format || tasks.defaultFormat) {
aspectRatio = tasks.format || tasks.defaultFormat || aspectRatio
}
tasks = tasks.items.map(item => ({
image: item.url || item.image || '',
prompt: item.videoPrompt || item.prompt || 'cinematic motion',
text: item.text || item.caption || '',
keyword: item.keyword || '',
keywordColor: item.keywordColor || '',
file: item.file || '',
lastFrameUrl: item.lastFrameUrl || '', // 首尾帧模式
}))
}
// Phase 1: 并行提交
const mode = tasks.some(t => t.lastFrameUrl) ? '首尾帧' : '单图'
console.log(`\n📡 并行提交 ${tasks.length} 个 VEO 视频任务(并发: ${concurrency},模式: ${mode}...`)
const submitted = []
for (let i = 0; i < tasks.length; i += concurrency) {
const batch = tasks.slice(i, i + concurrency)
const batchResults = await Promise.allSettled(
batch.map(async (task, j) => {
const idx = i + j
const prompt = task.videoPrompt || task.prompt
console.log(` [${idx + 1}/${tasks.length}] 提交: ${prompt.substring(0, 50)}...`)
try {
const taskId = await VeoApi.create(task.image, prompt, { aspectRatio, lastFrameUrl: task.lastFrameUrl })
return { idx, taskId, task, error: null }
} catch (err) {
console.error(` [${idx + 1}] 提交失败: ${err.message}`)
return { idx, taskId: null, task, error: err.message }
}
})
)
submitted.push(...batchResults.map(r => r.value || r.reason))
}
const pendingTasks = submitted.filter(s => s.taskId)
if (pendingTasks.length === 0) {
console.error('\n❌ 所有任务提交失败')
return tasks.map((task, idx) => ({
success: false, ...task,
error: (submitted.find(s => s.idx === idx) || {}).error || '提交失败',
}))
}
// Phase 2: 并行轮询
console.log(`\n⏳ 并行等待 ${pendingTasks.length} 个视频生成...`)
const pollResults = await Promise.allSettled(
pendingTasks.map(async ({ idx, taskId, task }) => {
const prompt = task.videoPrompt || task.prompt
const result = await pollWithRetry(taskId, prompt, { outputDir, aspectRatio, imageUrl: task.image, lastFrameUrl: task.lastFrameUrl })
return { idx, ...result, task }
})
)
// 合并结果
const results = []
for (let i = 0; i < tasks.length; i++) {
const submittedInfo = submitted.find(s => s.idx === i)
if (!submittedInfo || !submittedInfo.taskId) {
results.push({ success: false, ...tasks[i], error: submittedInfo?.error || '提交失败' })
continue
}
const pollResult = pollResults.find(r => {
if (r.status === 'fulfilled') return r.value.idx === i
return false
})
if (pollResult && pollResult.status === 'fulfilled') {
results.push({ success: true, ...tasks[i], ...pollResult.value })
} else {
const reason = pollResult?.reason?.message || '生成失败'
results.push({ success: false, ...tasks[i], error: reason })
}
}
const ok = results.filter(r => r.success).length
console.log(`\n✨ 批量完成: ${ok}/${tasks.length} 成功`)
// 输出 manifest.json
const manifestItems = results
.filter(r => r.success && r.files && r.files.length > 0)
.map(r => {
const item = {
file: path.basename(r.files[0]),
duration: 8, // Veo 默认 ~8 秒
}
if (r.text) item.text = r.text
if (r.caption) item.caption = r.caption
if (r.keyword) item.keyword = r.keyword
if (r.keywordColor) item.keywordColor = r.keywordColor
return item
})
if (manifestItems.length > 0 && !options.skipManifestWrite) {
const manifestPath = path.join(outputDir, 'manifest.json')
fs.writeFileSync(manifestPath, JSON.stringify({ items: manifestItems }, null, 2))
console.log(` 已生成 manifest.json${manifestItems.length} 条,文案与视频对应)`)
}
return results
}
/**
* 轮询 + 失败重试(单任务)
*/
async function pollWithRetry(taskId, prompt, options = {}) {
let currentTaskId = taskId
let currentPrompt = prompt
let lastError = null
for (let attempt = 0; attempt <= Config.maxRetries; attempt++) {
try {
if (attempt > 0) {
currentPrompt = PromptOptimizer.optimize(prompt, lastError, attempt)
console.log(`\n 🔄 重试 (任务 ${currentTaskId.substring(0, 8)}...): ${currentPrompt.substring(0, 50)}`)
currentTaskId = await VeoApi.create(
options.imageUrl || '',
currentPrompt,
{ aspectRatio: options.aspectRatio, lastFrameUrl: options.lastFrameUrl || '' }
)
}
const result = await VeoApi.poll(currentTaskId)
const timestamp = new Date().toISOString().replace(/[:.]/g, '-')
const videoFile = path.join(options.outputDir || './output', `${timestamp}_veo.mp4`)
await download(result.videoUrl, videoFile)
return {
taskId: currentTaskId,
prompt: currentPrompt,
originalPrompt: prompt,
attempts: attempt + 1,
file: videoFile,
files: [videoFile],
duration: 8,
}
} catch (err) {
lastError = err.message
if (attempt < Config.maxRetries) {
await new Promise(r => setTimeout(r, 5000))
}
}
}
throw new Error(`重试 ${Config.maxRetries} 次后仍失败: ${lastError}`)
}
// ============================================================================
// CLI
// ============================================================================
function showHelp() {
console.log(`
🎬 VEO Video Generator - 图生视频工具Google Veo 模型)
用法:
node veo-video-generator.js --image <url> --prompt "指令" [options]
node veo-video-generator.js --image <url> --last-frame <url> --prompt "过渡" [options]
node veo-video-generator.js batch <manifest.json> [options]
选项:
-o, --output <dir> 输出目录 (默认: ./output)
-a, --ar <ratio> 宽高比 (veo3 仅 16:9/9:16默认: 16:9)
--model <model> 模型: veo2/veo2-fast/veo3-fast/veo3-fast-frames (默认: veo3-fast-frames)
--last-frame <url> 结束帧 URL首尾帧模式
--no-enhance 关闭中文提示词增强
--no-upsample 关闭超分辨率
--retries <n> 失败重试次数 (默认: 3)
-h, --help 帮助
模式:
单图模式: --image <url> --prompt "运动描述"
首尾帧模式: --image <首帧url> --last-frame <尾帧url> --prompt "过渡描述"
示例:
# 单图
node veo-video-generator.js --image http://img.com/ref.jpg --prompt "zoom in" -a 16:9
node veo-video-generator.js --image http://img.com/ref.jpg --prompt "缓慢放大" -a 9:16
# 首尾帧
node veo-video-generator.js --image http://img.com/first.jpg --last-frame http://img.com/last.jpg --prompt "从静止到运动" -a 16:9
# 批量(自动检测单图/首尾帧)
node veo-video-generator.js batch ./manifest.json -o ./videos
manifest.json首尾帧模式由生图阶段生成:
{
"mode": "framePair",
"items": [
{
"file": "scene_01_first.png",
"url": "http://...",
"lastFrame": "scene_01_last.png",
"lastFrameUrl": "http://...",
"text": "字幕文案",
"videoPrompt": "machines start up, cinematic transition"
}
]
}
`)
}
async function main() {
const args = process.argv.slice(2)
if (args.includes('-h') || args.includes('--help') || args.length === 0) {
showHelp()
return
}
let command = 'single'
let params = []
const options = {
outputDir: './output',
aspectRatio: '9:16',
imageUrl: '',
lastFrameUrl: '',
prompt: '',
enhancePrompt: Config.enhancePrompt,
enableUpsample: Config.enableUpsample,
}
let i = 0
if (args[0] === 'batch') {
command = 'batch'
i = 1
}
while (i < args.length) {
const arg = args[i]
if (arg === '-o' || arg === '--output') {
options.outputDir = args[++i]
} else if (arg === '-a' || arg === '--ar') {
options.aspectRatio = args[++i]
} else if (arg === '--model') {
Config.model = args[++i]
} else if (arg === '--no-enhance') {
options.enhancePrompt = false
} else if (arg === '--no-upsample') {
options.enableUpsample = false
} else if (arg === '--image') {
options.imageUrl = args[++i]
} else if (arg === '--last-frame') {
options.lastFrameUrl = args[++i]
} else if (arg === '--prompt') {
options.prompt = args[++i]
} else if (arg === '--retries') {
Config.maxRetries = parseInt(args[++i], 10)
} else {
params.push(arg)
}
i++
}
if (command === 'batch') {
const filePath = params[0]
if (!filePath || !fs.existsSync(filePath)) {
console.error('请提供 manifest.json 路径')
process.exit(1)
}
const tasks = JSON.parse(fs.readFileSync(filePath, 'utf-8'))
await batchGenerate(tasks, options)
} else {
if (!options.imageUrl) {
console.error('请提供 --image 参数(图片 URL')
process.exit(1)
}
if (!options.prompt) {
console.error('请提供 --prompt 参数')
process.exit(1)
}
await generate(options.imageUrl, options.prompt, options)
}
}
// ============================================================================
// 导出
// ============================================================================
module.exports = { generate, batchGenerate, pollWithRetry, VeoApi, PromptOptimizer }
if (require.main === module) {
main().catch(err => {
console.error(`\n❌ 错误: ${err.message}`)
process.exit(1)
})
}