feat(video-pipeline): 重构视频流水线,优化成片时间线规则和状态管理
- 引入 manifest.json 作为唯一状态源,所有子 Agent 操作回写 manifest - 重构 timebuilder 逻辑,支持四种视频适配策略(加速/裁剪/放缓/画面停顿) - 统一 TTS 阶段输出结构,单句和多句均写入 segments[] - 重写字幕和配音生成,基于 segments 精确时长实现音画同步 - 新增 confirm 命令支持按 id 范围确认,上传阶段分离图片和视频 - 添加中间产物写入 output/ 目录的约束,清理废弃配置参数
This commit is contained in:
@@ -1,6 +1,6 @@
|
||||
{
|
||||
"jianyingDraftPath": "/Users/lc/Movies/JianyingPro/User Data/Projects/com.lveditor.draft",
|
||||
"capcutMateDir": "/Users/lc/capcut-mate",
|
||||
"jianyingDraftPath": "C:/Users/45070/AppData/Local/JianyingPro/User Data/Projects/com.lveditor.draft",
|
||||
"capcutMateDir": "C:/Users/45070/capcut-mate",
|
||||
"capcutMateApiBase": "http://capcut.muyetools.cn/openapi/capcut-mate/v1",
|
||||
"imgbbApiKey": "deprecated",
|
||||
"geminiApiBaseUrl": "https://yunwu.ai",
|
||||
|
||||
@@ -35,14 +35,15 @@ B 模式又分两种:**单图模式**(1 图 → 1 段视频)/ **首尾帧
|
||||
### 核心约束
|
||||
|
||||
1. **不可跳步**:
|
||||
- A(幻灯片):分镜 → 图片提示词 → 生图 → TTS+成片。无视频阶段
|
||||
- B(AI视频):分镜 → 图片提示词 → 生图 → 视频提示词 → 生视频 → TTS+成片
|
||||
- A(幻灯片):分镜 → manifest init → 图片提示词 → 生图 → TTS+成片。无视频阶段
|
||||
- B(AI视频):分镜 → manifest init → 图片提示词 → 生图 → 视频提示词 → 生视频 → TTS+成片
|
||||
- 阶段之间必须审查
|
||||
2. **manifest.json 是唯一状态源**:任何操作完成后立即回写
|
||||
2. **manifest.json 是唯一状态源**:`pipeline.js init` 在分镜确认后立即执行,创建 `output/{name}/` 目录和初始 manifest。后续所有子 Agent 输出回写此 manifest,不再传裸 JSON
|
||||
3. **禁止 curl 调 API**:生图/生视频必须通过 `pipeline.js` 或对应 generator 脚本
|
||||
4. **并行优先**:独立子任务用子 Agent 并行
|
||||
5. **分镜表是脊骨契约**:用户确认分镜表后,下游子 Agent 只能加字段,禁止改 shot 数量/顺序/字段值。主 Agent 每次接收子 Agent 输出,第一件事数数量是否对得上
|
||||
6. **prompts/*.md 只被子 Agent 读**:主 Agent 读 account.json,不读子 Agent 提示词模板
|
||||
7. **中间产物落 output**:所有中间文件(items JSON、urls 缓存、子 Agent 输出)必须写入 `output/{name}/` 目录,禁止散落在项目根目录
|
||||
|
||||
### Step -1: 意图确认(逐项确认,缺一不可)
|
||||
|
||||
@@ -79,26 +80,27 @@ B 模式又分两种:**单图模式**(1 图 → 1 段视频)/ **首尾帧
|
||||
|
||||
→ 展示给用户确认。确认后**分镜表锁定为脊骨契约**,下游禁止增减 shot。
|
||||
|
||||
### Step 2-0: Manifest 初始化
|
||||
|
||||
```bash
|
||||
node scripts/pipeline.js init --account <id> --mode <single|framePair> \
|
||||
--items '[{"id":1,"shotDesc":"...","script":"...","duration":5,"directorRef":"tarantino","keyword":"权力"}]'
|
||||
```
|
||||
|
||||
- 分镜确认后立即执行,创建 `output/{name}/` 目录和初始 `manifest.json`
|
||||
- 脚本从 account.json 继承:imageModel、videoModel、format、references
|
||||
- `imagePrompt` 暂为空,Step 2-A 补充;`videoPrompt` 暂为空,Step 3-A 补充
|
||||
- 输出路径打印到控制台,后续所有操作以此为工作目录
|
||||
|
||||
### Step 2-A: 图片提示词(子 Agent 执行)
|
||||
|
||||
- 主 Agent 传**完整分镜表 JSON**(不传原始文案)+ 图片提示词模板路径给子 Agent
|
||||
- 子 Agent 为每个 shot 追加 `imagePrompt` 字段:
|
||||
- 入参(来自分镜表):shotDesc + script + directorRef + keyword
|
||||
- 出参:分镜表 JSON + imagePrompt
|
||||
- 主 Agent 传**manifest 路径 + 图片提示词模板路径**给子 Agent
|
||||
- 子 Agent 读 manifest.items,为每个 shot 追加 `imagePrompt` 字段后回写 manifest
|
||||
- **硬约束:输出 shot 数量 == 输入 shot 数量**
|
||||
|
||||
**主 Agent 审查**:① 数量对得上?② shotDesc 内容完整保留?③ 光影策略对应 directorRef?
|
||||
|
||||
### Step 2-B: 生图 + Manifest 初始化
|
||||
|
||||
```bash
|
||||
node scripts/pipeline.js init --account <id> --mode <single|framePair> \
|
||||
--items '[{"shotDesc":"...","script":"...","duration":5,"imagePrompt":"...","directorRef":"tarantino","keyword":"权力"}]'
|
||||
```
|
||||
|
||||
- items 不含 videoPrompt,后续 Step 3-A 补充
|
||||
- 脚本从 account.json 继承:imageModel、videoModel、format、references
|
||||
- 首尾帧模式:每个 item 必须有 `lastFramePrompt`
|
||||
### Step 2-B: 生图
|
||||
|
||||
```bash
|
||||
node scripts/pipeline.js run --manifest <path> --phase images
|
||||
@@ -111,12 +113,9 @@ node scripts/pipeline.js run --manifest <path> --phase images
|
||||
|
||||
### Step 3-A: 视频提示词(B 模式专属,子 Agent 执行)
|
||||
|
||||
- 主 Agent 传分镜表 JSON(含已确认分镜图路径)+ 视频提示词模板路径给子 Agent
|
||||
- 子 Agent 为每个 shot 生成 `videoPrompt`:
|
||||
- 入参:shotDesc + directorRef + 已确认分镜图 + 目标模型
|
||||
- 出参:videoPrompt(描述镜头运动,非画面内容)
|
||||
- 主 Agent 传**manifest 路径 + 视频提示词模板路径**给子 Agent
|
||||
- 子 Agent 读 manifest.items(含已确认分镜图路径),为每个 shot 生成 `videoPrompt` 后回写 manifest
|
||||
- **硬约束:输出数量 == 分镜表 shot 数量**
|
||||
- Agent 按 id 对齐回写 manifest.json
|
||||
|
||||
**主 Agent 审查**:① 数量对得上?② 描述运动而非内容?③ 字数 ≤ 50?
|
||||
|
||||
|
||||
@@ -9,9 +9,9 @@
|
||||
## 创建方式
|
||||
|
||||
```bash
|
||||
# Step 2-A 生成 imagePrompt 后,通过脚本初始化(不含 videoPrompt)
|
||||
# Step 2-0:分镜确认后立即初始化(imagePrompt/videoPrompt 后续补充)
|
||||
node scripts/pipeline.js init --account 军事账号 --mode single \
|
||||
--items '[{"shotDesc":"英文画面描述","script":"中文口播文案","duration":5,"imagePrompt":"English prompt","directorRef":"tarantino","keyword":"权力"}]'
|
||||
--items '[{"shotDesc":"英文画面描述","script":"中文口播文案","duration":5,"directorRef":"tarantino","keyword":"权力"}]'
|
||||
|
||||
# 或从文件读取
|
||||
node scripts/pipeline.js init --account 军事账号 --mode single --items-file ./items.json
|
||||
@@ -193,7 +193,7 @@ node scripts/pipeline.js run --manifest <path> --retry-failed
|
||||
## 目录结构
|
||||
|
||||
```
|
||||
output/{account}_{YYYYMMDD}_{NNN}/
|
||||
output/{name}_{YYYYMMDD}_{NNN}/
|
||||
├── manifest.json # 主清单
|
||||
├── images/ # scene_{NN}_{slug}.jpeg(首尾帧加 _last,MJ 候选加 _cand{1-4})
|
||||
├── videos/ # scene_{NN}_{slug}.mp4
|
||||
@@ -206,7 +206,7 @@ slug 从 `shotDesc` 派生(slugify: 保留中文和字母数字,最多 20
|
||||
|
||||
## segments[] 字段(TTS 分句)
|
||||
|
||||
TTS 阶段自动生成。仅当 `script` 被切分为 2 句及以上时才写入。单句时不写 segments。
|
||||
TTS 阶段统一生成,单句时数组仅 1 个元素,多句时 N 个元素。assemble 阶段直接使用各 segment 的实际音频时长对齐字幕。
|
||||
|
||||
| 字段 | 说明 |
|
||||
|------|------|
|
||||
@@ -214,4 +214,26 @@ TTS 阶段自动生成。仅当 `script` 被切分为 2 句及以上时才写入
|
||||
| `audio` | 该句音频路径(相对 manifest) |
|
||||
| `duration` | 该句音频时长(秒) |
|
||||
|
||||
`item.audio` 指向所有分段合并后的完整音频,`item.audioDuration` 为各段累计时长。assemble 阶段优先用 `segments` 的精确时长对齐字幕,无 segments 时回退到字数权重估算。
|
||||
`item.audio` 指向 `segments[0].audio`,`item.audioDuration` 为各段累计时长。assemble 阶段遍历 segments 逐一添加音频和字幕,使用实际文件时长(非比例分配),确保音频与字幕精确同步,消除留白。
|
||||
|
||||
---
|
||||
|
||||
## 成片时间线规则
|
||||
|
||||
### 图片模式(images)
|
||||
|
||||
图片没有独立时长。TTS 音频时长 = 画面时长。无 TTS 音频的 item 时长为 0(跳过,不显示)。
|
||||
|
||||
### 视频模式(videos)
|
||||
|
||||
TTS 音频为主轴,视频通过以下策略适配音频时长:
|
||||
|
||||
| ratio = videoDur/audioDur | 策略 | 说明 |
|
||||
|---------------------------|------|------|
|
||||
| 0.9 ~ 1.1 | none | 接近匹配,无需调整 |
|
||||
| > 1.1, ≤ 2 | speed_up | 加速(setpts 压缩时间) |
|
||||
| > 2 | trim | 裁剪(截断到音频时长) |
|
||||
| < 0.9, ≥ 0.5 | slow_down | 放缓(setpts 拉长时间) |
|
||||
| < 0.5 | freeze | 画面停顿(视频原速 + 最后一帧冻结补时长) |
|
||||
|
||||
所有策略失败后兜底:截断到目标时长。
|
||||
|
||||
@@ -215,28 +215,89 @@ function getAudioDurationSec(filePath) {
|
||||
// 主流程
|
||||
// ============================================================================
|
||||
|
||||
function buildTimeline(items, defaultDurationUs) {
|
||||
// 音频为主轴,视频调速适配(≤2x 加速,>2x 截断)
|
||||
function buildTimeline(items) {
|
||||
// 核心规则:
|
||||
// 图片模式:图片没有独立时长,TTS 音频时长 = 画面时长。无音频 = 0 时长(跳过)
|
||||
// 视频模式:TTS 为主轴,视频通过 裁剪/加速/放缓/停顿 适配
|
||||
// 视频比音频长(ratio > 1.1):
|
||||
// ≤ 2x → 加速(setpts 压缩时间)
|
||||
// > 2x → 裁剪(截断到音频时长)
|
||||
// 视频比音频短(ratio < 0.9):
|
||||
// ≥ 0.5x → 放缓(setpts 拉长时间,≤2x慢速)
|
||||
// < 0.5x → 画面停顿(视频正常播放+最后一帧冻结补时长)
|
||||
let offset = 0
|
||||
return items.map(item => {
|
||||
const audioDur = (item.audioDuration != null) ? item.audioDuration * US : 0
|
||||
// 有 segments 时用各段实际时长之和(精确对齐音频文件)
|
||||
let audioDur
|
||||
if (item.segments && item.segments.length > 0) {
|
||||
audioDur = item.segments.reduce((sum, s) => sum + (s.duration || 0), 0) * US
|
||||
} else {
|
||||
audioDur = (item.audioDuration != null) ? item.audioDuration * US : 0
|
||||
}
|
||||
const videoDur = (item.videoDuration != null) ? item.videoDuration * US : 0
|
||||
// 无 TTS:用视频时长或固定时长
|
||||
const hasVideo = !!(item.video || item.videoUrl || item.url)
|
||||
|
||||
// 无 TTS 音频
|
||||
if (audioDur <= 0) {
|
||||
const dur = videoDur || defaultDurationUs
|
||||
const entry = { start: offset, end: offset + dur, duration: dur, speed: 1 }
|
||||
if (hasVideo && videoDur > 0) {
|
||||
// 视频模式无音频:用视频原始时长
|
||||
const entry = { start: offset, end: offset + videoDur, duration: videoDur, speed: 1, strategy: 'none' }
|
||||
offset += videoDur
|
||||
return entry
|
||||
}
|
||||
// 图片模式无音频:0 时长,标记跳过
|
||||
const entry = { start: offset, end: offset, duration: 0, speed: 1, strategy: 'none', skip: true }
|
||||
return entry
|
||||
}
|
||||
|
||||
// 有 TTS:音频时长为主轴
|
||||
const dur = audioDur
|
||||
|
||||
if (!hasVideo || videoDur <= 0) {
|
||||
// 图片模式:直接用音频时长
|
||||
const entry = { start: offset, end: offset + dur, duration: dur, speed: 1, strategy: 'none' }
|
||||
offset += dur
|
||||
return entry
|
||||
}
|
||||
|
||||
// 视频模式:视频 vs 音频时长匹配
|
||||
const ratio = videoDur / audioDur
|
||||
|
||||
if (ratio > 1.1) {
|
||||
// 视频比音频长
|
||||
if (ratio <= 2) {
|
||||
// 加速策略
|
||||
const entry = { start: offset, end: offset + dur, duration: dur, speed: ratio, strategy: 'speed_up' }
|
||||
offset += dur
|
||||
return entry
|
||||
} else {
|
||||
// 裁剪策略
|
||||
const entry = { start: offset, end: offset + dur, duration: dur, speed: 1, strategy: 'trim' }
|
||||
offset += dur
|
||||
return entry
|
||||
}
|
||||
} else if (ratio < 0.9) {
|
||||
// 视频比音频短
|
||||
if (ratio >= 0.5) {
|
||||
// 放缓策略(慢放 ≤2x)
|
||||
const entry = { start: offset, end: offset + dur, duration: dur, speed: ratio, strategy: 'slow_down' }
|
||||
offset += dur
|
||||
return entry
|
||||
} else {
|
||||
// 画面停顿策略(视频原速播放 + 最后一帧冻结补时长)
|
||||
const entry = {
|
||||
start: offset, end: offset + dur, duration: dur, speed: 1,
|
||||
strategy: 'freeze', freezeExtra: dur - videoDur,
|
||||
}
|
||||
offset += dur
|
||||
return entry
|
||||
}
|
||||
} else {
|
||||
// 接近匹配(0.9 ~ 1.1),无需调整
|
||||
const entry = { start: offset, end: offset + dur, duration: dur, speed: 1, strategy: 'none' }
|
||||
offset += dur
|
||||
return entry
|
||||
}
|
||||
// 有 TTS:音频时长为主轴
|
||||
const dur = audioDur
|
||||
const ratio = videoDur > 0 ? videoDur / audioDur : 1
|
||||
// ≤2x: 加速到音频时长;>2x: 截断(视频只取前 audioDur 部分)
|
||||
const speed = ratio <= 2 ? ratio : 1
|
||||
const needAdjust = videoDur > audioDur + 100000 // 视频比音频长 0.1s 以上才需要调整
|
||||
const entry = { start: offset, end: offset + dur, duration: dur, speed, needAdjust }
|
||||
offset += dur
|
||||
return entry
|
||||
})
|
||||
}
|
||||
|
||||
@@ -253,7 +314,6 @@ async function assemble(args) {
|
||||
filter: filterStr,
|
||||
format = '9:16',
|
||||
apiKey = '',
|
||||
duration = '4',
|
||||
animation = '轻微放大',
|
||||
} = args
|
||||
|
||||
@@ -284,22 +344,44 @@ async function assemble(args) {
|
||||
}
|
||||
|
||||
const { width, height } = getResolution(format)
|
||||
const defaultDurationUs = parseFloat(duration) * US
|
||||
|
||||
// 过滤出实际存在的文件
|
||||
const missingFileItems = []
|
||||
const items = manifest.items.filter(item => {
|
||||
if (item.url) return true // 视频模式可能用 URL
|
||||
if (item.video) return true // 视频模式本地文件
|
||||
if (!item.file) {
|
||||
missingFileItems.push(item.id || '?')
|
||||
return false
|
||||
}
|
||||
const filePath = path.join(inputDir, item.file)
|
||||
return fs.existsSync(filePath)
|
||||
})
|
||||
|
||||
if (items.length === 0) {
|
||||
if (missingFileItems.length > 0) {
|
||||
throw new Error(`没有可用的素材文件 — ${missingFileItems.length} 个 item 缺少 file 字段(id: ${missingFileItems.join(', ')}),请先运行 images 阶段`)
|
||||
}
|
||||
throw new Error('没有可用的素材文件')
|
||||
}
|
||||
|
||||
if (items.length === 0) throw new Error('没有可用的素材文件')
|
||||
|
||||
// 用 ffprobe 测量实际音频/视频时长,替代 manifest 中的估计值
|
||||
let audioMeasured = 0, videoMeasured = 0
|
||||
for (const item of items) {
|
||||
// 测量 TTS 音频实际时长(有 segments 时跳过,audioDuration 已是精确累计值)
|
||||
if (item.audio && !item.audio.startsWith('http') && !item.segments) {
|
||||
// 测量各 segment 音频文件实际时长
|
||||
if (item.segments && item.segments.length > 0) {
|
||||
for (const seg of item.segments) {
|
||||
if (!seg.audio || seg.audio.startsWith('http')) continue
|
||||
const audioPath = path.isAbsolute(seg.audio)
|
||||
? seg.audio
|
||||
: path.resolve(inputDir, seg.audio)
|
||||
if (!fs.existsSync(audioPath)) continue
|
||||
const actualDur = await getAudioDurationSec(audioPath)
|
||||
if (actualDur != null) { seg.duration = actualDur; audioMeasured++ }
|
||||
}
|
||||
} else if (item.audio && !item.audio.startsWith('http')) {
|
||||
const audioPath = path.isAbsolute(item.audio)
|
||||
? item.audio
|
||||
: path.resolve(inputDir, item.audio)
|
||||
@@ -323,16 +405,32 @@ async function assemble(args) {
|
||||
console.log(` 实际时长测量: 音频 ${audioMeasured} 个, 视频 ${videoMeasured} 个`)
|
||||
}
|
||||
|
||||
const timeline = buildTimeline(items, defaultDurationUs)
|
||||
const timeline = buildTimeline(items)
|
||||
const totalDurationUs = timeline.length > 0 ? timeline[timeline.length - 1].end : 0
|
||||
const hasTTS = items.some(item => item.audio && item.audioDuration != null)
|
||||
|
||||
// 时间轴诊断
|
||||
for (let i = 0; i < items.length; i++) {
|
||||
const item = items[i]
|
||||
const tl = timeline[i]
|
||||
if (tl.skip) { console.log(` [${i + 1}] 跳过(无音频)`); continue }
|
||||
const audioDur = item.segments
|
||||
? item.segments.reduce((s, seg) => s + (seg.duration || 0), 0)
|
||||
: (item.audioDuration || 0)
|
||||
const slotDur = tl.duration / US
|
||||
const diff = slotDur - audioDur
|
||||
const videoDur = (item.videoDuration || 0)
|
||||
const stratInfo = tl.strategy && tl.strategy !== 'none' ? ` 策略=${tl.strategy}` : ''
|
||||
const marker = Math.abs(diff) > 0.05 ? ' ⚠️ 不对齐' : ''
|
||||
console.log(` [${i + 1}] 画面=${slotDur.toFixed(2)}s 音频=${audioDur.toFixed(2)}s 视频=${videoDur.toFixed(2)}s${stratInfo}${marker}`)
|
||||
}
|
||||
|
||||
// -- 读取转场策略(在 addImages/addVideos 之前) --
|
||||
const transitionConfig = loadTransitions(manifest)
|
||||
|
||||
console.log(`\nCapCut 成片组装`)
|
||||
console.log(` 模式: ${mode} 画幅: ${format} (${width}x${height})`)
|
||||
console.log(` 时间线: ${hasTTS ? 'TTS音频驱动' : `固定${duration}s/段`} 总时长: ${(totalDurationUs / US).toFixed(1)}s`)
|
||||
console.log(` 时间线: ${hasTTS ? 'TTS音频驱动' : '视频原始时长'} 总时长: ${(totalDurationUs / US).toFixed(1)}s`)
|
||||
console.log(` 字幕: ${subtitles} 配音: ${voiceover} 动画: ${animation}`)
|
||||
if (finalEffects) console.log(` 特效: ${finalEffects}`)
|
||||
if (finalFilter) console.log(` 滤镜: ${finalFilter}`)
|
||||
@@ -386,10 +484,10 @@ async function assemble(args) {
|
||||
for (let i = 0; i < items.length; i++) {
|
||||
const item = items[i]
|
||||
const tl = timeline[i]
|
||||
if (tl.needAdjust && item.video) {
|
||||
if (tl.strategy && tl.strategy !== 'none' && item.video) {
|
||||
const videoPath = path.resolve(inputDir, item.video)
|
||||
const audioDur = tl.duration / US
|
||||
const adjustedPath = await adjustVideoSpeed(videoPath, audioDur)
|
||||
const adjustedPath = await adjustVideoSpeed(videoPath, audioDur, tl.strategy, tl.speed, tl.freezeExtra || 0)
|
||||
if (adjustedPath !== videoPath) {
|
||||
item.video = path.relative(inputDir, adjustedPath)
|
||||
item.videoDuration = audioDur
|
||||
@@ -398,7 +496,7 @@ async function assemble(args) {
|
||||
}
|
||||
}
|
||||
if (adjustedCount > 0) {
|
||||
console.log(` 视频调速: ${adjustedCount}/${items.length} 个`)
|
||||
console.log(` 视频调整: ${adjustedCount}/${items.length} 个`)
|
||||
}
|
||||
|
||||
// Step 2: 上传(已调速的)视频到 OSS
|
||||
@@ -547,7 +645,7 @@ async function assemble(args) {
|
||||
console.log(` 草稿ID: ${draftId}`)
|
||||
console.log(` 总时长: ${(totalDurationUs / US).toFixed(1)}s`)
|
||||
console.log(` 素材数: ${items.length}`)
|
||||
console.log(` 时间线: ${hasTTS ? 'TTS音频驱动' : '固定时长'}`)
|
||||
console.log(` 时间线: ${hasTTS ? 'TTS音频驱动' : '视频原始时长'}`)
|
||||
if (mode === 'videos' && subtitles === 'false') {
|
||||
console.log(`\n >> 视频模式未加字幕,请在剪映中打开草稿 → 识别字幕 → 语音识别生成\n`)
|
||||
}
|
||||
@@ -713,54 +811,142 @@ async function addKenBurns(draftUrl, segmentIds, items, timeline, manifest) {
|
||||
// ============================================================================
|
||||
|
||||
/**
|
||||
* ffmpeg 调速:将视频调整为指定时长
|
||||
* ratio <= 2x: 加速;ratio > 2x: 截断
|
||||
* 返回调整后的文件路径(调整失败则返回原路径)
|
||||
* ffmpeg 视频调整:根据策略适配音频时长
|
||||
*
|
||||
* 策略(按 ratio = videoDur / audioDur 选择):
|
||||
* speed_up (ratio > 1.1, ≤2x) → setpts 压缩时间(加速)
|
||||
* trim (ratio > 2x) → 截断到目标时长
|
||||
* slow_down (ratio < 0.9, ≥0.5x) → setpts 拉长时间(慢放)
|
||||
* freeze (ratio < 0.5x) → 视频原速 + 最后一帧冻结补时长
|
||||
* none (0.9~1.1) → 无需调整
|
||||
*
|
||||
* 所有策略失败后兜底:截断到目标时长
|
||||
*
|
||||
* 返回调整后的文件路径(失败则返回原路径)
|
||||
*/
|
||||
async function adjustVideoSpeed(videoPath, targetDurationSec) {
|
||||
async function adjustVideoSpeed(videoPath, targetDurationSec, strategy = 'none', speed = 1, freezeExtraUs = 0) {
|
||||
if (!fs.existsSync(videoPath)) return videoPath
|
||||
if (strategy === 'none') return videoPath
|
||||
|
||||
// 兜底截断:所有策略失败后的最终回退
|
||||
function fallbackTrim(cb) {
|
||||
execFile('ffmpeg', [
|
||||
'-y', '-i', videoPath,
|
||||
'-t', String(targetDurationSec),
|
||||
'-c', 'copy',
|
||||
videoPath.replace(/(\.\w+)$/, '_adj$1')
|
||||
], { timeout: 30000 }, (err) => {
|
||||
if (err) { cb(videoPath); return }
|
||||
cb(videoPath.replace(/(\.\w+)$/, '_adj$1'))
|
||||
})
|
||||
}
|
||||
|
||||
return new Promise((resolve) => {
|
||||
// 先获取视频时长
|
||||
execFile('ffprobe', [
|
||||
'-v', 'quiet', '-show_entries', 'format=duration',
|
||||
'-of', 'csv=p=0', videoPath
|
||||
], (err, stdout) => {
|
||||
if (err) { resolve(videoPath); return }
|
||||
if (err) { fallbackTrim(resolve); return }
|
||||
const videoDur = parseFloat(stdout.trim())
|
||||
if (!videoDur || videoDur <= 0 || videoDur <= targetDurationSec + 0.1) {
|
||||
resolve(videoPath); return
|
||||
}
|
||||
if (!videoDur || videoDur <= 0) { fallbackTrim(resolve); return }
|
||||
|
||||
const ratio = videoDur / targetDurationSec
|
||||
const outPath = videoPath.replace(/(\.\w+)$/, '_adj$1')
|
||||
|
||||
if (ratio <= 2) {
|
||||
// 加速:setpts=PTS/speed, atempo=speed (音频变速)
|
||||
const speed = ratio.toFixed(3)
|
||||
const atempo = Math.min(speed, 2.0) // atempo 单次上限 2.0
|
||||
execFile('ffmpeg', [
|
||||
'-y', '-i', videoPath,
|
||||
'-filter_complex', `setpts=PTS/${speed}`,
|
||||
'-an',
|
||||
outPath
|
||||
], { timeout: 30000 }, (err) => {
|
||||
if (err) { console.log(` 调速失败,使用原始视频: ${err.message}`); resolve(videoPath); return }
|
||||
console.log(` 调速: ${videoDur.toFixed(1)}s → ${targetDurationSec.toFixed(1)}s (${speed}x)`)
|
||||
resolve(outPath)
|
||||
})
|
||||
} else {
|
||||
// 截断:取前 targetDuration 秒
|
||||
if (strategy === 'trim') {
|
||||
execFile('ffmpeg', [
|
||||
'-y', '-i', videoPath,
|
||||
'-t', String(targetDurationSec),
|
||||
'-c', 'copy',
|
||||
outPath
|
||||
], { timeout: 30000 }, (err) => {
|
||||
if (err) { console.log(` 截断失败,使用原始视频: ${err.message}`); resolve(videoPath); return }
|
||||
if (err) { console.log(` 截断失败: ${err.message}`); resolve(videoPath); return }
|
||||
console.log(` 截断: ${videoDur.toFixed(1)}s → ${targetDurationSec.toFixed(1)}s`)
|
||||
resolve(outPath)
|
||||
})
|
||||
} else if (strategy === 'speed_up') {
|
||||
const speedVal = speed.toFixed(3)
|
||||
execFile('ffmpeg', [
|
||||
'-y', '-i', videoPath,
|
||||
'-filter_complex', `setpts=PTS/${speedVal}`,
|
||||
'-an',
|
||||
outPath
|
||||
], { timeout: 30000 }, (err) => {
|
||||
if (err) {
|
||||
console.log(` 加速失败,兜底截断: ${err.message}`)
|
||||
fallbackTrim(resolve)
|
||||
return
|
||||
}
|
||||
console.log(` 加速: ${videoDur.toFixed(1)}s → ${targetDurationSec.toFixed(1)}s (${speedVal}x)`)
|
||||
resolve(outPath)
|
||||
})
|
||||
} else if (strategy === 'slow_down') {
|
||||
const factor = (1 / speed).toFixed(3)
|
||||
execFile('ffmpeg', [
|
||||
'-y', '-i', videoPath,
|
||||
'-filter_complex', `setpts=PTS*${factor}`,
|
||||
'-an',
|
||||
outPath
|
||||
], { timeout: 30000 }, (err) => {
|
||||
if (err) {
|
||||
console.log(` 放缓失败,兜底截断: ${err.message}`)
|
||||
fallbackTrim(resolve)
|
||||
return
|
||||
}
|
||||
console.log(` 放缓: ${videoDur.toFixed(1)}s → ${targetDurationSec.toFixed(1)}s (${speed.toFixed(2)}x speed)`)
|
||||
resolve(outPath)
|
||||
})
|
||||
} else if (strategy === 'freeze') {
|
||||
// 画面停顿:原速播放 + 最后一帧冻结补时长
|
||||
const freezeSec = freezeExtraUs / US
|
||||
execFile('ffmpeg', [
|
||||
'-y', '-i', videoPath,
|
||||
'-filter_complex', `tpad=stop=-1:stop_duration=${freezeSec.toFixed(3)}`,
|
||||
'-an',
|
||||
outPath
|
||||
], { timeout: 30000 }, (err) => {
|
||||
if (err) {
|
||||
// 回退方案:截取最后一帧 → 生成冻结帧视频 → concat 拼接
|
||||
console.log(` tpad freeze 失败,尝试 concat 方案: ${err.message}`)
|
||||
const lastFrame = videoPath.replace(/(\.\w+)$/, '_lastframe.png')
|
||||
const frozenVideo = videoPath.replace(/(\.\w+)$/, '_frozen.mp4')
|
||||
execFile('ffmpeg', [
|
||||
'-y', '-sseof', '-0.1', '-i', videoPath,
|
||||
'-frames:v', '1', lastFrame
|
||||
], { timeout: 10000 }, (err2) => {
|
||||
if (err2) { console.log(` concat 方案也失败,兜底截断`); fallbackTrim(resolve); return }
|
||||
execFile('ffmpeg', [
|
||||
'-y', '-loop', '1', '-i', lastFrame,
|
||||
'-t', String(freezeSec.toFixed(3)),
|
||||
'-pix_fmt', 'yuv420p',
|
||||
'-vf', 'scale=trunc(iw/2)*2:trunc(ih/2)*2',
|
||||
frozenVideo
|
||||
], { timeout: 15000 }, (err3) => {
|
||||
if (err3) {
|
||||
try { fs.unlinkSync(lastFrame) } catch (_) {}
|
||||
console.log(` 冻结帧视频生成失败,兜底截断`)
|
||||
fallbackTrim(resolve)
|
||||
return
|
||||
}
|
||||
const concatList = path.join(path.dirname(videoPath), '_freeze_concat.txt')
|
||||
fs.writeFileSync(concatList, `file '${videoPath}'\nfile '${frozenVideo}'\n`)
|
||||
execFile('ffmpeg', [
|
||||
'-y', '-f', 'concat', '-safe', '0', '-i', concatList,
|
||||
'-c', 'copy', outPath
|
||||
], { timeout: 30000 }, (err4) => {
|
||||
try { fs.unlinkSync(lastFrame); fs.unlinkSync(frozenVideo); fs.unlinkSync(concatList) } catch (_) {}
|
||||
if (err4) { console.log(` 拼接失败,兜底截断`); fallbackTrim(resolve); return }
|
||||
console.log(` 画面停顿: ${videoDur.toFixed(1)}s + 冻结 ${freezeSec.toFixed(1)}s = ${targetDurationSec.toFixed(1)}s`)
|
||||
resolve(outPath)
|
||||
})
|
||||
})
|
||||
})
|
||||
return
|
||||
}
|
||||
console.log(` 画面停顿: ${videoDur.toFixed(1)}s + 冻结 ${freezeSec.toFixed(1)}s = ${targetDurationSec.toFixed(1)}s`)
|
||||
resolve(outPath)
|
||||
})
|
||||
} else {
|
||||
resolve(videoPath)
|
||||
}
|
||||
})
|
||||
})
|
||||
@@ -829,8 +1015,8 @@ async function addVideos(draftUrl, inputDir, items, timeline, width, height, tra
|
||||
async function batchUploadAudio(inputDir, items) {
|
||||
const urls = {}
|
||||
for (const item of items) {
|
||||
// 上传 segments 中的每段音频
|
||||
if (item.segments && item.segments.length > 1) {
|
||||
// 上传所有 segment 音频文件
|
||||
if (item.segments && item.segments.length > 0) {
|
||||
for (const seg of item.segments) {
|
||||
if (!seg.audio || seg.audio.startsWith('http') || urls[seg.audio]) continue
|
||||
const filePath = path.isAbsolute(seg.audio)
|
||||
@@ -848,7 +1034,7 @@ async function batchUploadAudio(inputDir, items) {
|
||||
}
|
||||
}
|
||||
}
|
||||
// 上传 item.audio(单段或 segments 的第一段)
|
||||
// 上传 item.audio(向后兼容,segments[0].audio 通常等于此值)
|
||||
if (!item.audio || item.audio.startsWith('http')) {
|
||||
if (item.audio) urls[item.audio] = item.audio
|
||||
continue
|
||||
@@ -893,24 +1079,29 @@ async function addVoiceover(draftUrl, inputDir, items, timeline, audioUrls = {})
|
||||
for (let i = 0; i < items.length; i++) {
|
||||
const item = items[i]
|
||||
const tl = timeline[i]
|
||||
const segments = item.segments && item.segments.length > 1 ? item.segments : null
|
||||
|
||||
if (segments) {
|
||||
// 多段音频:按 segment 逐段添加,使用精确时长
|
||||
const slots = distributeSegments(tl, segments)
|
||||
|
||||
for (const slot of slots) {
|
||||
const audioUrl = resolveAudio(slot.audio)
|
||||
if (item.segments && item.segments.length > 0) {
|
||||
// 逐段添加,每段使用实际音频文件时长(不做比例分配,消除留白)
|
||||
let currentTime = tl.start
|
||||
for (let si = 0; si < item.segments.length; si++) {
|
||||
const seg = item.segments[si]
|
||||
const audioUrl = resolveAudio(seg.audio)
|
||||
const segDurUs = (seg.duration || 0) * US
|
||||
if (segDurUs <= 0) continue
|
||||
// 最后一段对齐 timeline 末尾,吃掉浮点误差
|
||||
const isLast = si === item.segments.length - 1
|
||||
const endTime = isLast ? tl.end : currentTime + segDurUs
|
||||
audioInfos.push({
|
||||
audio_url: audioUrl,
|
||||
start: slot.start,
|
||||
end: slot.end,
|
||||
duration: slot.duration,
|
||||
start: currentTime,
|
||||
end: endTime,
|
||||
duration: endTime - currentTime,
|
||||
volume: 1.0,
|
||||
})
|
||||
currentTime = endTime
|
||||
}
|
||||
} else if (item.audio) {
|
||||
// 单段音频:用实际音频时长,不超过 timeline 时长
|
||||
// 无 segments:用实际音频时长
|
||||
const audioUrl = resolveAudio(item.audio)
|
||||
const audioDurUs = item.audioDuration ? item.audioDuration * US : tl.duration
|
||||
|
||||
@@ -981,23 +1172,6 @@ function applyAnimationProps(cap, style = {}) {
|
||||
if (style.outAnimDuration) cap.out_animation_duration = style.outAnimDuration
|
||||
}
|
||||
|
||||
// segments 按比例分配到时间线(DRY helper)
|
||||
function distributeSegments(tl, segments) {
|
||||
const totalSegDur = segments.reduce((sum, s) => sum + (s.duration || 0) * US, 0)
|
||||
if (totalSegDur <= 0) return []
|
||||
const tlDuration = tl.end - tl.start
|
||||
let currentTime = tl.start
|
||||
return segments.map((seg, idx) => {
|
||||
const segDurUs = Math.round((seg.duration || 0) * US)
|
||||
let duration = Math.round(tlDuration * (segDurUs / totalSegDur))
|
||||
if (idx === segments.length - 1) duration = tl.end - currentTime
|
||||
duration = Math.max(duration, 100000)
|
||||
const entry = { start: currentTime, end: currentTime + duration, duration, text: seg.text, audio: seg.audio }
|
||||
currentTime += duration
|
||||
return entry
|
||||
})
|
||||
}
|
||||
|
||||
function loadAccountConfig(manifest) {
|
||||
const account = manifest.account
|
||||
if (!account) return {}
|
||||
@@ -1093,17 +1267,19 @@ async function addSubtitles(draftUrl, items, timeline, style = {}, split = false
|
||||
const tl = timeline[i]
|
||||
|
||||
if (split) {
|
||||
// 分句模式:优先用 segments(TTS 逐句生成的精确时长),回退到字数估算
|
||||
const segments = item.segments && item.segments.length > 1 ? item.segments : null
|
||||
|
||||
if (segments) {
|
||||
// 精确模式:用 segments 的实际音频时长
|
||||
const slots = distributeSegments(tl, segments)
|
||||
|
||||
for (const slot of slots) {
|
||||
const cap = { start: slot.start, end: slot.end, text: slot.text }
|
||||
// 分句模式:优先用 segments 精确时长(与 addVoiceover 同步),回退到字数估算
|
||||
if (item.segments && item.segments.length > 0) {
|
||||
let currentTime = tl.start
|
||||
for (let si = 0; si < item.segments.length; si++) {
|
||||
const seg = item.segments[si]
|
||||
const segDurUs = (seg.duration || 0) * US
|
||||
if (segDurUs <= 0) continue
|
||||
const isLast = si === item.segments.length - 1
|
||||
const endTime = isLast ? tl.end : currentTime + segDurUs
|
||||
const cap = { start: currentTime, end: endTime, text: seg.text }
|
||||
applyAnimationProps(cap, animStyle)
|
||||
captions.push(cap)
|
||||
currentTime = endTime
|
||||
}
|
||||
} else {
|
||||
// 回退:字数权重估算
|
||||
@@ -1246,7 +1422,6 @@ async function main() {
|
||||
console.log('选项:')
|
||||
console.log(' --mode images|videos 素材类型(默认 images)')
|
||||
console.log(' --format 9:16 画幅比例')
|
||||
console.log(' --duration 4 默认每段时长/秒(无TTS时的fallback,默认 4)')
|
||||
console.log(' --voiceover true|false 是否添加TTS配音轨道(默认 true)')
|
||||
console.log(' --subtitles true|false 是否添加字幕(默认 true)')
|
||||
console.log(' --split-captions true|false 分句字幕模式(默认 true,按标点切分)')
|
||||
@@ -1256,12 +1431,12 @@ async function main() {
|
||||
console.log(' --apiKey <key> 云渲染 API Key(可选)')
|
||||
console.log(' --manifest <path> manifest.json 路径')
|
||||
console.log('')
|
||||
console.log('时间线模式:')
|
||||
console.log(' manifest.json 中每段包含 audio + duration → TTS音频驱动时间线')
|
||||
console.log(' 无 audio/duration → 按 --duration 固定时长')
|
||||
console.log('')
|
||||
console.log('manifest.json 示例(TTS驱动):')
|
||||
console.log(' {"items":[{"file":"1.png","text":"文案","audio":"seg_1.mp3","duration":3.5}]}')
|
||||
console.log('时间线规则:')
|
||||
console.log(' 图片模式: TTS 音频时长 = 画面时长,无音频则跳过')
|
||||
console.log(' 视频模式: TTS 为主轴,视频通过以下策略适配:')
|
||||
console.log(' 视频比音频长 → 加速(≤2x) 或 裁剪(>2x)')
|
||||
console.log(' 视频比音频短 → 放缓(≥0.5x) 或 画面停顿(<0.5x)')
|
||||
console.log(' 所有策略失败 → 兜底截断')
|
||||
console.log('')
|
||||
console.log('配置:')
|
||||
console.log(' 请运行 node setup.js 生成配置')
|
||||
|
||||
@@ -5,21 +5,26 @@
|
||||
const { loadManifest, saveManifest } = require('./pipeline-utils')
|
||||
|
||||
function confirmManifest(options) {
|
||||
const { manifest: manifestPath, all } = options
|
||||
const { manifest: manifestPath, all, items: itemsStr } = options
|
||||
|
||||
if (!manifestPath) {
|
||||
console.error('用法: pipeline.js confirm --manifest <path> --all')
|
||||
console.error(' pipeline.js confirm --manifest <path> --items 1,3,5')
|
||||
process.exit(1)
|
||||
}
|
||||
if (!all) {
|
||||
console.error('错误: 必须指定 --all')
|
||||
if (!all && !itemsStr) {
|
||||
console.error('错误: 必须指定 --all 或 --items <id列表>')
|
||||
process.exit(1)
|
||||
}
|
||||
|
||||
const manifest = loadManifest(manifestPath)
|
||||
const targetIds = itemsStr
|
||||
? new Set(itemsStr.split(',').map(s => parseInt(s.trim(), 10)).filter(n => !isNaN(n)))
|
||||
: null
|
||||
|
||||
let count = 0
|
||||
for (const item of manifest.items) {
|
||||
if (targetIds && !targetIds.has(item.id)) continue
|
||||
if (item.file && item.status === 'done' && !item.confirmed) {
|
||||
item.confirmed = true
|
||||
count++
|
||||
@@ -30,7 +35,8 @@ function confirmManifest(options) {
|
||||
|
||||
const total = manifest.items.length
|
||||
const confirmed = manifest.items.filter(it => it.confirmed).length
|
||||
console.log(`已确认: ${count} items(共 ${confirmed}/${total} 已确认)`)
|
||||
const scope = targetIds ? `${Array.from(targetIds).join(',')}` : '全部'
|
||||
console.log(`已确认: ${count} items(范围: ${scope},共 ${confirmed}/${total} 已确认)`)
|
||||
}
|
||||
|
||||
module.exports = { confirmManifest }
|
||||
|
||||
@@ -6,7 +6,7 @@
|
||||
|
||||
const fs = require('fs')
|
||||
const path = require('path')
|
||||
const { loadAccountConfig, saveManifest, ensureDir, ACCOUNTS_DIR, SKILLS_DIR } = require('./pipeline-utils')
|
||||
const { loadAccountConfig, saveManifest, ensureDir, slugify, ACCOUNTS_DIR, SKILLS_DIR } = require('./pipeline-utils')
|
||||
|
||||
function initManifest(options) {
|
||||
const { account: accountId, mode, items: itemsJson, itemsFile } = options
|
||||
@@ -40,7 +40,8 @@ function initManifest(options) {
|
||||
}
|
||||
|
||||
// 校验必填字段
|
||||
const requiredFields = ['shotDesc', 'script', 'imagePrompt']
|
||||
const requiredFields = ['shotDesc', 'script']
|
||||
const optionalFields = ['imagePrompt', 'videoPrompt', 'lastFramePrompt']
|
||||
const resolvedMode = mode || 'single'
|
||||
|
||||
for (let i = 0; i < rawItems.length; i++) {
|
||||
@@ -52,8 +53,7 @@ function initManifest(options) {
|
||||
}
|
||||
}
|
||||
if (resolvedMode === 'framePair' && !item.lastFramePrompt) {
|
||||
console.error(`错误: 首尾帧模式 items[${i}] 缺少 "lastFramePrompt"(imagePrompt 作为第一帧)`)
|
||||
process.exit(1)
|
||||
delete item.lastFramePrompt // 首尾帧模式 Step 2-A 补充
|
||||
}
|
||||
}
|
||||
|
||||
@@ -68,9 +68,11 @@ function initManifest(options) {
|
||||
|
||||
// 构建 items
|
||||
const items = rawItems.map((raw, i) => {
|
||||
const slug = slugify(raw.shotDesc || raw.script || `scene_${i + 1}`)
|
||||
const item = {
|
||||
id: i + 1,
|
||||
status: 'pending',
|
||||
file: `images/scene_${String(i + 1).padStart(2, '0')}_${slug}.jpeg`,
|
||||
shotDesc: raw.shotDesc || '',
|
||||
script: raw.script || '',
|
||||
duration: raw.duration || 5,
|
||||
@@ -129,7 +131,13 @@ function initManifest(options) {
|
||||
console.log(` 画幅: ${manifest.format}, 模式: ${manifest.mode}`)
|
||||
console.log(` Items: ${items.length}`)
|
||||
console.log(` 参考图: ${references.length}`)
|
||||
if (items.some(it => !it.videoPrompt)) {
|
||||
if (items.some(it => !it.imagePrompt)) {
|
||||
console.log(` ⚠ ${items.filter(it => !it.imagePrompt).length} 个 item 缺少 imagePrompt,请运行 Step 2-A(图片提示词)补充`)
|
||||
}
|
||||
if (resolvedMode === 'framePair' && items.some(it => !it.lastFramePrompt)) {
|
||||
console.log(` ⚠ ${items.filter(it => !it.lastFramePrompt).length} 个 item 缺少 lastFramePrompt,请运行 Step 2-A 补充`)
|
||||
}
|
||||
if (items.some(it => !it.videoPrompt && resolvedMode !== 'framePair')) {
|
||||
console.log(` ⚠ ${items.filter(it => !it.videoPrompt).length} 个 item 缺少 videoPrompt,生视频阶段将跳过`)
|
||||
}
|
||||
console.log()
|
||||
|
||||
@@ -41,6 +41,9 @@ function validateManifest(manifestPath) {
|
||||
if (item.status && !['pending', 'generating', 'done', 'failed'].includes(item.status)) {
|
||||
issues.push(`${prefix} status 无效: ${item.status}`)
|
||||
}
|
||||
if (item.status === 'done' && !item.file && !item.video && !item.url) {
|
||||
issues.push(`${prefix} status=done 但缺少 file/video/url(素材路径)`)
|
||||
}
|
||||
})
|
||||
}
|
||||
|
||||
|
||||
@@ -15,6 +15,14 @@ async function phaseAssemble(manifest, manifestPath, options) {
|
||||
const hasVideos = videoItems.length > 0
|
||||
const mode = hasVideos ? 'videos' : 'images'
|
||||
|
||||
// 前置校验:图片模式下检查 file 字段
|
||||
if (mode === 'images') {
|
||||
const missingFile = manifest.items.filter(it => !it.file)
|
||||
if (missingFile.length > 0) {
|
||||
throw new Error(`${missingFile.length} 个 item 缺少 file 字段(id: ${missingFile.map(it => it.id).join(', ')}),请先运行 images 阶段生成图片`)
|
||||
}
|
||||
}
|
||||
|
||||
const assembleArgs = {
|
||||
input: dir,
|
||||
manifest: manifestPath,
|
||||
@@ -22,7 +30,6 @@ async function phaseAssemble(manifest, manifestPath, options) {
|
||||
format: manifest.format || accountConfig.defaultFormat || '9:16',
|
||||
subtitles: mode === 'images' ? 'true' : 'false',
|
||||
voiceover: manifest.items.some(it => it.audio) ? 'true' : 'false',
|
||||
duration: '4',
|
||||
animation: capcutConfig.animation || '渐显+放大',
|
||||
}
|
||||
|
||||
|
||||
@@ -17,7 +17,8 @@ async function phaseImages(manifest, manifestPath, options) {
|
||||
ensureDir(imagesDir)
|
||||
|
||||
const items = manifest.items.filter(it =>
|
||||
(!it.status || it.status === 'pending' || it.status === 'generating') && it.imagePrompt
|
||||
((!it.status || it.status === 'pending' || it.status === 'generating') && it.imagePrompt) ||
|
||||
(it.status === 'done' && manifest.mode === 'framePair' && it.file && it.lastFramePrompt && !it.lastFrame)
|
||||
)
|
||||
if (items.length === 0) { log('images', '无待处理 item,跳过'); return }
|
||||
|
||||
@@ -45,6 +46,14 @@ async function phaseImages(manifest, manifestPath, options) {
|
||||
item.status = 'generating'
|
||||
saveManifest(manifestPath, manifest)
|
||||
|
||||
// 仅补 lastFrame:首帧已存在,跳过首帧生成
|
||||
if (item.file && manifest.mode === 'framePair' && item.lastFramePrompt && !item.lastFrame) {
|
||||
log('images', `[${idx}] 补生成 lastFrame(首帧已有: ${item.file})`)
|
||||
await generateLastFrame(item, idx, manifest, dir, imagesDir, model, ratio, manifestPath)
|
||||
saveManifest(manifestPath, manifest)
|
||||
return { ok: true }
|
||||
}
|
||||
|
||||
let result
|
||||
if (model === 'gemini') {
|
||||
result = await generateGemini(item, idx, dir, imagesDir, ratio, refs)
|
||||
|
||||
@@ -2,7 +2,8 @@
|
||||
* Phase: tts — 语音合成(逐句分句生成)
|
||||
*
|
||||
* 将每个 item 的 script 按标点切分为短句,每句单独生成 TTS 音频。
|
||||
* 结果写入 item.segments[],实现字幕与语音精确对齐。
|
||||
* 统一写入 item.segments[],单句时数组仅 1 个元素。
|
||||
* item.audio 指向第一段,item.audioDuration 为累计时长。
|
||||
*/
|
||||
|
||||
const path = require('path')
|
||||
@@ -29,47 +30,32 @@ async function phaseTts(manifest, manifestPath, options = {}) {
|
||||
|
||||
try {
|
||||
const sentences = splitTextIntoSentences(fullText)
|
||||
const segments = []
|
||||
let totalDuration = 0
|
||||
|
||||
if (sentences.length <= 1) {
|
||||
// 单句:不需要 segments,走原逻辑
|
||||
const { filePath, duration } = await synthesize(fullText, {
|
||||
for (let j = 0; j < sentences.length; j++) {
|
||||
const sentence = sentences[j]
|
||||
const segId = `${item.id || idx}_${j + 1}`
|
||||
const { filePath, duration } = await synthesize(sentence, {
|
||||
outputDir: audioDir,
|
||||
id: item.id || idx,
|
||||
id: segId,
|
||||
voice: manifest.ttsVoice || undefined,
|
||||
instruction: manifest.ttsInstruction || undefined,
|
||||
rate: manifest.ttsRate || undefined,
|
||||
})
|
||||
item.audio = path.relative(dir, filePath).replace(/\\/g, '/')
|
||||
item.audioDuration = Math.round(duration * 1000) / 1000
|
||||
log('tts', `[${idx}/${items.length}] ${duration.toFixed(1)}s: ${fullText.substring(0, 30)}...`)
|
||||
} else {
|
||||
// 多句:逐句生成,写入 segments
|
||||
const segments = []
|
||||
let totalDuration = 0
|
||||
|
||||
for (let j = 0; j < sentences.length; j++) {
|
||||
const sentence = sentences[j]
|
||||
const segId = `${item.id || idx}_${j + 1}`
|
||||
const { filePath, duration } = await synthesize(sentence, {
|
||||
outputDir: audioDir,
|
||||
id: segId,
|
||||
voice: manifest.ttsVoice || undefined,
|
||||
instruction: manifest.ttsInstruction || undefined,
|
||||
rate: manifest.ttsRate || undefined,
|
||||
})
|
||||
segments.push({
|
||||
text: sentence,
|
||||
audio: path.relative(dir, filePath).replace(/\\/g, '/'),
|
||||
duration: Math.round(duration * 1000) / 1000,
|
||||
})
|
||||
totalDuration += duration
|
||||
}
|
||||
|
||||
item.segments = segments
|
||||
item.audio = segments[0].audio
|
||||
item.audioDuration = Math.round(totalDuration * 1000) / 1000
|
||||
log('tts', `[${idx}/${items.length}] ${totalDuration.toFixed(1)}s (${segments.length}句): ${fullText.substring(0, 30)}...`)
|
||||
segments.push({
|
||||
text: sentence,
|
||||
audio: path.relative(dir, filePath).replace(/\\/g, '/'),
|
||||
duration: Math.round(duration * 1000) / 1000,
|
||||
})
|
||||
totalDuration += duration
|
||||
}
|
||||
|
||||
// 统一使用 segments 数组(单句 = 1 元素,多句 = N 元素)
|
||||
item.segments = segments
|
||||
item.audio = segments[0].audio
|
||||
item.audioDuration = Math.round(totalDuration * 1000) / 1000
|
||||
log('tts', `[${idx}/${items.length}] ${totalDuration.toFixed(1)}s (${segments.length}句): ${fullText.substring(0, 30)}...`)
|
||||
} catch (err) {
|
||||
item.status = 'failed'
|
||||
item.error = `TTS失败: ${err.message}`
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
/**
|
||||
* Phase: upload — OSS 上传
|
||||
*
|
||||
* 将生成的图片(含首尾帧)上传到 OSS,回写 url
|
||||
* 将图片(含首尾帧)和视频上传到 OSS,回写 url / videoUrl
|
||||
*/
|
||||
|
||||
const path = require('path')
|
||||
@@ -11,35 +11,64 @@ async function phaseUpload(manifest, manifestPath) {
|
||||
const dir = getManifestDir(manifestPath)
|
||||
const { uploadFile } = require('../oss-upload')
|
||||
|
||||
const items = manifest.items.filter(it =>
|
||||
// 图片(含首尾帧 first frame)
|
||||
const imageItems = manifest.items.filter(it =>
|
||||
it.status === 'done' && it.file && !it.url
|
||||
)
|
||||
if (items.length === 0) { log('upload', '无待上传 item,跳过'); return }
|
||||
// 视频
|
||||
const videoItems = manifest.items.filter(it =>
|
||||
it.status === 'done' && it.video && !it.videoUrl
|
||||
)
|
||||
|
||||
log('upload', `共 ${items.length} 个文件`)
|
||||
if (imageItems.length === 0 && videoItems.length === 0) {
|
||||
log('upload', '无待上传文件,跳过')
|
||||
return
|
||||
}
|
||||
|
||||
for (let i = 0; i < items.length; i++) {
|
||||
const item = items[i]
|
||||
const filePath = path.resolve(dir, item.file)
|
||||
try {
|
||||
const { url } = await uploadFile(filePath)
|
||||
item.url = url
|
||||
log('upload', `[${i + 1}/${items.length}] ${item.file} → ${url.substring(0, 60)}...`)
|
||||
} catch (err) {
|
||||
item.error = `上传失败: ${err.message}`
|
||||
log('upload', `[${i + 1}/${items.length}] 失败: ${err.message}`)
|
||||
}
|
||||
if (item.url && item.lastFrame && !item.lastFrameUrl) {
|
||||
const lastPath = path.resolve(dir, item.lastFrame)
|
||||
// 上传图片
|
||||
if (imageItems.length > 0) {
|
||||
log('upload', `图片: ${imageItems.length} 个`)
|
||||
for (let i = 0; i < imageItems.length; i++) {
|
||||
const item = imageItems[i]
|
||||
const filePath = path.resolve(dir, item.file)
|
||||
try {
|
||||
const { url } = await uploadFile(lastPath)
|
||||
item.lastFrameUrl = url
|
||||
log('upload', `[${i + 1}/${items.length}] lastFrame → OK`)
|
||||
const { url } = await uploadFile(filePath)
|
||||
item.url = url
|
||||
log('upload', ` [${i + 1}/${imageItems.length}] ${item.file} → OK`)
|
||||
} catch (err) {
|
||||
log('upload', `[${i + 1}/${items.length}] lastFrame 上传失败: ${err.message}`)
|
||||
item.error = `上传失败: ${err.message}`
|
||||
log('upload', ` [${i + 1}/${imageItems.length}] 失败: ${err.message}`)
|
||||
}
|
||||
// 首尾帧模式:上传 lastFrame
|
||||
if (item.url && item.lastFrame && !item.lastFrameUrl) {
|
||||
const lastPath = path.resolve(dir, item.lastFrame)
|
||||
try {
|
||||
const { url } = await uploadFile(lastPath)
|
||||
item.lastFrameUrl = url
|
||||
log('upload', ` [${i + 1}/${imageItems.length}] lastFrame → OK`)
|
||||
} catch (err) {
|
||||
log('upload', ` [${i + 1}/${imageItems.length}] lastFrame 上传失败: ${err.message}`)
|
||||
}
|
||||
}
|
||||
saveManifest(manifestPath, manifest)
|
||||
}
|
||||
}
|
||||
|
||||
// 上传视频
|
||||
if (videoItems.length > 0) {
|
||||
log('upload', `视频: ${videoItems.length} 个`)
|
||||
for (let i = 0; i < videoItems.length; i++) {
|
||||
const item = videoItems[i]
|
||||
const videoPath = path.resolve(dir, item.video)
|
||||
try {
|
||||
const { url } = await uploadFile(videoPath)
|
||||
item.videoUrl = url
|
||||
log('upload', ` [${i + 1}/${videoItems.length}] ${item.video} → OK`)
|
||||
} catch (err) {
|
||||
log('upload', ` [${i + 1}/${videoItems.length}] 失败: ${err.message}`)
|
||||
}
|
||||
saveManifest(manifestPath, manifest)
|
||||
}
|
||||
saveManifest(manifestPath, manifest)
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
@@ -112,13 +112,23 @@ function applyRetryFailed(manifest, phases) {
|
||||
for (const item of manifest.items) {
|
||||
if (item.status === 'failed' || item.status === 'partial') {
|
||||
if (item.url && item.videoPrompt && !item.video) {
|
||||
// 图片已上传但视频未生成 → 直接重试视频阶段
|
||||
item.status = 'done'
|
||||
item.error = ''
|
||||
resetCount++
|
||||
} else if (!item.url && item.imagePrompt) {
|
||||
item.status = 'pending'
|
||||
item.error = ''
|
||||
resetCount++
|
||||
// 图片未上传 → 重试图片阶段
|
||||
// 如果首帧已存在但 lastFrame 失败,只重置 lastFrame 相关
|
||||
if (item.file && manifest.mode === 'framePair' && !item.lastFrame) {
|
||||
item.status = 'done' // 保留首帧,只补 lastFrame
|
||||
item.error = ''
|
||||
resetCount++
|
||||
} else {
|
||||
item.status = 'pending'
|
||||
item.error = ''
|
||||
delete item.file // 清除旧文件引用,避免重复
|
||||
resetCount++
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -128,7 +138,7 @@ function applyRetryFailed(manifest, phases) {
|
||||
}
|
||||
}
|
||||
if (phases.includes('images')) {
|
||||
if (manifest.items.some(it => !it.status || it.status === 'pending')) {
|
||||
if (manifest.items.some(it => (!it.status || it.status === 'pending') || (it.status === 'done' && manifest.mode === 'framePair' && !it.lastFrame))) {
|
||||
manifest.pipeline.phases.images = 'pending'
|
||||
}
|
||||
}
|
||||
@@ -159,7 +169,6 @@ function parseArgs(argv) {
|
||||
else if (argv[i] === '--image-model' && argv[i + 1]) args.imageModel = argv[++i]
|
||||
else if (argv[i] === '--video-model' && argv[i + 1]) args.videoModel = argv[++i]
|
||||
else if (argv[i] === '--references' && argv[i + 1]) args.references = argv[++i]
|
||||
else if (argv[i] === '--style' && argv[i + 1]) args.style = argv[++i]
|
||||
else if (argv[i] === '--all') args.all = true
|
||||
else if (!args.command) args.command = argv[i]
|
||||
}
|
||||
@@ -219,6 +228,7 @@ async function main() {
|
||||
console.log(' pipeline.js init --account <id> --mode <single|framePair> --items <JSON> [--items-file <path>] [--image-model gemini|mj] [--video-model veo3-fast|grok|kling] [--format 9:16]')
|
||||
console.log(' pipeline.js validate --manifest <path>')
|
||||
console.log(' pipeline.js confirm --manifest <path> --all')
|
||||
console.log(' pipeline.js confirm --manifest <path> --items 1,3,5')
|
||||
console.log(' pipeline.js run --manifest <path> [--account id] [--phase p1,p2] [--resume] [--retry-failed]')
|
||||
console.log(' pipeline.js status --manifest <path>')
|
||||
console.log('')
|
||||
|
||||
2
.gitignore
vendored
2
.gitignore
vendored
@@ -2,7 +2,7 @@
|
||||
node_modules/
|
||||
|
||||
|
||||
|
||||
config.json
|
||||
|
||||
# Local settings
|
||||
.claude/settings.local.json
|
||||
|
||||
@@ -2,9 +2,7 @@
|
||||
|
||||
## 一、角色定义
|
||||
|
||||
你是一位专精图片生成模型的提示词工程师,具备深厚的视觉叙事和光影设计能力。
|
||||
|
||||
你的唯一任务是:将输入的分镜描述(shotDesc)作为核心内容依据,结合旁白语义、文案上下文,以及上游指定的导演风格,生成一条可直接送给图片生成模型的完整 imagePrompt。
|
||||
你是一位拥有 15 年经验的电影摄影指导(DP),擅长将文字分镜转化为高表现力的视觉起始帧。你不仅关注“画了什么”,更关注“空间叙述”与“光影秩序”。
|
||||
|
||||
> **重要前提:** 你生成的图片是下游视频片段的起始帧。构图和姿态必须是「即将发生」的瞬间,而非「已完成」的状态。
|
||||
|
||||
|
||||
Reference in New Issue
Block a user