feat(agent): 添加视频创作工作流技能系统和流程工具
新增基于 SKILL.md 的视频创作工作流系统,Agent 可通过 skills 目录加载结构化的导演指令;实现 validate_storyboard、update_manifest_items、confirm_images 三个流程工具支撑分镜校验、提示词更新和图片确认。
This commit is contained in:
@@ -33,19 +33,19 @@ ${accountList}
|
||||
|
||||
## 你的能力
|
||||
1. **查看账号** - 使用 list_accounts 列出所有可用账号及其配置
|
||||
2. **创建账号** - 使用 create_account 创建新的短视频账号,配置生图/视频模型、画幅等
|
||||
2. **创建账号** - 使用 create_account 创建新的短视频账号
|
||||
3. **查看账号配置** - 使用 get_account_config 获取账号详细配置
|
||||
4. **查看 Pipeline 进度** - 使用 pipeline_status 检查创作进度
|
||||
5. **执行创作阶段** - 使用 run_pipeline_phase 执行 pipeline 阶段
|
||||
|
||||
## 视频创作流程
|
||||
1. 确认用户意图(A.幻灯片视频 / B.AI视频)
|
||||
2. 选择/创建账号
|
||||
3. 规划分镜脚本
|
||||
4. 生成图片(images 阶段)
|
||||
5. 生成视频片段(videos 阶段,仅 B 模式)
|
||||
6. 配音(tts 阶段)
|
||||
7. 成片组装(assemble 阶段)
|
||||
4. **获取提示词模板** - 使用 get_account_prompts 获取账号的分镜/图片/视频模板
|
||||
5. **分镜校验** - 使用 validate_storyboard 校验分镜质量(TTS 估算、ratio 预检)
|
||||
6. **初始化 Manifest** - 使用 create_manifest 创建项目骨架
|
||||
7. **更新 Manifest** - 使用 update_manifest_items 更新分镜的 imagePrompt/videoPrompt
|
||||
8. **生成图片** - 使用 generate_images 或 run_pipeline_phase --phase images
|
||||
9. **确认图片** - 使用 confirm_images 标记图片已确认
|
||||
10. **生成视频** - 使用 generate_videos 或 run_pipeline_phase --phase upload,videos
|
||||
11. **TTS + 成片** - 使用 run_pipeline_phase --phase tts,assemble
|
||||
12. **查看进度** - 使用 pipeline_status 检查创作进度
|
||||
13. **查看历史** - 使用 list_outputs 查看历史生成记录
|
||||
14. **读取 Manifest** - 使用 get_manifest 查看 manifest 详情
|
||||
|
||||
## 行为准则
|
||||
- 用中文回复,友好、专业
|
||||
|
||||
@@ -3,6 +3,9 @@ import type { AgentEvent } from '@earendil-works/pi-agent-core';
|
||||
import { streamSimple } from '@earendil-works/pi-ai';
|
||||
import type { AssistantMessage } from '@earendil-works/pi-ai';
|
||||
import { WebSocket } from 'ws';
|
||||
import fs from 'fs';
|
||||
import path from 'path';
|
||||
import { fileURLToPath } from 'url';
|
||||
import { createPiModel } from './pi-model';
|
||||
import { createPiTools } from './pi-tools';
|
||||
import { tools } from './tools/index';
|
||||
@@ -10,6 +13,32 @@ import { videoAgent } from './index';
|
||||
import { dbToPiMessages, saveUserMessage, saveAssistantMessage, saveToolResult, type DbMessage } from './pi-persist';
|
||||
import { getDb } from '../db';
|
||||
|
||||
const __filename = fileURLToPath(import.meta.url);
|
||||
const __dirname = path.dirname(__filename);
|
||||
const SKILLS_DIR = path.join(__dirname, 'skills');
|
||||
|
||||
function loadSkills(): string {
|
||||
const parts: string[] = [];
|
||||
if (!fs.existsSync(SKILLS_DIR)) return '';
|
||||
|
||||
const skillDirs = fs.readdirSync(SKILLS_DIR, { withFileTypes: true })
|
||||
.filter((d) => d.isDirectory());
|
||||
|
||||
for (const dir of skillDirs) {
|
||||
const skillFile = path.join(SKILLS_DIR, dir.name, 'SKILL.md');
|
||||
if (fs.existsSync(skillFile)) {
|
||||
let content = fs.readFileSync(skillFile, 'utf-8');
|
||||
// Strip YAML frontmatter
|
||||
content = content.replace(/^---[\s\S]*?---\n*/, '');
|
||||
parts.push(content.trim());
|
||||
}
|
||||
}
|
||||
|
||||
return parts.join('\n\n---\n\n');
|
||||
}
|
||||
|
||||
const cachedSkillContent = loadSkills();
|
||||
|
||||
interface RunContext {
|
||||
currentAssistantMsgId: string | null;
|
||||
}
|
||||
@@ -35,7 +64,7 @@ export async function runAgentChat(ws: WebSocket, convId: string, userContent: s
|
||||
|
||||
const agent = new Agent({
|
||||
initialState: {
|
||||
systemPrompt: videoAgent.getSystemPrompt(),
|
||||
systemPrompt: videoAgent.getSystemPrompt() + (cachedSkillContent ? '\n\n' + cachedSkillContent : ''),
|
||||
model,
|
||||
thinkingLevel: 'off',
|
||||
tools: piTools,
|
||||
|
||||
127
web/server/agent/skills/video-from-script/SKILL.md
Normal file
127
web/server/agent/skills/video-from-script/SKILL.md
Normal file
@@ -0,0 +1,127 @@
|
||||
---
|
||||
name: video-from-script
|
||||
description: 视频创作工作流。A.幻灯片视频(图文成片)— 生图+配音+字幕;B.AI视频 — 生图+AI视频化+组装。
|
||||
---
|
||||
|
||||
# 视频创作工作流
|
||||
|
||||
**你是导演。** 负责:意图理解 → 编排调度 → 质量卡点 → 用户沟通。
|
||||
|
||||
## 两类成片
|
||||
|
||||
| 类型 | 流程 | AI视频 |
|
||||
|------|------|--------|
|
||||
| **A. 幻灯片视频** | 分镜 → manifest → 生图 → TTS+成片 | ❌ |
|
||||
| **B. AI 视频** | 分镜 → manifest → 生图 → 生视频 → TTS+成片 | ✅ |
|
||||
|
||||
B 模式分:**单图**(1图→1视频)/ **首尾帧**(2图→过渡视频)
|
||||
|
||||
## 路由规则
|
||||
|
||||
| 用户意图 | 类型 |
|
||||
|---------|------|
|
||||
| "图文成片"、"幻灯片视频" | A |
|
||||
| "图生视频"、"AI视频" | B(单图) |
|
||||
| "首尾帧"、"关键帧" | B(首尾帧) |
|
||||
| 只说"做视频" | **追问**:A还是B? |
|
||||
|
||||
## 核心约束
|
||||
|
||||
1. **不可跳步**:阶段之间必须审查
|
||||
2. **manifest.json 是唯一状态源**:用 create_manifest 创建,后续所有操作读写此文件
|
||||
3. **分镜表是脊骨契约**:确认后禁止增减 shot 数量/顺序
|
||||
4. **禁止手写 manifest.json**:必须通过工具操作
|
||||
|
||||
## 执行流程
|
||||
|
||||
### Step -1: 意图确认(逐项确认)
|
||||
|
||||
```
|
||||
1. 成片类型:A/B? → B 继续:单图/首尾帧?
|
||||
2. 素材来源:有现成文案/图片?还是 AI 生成?
|
||||
3. 账号:list_accounts 展示 → 用户选
|
||||
4. 参数:画幅、模型 — 优先从 account.json 继承
|
||||
```
|
||||
|
||||
→ 确认后输出执行计划,用户说"开始"才进入 Step 0。
|
||||
|
||||
### Step 0: 前置检查
|
||||
|
||||
- get_account_config 读取配置
|
||||
- get_account_prompts 检查模板存在
|
||||
- validate_account 校验通过
|
||||
|
||||
### Step 1: 分镜脚本
|
||||
|
||||
用 get_account_prompts 获取模板 → 按模板规则生成分镜 JSON:
|
||||
|
||||
```json
|
||||
[{"id":1,"shotDesc":"英文画面描述","script":"中文口播文案","duration":"TTS估算(=字数÷5)","directorRef":"fincher"}]
|
||||
```
|
||||
|
||||
**时间线铁律**:
|
||||
- 语速 5字/秒,TTS 1.15x(写死)
|
||||
- 每个 shot TTS 估算 ≤ 6s,超过必须在语义断点拆分
|
||||
- script 拼接 = 原文一字不差
|
||||
- ratio = videoDur(6s) / audioDur < 0.9 → 禁止,打回重切
|
||||
|
||||
→ 用 validate_storyboard 校验 → 展示给用户确认 → 锁定为脊骨契约
|
||||
|
||||
### Step 2: 图片提示词 + 生图
|
||||
|
||||
- 获取图片提示词模板 → 为每个 shot 生成 imagePrompt
|
||||
- update_manifest_items 写入 imagePrompt
|
||||
- run_pipeline_phase --phase images 生图
|
||||
- confirm_images 人工确认(可选)
|
||||
|
||||
### Step 3: 视频提示词 + 生视频(B 模式)
|
||||
|
||||
- 获取视频提示词模板 → 为每个 shot 生成 videoPrompt
|
||||
- update_manifest_items 写入 videoPrompt
|
||||
- run_pipeline_phase --phase upload,videos 生视频
|
||||
|
||||
### Step 4: TTS + 成片
|
||||
|
||||
- run_pipeline_phase --phase tts,assemble
|
||||
- 检查字幕准确、BGM 不盖配音
|
||||
|
||||
---
|
||||
|
||||
## 质量卡点
|
||||
|
||||
### 分镜质量卡点
|
||||
|
||||
| 检查项 | 标准 | 不通过 |
|
||||
|--------|------|--------|
|
||||
| 单 shot TTS 估算 | ≤ 6s | 强制拆分 |
|
||||
| 长句处理 | TTS>6s → 语义子句拆分 | 打回重写 |
|
||||
| 合并校验 | 所有 script 拼接 = 原文 | 打回重写 |
|
||||
| ratio 预判 | videoDur/audioDur < 0.9 → 禁止 | 打回重切 |
|
||||
|
||||
### assemble 铁律
|
||||
|
||||
- 音频 1.15x 原速,无 speed 字段
|
||||
- 视频只允许加速或截断
|
||||
- **禁止慢放/冻结帧/音频调速**
|
||||
|
||||
---
|
||||
|
||||
## 视频模型参考
|
||||
|
||||
| 模型 | 时长 | 画幅 | 单图 | 首尾帧 |
|
||||
|------|------|------|------|--------|
|
||||
| Grok | 6s | 任意 | ✅ | ❌ |
|
||||
| Veo3-fast | ~8s | 16:9,9:16 | ✅ | ✅ |
|
||||
| Veo3-fast-frames | ~8s | 16:9,9:16 | ✅ | ✅ |
|
||||
| Kling | 6s | 任意 | ✅ | ✅ |
|
||||
|
||||
**降级链**: Grok ↔ VEO ↔ Kling
|
||||
|
||||
## 图像模型参考
|
||||
|
||||
| 模型 | 文生图 | 图生图 | 风格参考 |
|
||||
|------|--------|--------|---------|
|
||||
| Gemini | ✅ | ✅ | 本地文件 |
|
||||
| GPT Image | ✅ | ✅ | 多图输入 |
|
||||
| MJ | ✅ | ✅ | --sref URL |
|
||||
| Kling | ✅ | ❌ | style_image |
|
||||
26
web/server/agent/tools/confirm-images.ts
Normal file
26
web/server/agent/tools/confirm-images.ts
Normal file
@@ -0,0 +1,26 @@
|
||||
import { execSync } from 'child_process';
|
||||
import { PIPELINE_SCRIPT, PROJECT_ROOT } from './shared';
|
||||
import type { ToolDefinition } from './types';
|
||||
|
||||
export const confirmImages: ToolDefinition = {
|
||||
name: 'confirm_images',
|
||||
description: '确认分镜图质量,将 manifest 中所有图片标记为 confirmed=true。也可以跳过确认直接批量确认。',
|
||||
input_schema: {
|
||||
type: 'object',
|
||||
properties: {
|
||||
manifestPath: { type: 'string', description: 'manifest.json 路径' },
|
||||
skip: { type: 'boolean', description: '跳过人工确认,直接全部确认(默认 false)' },
|
||||
},
|
||||
required: ['manifestPath'],
|
||||
},
|
||||
execute: async (params) => {
|
||||
const { manifestPath, skip = false } = params as { manifestPath: string; skip?: boolean };
|
||||
try {
|
||||
const cmd = `node "${PIPELINE_SCRIPT}" confirm --manifest "${manifestPath}"${skip ? ' --all' : ''}`;
|
||||
const output = execSync(cmd, { cwd: PROJECT_ROOT, encoding: 'utf-8' });
|
||||
return `图片确认完成:\n${output}`;
|
||||
} catch (err: any) {
|
||||
return `确认失败: ${err.message}`;
|
||||
}
|
||||
},
|
||||
};
|
||||
@@ -12,6 +12,9 @@ import { generateVideos } from './generate-videos';
|
||||
import { listOutputs } from './list-outputs';
|
||||
import { getManifest } from './get-manifest';
|
||||
import { createManifest } from './create-manifest';
|
||||
import { validateStoryboard } from './validate-storyboard';
|
||||
import { confirmImages } from './confirm-images';
|
||||
import { updateManifestItems } from './update-manifest-items';
|
||||
|
||||
export const tools: ToolDefinition[] = [
|
||||
listAccounts,
|
||||
@@ -25,4 +28,7 @@ export const tools: ToolDefinition[] = [
|
||||
listOutputs,
|
||||
getManifest,
|
||||
createManifest,
|
||||
validateStoryboard,
|
||||
confirmImages,
|
||||
updateManifestItems,
|
||||
];
|
||||
|
||||
45
web/server/agent/tools/update-manifest-items.ts
Normal file
45
web/server/agent/tools/update-manifest-items.ts
Normal file
@@ -0,0 +1,45 @@
|
||||
import path from 'path';
|
||||
import fs from 'fs';
|
||||
import { PROJECT_ROOT, loadJSON } from './shared';
|
||||
import type { ToolDefinition } from './types';
|
||||
|
||||
export const updateManifestItems: ToolDefinition = {
|
||||
name: 'update_manifest_items',
|
||||
description: '更新 manifest.json 中指定 items 的字段(如 imagePrompt、videoPrompt)。只更新提供的字段,不覆盖其他字段。',
|
||||
input_schema: {
|
||||
type: 'object',
|
||||
properties: {
|
||||
manifestPath: { type: 'string', description: 'manifest.json 路径' },
|
||||
updates: { type: 'string', description: 'JSON 数组,每个元素需包含 id(shot 序号)和要更新的字段,如 [{id:1,imagePrompt:"..."},{id:2,imagePrompt:"..."}]' },
|
||||
},
|
||||
required: ['manifestPath', 'updates'],
|
||||
},
|
||||
execute: async (params) => {
|
||||
const { manifestPath, updates } = params as { manifestPath: string; updates: string };
|
||||
const resolved = path.isAbsolute(manifestPath)
|
||||
? manifestPath
|
||||
: path.resolve(PROJECT_ROOT, manifestPath);
|
||||
|
||||
if (!fs.existsSync(resolved)) return `manifest 不存在: ${resolved}`;
|
||||
|
||||
let updateList: any[];
|
||||
try { updateList = JSON.parse(updates); } catch { return '错误: updates 不是合法 JSON'; }
|
||||
if (!Array.isArray(updateList)) return '错误: updates 必须是数组';
|
||||
|
||||
const manifest = loadJSON(resolved) as { items: any[] };
|
||||
if (!manifest.items) return '错误: manifest 无 items 数组';
|
||||
|
||||
let updated = 0;
|
||||
for (const upd of updateList) {
|
||||
const idx = manifest.items.findIndex((item: any) => item.id === upd.id);
|
||||
if (idx === -1) return `错误: 找不到 id=${upd.id} 的 item`;
|
||||
|
||||
const { id, ...fields } = upd;
|
||||
Object.assign(manifest.items[idx], fields);
|
||||
updated++;
|
||||
}
|
||||
|
||||
fs.writeFileSync(resolved, JSON.stringify(manifest, null, 2), 'utf-8');
|
||||
return `已更新 ${updated}/${manifest.items.length} 个 item`;
|
||||
},
|
||||
};
|
||||
70
web/server/agent/tools/validate-storyboard.ts
Normal file
70
web/server/agent/tools/validate-storyboard.ts
Normal file
@@ -0,0 +1,70 @@
|
||||
import type { ToolDefinition } from './types';
|
||||
|
||||
export const validateStoryboard: ToolDefinition = {
|
||||
name: 'validate_storyboard',
|
||||
description: '校验分镜脚本质量:TTS 估算 ≤ 6s、ratio 预检、script 拼接校验。返回校验结果和问题列表。',
|
||||
input_schema: {
|
||||
type: 'object',
|
||||
properties: {
|
||||
items: { type: 'string', description: '分镜 JSON 数组字符串,每个元素需包含 shotDesc、script 字段,可选 duration、directorRef' },
|
||||
videoModelDuration: { type: 'number', description: '视频模型固定时长(秒),默认 6' },
|
||||
},
|
||||
required: ['items'],
|
||||
},
|
||||
execute: async (params) => {
|
||||
const { items, videoModelDuration = 6 } = params as { items: string; videoModelDuration?: number };
|
||||
let parsed: any[];
|
||||
try { parsed = JSON.parse(items); } catch { return '错误: items 不是合法 JSON'; }
|
||||
if (!Array.isArray(parsed) || parsed.length === 0) return '错误: items 必须是非空数组';
|
||||
|
||||
const errors: string[] = [];
|
||||
const warnings: string[] = [];
|
||||
const TTS_SPEED = 1.15;
|
||||
const CHARS_PER_SEC = 5;
|
||||
|
||||
for (const item of parsed) {
|
||||
const idx = item.id ?? parsed.indexOf(item) + 1;
|
||||
const script: string = item.script || '';
|
||||
const charCount = script.length;
|
||||
const ttsEstimate = charCount / CHARS_PER_SEC;
|
||||
const audioDur = ttsEstimate * TTS_SPEED;
|
||||
const ratio = videoModelDuration / audioDur;
|
||||
|
||||
if (!item.shotDesc) errors.push(`Shot ${idx}: 缺少 shotDesc`);
|
||||
if (!script) errors.push(`Shot ${idx}: 缺少 script`);
|
||||
|
||||
// TTS 估算检查
|
||||
if (ttsEstimate > 6) {
|
||||
errors.push(`Shot ${idx}: TTS 估算 ${ttsEstimate.toFixed(1)}s > 6s,必须拆分 (script: ${script.slice(0, 30)}...)`);
|
||||
}
|
||||
|
||||
// ratio 预检
|
||||
if (ratio < 0.9) {
|
||||
errors.push(`Shot ${idx}: ratio ${ratio.toFixed(2)} < 0.9,音频太长需拆分 (audio=${audioDur.toFixed(1)}s, video=${videoModelDuration}s)`);
|
||||
}
|
||||
|
||||
if (!item.directorRef) {
|
||||
warnings.push(`Shot ${idx}: 建议填写 directorRef`);
|
||||
}
|
||||
}
|
||||
|
||||
// script 拼接校验 - 返回统计而非原文比对(原文由用户提供)
|
||||
const totalChars = parsed.reduce((sum: number, i: any) => sum + (i.script?.length || 0), 0);
|
||||
const totalAudio = (totalChars / CHARS_PER_SEC) * TTS_SPEED;
|
||||
|
||||
const result = {
|
||||
valid: errors.length === 0,
|
||||
shotCount: parsed.length,
|
||||
totalChars,
|
||||
estimatedTotalAudio: `${totalAudio.toFixed(1)}s`,
|
||||
errors,
|
||||
warnings,
|
||||
};
|
||||
|
||||
if (errors.length > 0) {
|
||||
return `❌ 分镜校验未通过 (${errors.length} 个问题):\n\n${errors.map((e, i) => `${i + 1}. ${e}`).join('\n')}${warnings.length ? `\n\n⚠️ 警告:\n${warnings.map((w, i) => `${i + 1}. ${w}`).join('\n')}` : ''}\n\n统计: ${parsed.length} 个镜头, 总字数 ${totalChars}, 音频估算 ${totalAudio.toFixed(1)}s`;
|
||||
}
|
||||
|
||||
return `✅ 分镜校验通过\n\n统计: ${parsed.length} 个镜头, 总字数 ${totalChars}, 音频估算 ${totalAudio.toFixed(1)}s${warnings.length ? `\n\n⚠️ 警告:\n${warnings.map((w, i) => `${i + 1}. ${w}`).join('\n')}` : ''}`;
|
||||
},
|
||||
};
|
||||
Reference in New Issue
Block a user