AI Agent skill for narrator-ai-cli — CLI client for Narrator AI video narration API
# Add to your Claude Code skills
git clone https://github.com/NarratorAI-Studio/narrator-ai-cli-skillname: narrator-ai-cli version: "1.0.4" license: MIT description: >- AI 电影/短剧解说视频自动生成(AI 解说大师 CLI Skill)。当用户需要创建电影解说视频、短剧解说、影视二创、AI 配音旁白视频、film commentary、video narration、drama dubbing、movie narration 时触发。内置电影素材库、BGM、多语种配音、解说模板。通过 narrator-ai-cli 命令行实现:搜片→选模板→选 BGM→选配音→生成文案→合成视频的全流程自动化。CLI client for Narrator AI video narration API. user-invocable: true tags:
CLI client for Narrator AI video narration API. Designed for AI agents and developers.
This file covers decision flow, the common workflow, and pointers. Detailed lookups live in references/:
| Topic | File |
|---|---|
| Resource selection (material / BGM / dubbing / templates) — list commands, response formats, field mapping | references/resources.md |
| Full workflow steps with parameter tables and JSON examples (Fast Path + Standard Path) | references/workflows.md |
| Magic Video — optional visual template step (catalog, params, language rules) | references/magic-video.md |
| Polling pattern, task types, file ops, user account, error codes | references/operations.md |
┌─── Fast Path (原创文案, cheaper) ───┐
│ fast-writing → fast-clip-data │
Source material ──┤ ↓ ├──→ video-composing ──→ (magic-video)
(material list / │ [video-composing keys off │ final MP4 URL optional visual
search-movie / │ fast-clip-data.task_order_num] │ template pass
file upload) └─────────────────────────────────────┘
┌─── Standard Path (二创文案) ────────┐
│ popular-learning → generate- │
│ writing → clip-data │
│ ↓ │
│ [video-composing keys off │
│ generate-writing.task_order_num] │
└─────────────────────────────────────┘
Install this Skill in your AI agent (OpenClaw, Windsurf, WorkBuddy, etc.), then just say "create a movie narration video" — the AI handles the rest.
A machine-readable skill file (SKILL.md) that teaches AI agents how to use the narrator-ai-cli tool for automated video narration production.
You say: "Create a narration video for Pegasus in a comedy style"
AI executes: Search movie → Select template → Choose BGM → Pick voice → Generate script → Compose video → Return download link
| | CLI (command-line tool) | Skill (capability description) | |---|---|---| | What it is | A set of executable commands | Instructions that teach AI how to use those commands | | Analogy | Kitchen tools | A recipe book | | Works alone? | Yes, in terminal manually | No, requires CLI |
In short: CLI is the hands. Skill is the brain. Together, the AI agent can produce videos end-to-end.
pip install "narrator-ai-cli @ git+https://github.com/GridLtd-ProductDev/narrator-ai-cli.git"
See narrator-ai-cli for detailed installation options.
narrator-ai-cli config set app_key <your_app_key>
No comments yet. Be the first to share your thoughts!
Always:
- Confirm before acting. Every resource (source, BGM, dubbing, template) and every
magic-videosubmission requires explicit user approval. Never auto-select, never auto-submit.- Source data, never invent. Construct
confirmed_movie_jsonfrommaterial listfields ortask search-movieoutput. If neither yields it, ask the user — do not fabricate.- Honor the language chain. The dubbing voice's language defines the writing task
languageparam AND everymagic-videotext param. All three must match. →references/magic-video.md§ Language Awareness- Paginate
material listto exhaustion, search programmatically. Fetch all pages untiltotalis consumed, thengrep -iorpython3 -con the JSON. Never trust truncated terminal display.- Poll with the canonical
whileloop at 5-second intervals. Never use a fixed-iterationforloop. →references/operations.md§ Task PollingNever:
- Submit
magic-videowithout showing the full request body (templates + everytemplate_paramsvalue) and getting user confirmation. The cost is 30 pts/minute and irreversible.- Submit Chinese default values for
magic-videotext params when narration language is non-Chinese. The defaults are hardcoded Chinese and will appear as Chinese text in a non-Chinese video.- Submit
.task_id(32-char hex) asorder_num. Downstream tasks want.task_order_num(the prefixed string likegenerate_writing_xxxxx), not.task_id. Submitting the hex returns10001 任务关联记录数据异常. The other look-alike —.results.order_info.order_num(script_xxxxx) — is also wrong; seereferences/operations.md§ Task Query Response Shape.
This skill assumes the narrator-ai-cli binary is installed and configured with a valid NARRATOR_APP_KEY. See README.md for install / setup. Agents can verify with narrator-ai-cli user balance.
| Concept | Description |
|---|---|
| file_id | 32-char hex string for uploaded files. Via file upload or task results |
| task_id | 32-char hex string returned on task creation. Poll with task query |
| task_order_num | Assigned after task creation. Used as order_num for downstream tasks |
| files[] | Output files in the completed task response (flat, top-level array). Each entry has file_id, file_path, suffix. Read .files[0].file_id for the next step's input |
| learning_model_id | Narration style model — from a pre-built template (90+) or popular-learning result |
| learning_srt | Reference SRT file_id. Mutually exclusive with learning_model_id |
⚠️ Agent behavior — first message of a session: Before asking the user for a movie title or workflow path, proactively orient them about what the skill offers. Most users assume they need to upload their own video + SRT and don't realize a pre-built material library ships with the skill. Skipping this step often results in unnecessary uploads or aborted sessions.
Required opening (adapt to the conversation language):
task search-movie only if not foundmaterial list --json and present 5–8 titles spanning varied genres; offer to filter by genre on requestfile uploadExample opening (Chinese conversation):
你好,欢迎使用 AI 解说大师。这个技能可以帮你生成电影/短剧解说视频。我这边内置了约 100 部电影素材(视频 + 字幕都是现成的),所以大多数情况你不需要自己上传任何文件。
你想怎么开始?
- 直接告诉我片名 — 我先查内置素材库,没有再去外部搜
- 让我列一些内置素材 — 你可以按类型挑(喜剧 / 动作 / 悬疑 / 科幻…)
- 自己上传视频 + 字幕 — 我引导你完成上传流程
After source material is confirmed, walk the user through the decision sequence below — one question per turn, in order. Do NOT collapse multiple decisions into one message; users cannot reason about target_mode before they've picked a path.
Decision sequence (each step waits for explicit user confirmation):
target_mode — only ask if path = Fast. Choose mode 1 / 2 / 3 (see "Fast Path internal: target_mode" below). If path = Standard, skip this question entirely — Standard Path has no target_mode.⚠️ Anti-pattern (do NOT do this): Asking "① 解说模式 (纯解说/原声混剪) ② 制作路线 (快速/标准)" in the same message.
纯解说and原声混剪are Fast Path internal modes (target_mode 1 vs 2). They do not exist in Standard Path. Asking them alongside the path choice forces the user to make decisions in the wrong order and conflates two layers of the decision tree.
Two end-to-end paths produce a finished narrated video. Choose with the user before starting.
| | Fast Path (原创文案, recommended) | Standard Path (二创文案) | |---|---|---| | Pipeline | material → fast-writing → fast-clip-data → video-composing → magic-video* | material → popular-learning** → generate-writing → clip-data → video-composing → magic-video* | | Cost / speed | Faster, cheaper | Higher quality narration | | When to use | Default unless user wants adapted-style narration | When user wants narration learned from a reference style |
* magic-video is optional; only on explicit user request. ** popular-learning is skippable when using a pre-built template (recommended).
⚠️ Path is a standalone decision — ask the user "Fast or Standard?" by itself, in its own message. Do not auto-select. Do not bundle it with
target_modeor any other follow-up question.
target_mode (ask only after path=Fast is confirmed)Skip this section entirely if the user picked Standard Path —
target_modeonly exists inside fast-writing.
| Mode | Use when | Required input |
|---|---|---|
| "1" 热门影视 (纯解说) | Known movie, narration from plot only | confirmed_movie_json; no episodes_data |
| "2" 原声混剪 (Original Mix) | Known movie + you have its SRT | confirmed_movie_json + episodes_data[{srt_oss_key, num}] |
| "3" 冷门/新剧 (New Drama) | Obscure/new content | episodes_data[{srt_oss_key, num}]; confirmed_movie_json optional |
Before any task, gather these resources in this order, with explicit user confirmation at each step:
material list or via file uploadbgm listdubbing listtask narration-stylesDetailed list commands, response shapes, and field mappings live in references/resources.md.
⚠️ Universal rules — apply at every resource step:
- Pre-filter by context. Use the per-resource filter flag where supported:
bgm list --search,dubbing list --lang,task narration-styles --genre.material listdoes NOT accept these flags — paginate the JSON and search programmatically withgrep -i/python3 -c.- Default presentation: 5–8 options with the resource ID and key descriptive fields.
- If the user has no preference: present 3 recommendations with a one-line reason for each. Still wait for confirmation.
- Confirm one resource at a time. Do not advance until the current one is confirmed.
⚠️ Dubbing → writing
languagemismatch check: if the user pre-specified alanguagevalue that conflicts with the chosen voice, surface the mismatch and ask before proceeding. (The general language-chain rule lives in Agent Rules above.)
Detailed parameter tables, all
target_modecases, and full JSON examples live inreferences/workflows.md.
Step 0 — Find source material & determine target_mode:
narrator-ai-cli material list --json --page 1 --size 100. Search programmatically with grep -i or python3 -c on the JSON output — do NOT rely on the terminal display (may be truncated). Paginate (--page 2, etc.) until exhausted if total > 100.target_mode=1) or original mix (target_mode=2)? Construct confirmed_movie_json from material fields (mapping in references/resources.md).task search-movie "<name>" --json → target_mode=1 (or target_mode=2 if user uploads SRT). May take 60+ seconds (Gradio backend, results cached 24h).target_mode=3 with user's uploaded SRT. confirmed_movie_json optional.Step 1 — fast-writing: pass learning_model_id, target_mode, playlet_name, confirmed_movie_json and/or episodes_data, model (flash 5pts/char or pro 15pts/char). Save task_id from the creation response, then poll until top-level .status=2 and save .files[0].file_id from the completed task.
Step 2 — fast-clip-data: pass task_id + file_id from Step 1, plus bgm, dubbing, dubbing_type, and episodes_data with video_oss_key / srt_oss_key / negative_oss_key. Poll until top-level .status=2; read top-level .task_order_num from the response.
Step 3 — video-composing: pass order_num: <.task_order_num from Step 2>, plus bgm, dubbing, dubbing_type (re-pass the same values from Step 2 — the API does not inherit them). All four are required; submitting only order_num returns 10001 查询解说工程任务结果失败. Poll → .results.tasks[0].video_url is the finished MP4.
Step 4 (optional) — magic-video: only on explicit user request. See references/magic-video.md.
Detailed parameter tables and JSON examples live in
references/workflows.md.
Step 0 — Source material: same material/upload flow as Fast Path. Use video_file_id as video_oss_key and negative_oss_key, and srt_file_id as srt_oss_key in episodes_data.
Step 1 — popular-learning (skip if using a pre-built template): pass video_srt_path, narrator_type, model_version. Poll until top-level .status=2, then parse .results.tasks[0].task_result JSON → agent_unique_code is the learning_model_id. Or use a pre-built template id from task narration-styles --json directly.
Step 2 — generate-writing: pass learning_model_id, playlet_name, playlet_num, episodes_data, plus three additional required fields — target_platform (e.g. "douyin"), vendor_requirements ("" if none), and target_character_name ("" if not applicable). Omitting any of these returns 10001 ... Field required. Full param table in references/workflows.md. Save task_id from the creation response.
Step 3 — clip-data: pass order_num (= top-level .task_order_num from Step 2's polled task record, e.g. generate_writing_xxxxx), plus bgm, dubbing, dubbing_type. ⚠️ Different from Fast Path's fast-clip-data, which takes task_id — clip-data takes order_num instead. Poll until top-level .status=2 (required prerequisite for Step 4) — but do not use clip-data's own task_order_num for video-composing; Step 4 keys off generate-writing's instead.
Step 4 — video-composing: pass order_num + bgm + dubbing + dubbing_type (all four required — re-pass the BGM/voice values from Step 3; the API does not inherit them, and submitting only order_num returns 10001 查询解说工程任务结果失败). ⚠️ Standard Path keys off generate-writing's task_order_num (generate_writing_xxxxx), NOT clip-data's. clip-data must reach top-level .status=2 first as a prerequisite, but its own task_order_num (generate_clip_data_xxxxx) returns 10001 任务关联记录信息缺失 when submitted. This is opposite to Fast Path (where fast-clip-data is the right anchor) — see Important Notes #4. Poll → .results.tasks[0].video_url is the finished MP4.
Step 5 (optional) — magic-video: only on explicit user request. See references/magic-video.md.
# Voice clone — input audio_file_id, returns voice_id
narrator-ai-cli task create voice-clone --json -d '{"audio_file_id": "<file_id>"}'
# Text to speech — input voice_id + audio_text
narrator-ai-cli task create tts --json -d '{"voice_id": "<voice_id>", "audio_text": "Text to speak"}'
Both accept optional clone_model (default: pro).
confirmed_movie_json is required for target_mode=1 and 2, optional for 3. Construct from material fields when found in pre-built materials; use search-movie otherwise.file_id always comes from file list or material list. Never guess.search-movie may take 60+ seconds (Gradio backend, results cached 24h).video-composing.order_num is path-asymmetric — which upstream task's task_order_num to use differs by path (the field-name rule — use task_order_num, not the hex order_num — is in Agent Rules above):
fast-clip-data's task_order_num (format: fast_writing_clip_data_xxxxx).generate-writing's task_order_num (format: generate_writing_xxxxx). The clip-data step's own task_order_num (generate_clip_data_xxxxx) returns 10001 任务关联记录信息缺失. clip-data must still complete first as a prerequisite — but its order is not what video-composing keys off.popular-learning. List with task narration-styles --json; preview at the resources URL above.-d @file.json for large request bodies to avoid shell quoting issues.task verify before expensive tasks to catch missing/invalid materials early; task budget to estimate point cost.https://openapi.jieshuo.cn. No third-party services.NARRATOR_APP_KEY stored at ~/.narrator-ai/config.yaml. Keep private; do not commit.📧 Need an API key? Email merlinyang@gridltd.com or scan the QR code at the bottom of this page.
Choose the method for your agent platform:
OpenClaw:
mkdir -p ~/.openclaw/skills/narrator-ai-cli
cp SKILL.md ~/.openclaw/skills/narrator-ai-cli/SKILL.md
WorkBuddy / QClaw (Tencent):
Upload SKILL.md through the skill management UI.
Windsurf:
cp SKILL.md /path/to/your/project/.skills/narrator-ai-cli/SKILL.md
Claude Code / Cursor:
cp SKILL.md /path/to/your/project/.skills/narrator-ai-cli/SKILL.md
Any markdown-reading agent:
cp SKILL.md /path/to/agent/skills/narrator-ai-cli/SKILL.md
💡 Tip: You can also just give the agent this repo URL — most agents can read the GitHub repo structure and auto-configure.
Once installed, use natural language:
| Platform | Setup | Status | |----------|-------|--------| | OpenClaw | Native skill loading | ✅ Verified | | Windsurf | .skills directory | ✅ Verified | | WorkBuddy (Tencent) | Upload SKILL.md | ✅ Verified | | QClaw (Tencent) | Upload SKILL.md | ✅ Verified | | Youdao Lobster | Skill loading | ✅ Verified | | Yuanqi AI | Skill loading | ✅ Verified | | Claude Code | SKILL.md in project root | ✅ Verified | | Cursor | rules/skills directory | ✅ Verified | | Any markdown-skill agent | Point to SKILL.md | ✅ Compatible |
| Feature | Details | |---------|---------| | Two workflow paths | Adapted Narration and Original Narration | | Three creation modes | Hot Drama / Original Mix / New Drama | | Built-in resources | 93 movies, 146 BGM tracks, 63 dubbing voices, 90+ narration templates | | Full pipeline | Script → Clip data → Video composing → Visual template | | Standalone tasks | Voice cloning, text-to-speech | | Data flow mapping | Which output feeds into which input | | Error handling | All 18 API error codes with recommended actions | | Cost estimation | Budget verification before task creation |
| Section | Description | |---------|-------------| | Frontmatter | Skill metadata (name, description, requirements) | | Architecture | CLI source structure and design choices | | Core Concepts | Key terms: file_id, task_id, order_num, etc. | | Workflow Paths | Two complete pipelines with step-by-step commands | | Prerequisites | How to select resources (materials, BGM, dubbing, templates) | | Fast Path | Recommended workflow: search → write → clip → compose → magic | | Standard Path | Full workflow: learn → write → clip → compose → magic | | Standalone Tasks | Voice clone and TTS | | Task Management | Query, list, budget, verify, save | | File Operations | Upload, download, list, delete | | Error Handling | All 18 API error codes with actions | | Data Flow | ASCII diagram of complete pipeline | | Important Notes | 9 critical gotchas and best practices |
Need an API key or help?

MIT