// Video to Text CLI

Automate transcription from the terminal.

Install a video to text CLI starter for automation. Upload media, create transcription jobs, poll status, and download transcripts, subtitles, summaries, and JSON.

CLIbashJSON
vocce · transcribe● live
点击或拖拽上传
上传音频或视频文件 · ≤ 50MB
// runs in your agent, ships to your stack Claude Code Cursor Gemini MCP CLI REST API n8n Zapier Make GitHub Actions Notion HubSpot
// runs in your agent, ships to your stack Claude Code Cursor Gemini MCP CLI REST API n8n Zapier Make GitHub Actions Notion HubSpot
// What you get

One upload. Every file the next step needs.

The same reliable Vocce pipeline, focused on this job. Free 3-minute preview, then pay only when the export matters.

vocce create
Job polling
Batch jobs
Transcript + subtitles
Agent JSON
// How it works

How to transcribe video from the terminal

01
Upload or paste a URL Any format, any length — Vocce normalizes it.
02
One reliable call Clean, compress, transcribe, diarize, summarize.
03
Export the pack Transcript, subtitles, summary, and agent JSON.
// who uses video to text cli

Built for real workflows.

CI pipelines

Transcribe release demos and PR videos automatically with a GitHub Action or one CLI line.

Batch jobs

Loop a folder of recordings through `vocce create` and collect transcripts, subtitles, and JSON.

Scripted workflows

Cron a watch folder, poll job status, and deliver results by webhook — no dashboard clicking.

// faq

Video to Text CLI, answered.

How do I transcribe a video from the command line? +

npx vocce create ./video.mp4 --ops transcribe,subtitles,summary. The CLI uploads, creates the job, and can poll until your exports are ready.

Can I batch-process many files? +

Yes — jobs are idempotent (same input, same job_id), so you can safely loop folders and retry without double-billing.

How do I get results into my system? +

Poll the job, download artifacts directly, or register a webhook and let results push to your queue or automation.

Is the output machine-readable? +

Every job can emit agent.v1 JSON with speaker turns, timestamps, and artifact links — the same schema across CLI, API, and MCP.