Files
workspace/.claude/skills/remotion-best-practices/rules/transcribe-captions.md

71 lines
1.9 KiB
Markdown

---
name: transcribe-captions
description: Transcribing audio to generate captions in Remotion
metadata:
tags: captions, transcribe, whisper, audio, speech-to-text
---
# Transcribing audio
To transcribe audio to generate captions in Remotion, you can use the [`transcribe()`](https://www.remotion.dev/docs/install-whisper-cpp/transcribe) function from the [`@remotion/install-whisper-cpp`](https://www.remotion.dev/docs/install-whisper-cpp) package.
## Prerequisites
First, the @remotion/install-whisper-cpp package needs to be installed.
If it is not installed, use the following command:
```bash
npx remotion add @remotion/install-whisper-cpp
```
## Transcribing
Make a Node.js script to download Whisper.cpp and a model, and transcribe the audio.
```ts
import path from "path";
import {
downloadWhisperModel,
installWhisperCpp,
transcribe,
toCaptions,
} from "@remotion/install-whisper-cpp";
import fs from "fs";
const to = path.join(process.cwd(), "whisper.cpp");
await installWhisperCpp({
to,
version: "1.5.5",
});
await downloadWhisperModel({
model: "medium.en",
folder: to,
});
// Convert the audio to a 16KHz wav file first if needed:
// import {execSync} from 'child_process';
// execSync('ffmpeg -i /path/to/audio.mp4 -ar 16000 /path/to/audio.wav -y');
const whisperCppOutput = await transcribe({
model: "medium.en",
whisperPath: to,
whisperCppVersion: "1.5.5",
inputPath: "/path/to/audio123.wav",
tokenLevelTimestamps: true,
});
// Optional: Apply our recommended postprocessing
const { captions } = toCaptions({
whisperCppOutput,
});
// Write it to the public/ folder so it can be fetched from Remotion
fs.writeFileSync("captions123.json", JSON.stringify(captions, null, 2));
```
Transcribe each clip individually and create multiple JSON files.
See [Displaying captions](display-captions.md) for how to display the captions in Remotion.