71 lines
1.9 KiB
Markdown
71 lines
1.9 KiB
Markdown
---
|
|
name: transcribe-captions
|
|
description: Transcribing audio to generate captions in Remotion
|
|
metadata:
|
|
tags: captions, transcribe, whisper, audio, speech-to-text
|
|
---
|
|
|
|
# Transcribing audio
|
|
|
|
To transcribe audio to generate captions in Remotion, you can use the [`transcribe()`](https://www.remotion.dev/docs/install-whisper-cpp/transcribe) function from the [`@remotion/install-whisper-cpp`](https://www.remotion.dev/docs/install-whisper-cpp) package.
|
|
|
|
## Prerequisites
|
|
|
|
First, the @remotion/install-whisper-cpp package needs to be installed.
|
|
If it is not installed, use the following command:
|
|
|
|
```bash
|
|
npx remotion add @remotion/install-whisper-cpp
|
|
```
|
|
|
|
## Transcribing
|
|
|
|
Make a Node.js script to download Whisper.cpp and a model, and transcribe the audio.
|
|
|
|
```ts
|
|
import path from "path";
|
|
import {
|
|
downloadWhisperModel,
|
|
installWhisperCpp,
|
|
transcribe,
|
|
toCaptions,
|
|
} from "@remotion/install-whisper-cpp";
|
|
import fs from "fs";
|
|
|
|
const to = path.join(process.cwd(), "whisper.cpp");
|
|
|
|
await installWhisperCpp({
|
|
to,
|
|
version: "1.5.5",
|
|
});
|
|
|
|
await downloadWhisperModel({
|
|
model: "medium.en",
|
|
folder: to,
|
|
});
|
|
|
|
// Convert the audio to a 16KHz wav file first if needed:
|
|
// import {execSync} from 'child_process';
|
|
// execSync('ffmpeg -i /path/to/audio.mp4 -ar 16000 /path/to/audio.wav -y');
|
|
|
|
const whisperCppOutput = await transcribe({
|
|
model: "medium.en",
|
|
whisperPath: to,
|
|
whisperCppVersion: "1.5.5",
|
|
inputPath: "/path/to/audio123.wav",
|
|
tokenLevelTimestamps: true,
|
|
});
|
|
|
|
// Optional: Apply our recommended postprocessing
|
|
const { captions } = toCaptions({
|
|
whisperCppOutput,
|
|
});
|
|
|
|
// Write it to the public/ folder so it can be fetched from Remotion
|
|
fs.writeFileSync("captions123.json", JSON.stringify(captions, null, 2));
|
|
```
|
|
|
|
Transcribe each clip individually and create multiple JSON files.
|
|
|
|
See [Displaying captions](display-captions.md) for how to display the captions in Remotion.
|