72 lines
2.5 KiB
Markdown
72 lines
2.5 KiB
Markdown
---
|
|
name: silence-detection
|
|
description: Adaptive silence detection for video/audio files using FFmpeg loudnorm and silencedetect
|
|
metadata:
|
|
tags: silence, detection, trimming, ffmpeg, loudnorm, audio
|
|
---
|
|
|
|
# Adaptive Silence Detection
|
|
|
|
Detect silent segments in video or audio files.
|
|
|
|
Requires FFmpeg — see [ffmpeg.md](./ffmpeg.md) for how to invoke it in Remotion projects.
|
|
|
|
## Step 1: Measure loudness with `loudnorm`
|
|
|
|
Use the `loudnorm` filter in JSON mode to get the EBU R128 integrated loudness and gating threshold for each file:
|
|
|
|
```bash
|
|
npx remotion ffmpeg -i public/video.mov -map 0:a -af loudnorm=print_format=json -f null /dev/null
|
|
```
|
|
|
|
As output you will get:
|
|
- `input_i`: Integrated loudness (dB) — the overall perceived volume
|
|
- `input_thresh`: EBU R128 gating threshold (dB) — the level below which audio is considered too quiet to count toward loudness measurement
|
|
|
|
## Step 2: Detect silences using adaptive threshold
|
|
|
|
Pass the `input_thresh` value from step 1 as the `noise` parameter to `silencedetect`:
|
|
|
|
```bash
|
|
npx remotion ffmpeg -i public/video.mov -map 0:a -af "silencedetect=noise=${THRESH}dB:d=0.5" -f null /dev/null
|
|
```
|
|
|
|
Parameters:
|
|
- `noise`: The threshold below which audio is considered silent. Use `input_thresh` from step 1.
|
|
- `d`: Minimum silence duration in seconds. `0.5` is a good default.
|
|
|
|
## Interpreting the output
|
|
|
|
The filter outputs pairs of `silence_start` and `silence_end` timestamps:
|
|
|
|
```
|
|
[silencedetect] silence_start: 0
|
|
[silencedetect] silence_end: 2.241021 | silence_duration: 2.241021
|
|
[silencedetect] silence_start: 38.77425
|
|
[silencedetect] silence_end: 39.619604 | silence_duration: 0.845354
|
|
```
|
|
|
|
## Identifying leading and trailing silence
|
|
|
|
- **Leading silence**: Consecutive silence segments starting at or near 0. If the first `silence_start` is > 0.5s, there is no leading silence.
|
|
- **Trailing silence**: The last silence segment that extends to (or near) the end of the file. Compare the last `silence_end` with the file's total duration.
|
|
|
|
When multiple silences are nearly contiguous at the start or end (gap < 0.2s), treat them as a single leading/trailing silence block.
|
|
|
|
## Using with Remotion's `<Video>` component
|
|
|
|
Apply the detected trim points using `trimBefore` and `trimAfter` (values are in frames):
|
|
|
|
```tsx
|
|
import { Video } from "@remotion/media";
|
|
import { staticFile, useVideoConfig } from "remotion";
|
|
|
|
const { fps } = useVideoConfig();
|
|
|
|
<Video
|
|
src={staticFile("video.mov")}
|
|
trimBefore={Math.floor(leadingEnd * fps)}
|
|
trimAfter={Math.ceil(trailingStart * fps)}
|
|
/>
|
|
```
|