High Fidelity Neural Audio Compression - Samples

We present samples for EnCodec, our proposed neural audio codec. Read our paper for more details. All the bandwidths are reported without Entropy Coding.

We first provide samples for stereo music at 48 kHz for Opus 24 kbps, MP3 64 kbps, Lyra-v2 at 6 and 12 kbps (mono), and EnCodec at 3, 6, and 12 kbps. The samples are taken from an internal proprietary dataset of music.

We then provide samples for mono audio at 24 kHz. Samples were taken from Common Voice and FSD50K, as well as a proprietary dataset of music. We compare to EVS, Opus, Lyra-v2 and our reimplementation of SoundStream with some improvements. We provide samples for EnCodec at 1.5, 3, 6, 12 kbps. We refer the reader to our paper, Table A.1 for the licenses of the datasets used.

Clicking on any sample will automatically pause the other samples :)

48 kHz stereophonic

Note that as Lyra-v2 does not support stereo inputs, we provide mono samples for it.

Music 1

Ground Truth MP3 (64 kbps)
Opus (24 kbps): Lyra-v2 (Mono, 12 kbps)
EnCodec (3 kbps): EnCodec (6 kbps):
EnCodec (12 kbps):

Music 2

Ground Truth MP3 (64 kbps)
Opus (24 kbps): Lyra-v2 (Mono, 12 kbps)
EnCodec (3 kbps): EnCodec (6 kbps):
EnCodec (12 kbps):

24 kHz monophonic

Clean Speech (Common Voice)

Ground Truth Opus (6 kbps)
EVS (6 kbps): SoundStream (6 kbps, reimplem.):
Lyra-v2 (3 kbps): Lyra-v2 (6 kbps):
EnCodec (1.5 kbps): EnCodec (3 kbps):
EnCodec (6 kbps): EnCodec (12 kbps):

Speech + Background (Common Voice + FSD50K)

Ground Truth Opus (6 kbps)
EVS (6 kbps): SoundStream (6 kbps, reimplem.):
Lyra-v2 (3 kbps): Lyra-v2 (6 kbps):
EnCodec (1.5 kbps): EnCodec (3 kbps):
EnCodec (6 kbps): EnCodec (12 kbps):

Speech + Background 2 (Common Voice + FSD50K)

Ground Truth Opus (6 kbps)
EVS (6 kbps): SoundStream (6 kbps, reimplem.):
Lyra-v2 (3 kbps): Lyra-v2 (6 kbps):
EnCodec (1.5 kbps): EnCodec (3 kbps):
EnCodec (6 kbps): EnCodec (12 kbps):

Music (proprietary dataset)

Ground Truth Opus (6 kbps)
EVS (6 kbps): SoundStream (6 kbps, reimplem.):
Lyra-v2 (3 kbps): Lyra-v2 (6 kbps):
EnCodec (1.5 kbps): EnCodec (3 kbps):
EnCodec (6 kbps): EnCodec (12 kbps)