From Discrete Tokens to High Fidelity Audio with Multi Band Diffusion

Your Image

Compression Samples

Here are some samples from our model for you to listen:

  • Ground Truth - 24 khz
  • MBD : Multiband diffusion
  • Encodec : Github
  • Opus 6kbps



Music

6 kbps
Ground Truth MBD Encodec Opus
3 kbps
Ground Truth MBD Encodec Opus
1.5 kbps
Ground Truth MBD Encodec Opus




Speech

Speech Samples from the Expresso Dataset.

Ground Truth Bit Rate MBD Encodec
1.5
3.0
6.0
1.5
3.0
6.0
1.5
3.0
6.0
1.5
3.0
6.0


Text to Audio Samples







Text to music samples from Github using the medium sized model.

MusicGen

Prompt Encodec Multi Band Diffusion
Bluesy guitar instrumental with soulful licks and a driving rhythm section
80s_pop_track_with_bassy_drums_and_synth
90s rock song with loud guitars and heavy drums.
African tribal groove with polyrhythmic drums and melodic kalimba
Classical orchestral piece with soaring strings and delicate piano
Jazz Funk with slap bass and saxophone.