Here are some samples from our model for you to listen:
| 6 kbps | |||
| Ground Truth | MBD | Encodec | Opus |
|---|---|---|---|
| 3 kbps | |||
| Ground Truth | MBD | Encodec | Opus |
| 1.5 kbps | |||
| Ground Truth | MBD | Encodec | Opus |
Speech Samples from the Expresso Dataset.
| Ground Truth | Bit Rate | MBD | Encodec |
|---|---|---|---|
| 1.5 | |||
| 3.0 | |||
| 6.0 | |||
| 1.5 | |||
| 3.0 | |||
| 6.0 | |||
| 1.5 | |||
| 3.0 | |||
| 6.0 | |||
| 1.5 | |||
| 3.0 | |||
| 6.0 |
Text to music samples from Github using the medium sized model.
| Prompt | Encodec | Multi Band Diffusion |
|---|---|---|
| Bluesy guitar instrumental with soulful licks and a driving rhythm section | ||
| 80s_pop_track_with_bassy_drums_and_synth | ||
| 90s rock song with loud guitars and heavy drums. | ||
| African tribal groove with polyrhythmic drums and melodic kalimba | ||
| Classical orchestral piece with soaring strings and delicate piano | ||
| Jazz Funk with slap bass and saxophone. |