Codec-SUPERB: Sound Codec Speech Processing Universal Performance Benchmark

A comprehensive benchmark evaluating audio codec models across diverse speech processing tasks.

📖 Introduction

Codec-SUPERB sets a new standard for evaluating sound codec models. We provide a rigorous and transparent framework for assessing speech quality and information preservation across various downstream tasks. Our goal is to foster innovation and facilitate community collaboration in the field of neural audio coding.

✨ Key Features

🚀 Out-of-the-Box Interface: Intuitive API for easy integration and rapid experimentation with diverse codec models.
📊 Multi-Perspective Leaderboard: Comprehensive assessment across various speech processing dimensions with rankings for competitive transparency.
🏗️ Standardized Environment: Ensures fair and consistent comparisons by using uniform testing conditions for all models.
📚 Unified Datasets: Curated collection of datasets testing a wide range of real-world speech processing scenarios.
⚡ Batch Processing Support: Highly optimized batch encoding/decoding for significant performance speedups.

🛠️ Installation

# Clone the repository
git clone https://github.com/voidful/Codec-SUPERB.git
cd Codec-SUPERB

# Install dependencies
pip install -r requirements.txt

🚀 Quick Start

List and Load Codecs

from SoundCodec import codec

# List all available codecs
print(codec.list_codec())

# Load a specific codec
model = codec.load_codec('encodec_24k_6bps')

Single Audio Processing

import torchaudio

# Load audio
waveform, sample_rate = torchaudio.load('sample_audio.wav')
data_item = {'audio': {'array': waveform.numpy()[-1], 'sampling_rate': sample_rate}}

# Extract discrete units
sound_unit = model.extract_unit(data_item).unit

# Reconstruct audio
reconstructed = model.synth(data_item, local_save=False)['audio']['array']

⚡ Advanced Usage: Batch Processing

Codec-SUPERB supports efficient batch operations, typically providing 3-5x performance improvement on GPU.

# Prepare multiple samples
data_list = [
    {'audio': {'array': wave1, 'sampling_rate': 16000}},
    {'audio': {'array': wave2, 'sampling_rate': 16000}}
]

# Option 1: Batch extraction and decoding (Recommended)
batch_extracted = model.batch_extract_unit(data_list)
batch_decoded = model.batch_decode_unit(batch_extracted)

# Option 2: Complete batch synthesis pipeline
results = model.batch_synth(data_list, local_save=False)

Tip

Grouping samples by similar lengths can further optimize batch processing efficiency.

🎯 Benchmarking & Leaderboard

Follow these steps to evaluate your codec and contribute to the Official Leaderboard.

1. Synthesize the Test Set

Use the voidful/codec-superb-tiny dataset:

PYTHONPATH=. python3 scripts/dataset_creator.py --dataset voidful/codec-superb-tiny

2. Calculate Metrics

Compute standard metrics (MEL, PESQ, STOI, F0Corr):

# Benchmark all codecs
PYTHONPATH=. python3 scripts/benchmarking.py --dataset datasets/voidful/codec-superb-tiny_synth

# Benchmark only specific codec(s)
PYTHONPATH=. python3 scripts/benchmarking.py \
    --dataset datasets/voidful/codec-superb-tiny_synth \
    --models llmcodec

3. Submit Results

Locate the generated JSON file: datasets_voidful_codec-superb-tiny_synth_evaluation_results_*.json.
Open a New Issue in this repository titled New Benchmark Result: [Codec Name].
Attach the JSON file or paste its content.

🛡️ Encode-Only Codec Support

Certain codecs (e.g., s3tokenizer) focus on tokenization and do not support reconstruction. Codec-SUPERB handles these automatically:

Benchmarking: Automatically skipped during reconstruction evaluation.
API: Raises NotImplementedError if decode_unit is called, with clear messaging.

🧪 Testing

# Run all tests
python -m pytest SoundCodec/test/

# Verify all codecs (Initialization & Synthesis)
PYTHONPATH=. python3 scripts/check_all_codecs.py

📝 Citation

If you use Codec-SUPERB in your research, please cite:

@inproceedings{wu-etal-2024-codec,
    title = "Codec-{SUPERB}: An In-Depth Analysis of Sound Codec Models",
    author = "Wu, Haibin and Chung, Ho-Lam and Lin, Yi-Cheng and Wu, Yuan-Kuei and Chen, Xuanjun and Pai, Yu-Chi and Wang, Hsiu-Hsuan and Chang, Kai-Wei and Liu, Alexander and Lee, Hung-yi",
    booktitle = "Findings of the Association for Computational Linguistics: ACL 2024",
    year = "2024",
    url = "https://aclanthology.org/2024.findings-acl.616",
    doi = "10.18653/v1/2024.findings-acl.616",
    pages = "10330--10348",
}

🤝 Contribution & License

Contributions are highly encouraged! See CONTRIBUTING.md for details. This project is licensed under the MIT License.

Developed with ❤️ by the Codec-SUPERB Team

Name		Name	Last commit message	Last commit date
Latest commit History 221 Commits
.github/workflows		.github/workflows
SoundCodec		SoundCodec
docs		docs
img		img
scripts		scripts
v2		v2
web		web
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Codec-SUPERB: Sound Codec Speech Processing Universal Performance Benchmark

📖 Introduction

✨ Key Features

🛠️ Installation

🚀 Quick Start

List and Load Codecs

Single Audio Processing

⚡ Advanced Usage: Batch Processing

🎯 Benchmarking & Leaderboard

1. Synthesize the Test Set

2. Calculate Metrics

3. Submit Results

🛡️ Encode-Only Codec Support

🧪 Testing

📝 Citation

🤝 Contribution & License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 6

Uh oh!

Languages

voidful/Codec-SUPERB

Folders and files

Latest commit

History

Repository files navigation

Codec-SUPERB: Sound Codec Speech Processing Universal Performance Benchmark

📖 Introduction

✨ Key Features

🛠️ Installation

🚀 Quick Start

List and Load Codecs

Single Audio Processing

⚡ Advanced Usage: Batch Processing

🎯 Benchmarking & Leaderboard

1. Synthesize the Test Set

2. Calculate Metrics

3. Submit Results

🛡️ Encode-Only Codec Support

🧪 Testing

📝 Citation

🤝 Contribution & License

About

Topics

Resources

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 6

Uh oh!

Languages

Packages