MRHF-Codec (emcodec)
ResearchNovel neural audio codec that achieves +40dB improvement over Meta's DAC on high-frequency reconstruction. MS thesis research.

Spectrogram Comparison

Left: Original signal. Right: emcodec reconstruction. The high-frequency content above 6kHz is faithfully preserved — where competing codecs produce noise.
Overview
MRHF-Codec (Multi-Resolution High-Frequency Preserving Neural Audio Codec) is a dual-path encoder-decoder architecture that solves a critical failure in existing neural audio codecs: catastrophic high-frequency destruction. Meta's DAC produces **negative SI-SDR on frequencies above 6kHz** (worse than random noise). MRHF-Codec achieves **positive SI-SDR (+11.6dB)** on the same content — a 40+ dB improvement — while using 11.8% less bitrate.
Tech Stack
Language: Python 3.9+Framework: PyTorchArchitecture: Dual-path encoder-decoder with FSQ quantizationTraining: GAN-based (Multi-Period + Multi-Scale Discriminators)Losses: L1 reconstruction + multi-scale STFT + mel-spectrogram + adversarial + feature matchingAnalysis: librosa, pyloudnorm (ITU-R BS.1770), soundfileEvaluation: Custom MUSHRA interface (built in SoundPrivate), 12+ metricsTraining Infra: RTX 4090, ~111 hours per 500K steps, WandB logging, mixed precisionTesting: pytest — 240+ test cases across 9 modules
Engineering Highlights
- 01Asymmetric temporal resolution — First neural audio codec to use different downsampling ratios per frequency band. 128x for high-freq gives 4x better transient preservation than DAC's uniform 512x.
- 02FSQ over VQ — Finite Scalar Quantization eliminates codebook collapse (the root cause of HF degradation discovered in Phase 1). No learnable codebooks, no EMA updates, no dead codes.
- 03240+ tests with gradient diagnostics — Test suite includes gradient flow checks, energy conservation validation per frequency band, and end-to-end training pipeline tests. Not typical for research code.
- 04MUSHRA evaluation pipeline — Built a standardized listening test interface (in SoundPrivate) specifically for codec comparison. Scientific evaluation methodology, not just metrics.
- 05Comprehensive ablation studies — Validated each design decision: downsampling ratio choice, FSQ vs VQ, band split frequency, quantization layer allocation.
Stats
135 over 3.5 months
Commits
240+ cases across 9 modules
Tests
11.7M generator, 41.3M discriminator
Parameters
275,527 files (187.4 hours)
Training Data
62-file test set, 12+ metrics
Evaluation
In draft
Paper