Codec Training Code

#1
by yukiarimo - opened

Hello!

Please release fine-tuning and training from scratch your Qwen Codec (audio tokenizer). No WavLM or anything, pure PyTorch audio-in audio-out loss. I want to train from scratch on my 48 kHz LJSpeech-sized dataset!

Thanks!

+1

seconding this. I really want to learn how to train this tokenizer from scratch on my own acoustic datasets.

Updates?

Please look into the script. Thereโ€™s no gradients and loss calculation for encoder, decoder

I was also waiting for the official training code, but since it hasn't been released no matter how long I wait, I made my own. Feel free to use it if you like. This is a script that performs fine-tuning only on the decoder (so codec compatibility is preserved). It also comes with an added bonus: you can tweak the upsampler settings for 48kHz output.
https://github.com/takuma104/Qwen3-TTS-Tokenizer-12Hz-Trainer

Sign up or log in to comment