Codec Training Code

by yukiarimo - opened Jan 22

Jan 22

Hello!

Please release fine-tuning and training from scratch your Qwen Codec (audio tokenizer). No WavLM or anything, pure PyTorch audio-in audio-out loss. I want to train from scratch on my 48 kHz LJSpeech-sized dataset!

Thanks!

Ranjit

Jan 23

simusid

Jan 24

seconding this. I really want to learn how to train this tokenizer from scratch on my own acoustic datasets.

yukiarimo

Feb 1

Updates?

neuralworm

Feb 6

https://github.com/QwenLM/Qwen3-TTS
(finetuning folder)

yukiarimo

Feb 6

Please look into the script. There’s no gradients and loss calculation for encoder, decoder

takuma104

Mar 9

I was also waiting for the official training code, but since it hasn't been released no matter how long I wait, I made my own. Feel free to use it if you like. This is a script that performs fine-tuning only on the decoder (so codec compatibility is preserved). It also comes with an added bonus: you can tweak the upsampler settings for 48kHz output.
https://github.com/takuma104/Qwen3-TTS-Tokenizer-12Hz-Trainer

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment