MTP Implementation

#1
by kldzj - opened

The README mentions:

Initial quantized release without MTP implementation

Are you planning a release with MTP and is it generally technically possible now? If so, would it also be feasible to release a new GLM-4.5 Air with MTP layers?

Thank you for your hard work. :)

cyankiwi org

Hi @kldzj , thank you for your interest. Technically, it is possible to have MTP layers in quantized GLM models, but I am not sure if they are directly compatible with vllm and sglang.

I am still working on it.

Interesting, could you keep us updated in this discussion? :)

this one supports MTP, but it's hit rate is pretty low (still worth enabling in my opinion)
https://huggingface.co/QuantTrio/GLM-4.7-AWQ

Sign up or log in to comment