mtp or other speculative decoding method?

#34
by CHNtentes - opened

glm 4.5 can decode 2~3x faster with mtp enabled

Sign up or log in to comment