Instructions to use zai-org/chatglm2-6b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use zai-org/chatglm2-6b with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("zai-org/chatglm2-6b", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
fix when use_cache = False,inference 乱码
#85
by ShiJueXiaofei - opened
当加载原始模型,设置 use_cache = False 时,对next_token的预测,input_ids的截取只判断了 is_first_forward ,仍然截取处理,只使用最新的token写入input_ids。此时没有past_key_value参数,会导致模型推理乱码。
应该 判断 is_first_forward == False and self.config.use_cache == True 的时候,才能截取最新预测的token,传入model,否则要传入前面原始文本序列及已经预测的token。
zxdu20 changed pull request status to merged