Instructions to use jbochi/madlad400-3b-mt with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use jbochi/madlad400-3b-mt with Transformers:
# Use a pipeline as a high-level helper # Warning: Pipeline type "translation" is no longer supported in transformers v5. # You must load the model directly (see below) or downgrade to v4.x with: # 'pip install "transformers<5.0.0' from transformers import pipeline pipe = pipeline("translation", model="jbochi/madlad400-3b-mt")# Load model directly from transformers import AutoTokenizer, AutoModelForSeq2SeqLM tokenizer = AutoTokenizer.from_pretrained("jbochi/madlad400-3b-mt") model = AutoModelForSeq2SeqLM.from_pretrained("jbochi/madlad400-3b-mt") - Notebooks
- Google Colab
- Kaggle
Translation not working outputing garbage tokens
Used the usage code provided in this Repo
from transformers import T5ForConditionalGeneration, T5Tokenizer
model_name = 'jbochi/madlad400-3b-mt'
model = T5ForConditionalGeneration.from_pretrained(model_name, device_map="auto")
tokenizer = T5Tokenizer.from_pretrained(model_name)
text = "<2pt> I love pizza!"
input_ids = tokenizer(text, return_tensors="pt").input_ids.to(model.device)
outputs = model.generate(input_ids=input_ids)
tokenizer.decode(outputs[0], skip_special_tokens=True)
and i got this output
'1000000000000000000'
print (input_ids) got this tensor([[ 805, 116, 908, 10108, 88792, 918, 2]], device='cuda:0')
and here is the outputs tensors before decoding = tensor([[ 0, 805, 808, 813, 813, 813, 813, 813, 813, 813, 813, 813, 813, 813,
813, 813, 813, 813, 813, 813, 813]], device='cuda:0')
am i missing something here ??
I'm the same—I thought it was just me or that my hardware was the problem :)
I'm the same—I thought it was just me or that my hardware was the problem :)
yes i thought the same it must be a hardware issue but its not. something is broken here but i haven't figured it out yet. let me know if you figured out a solution or even an alternative Model or something
Same for me. Curiously, the 10B model still works.
Same for me. Curiously, the 10B model still works.
hey the issue is with the tokenizer you can use a a sentence piece tokenizer model to resolve this here is a repo i used Heng666/madlad400-3b-mt-ct2-int8