Instructions to use sdadas/xlm-roberta-large-twitter with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use sdadas/xlm-roberta-large-twitter with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("fill-mask", model="sdadas/xlm-roberta-large-twitter")# Load model directly from transformers import AutoTokenizer, AutoModelForMaskedLM tokenizer = AutoTokenizer.from_pretrained("sdadas/xlm-roberta-large-twitter") model = AutoModelForMaskedLM.from_pretrained("sdadas/xlm-roberta-large-twitter") - Notebooks
- Google Colab
- Kaggle
XLM-RoBERTA-large-twitter
This is a XLM-RoBERTa-large model tuned on a corpus of over 156 million tweets in ten languages: English, Spanish, Italian, Portuguese, French, Chinese, Hindi, Arabic, Dutch and Korean. The model has been trained from the original XLM-RoBERTA-large checkpoint for 2 epochs with a batch size of 1024.
For best results, preprocess the tweets using the following method before passing them to the model:
def preprocess(text):
new_text = []
for t in text.split(" "):
t = '@user' if t.startswith('@') and len(t) > 1 else t
t = 'http' if t.startswith('http') else t
new_text.append(t)
return " ".join(new_text)
- Downloads last month
- 45