Zero-Shot Classification
Transformers
PyTorch
TensorFlow
Safetensors
xlm-roberta
text-classification
tensorflow
nli
natural-language-inference
Eval Results (legacy)
Instructions to use nahiar/zero-shot-classification with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use nahiar/zero-shot-classification with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("zero-shot-classification", model="nahiar/zero-shot-classification")# Load model directly from transformers import AutoTokenizer, AutoModelForSequenceClassification tokenizer = AutoTokenizer.from_pretrained("nahiar/zero-shot-classification") model = AutoModelForSequenceClassification.from_pretrained("nahiar/zero-shot-classification") - Notebooks
- Google Colab
- Kaggle
| language: | |
| - multilingual | |
| - en | |
| - fr | |
| - es | |
| - de | |
| - el | |
| - bg | |
| - ru | |
| - tr | |
| - ar | |
| - vi | |
| - th | |
| - zh | |
| - hi | |
| - sw | |
| - ur | |
| tags: | |
| - text-classification | |
| - pytorch | |
| - tensorflow | |
| - zero-shot-classification | |
| - xlm-roberta | |
| - multilingual | |
| - nli | |
| - natural-language-inference | |
| datasets: | |
| - multi_nli | |
| - xnli | |
| license: mit | |
| pipeline_tag: zero-shot-classification | |
| library_name: transformers | |
| model-index: | |
| - name: xlm-roberta-large-xnli | |
| results: | |
| - task: | |
| type: zero-shot-classification | |
| name: Zero-Shot Classification | |
| dataset: | |
| name: XNLI | |
| type: xnli | |
| metrics: | |
| - type: accuracy | |
| value: 0.834 | |
| name: Accuracy | |
| - type: f1 | |
| value: 0.833 | |
| name: F1 Score | |
| widget: | |
| - text: "За кого вы голосуете в 2020 году?" | |
| candidate_labels: "politique étrangère, Europe, élections, affaires, politique" | |
| multi_class: true | |
| example_title: "Russian Political Classification" | |
| - text: "لمن تصوت في 2020؟" | |
| candidate_labels: "السياسة الخارجية, أوروبا, الانتخابات, الأعمال, السياسة" | |
| multi_class: true | |
| example_title: "Arabic Political Classification" | |
| - text: "2020'de kime oy vereceksiniz?" | |
| candidate_labels: "dış politika, Avrupa, seçimler, ticaret, siyaset" | |
| multi_class: true | |
| example_title: "Turkish Political Classification" | |
| - text: "I love this movie" | |
| candidate_labels: "positive, negative, neutral" | |
| multi_class: false | |
| example_title: "English Sentiment Analysis" | |
| # XLM-RoBERTa Large for Zero-Shot Classification (XNLI) | |
| ## Model Description | |
| This model is based on the excellent work by [joeddav/xlm-roberta-large-xnli](https://huggingface.co/joeddav/xlm-roberta-large-xnli). It takes [xlm-roberta-large](https://huggingface.co/xlm-roberta-large) and fine-tunes it on a combination of NLI data in 15 languages. | |
| **Original Model Credit**: This model is a copy of [joeddav/xlm-roberta-large-xnli](https://huggingface.co/joeddav/xlm-roberta-large-xnli) by Joe Davison. All credit for the training and development goes to the original author. | |
| This model is intended to be used for zero-shot text classification, such as with the Hugging Face [ZeroShotClassificationPipeline](https://huggingface.co/docs/transformers/main_classes/pipelines#transformers.ZeroShotClassificationPipeline). | |
| ## Quick Start | |
| ```python | |
| from transformers import pipeline | |
| # Load the zero-shot classification pipeline | |
| classifier = pipeline("zero-shot-classification", | |
| model="YOUR_USERNAME/zero-shot-classification") | |
| # Example usage | |
| text = "I love this new smartphone, it's amazing!" | |
| candidate_labels = ["technology", "sports", "politics", "entertainment"] | |
| result = classifier(text, candidate_labels) | |
| print(result) | |
| ``` | |
| ## Intended Usage | |
| This model is intended to be used for zero-shot text classification, especially in languages other than English. It is fine-tuned on XNLI, which is a multilingual NLI dataset. The model can therefore be used with any of the languages in the XNLI corpus: | |
| - English | |
| - French | |
| - Spanish | |
| - German | |
| - Greek | |
| - Bulgarian | |
| - Russian | |
| - Turkish | |
| - Arabic | |
| - Vietnamese | |
| - Thai | |
| - Chinese | |
| - Hindi | |
| - Swahili | |
| - Urdu | |
| Since the base model was pre-trained trained on 100 different languages, the | |
| model has shown some effectiveness in languages beyond those listed above as | |
| well. See the full list of pre-trained languages in appendix A of the | |
| [XLM Roberata paper](https://arxiv.org/abs/1911.02116) | |
| For English-only classification, it is recommended to use | |
| [bart-large-mnli](https://huggingface.co/facebook/bart-large-mnli) or | |
| [a distilled bart MNLI model](https://huggingface.co/models?filter=pipeline_tag%3Azero-shot-classification&search=valhalla). | |
| ### Using the zero-shot classification pipeline | |
| The model can be loaded with the `zero-shot-classification` pipeline like so: | |
| ```python | |
| from transformers import pipeline | |
| classifier = pipeline("zero-shot-classification", | |
| model="YOUR_USERNAME/zero-shot-classification") | |
| ``` | |
| You can then classify in any of the above languages. You can even pass the labels in one language and the sequence to | |
| classify in another: | |
| ```python | |
| # we will classify the Russian translation of, "Who are you voting for in 2020?" | |
| sequence_to_classify = "За кого вы голосуете в 2020 году?" | |
| # we can specify candidate labels in Russian or any other language above: | |
| candidate_labels = ["Europe", "public health", "politics"] | |
| classifier(sequence_to_classify, candidate_labels) | |
| # {'labels': ['politics', 'Europe', 'public health'], | |
| # 'scores': [0.9048484563827515, 0.05722189322113991, 0.03792969882488251], | |
| # 'sequence': 'За кого вы голосуете в 2020 году?'} | |
| ``` | |
| The default hypothesis template is the English, `This text is {}`. If you are working strictly within one language, it | |
| may be worthwhile to translate this to the language you are working with: | |
| ```python | |
| sequence_to_classify = "¿A quién vas a votar en 2020?" | |
| candidate_labels = ["Europa", "salud pública", "política"] | |
| hypothesis_template = "Este ejemplo es {}." | |
| classifier(sequence_to_classify, candidate_labels, hypothesis_template=hypothesis_template) | |
| # {'labels': ['política', 'Europa', 'salud pública'], | |
| # 'scores': [0.9109585881233215, 0.05954807624220848, 0.029493311420083046], | |
| # 'sequence': '¿A quién vas a votar en 2020?'} | |
| ``` | |
| ### Using with manual PyTorch | |
| ```python | |
| # pose sequence as a NLI premise and label as a hypothesis | |
| from transformers import AutoModelForSequenceClassification, AutoTokenizer | |
| nli_model = AutoModelForSequenceClassification.from_pretrained('YOUR_USERNAME/zero-shot-classification') | |
| tokenizer = AutoTokenizer.from_pretrained('YOUR_USERNAME/zero-shot-classification') | |
| premise = sequence | |
| hypothesis = f'This example is {label}.' | |
| # run through model pre-trained on MNLI | |
| x = tokenizer.encode(premise, hypothesis, return_tensors='pt', | |
| truncation_strategy='only_first') | |
| logits = nli_model(x.to(device))[0] | |
| # we throw away "neutral" (dim 1) and take the probability of | |
| # "entailment" (2) as the probability of the label being true | |
| entail_contradiction_logits = logits[:,[0,2]] | |
| probs = entail_contradiction_logits.softmax(dim=1) | |
| prob_label_is_true = probs[:,1] | |
| ``` | |
| ## Training | |
| This model was pre-trained on set of 100 languages, as described in | |
| [the original paper](https://arxiv.org/abs/1911.02116). It was then fine-tuned on the task of NLI on the concatenated | |
| MNLI train set and the XNLI validation and test sets. Finally, it was trained for one additional epoch on only XNLI | |
| data where the translations for the premise and hypothesis are shuffled such that the premise and hypothesis for | |
| each example come from the same original English example but the premise and hypothesis are of different languages. | |
| ## Model Performance | |
| This model achieves excellent performance on multilingual zero-shot classification tasks. For detailed performance metrics, please refer to the [original model](https://huggingface.co/joeddav/xlm-roberta-large-xnli). | |
| ## Limitations and Bias | |
| - The model may have biases inherited from the training data (MNLI and XNLI datasets) | |
| - Performance may vary across different languages and domains | |
| - The model works best with the 15 languages explicitly included in the XNLI training data | |
| - For English-only tasks, consider using specialized English models like `facebook/bart-large-mnli` | |
| ## Citation | |
| If you use this model, please cite the original work: | |
| ```bibtex | |
| @misc{davison2020zero, | |
| title={Zero-Shot Learning in Modern NLP}, | |
| author={Joe Davison}, | |
| year={2020}, | |
| howpublished={\url{https://joeddav.github.io/blog/2020/05/29/ZSL.html}}, | |
| } | |
| ``` | |
| ## License | |
| This model is released under the MIT License, following the original model's licensing. | |
| ## Contact | |
| This is a copy of the original model by Joe Davison. For questions about the model architecture and training, please refer to the [original repository](https://huggingface.co/joeddav/xlm-roberta-large-xnli). | |