Text Classification
setfit
Safetensors
sentence-transformers
English
bert
generated_from_setfit_trainer
text-embeddings-inference
Instructions to use fabiancpl/nlbse25_python with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- setfit
How to use fabiancpl/nlbse25_python with setfit:
from setfit import SetFitModel model = SetFitModel.from_pretrained("fabiancpl/nlbse25_python") - sentence-transformers
How to use fabiancpl/nlbse25_python with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("fabiancpl/nlbse25_python") sentences = [ "The weather is lovely today.", "It's so sunny outside!", "He drove to the stadium." ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [3, 3] - Notebooks
- Google Colab
- Kaggle
| tags: | |
| - setfit | |
| - sentence-transformers | |
| - text-classification | |
| - generated_from_setfit_trainer | |
| widget: [] | |
| metrics: | |
| - accuracy | |
| - f1 | |
| - precision | |
| - recall | |
| pipeline_tag: text-classification | |
| library_name: setfit | |
| inference: true | |
| license: mit | |
| datasets: | |
| - NLBSE/nlbse25-code-comment-classification | |
| language: | |
| - en | |
| base_model: | |
| - sentence-transformers/all-MiniLM-L6-v2 | |
| # Python comment classifier | |
| This is a [SetFit](https://github.com/huggingface/setfit) model that can be used for Python code comment classification. | |
| The model has been trained using few-shot learning that involves: | |
| 1. Fine-tuning a [Sentence Transformer](https://www.sbert.net) with contrastive learning. | |
| 2. Training a classification head with features from the fine-tuned model. | |
| ## Model Description | |
| - **Model Type:** SetFit | |
| - **Classification head:** [RandomForestClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html) | |
| ## Sources | |
| - **Repository:** [GitHub](https://github.com/fabiancpl/sbert-comment-classification/) | |
| - **Paper:** [Evaluating the Performance and Efficiency of Sentence-BERT for Code Comment Classification](https://ieeexplore.ieee.org/document/11029440) | |
| - **Dataset:** [HF Dataset](https://huggingface.co/datasets/NLBSE/nlbse25-code-comment-classification) | |
| ## How to use it | |
| First, install the depencies: | |
| ```bash | |
| pip install setfit scikit-learn | |
| ``` | |
| Then, load the model and run inferences: | |
| ```python | |
| from setfit import SetFitModel | |
| # Download from the 🤗 Hub | |
| model = SetFitModel.from_pretrained("fabiancpl/nlbse25_python") | |
| # Run inference | |
| preds = model("This function sorts a list of numbers.") | |
| ``` | |
| ## Cite as | |
| ```bibtex | |
| @inproceedings{11029440, | |
| author={Peña, Fabian C. and Herbold, Steffen}, | |
| booktitle={2025 IEEE/ACM International Workshop on Natural Language-Based Software Engineering (NLBSE)}, | |
| title={Evaluating the Performance and Efficiency of Sentence-BERT for Code Comment Classification}, | |
| year={2025}, | |
| pages={21-24}, | |
| doi={10.1109/NLBSE66842.2025.00010}} | |
| ``` | |