Text Classification
setfit
Safetensors
sentence-transformers
English
bert
generated_from_setfit_trainer
text-embeddings-inference
Instructions to use fabiancpl/nlbse25_python with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- setfit
How to use fabiancpl/nlbse25_python with setfit:
from setfit import SetFitModel model = SetFitModel.from_pretrained("fabiancpl/nlbse25_python") - sentence-transformers
How to use fabiancpl/nlbse25_python with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("fabiancpl/nlbse25_python") sentences = [ "The weather is lovely today.", "It's so sunny outside!", "He drove to the stadium." ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [3, 3] - Notebooks
- Google Colab
- Kaggle
File size: 2,032 Bytes
2bd9ad7 1b8f032 2bd9ad7 1b8f032 2bd9ad7 1b8f032 2bd9ad7 1b8f032 2bd9ad7 1b8f032 2bd9ad7 1b8f032 2bd9ad7 1b8f032 2bd9ad7 1b8f032 2bd9ad7 1b8f032 2bd9ad7 1b8f032 2bd9ad7 1b8f032 2bd9ad7 1b8f032 2bd9ad7 1b8f032 2bd9ad7 1b8f032 2bd9ad7 1b8f032 2bd9ad7 1b8f032 2bd9ad7 1b8f032 2bd9ad7 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 | ---
tags:
- setfit
- sentence-transformers
- text-classification
- generated_from_setfit_trainer
widget: []
metrics:
- accuracy
- f1
- precision
- recall
pipeline_tag: text-classification
library_name: setfit
inference: true
license: mit
datasets:
- NLBSE/nlbse25-code-comment-classification
language:
- en
base_model:
- sentence-transformers/all-MiniLM-L6-v2
---
# Python comment classifier
This is a [SetFit](https://github.com/huggingface/setfit) model that can be used for Python code comment classification.
The model has been trained using few-shot learning that involves:
1. Fine-tuning a [Sentence Transformer](https://www.sbert.net) with contrastive learning.
2. Training a classification head with features from the fine-tuned model.
## Model Description
- **Model Type:** SetFit
- **Classification head:** [RandomForestClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html)
## Sources
- **Repository:** [GitHub](https://github.com/fabiancpl/sbert-comment-classification/)
- **Paper:** [Evaluating the Performance and Efficiency of Sentence-BERT for Code Comment Classification](https://ieeexplore.ieee.org/document/11029440)
- **Dataset:** [HF Dataset](https://huggingface.co/datasets/NLBSE/nlbse25-code-comment-classification)
## How to use it
First, install the depencies:
```bash
pip install setfit scikit-learn
```
Then, load the model and run inferences:
```python
from setfit import SetFitModel
# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("fabiancpl/nlbse25_python")
# Run inference
preds = model("This function sorts a list of numbers.")
```
## Cite as
```bibtex
@inproceedings{11029440,
author={Peña, Fabian C. and Herbold, Steffen},
booktitle={2025 IEEE/ACM International Workshop on Natural Language-Based Software Engineering (NLBSE)},
title={Evaluating the Performance and Efficiency of Sentence-BERT for Code Comment Classification},
year={2025},
pages={21-24},
doi={10.1109/NLBSE66842.2025.00010}}
```
|