Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Paper • 1908.10084 • Published • 13
How to use PietroSaveri/meme-cluster-classifier with sentence-transformers:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("PietroSaveri/meme-cluster-classifier")
sentences = [
"Mitochondria, often called 'powerhouses of the cell,' generate most of the cell's ATP through cellular respiration and have their own DNA.",
"Plate tectonics theory explains that Earth's lithosphere is divided into plates that move, causing earthquakes, volcanoes, and mountain formation.",
"The Titanic was intentionally sunk as part of an insurance scam by J.P. Morgan.",
"Why can't you trust a statistician? They're always plotting something, and they have a mean personality."
]
embeddings = model.encode(sentences)
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [4, 4]This is a sentence-transformers model finetuned from sentence-transformers/all-mpnet-base-v2. The main goal of thius fine-tuned model is to assignb memes into 3 different clusters:
SentenceTransformer(
(0): Transformer({'max_seq_length': 384, 'do_lower_case': False}) with Transformer model: MPNetModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
model = 'PietroSaveri/meme-cluster-classifier'
fine_tuned_model = SentenceTransformer(model)
# 3) Compute centroids just once
seed_centroids = {}
for cat, texts in seed_texts.items():
embs = embedding_model.encode(texts, convert_to_numpy=True)
seed_centroids[cat] = embs.mean(axis=0)
# 4) Define a tiny helper for cosine
def cosine_sim(a, b):
return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))
# 5) Wrap it all up in a function
def predict(text: str):
vec = fine_tuned_model.encode(text, convert_to_numpy=True)
sims = { cat: cosine_sim(vec, centroid) for cat, centroid in seed_centroids.items()}
# sort by descending similarity
assigned = max(sims, key=sims.get)
return sims, assigned
# --- USAGE ---
text = "Why did the biologist go broke? Because his cells were division!"
scores, ranking = predict(text)
print("Raw scores:")
for cat, score in scores.items():
print(f" {cat:25s}: {score:.3f}")Raw scores:
# Conspiracy : 0.700
# Wordplay & Nerd Humor : 0.907
# Educational Science Humor: 0.903
meme-dev-binaryBinaryClassificationEvaluator| Metric | Value |
|---|---|
| cosine_accuracy | 1.0 |
| cosine_accuracy_threshold | 0.7175 |
| cosine_f1 | 1.0 |
| cosine_f1_threshold | 0.7175 |
| cosine_precision | 1.0 |
| cosine_recall | 1.0 |
| cosine_ap | 1.0 |
| cosine_mcc | 1.0 |
sentence_0, sentence_1, and label| sentence_0 | sentence_1 | label | |
|---|---|---|---|
| type | string | string | float |
| details |
|
|
|
| sentence_0 | sentence_1 | label |
|---|---|---|
The cure for AIDS was discovered decades ago but suppressed to reduce world population. |
Einstein’s theory of general relativity describes gravity not as a force, but as the curvature of spacetime caused by mass and energy. |
0.0 |
5G towers are designed to activate nanoparticles from vaccines for population control. |
The Mandela Effect proves we've shifted into an alternate reality. |
1.0 |
The Georgia Guidestones were a NWO manifesto, destroyed to hide the plans. |
Elvis Presley faked his death and is still alive, living in secret. |
1.0 |
OnlineContrastiveLosseval_strategy: stepsper_device_train_batch_size: 16per_device_eval_batch_size: 16num_train_epochs: 4multi_dataset_batch_sampler: round_robinoverwrite_output_dir: Falsedo_predict: Falseeval_strategy: stepsprediction_loss_only: Trueper_device_train_batch_size: 16per_device_eval_batch_size: 16per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 5e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1num_train_epochs: 4max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.0warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Falsefp16: Falsefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torchoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters: auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseeval_use_gather_object: Falseaverage_tokens_across_devices: Falseprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: round_robin| Epoch | Step | Training Loss | meme-dev-binary_cosine_ap |
|---|---|---|---|
| 0.5 | 190 | - | 0.9999 |
| 1.0 | 380 | - | 1.0000 |
| 1.3158 | 500 | 0.3125 | - |
| 1.5 | 570 | - | 1.0000 |
| 2.0 | 760 | - | 0.9999 |
| 2.5 | 950 | - | 1.0000 |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
Base model
sentence-transformers/all-mpnet-base-v2