Medical BGE-Base for Eldercare

This is a sentence-transformers model finetuned from BAAI/bge-base-en-v1.5 on the medical-embeddings-training-data dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Type: Sentence Transformer
Base model: BAAI/bge-base-en-v1.5
Maximum Sequence Length: 512 tokens
Output Dimensionality: 768 dimensions
Similarity Function: Cosine Similarity
Training Dataset:
- medical-embeddings-training-data
Language: en
License: apache-2.0

Model Sources

Documentation: Sentence Transformers Documentation
Repository: Sentence Transformers on GitHub
Hugging Face: Sentence Transformers on Hugging Face

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': True, 'architecture': 'BertModel'})
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'Represent this sentence for searching relevant passages: Is Almond Oil Oral safe for elderly?',
    'Almond Oil Oral',
    '[minor] Minimal additive effects possible',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.9264, 0.6171],
#         [0.9264, 1.0000, 0.5909],
#         [0.6171, 0.5909, 1.0000]])

Evaluation

Metrics

Information Retrieval

Dataset: medical-eval
Evaluated with InformationRetrievalEvaluator

Metric	Value
cosine_accuracy@1	0.364
cosine_accuracy@3	0.4565
cosine_accuracy@5	0.513
cosine_accuracy@10	0.586
cosine_precision@1	0.364
cosine_precision@3	0.1522
cosine_precision@5	0.1026
cosine_precision@10	0.0586
cosine_recall@1	0.364
cosine_recall@3	0.4565
cosine_recall@5	0.513
cosine_recall@10	0.586
cosine_ndcg@10	0.4651
cosine_mrr@10	0.4277
cosine_map@100	0.437

Training Details

Training Dataset

medical-embeddings-training-data

Dataset: medical-embeddings-training-data
Size: 138,875 training samples
Columns: anchor and positive
Approximate statistics based on the first 1000 samples:
anchor positive
type string string
details
min: 12 tokens
mean: 27.02 tokens
max: 69 tokens

min: 3 tokens
mean: 56.27 tokens
max: 379 tokens

	anchor	positive
type	string	string
details	min: 12 tokens mean: 27.02 tokens max: 69 tokens	min: 3 tokens mean: 56.27 tokens max: 379 tokens

Samples:

anchor	positive
`Represent this sentence for searching relevant passages: [CONTEXT: This chunk provides guidance on the importance of snacking for older adults to support their nutrition, health, independence, and quality of life, emphasizing professional and community support.] Snacking for Seniors - Part 10`	`Snacking is an important aspect of nutrition and diet for older adults. Understanding snacking helps seniors maintain health, independence, and quality of life. Professional guidance and community resources support snacking for seniors. Taking action on snacking makes a positive difference.`
`Represent this sentence for searching relevant passages: [CONTEXT: This chunk provides guidance on Medicare financial enrollment aimed at seniors and caregivers to enhance understanding and improve health outcomes through informed communication with healthcare providers.] Enrollment in Medicare Financial`	`Information about enrollment for medicare financial. This knowledge helps seniors and caregivers understand important aspects of enrollment. Regular communication with healthcare providers and staying informed about enrollment can improve quality of life and health outcomes.`
`Represent this sentence for searching relevant passages: What are the warnings for ca44f63d-749f-4062-8fc7-ac3158825310?`	`Warnings Stop use and ask a doctor , if symptoms persist or worsen. If pregnant or breast-feeding, take only on advice of a healthcare professional.`

Loss: MatryoshkaLoss with these parameters:

{
    "loss": "CachedGISTEmbedLoss",
    "matryoshka_dims": [
        768,
        512,
        256,
        128,
        64
    ],
    "matryoshka_weights": [
        1,
        1,
        1,
        1,
        1
    ],
    "n_dims_per_step": -1
}

Evaluation Dataset

medical-embeddings-training-data

Dataset: medical-embeddings-training-data
Size: 15,431 evaluation samples
Columns: anchor and positive
Approximate statistics based on the first 1000 samples:
anchor positive
type string string
details
min: 12 tokens
mean: 27.07 tokens
max: 79 tokens

min: 3 tokens
mean: 53.67 tokens
max: 441 tokens

	anchor	positive
type	string	string
details	min: 12 tokens mean: 27.07 tokens max: 79 tokens	min: 3 tokens mean: 53.67 tokens max: 441 tokens

Samples:

anchor	positive
`Represent this sentence for searching relevant passages: What is N-21 used for?`	`INDICATIONS Support for the eyes, inflammation, redness and dryness of the eyes.`
`Represent this sentence for searching relevant passages: What are the warnings for Venofye Orchard Bee Brilliance SPF 30?`	`Warnings For external use only. Do not use on damaged or broken skin. Keep out of eyes. If contact occurs, rinse with water. If rash or irritation develops, discontinue use and consult your physician.`
`Represent this sentence for searching relevant passages: What is Nymalize used for?`	1 INDICATIONS AND USAGE NYMALIZE is indicated for the improvement of neurological outcome by reducing the incidence and severity of ischemic deficits in adult patients with subarachnoid hemorrhage (SAH) from ruptured intracranial berry aneurysms regardless of their post-ictus neurological condition (i.e., Hunt and Hess Grades I-V). NYMALIZE is a dihydropyridine calcium channel blocker indicated for the improvement of neurological outcome by reducing the incidence and severity of ischemic deficits in adult patients with subarachnoid hemorrhage (SAH) from ruptured intracranial berry aneurysms regardless of their post-ictus neurological condition (i.e., Hunt and Hess Grades I-V). ( 1 )

Loss: MatryoshkaLoss with these parameters:

{
    "loss": "CachedGISTEmbedLoss",
    "matryoshka_dims": [
        768,
        512,
        256,
        128,
        64
    ],
    "matryoshka_weights": [
        1,
        1,
        1,
        1,
        1
    ],
    "n_dims_per_step": -1
}

Training Hyperparameters

Non-Default Hyperparameters

per_device_train_batch_size: 32
num_train_epochs: 1
learning_rate: 2e-05
warmup_steps: 0.1
weight_decay: 0.01
fp16: True
eval_strategy: steps
per_device_eval_batch_size: 32
load_best_model_at_end: True
warmup_ratio: 0.1
batch_sampler: no_duplicates

All Hyperparameters

Click to expand

per_device_train_batch_size: 32
num_train_epochs: 1
max_steps: -1
learning_rate: 2e-05
lr_scheduler_type: linear
lr_scheduler_kwargs: None
warmup_steps: 0.1
optim: adamw_torch
optim_args: None
weight_decay: 0.01
adam_beta1: 0.9
adam_beta2: 0.999
adam_epsilon: 1e-08
optim_target_modules: None
gradient_accumulation_steps: 1
average_tokens_across_devices: True
max_grad_norm: 1.0
label_smoothing_factor: 0.0
bf16: False
fp16: True
bf16_full_eval: False
fp16_full_eval: False
tf32: None
gradient_checkpointing: False
gradient_checkpointing_kwargs: None
torch_compile: False
torch_compile_backend: None
torch_compile_mode: None
use_liger_kernel: False
liger_kernel_config: None
use_cache: False
neftune_noise_alpha: None
torch_empty_cache_steps: None
auto_find_batch_size: False
log_on_each_node: True
logging_nan_inf_filter: True
include_num_input_tokens_seen: no
log_level: passive
log_level_replica: warning
disable_tqdm: False
project: huggingface
trackio_space_id: trackio
eval_strategy: steps
per_device_eval_batch_size: 32
prediction_loss_only: True
eval_on_start: False
eval_do_concat_batches: True
eval_use_gather_object: False
eval_accumulation_steps: None
include_for_metrics: []
batch_eval_metrics: False
save_only_model: False
save_on_each_node: False
enable_jit_checkpoint: False
push_to_hub: False
hub_private_repo: None
hub_model_id: None
hub_strategy: every_save
hub_always_push: False
hub_revision: None
load_best_model_at_end: True
ignore_data_skip: False
restore_callback_states_from_checkpoint: False
full_determinism: False
seed: 42
data_seed: None
use_cpu: False
accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
parallelism_config: None
dataloader_drop_last: False
dataloader_num_workers: 0
dataloader_pin_memory: True
dataloader_persistent_workers: False
dataloader_prefetch_factor: None
remove_unused_columns: True
label_names: None
train_sampling_strategy: random
length_column_name: length
ddp_find_unused_parameters: None
ddp_bucket_cap_mb: None
ddp_broadcast_buffers: False
ddp_backend: None
ddp_timeout: 1800
fsdp: []
fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
deepspeed: None
debug: []
skip_memory_metrics: True
do_predict: False
resume_from_checkpoint: None
warmup_ratio: 0.1
local_rank: -1
prompts: None
batch_sampler: no_duplicates
multi_dataset_batch_sampler: proportional
router_mapping: {}
learning_rate_mapping: {}

Training Logs

Epoch	Step	Training Loss	Validation Loss	medical-eval_cosine_ndcg@10
-1	-1	-	-	0.3526
0.0115	50	21.4644	-	-
0.0230	100	12.3693	-	-
0.0346	150	8.3857	-	-
0.0461	200	6.6197	-	-
0.0576	250	5.0731	-	-
0.0691	300	4.7315	-	-
0.0806	350	4.0549	-	-
0.0922	400	3.6729	-	-
0.1037	450	3.2577	-	-
0.1152	500	3.1758	2.0604	0.4118
0.1267	550	2.8301	-	-
0.1382	600	2.8569	-	-
0.1498	650	2.6782	-	-
0.1613	700	2.6447	-	-
0.1728	750	2.4327	-	-
0.1843	800	2.3657	-	-
0.1959	850	2.6314	-	-
0.2074	900	2.1208	-	-
0.2189	950	2.2966	-	-
0.2304	1000	2.2544	1.5063	0.4352
0.2419	1050	2.3016	-	-
0.2535	1100	2.0355	-	-
0.2650	1150	2.1095	-	-
0.2765	1200	2.1743	-	-
0.2880	1250	2.0971	-	-
0.2995	1300	1.9309	-	-
0.3111	1350	1.8261	-	-
0.3226	1400	1.8530	-	-
0.3341	1450	1.8746	-	-
0.3456	1500	2.0222	1.3728	0.4472
0.3571	1550	1.8223	-	-
0.3687	1600	1.9075	-	-
0.3802	1650	1.8989	-	-
0.3917	1700	1.9164	-	-
0.4032	1750	1.8066	-	-
0.4147	1800	1.6366	-	-
0.4263	1850	1.8221	-	-
0.4378	1900	1.8602	-	-
0.4493	1950	1.7117	-	-
0.4608	2000	1.5802	1.2166	0.4514
0.4724	2050	1.6800	-	-
0.4839	2100	1.7472	-	-
0.4954	2150	1.6999	-	-
0.5069	2200	1.7095	-	-
0.5184	2250	1.5913	-	-
0.5300	2300	1.7659	-	-
0.5415	2350	1.5958	-	-
0.5530	2400	1.7422	-	-
0.5645	2450	1.7618	-	-
0.5760	2500	1.7058	1.1509	0.4576
0.5876	2550	1.7298	-	-
0.5991	2600	1.6349	-	-
0.6106	2650	1.4472	-	-
0.6221	2700	1.5778	-	-
0.6336	2750	1.5168	-	-
0.6452	2800	1.5806	-	-
0.6567	2850	1.4689	-	-
0.6682	2900	1.4052	-	-
0.6797	2950	1.4711	-	-
0.6912	3000	1.4793	1.0594	0.4616
0.7028	3050	1.6667	-	-
0.7143	3100	1.6008	-	-
0.7258	3150	1.4364	-	-
0.7373	3200	1.5818	-	-
0.7488	3250	1.4065	-	-
0.7604	3300	1.4548	-	-
0.7719	3350	1.2939	-	-
0.7834	3400	1.6109	-	-
0.7949	3450	1.7005	-	-
0.8065	3500	1.4387	1.0056	0.4633
0.8180	3550	1.5827	-	-
0.8295	3600	1.5064	-	-
0.8410	3650	1.5197	-	-
0.8525	3700	1.5047	-	-
0.8641	3750	1.5071	-	-
0.8756	3800	1.4802	-	-
0.8871	3850	1.3452	-	-
0.8986	3900	1.5015	-	-
0.9101	3950	1.4390	-	-
0.9217	4000	1.2257	0.9738	0.4651
0.9332	4050	1.3540	-	-
0.9447	4100	1.3713	-	-
0.9562	4150	1.4461	-	-
0.9677	4200	1.5809	-	-
0.9793	4250	1.2528	-	-
0.9908	4300	1.3154	-	-

The bold row denotes the saved checkpoint.

Framework Versions

Python: 3.11.10
Sentence Transformers: 5.2.3
Transformers: 5.2.0
PyTorch: 2.4.1+cu124
Accelerate: 1.12.0
Datasets: 2.21.0
Tokenizers: 0.22.2

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MatryoshkaLoss

@misc{kusupati2024matryoshka,
    title={Matryoshka Representation Learning},
    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
    year={2024},
    eprint={2205.13147},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

Downloads last month: 1

Safetensors

Model size

0.1B params

Tensor type

F32

Model tree for epaulson2/medical-gte-base-eldercare

Base model

BAAI/bge-base-en-v1.5

Finetuned

(467)

this model

Papers for epaulson2/medical-gte-base-eldercare

Matryoshka Representation Learning

Paper • 2205.13147 • Published May 26, 2022 • 27

Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks

Paper • 1908.10084 • Published Aug 27, 2019 • 13

Evaluation results

Cosine Accuracy@1 on medical eval
self-reported

0.364
Cosine Accuracy@3 on medical eval
self-reported

0.457
Cosine Accuracy@5 on medical eval
self-reported

0.513
Cosine Accuracy@10 on medical eval
self-reported

0.586
Cosine Precision@1 on medical eval
self-reported

0.364
Cosine Precision@3 on medical eval
self-reported

0.152
Cosine Precision@5 on medical eval
self-reported

0.103
Cosine Precision@10 on medical eval
self-reported

0.059