Instructions to use TeraflopAI/teraflopai-denseon-caselaw with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use TeraflopAI/teraflopai-denseon-caselaw with sentence-transformers:

from sentence_transformers import SentenceTransformer

model = SentenceTransformer("TeraflopAI/teraflopai-denseon-caselaw")

sentences = [
"Under what specific evidentiary standard is evidence of a third party's motive or opportunity admissible to create reasonable doubt about a defendant's guilt?",
"In Walker v. State,\n353 Ark. 12, 17\n,\n110 S.W.3d 752, 755\n (2003) we explained our\n\nholding in Zinger. \n\n We have held that a defendant may introduce evidence tending to show that\n someone other than the defendant committed the crime charged, but such evidence\n is inadmissible unless it points directly to the guilt of the third party. Evidence\n which does no more than create an inference or conjecture as to another's guilt is\n inadmissible. [Burmingham v. State,\n342 Ark. 95\n,\n27 S.W.3d 351\n (2000)]; Zinger v.\n State,\n313 Ark. 70\n,\n852 S.W.2d 320\n (1993) (citing State v. Wilson,\n322 N.C. 117\n,\n367 S.E.2d 589\n (1988)). This rule does not require that any evidence, however remote,\n must be admitted to show a third party's possible culpability; evidence of mere\n\n\n 24\n\f motive or opportunity to commit the crime in another person, without more, will\n not suffice to raise a reasonable doubt about a defendant's guilt. There must be\n direct or circumstantial evidence linking the third person to the actual perpetration\n of the crime.",
"This court commented further on the doctrine of informed consent in Williams v. Menehan, 191 Kan. 6, 379 P. 2d 292. There it was held the parents of a small child who died while a team of physicians was performing a cardiac catheterization had given an informed consent to the procedure. In commenting on the rule as laid down iii Natanson, supra, it was said:\n\n\". . . [I]t is the duty of a doctor to make a reasonable disclosure to his patient of the nature and probable consequences of the suggested or recommended treatment, and to make a reasonable disclosure of the dangers within his knowledge which are incident or possible in the treatment he proposes to administer. But this does not mean that a doctor is under an obligation to describe in detail all of the possible consequences of treatment. To make a *532 complete disclosure of all facts, diagnoses and alternatives or possibilities which might occur to the doctor could so alarm the patient that it would, in fact, constitute bad medical practice.\" (p. 8.)",
"Commonwealth v. Melvin,\n103 A.3d 1, 40\n (Pa. Super. 2014). \n\n The trial court accurately summarized the facts presented at trial,\n\nviewed in the light most favorable to the Commonwealth as verdict winner:\n\n On July 18, 2013, Susan Riffle was a passenger\n on a motorcycle being operated by [Mangone]. The\n motorcycle hit some loose gravel and went down and\n Riffle, who sustained numerous injuries as a result of\n the accident, was \"Life Flighted.\"[] When Riffle saw\n [Mangone] leave the scene of the accident, her belief\n was that he was going for help because she told him\n that help was needed. \n\n Andrew Franko, a first responder, responded to\n the scene and observed a female, who was not in\n good condition, lying on the roadway. Observing\n [Mangone] going towards his motorcycle, Franko\n said to him that \"she is hurt. You can't go nowhere.\"\n Franko also advised [Mangone] that he was a first\n responder and could provide help. Nonetheless,\n [Mangone] picked up his motorcycle and left."
]
embeddings = model.encode(sentences)

similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [4, 4]

Notebooks
Google Colab
Kaggle

You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

SentenceTransformer

This is a sentence-transformers model trained on the parquet dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for retrieval.

Model Details

Model Description

Model Type: Sentence Transformer
Maximum Sequence Length: 512 tokens
Output Dimensionality: 768 dimensions
Similarity Function: Cosine Similarity
Supported Modality: Text
Training Dataset:
- parquet

Model Sources

Documentation: Sentence Transformers Documentation
Repository: Sentence Transformers on GitHub
Hugging Face: Sentence Transformers on Hugging Face

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'transformer_task': 'feature-extraction', 'modality_config': {'text': {'method': 'forward', 'method_output_name': 'last_hidden_state'}}, 'module_output_name': 'token_embeddings', 'architecture': 'ModernBertModel'})
  (1): Pooling({'embedding_dimension': 768, 'pooling_mode': 'cls', 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("TeraflopAI/teraflopai-denseon-caselaw")
# Run inference
queries = [
    "Under what circumstances can a plaintiff recover damages for lost profits when a defendant's breach of contract involves the failure to provide telephone service?",
]
documents = [
    'UNITED STATES COURT OF APPEALS\n                            UNITED STATES COURT OF APPEALS\n                    FOR THE FIRST CIRCUIT\n                                FOR THE FIRST CIRCUIT\n                                         \nNo. 94-1711\n\n                  SAS OF PUERTO RICO, INC.,\n\n                    Plaintiff, Appellant,\n\n                              v.\n\n                PUERTO RICO TELEPHONE COMPANY,\n\n                     Defendant, Appellee. \n\n                                         \n\n         APPEAL FROM THE UNITED STATES DISTRICT COURT\n\n               FOR THE DISTRICT OF PUERTO RICO\n\n        [Hon. Jose Antonio Fuste, U.S. District Judge]\n                                                                 \n\n                                         \n\n                            Before\n\n                    Torruella, Chief Judge,\n                                                      \n\n                    Boudin, Circuit Judge,\n                                                     \n\n              and Boyle,* Senior District Judge.\n                                                           \n\n                                         \n\nLaurence  Z.  Shiekman  with  whom  M.  Duncan  Grant,  Frank   M.\n                                                                              \nRapoport,  Michael A. Ceramella and Pepper, Hamilton & Scheetz were on\n                                                                      \nbrief for appellant.\nPhilip J. Mause with whom Joaquin A. Marquez and Drinker Biddle  &\n                                                                              \nReath were on brief for appellee.\n             \n\n                                         \n\n                      February 21, 1995\n                                         \n\n                \n\n*Of the District of Rhode Island, sitting by designation.',
    'Nor did the BIA abuse its discretion by denying Chen\'s motion to reopen, which alleged that he suffered from the ineffective assistance of counsel. To prevail on such a claim, the alien must first comply with certain procedures set forth in Matter of Lozada, 19 I. & N. Dec. 637 (BIA 1988). Here, the BIA properly noted that besides filing a supporting affidavit, Chen made no effort to comply with the requirements enumerated in Lozada. Chen not only failed to notify his former counsel of the allegations of ineffective assistance and to allow him an opportunity to respond, he also failed to file a complaint with a disciplinary authority or provide an explanation for not doing so. See Twum v. INS, 411 F.3d 54, 59 (2d Cir.2005) (citing Lozada, 19 I. & N. Dec. at 639). \n\nBy failing to substantially comply with Lozada, Chen "forfeit[ed][his] ineffective assistance of counsel claim." Jian Yun Zheng v. U.S. Dep\'t of Justice, 409 F.3d 43, 47 (2d Cir.2005). While it is true that "slavish adherence" to Lozada\'s requirements is not necessary in certain circumstances, and while the BIA acknowledged in its June 2006 decision that the brief written by Chen\'s former counsel was deficient, this is not a case in which the facts supporting a "claim of ineffective assistance are clear on the face of the record," which may excuse the failure to comply with Lozada. Yi Long Yang, 478 F.3d at 142-43. The facts here are distinct from the circumstances presented in Yi Long Yang, in that Chen\'s former counsel was not disbarred, nor was there evidence that the agency explicitly assumed his competence. See id. at 142.',
    'We believe the same prerequisite should operate in this\ncase. The requirement that parties seeking Rule 60(b) relief\nshow some prospect of succeeding on the merits flows from\nthe basic principle that courts should revive previously-\ndismissed claims only if they have some reason to believe that\ndoing so will not ultimately waste judicial resources. See\nMurray,\n52 F.3d at 355\n. This principle holds true here:\nreviving Thomas\'s appeal will constitute an "empty exercise\nor futile gesture,"\nid.,\n unless Thomas has some possibility of\nprevailing. \n\n       Indeed, we see two especially good reasons to condition\nthe grant of Thomas\'s motion for reconsideration on his\ndemonstrating a chance of succeeding on the merits. First,\nThomas claims that his appeal should be reinstated because\nthe PLRA\'s three-strikes provision is unconstitutional as\napplied to him. For this court to reach out and decide this\ndifficult and important question simply to reinstate a pointless\nappeal would violate the norm of constitutional avoidance to\nwhich we generally adhere. See Kalka v. Hawk,\n215 F.3d 90, 97\n (D.C. Cir. 2000) ("Federal courts should not decide\nconstitutional questions unless it is necessary to do so.").\nSecond, the PLRA provides that a court "shall dismiss" an\nIFP litigant\'s case if the "appeal . . . is frivolous or malicious\n. . . [or] fails to state a claim on which relief may be granted."\n28 U.S.C. § 1915\n(e)(2). Thus, even were we to grant Thomas\nIFP status and reinstate his appeal, we would then have to\n\x0c                                7\npromptly dismiss the case if his claims lack merit. What could\nbe a more "futile gesture" than reinstating an appeal only to\nthen immediately dismiss it?',
]
query_embeddings = model.encode_query(queries)
document_embeddings = model.encode_document(documents)
print(query_embeddings.shape, document_embeddings.shape)
# [1, 768] [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(query_embeddings, document_embeddings)
print(similarities)
# tensor([[ 0.6819, -0.1021, -0.0287]])

Evaluation

Metrics

Information Retrieval

Dataset: DenseOn_lr6e-05_warmup0.1_bs8k-caselaw
Evaluated with InformationRetrievalEvaluator

Metric	Value
cosine_accuracy@1	0.8934
cosine_accuracy@3	0.957
cosine_accuracy@5	0.9722
cosine_accuracy@10	0.9854
cosine_precision@1	0.8934
cosine_precision@3	0.319
cosine_precision@5	0.1944
cosine_precision@10	0.0985
cosine_recall@1	0.8934
cosine_recall@3	0.957
cosine_recall@5	0.9722
cosine_recall@10	0.9854
cosine_ndcg@10	0.9419
cosine_mrr@10	0.9277
cosine_map@100	0.9284

Training Details

Training Dataset

parquet

Dataset: parquet
Size: 36,118,859 training samples
Columns: question and answer
Approximate statistics based on the first 1000 samples:
question answer
type string string
details
min: 16 tokens
mean: 29.64 tokens
max: 52 tokens

min: 41 tokens
mean: 306.99 tokens
max: 512 tokens

	question	answer
type	string	string
details	min: 16 tokens mean: 29.64 tokens max: 52 tokens	min: 41 tokens mean: 306.99 tokens max: 512 tokens

Samples:

question answer

What is the legal standard and procedure for granting a defendant's request for leave to withdraw as counsel when the attorney certifies that no nonfrivolous issues exist for appeal? Appeal by the defendant from two judgments of the Supreme Court, Queens County (Rosengarten, J.), both rendered May 5, 2003, convicting him of burglary in the first degree, robbery in the first degree, and burglary in the second degree under indictment No. 3417/01, and burglary in the first degree, robbery in the first degree, and burglary in the second degree under indictment No. 1182/02, upon his pleas of guilty, and imposing sentences. Ordered that the judgments are affirmed. We have reviewed the record and agree with the defendant's assigned counsel that there are no nonfrivolous issues which could be raised on appeal. Counsel's application for leave to withdraw as counsel is granted (see Anders v California, 386 US 738 [1967]; People v Paige, 54 AD2d 631 [1976]; cf. People v *606Gonzalez, 47 NY2d 606 [1979]). Adams, J.P., Cozier, Ritter and Skelos, JJ., concur.

Are state-law tort claims alleging defective labeling of generic drugs preempted by federal law? ORDER

question	answer
`What is the legal standard and procedure for granting a defendant's request for leave to withdraw as counsel when the attorney certifies that no nonfrivolous issues exist for appeal?`	Appeal by the defendant from two judgments of the Supreme Court, Queens County (Rosengarten, J.), both rendered May 5, 2003, convicting him of burglary in the first degree, robbery in the first degree, and burglary in the second degree under indictment No. 3417/01, and burglary in the first degree, robbery in the first degree, and burglary in the second degree under indictment No. 1182/02, upon his pleas of guilty, and imposing sentences. Ordered that the judgments are affirmed. We have reviewed the record and agree with the defendant's assigned counsel that there are no nonfrivolous issues which could be raised on appeal. Counsel's application for leave to withdraw as counsel is granted (see Anders v California, 386 US 738 [1967]; People v Paige, 54 AD2d 631 [1976]; cf. People v *606Gonzalez, 47 NY2d 606 [1979]). Adams, J.P., Cozier, Ritter and Skelos, JJ., concur.
`Are state-law tort claims alleging defective labeling of generic drugs preempted by federal law?`	`ORDER`

JOSEPH N. LAPLANTE, District Judge.

This case presents a question currently pending before three different federal courts of appeal: whether state-law tort claims alleging the defective labeling of generic drugs are preempted by federal law. See Morris v. Wyeth, Inc., No. 09-5509 (6th Cir. Apr. 27, 2009); Demahy v. Wyeth, Inc., No. 08-31204 (5th Cir. Dec. 16, 2008); Mensing v. Wyeth, Inc., No. 08-3850 (8th Cir. Dec. 10, 2008). The defendants, Mutual Pharmaceutical Company, Inc. and United Research Laboratories, Inc., move for judgment on the pleadings, see Fed.R.Civ.P. 12(c), on claims by the plaintiffs, Karen L. and Gregory S. Bartlett, alleging that Karen suffered serious injuries from Sulindac, a generic drug manufactured by the defendants. The defendants argue that all of the plaintiffs' state-law causes of action are pre-empted by Title I of the Drug Price Competition and Patent Term Restoration Act of 1984, 1 part of the Hatch-Waxman Amendments to the Federal Food, Drug,... | | Under what circumstances can a trial court's finding of competency to stand trial be challenged when multiple expert evaluations conclude the defendant is competent but exhibit bizarre conduct? | Prior to trial, Card was examined by two court-appointed psychologists for the purpose of determining whether he was competent to stand trial. Following examinations, both psychologists concluded that Card was competent to stand trial pursuant to the criteria set forth in Rule 3.211, Florida Rule of Criminal Procedure. After the initial reports of the two court-appointed *1175 experts were filed, the defense filed a motion for the appointment of a forensic psychiatrist to examine Card. The court acquiesced to this request. Although the forensic psychiatrist did not file his report with the court until a few months after the court issued its order finding Card competent to stand trial, the forensic psychiatrist also concluded that Card was competent. Further, although the various reports filed by the experts indicate bizarre conduct and behavioral problems, the trial court was never presented with evidence providing reasonable grounds to believe that Card was not competent to stand tria... |

Loss: MatryoshkaLoss with these parameters:

{
    "loss": "CachedMultipleNegativesRankingLoss",
    "matryoshka_dims": [
        768,
        512,
        256,
        128
    ],
    "matryoshka_weights": [
        1,
        1,
        1,
        1
    ],
    "n_dims_per_step": -1
}

Evaluation Dataset

parquet

Dataset: parquet
Size: 10,000 evaluation samples
Columns: question and answer
Approximate statistics based on the first 1000 samples:
question answer
type string string
details
min: 15 tokens
mean: 29.87 tokens
max: 55 tokens

min: 74 tokens
mean: 311.66 tokens
max: 512 tokens

	question	answer
type	string	string
details	min: 15 tokens mean: 29.87 tokens max: 55 tokens	min: 74 tokens mean: 311.66 tokens max: 512 tokens

Samples:

question	answer
`What specific factors do Texas courts consider when determining if terminating a parent's rights serves the child's best interest?`	In determining whether termination is in the child's best interest, we apply the following factors laid out in Holley v. Adams, 544 S.W.2d 367, 371–72 (Tex. 1976). Those factors include, but are not limited to: 1. The child's desires; 2. The child's physical and emotional needs, now and in the future; 3. The emotional and physical danger to the child, now and in the future; 4. The parental ability of the individuals seeking custody; 5. The programs available to assist these individuals in promoting the child's best interest; 6. The plans for the child by the individual or agency seeking custody; 7. The stability of the home or proposed placement; 8. The parent's act or omissions that may indicate the existing parent-child relationship is not the proper one; and 9. Any excuse for the parent's acts or omissions.
`Under what circumstances do separate criminal acts fail to constitute a single continuous transaction for the purpose of admitting evidence of one act to prove another?`	¶37 The case before us is far more analogous to Hildreth than to the others. Gallegos stabbed Victim in a park and was later apprehended. Then, while at the police station, Gallegos acted violently, resulting in additional charges. Gallegos's violent behavior at the police station did not "facilitate[ ] flight" from the earlier attack, nor could the later crimes be characterized as "a single [violent] spree," as we would characterize a string of robberies, for example. See Benson , 2014 UT App 92 , ¶¶ 13-14, 325 P.3d 855 . Neither do Gallegos's crimes demonstrate "a distinct behavioral arc of increasingly aggressive and opportunistic transgressions." Burke , 2011 UT App 168 , ¶ 24, 256 P.3d 1102 . Instead, this case is more like Hildreth , where the defendant committed a sequence of offenses, but those offenses were not otherwise related to each other. See 2010 UT App 209 , ¶ 32, 238 P.3d 444 . Here, the stabbing at the park and the violent behavior at the police station are so indepen...
`What level of culpability, such as actual knowledge or reckless disregard, must a plaintiff prove to establish an Eighth Amendment violation for deliberate indifference?`	`Wilson v. Seiter, ___ U.S. at -, ___, 111 S.Ct. at 2324-25, 2327.`

The Seventh Circuit recently observed that "[i]n order to show `deliberate indifference,' a plaintiff is required to prove that the prison official's action was deliberate or reckless in the criminal sense." Santiago v. Lane, 894 F.2d 218 (7th Cir.1990) (emphasis added) (footnote omitted). The United States Supreme Court has cited the Seventh Circuit's criminal recklessness standard with approval. Whitley v. Albers, 475 U.S. 312, 321, 106 S.Ct. 1078, 1085, 89 L.Ed.2d 251 (1986), citing Duckworth v. Franzen, 780 F.2d 645, 653 (7th Cir.1985), cert. denied, 479 U.S. 816, 107 S.Ct. 71, 93 L.Ed.2d 28 (1986). In Franzen, the Seventh Circuit noted that punishment under the Eighth Amendment "implies at a minimum actual knowledge of impending harm easily preventable, so that a conscious, culpable refusal to prevent the harm can be inferred from the defendant's failure to prevent it." 780 F.2d at 653. See also Wilks v. You... |

Loss: MatryoshkaLoss with these parameters:

{
    "loss": "CachedMultipleNegativesRankingLoss",
    "matryoshka_dims": [
        768,
        512,
        256,
        128
    ],
    "matryoshka_weights": [
        1,
        1,
        1,
        1
    ],
    "n_dims_per_step": -1
}

Training Hyperparameters

Non-Default Hyperparameters

per_device_train_batch_size: 2048
num_train_epochs: 1
learning_rate: 6e-05
lr_scheduler_type: cosine
warmup_steps: 0.1
bf16: True
per_device_eval_batch_size: 512
prompts: {'question': 'query: ', 'answer': 'document: '}
batch_sampler: no_duplicates

All Hyperparameters

Click to expand

per_device_train_batch_size: 2048
num_train_epochs: 1
max_steps: -1
learning_rate: 6e-05
lr_scheduler_type: cosine
lr_scheduler_kwargs: None
warmup_steps: 0.1
optim: adamw_torch_fused
optim_args: None
weight_decay: 0.0
adam_beta1: 0.9
adam_beta2: 0.999
adam_epsilon: 1e-08
optim_target_modules: None
gradient_accumulation_steps: 1
average_tokens_across_devices: True
max_grad_norm: 1.0
label_smoothing_factor: 0.0
bf16: True
fp16: False
bf16_full_eval: False
fp16_full_eval: False
tf32: None
gradient_checkpointing: False
gradient_checkpointing_kwargs: None
torch_compile: False
torch_compile_backend: None
torch_compile_mode: None
use_liger_kernel: False
liger_kernel_config: None
use_cache: False
neftune_noise_alpha: None
torch_empty_cache_steps: None
auto_find_batch_size: False
log_on_each_node: True
logging_nan_inf_filter: True
include_num_input_tokens_seen: no
log_level: passive
log_level_replica: warning
disable_tqdm: False
project: huggingface
trackio_space_id: None
trackio_bucket_id: None
trackio_static_space_id: None
per_device_eval_batch_size: 512
prediction_loss_only: True
eval_on_start: False
eval_do_concat_batches: True
eval_use_gather_object: False
eval_accumulation_steps: None
include_for_metrics: []
batch_eval_metrics: False
save_only_model: False
save_on_each_node: False
enable_jit_checkpoint: False
push_to_hub: False
hub_private_repo: None
hub_model_id: None
hub_strategy: every_save
hub_always_push: False
hub_revision: None
load_best_model_at_end: False
ignore_data_skip: False
restore_callback_states_from_checkpoint: False
full_determinism: False
seed: 42
data_seed: None
use_cpu: False
accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
parallelism_config: None
dataloader_drop_last: True
dataloader_num_workers: 0
dataloader_pin_memory: True
dataloader_persistent_workers: False
dataloader_prefetch_factor: None
remove_unused_columns: True
label_names: None
train_sampling_strategy: random
length_column_name: length
ddp_find_unused_parameters: None
ddp_bucket_cap_mb: None
ddp_broadcast_buffers: False
ddp_static_graph: None
ddp_backend: None
ddp_timeout: 1800
fsdp: []
fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
deepspeed: None
debug: []
skip_memory_metrics: True
do_predict: False
resume_from_checkpoint: None
warmup_ratio: None
local_rank: -1
prompts: {'question': 'query: ', 'answer': 'document: '}
batch_sampler: no_duplicates
multi_dataset_batch_sampler: proportional
router_mapping: {}
learning_rate_mapping: {}

Training Logs

Click to expand

Epoch	Step	Training Loss	Validation Loss	DenseOn_lr6e-05_warmup0.1_bs8k-caselaw_cosine_ndcg@10
0.7008	3090	0.3007	-	-
0.7031	3100	0.2991	-	-
0.7054	3110	0.3052	-	-
0.7076	3120	0.2941	-	-
0.7099	3130	0.2964	-	-
0.7122	3140	0.2958	-	-
0.7144	3150	0.2966	-	-
0.7167	3160	0.2959	-	-
0.7190	3170	0.2954	-	-
0.7213	3180	0.2982	-	-
0.7235	3190	0.2980	-	-
0.7258	3200	0.2955	-	-
0.7281	3210	0.2958	-	-
0.7303	3220	0.2988	-	-
0.7326	3230	0.2892	-	-
0.7349	3240	0.2973	-	-
0.7371	3250	0.2960	-	-
0.7394	3260	0.3030	-	-
0.7417	3270	0.2998	-	-
0.7439	3280	0.3021	-	-
0.7462	3290	0.3046	-	-
0.7485	3300	0.2922	-	-
0.7507	3310	0.2941	-	-
0.7530	3320	0.2953	-	-
0.7553	3330	0.2992	-	-
0.7575	3340	0.3015	-	-
0.7598	3350	0.2951	-	-
0.7621	3360	0.3021	-	-
0.7643	3370	0.3022	-	-
0.7666	3380	0.2923	-	-
0.7689	3390	0.2946	-	-
0.7711	3400	0.2986	-	-
0.7734	3410	0.2960	-	-
0.7757	3420	0.3006	-	-
0.7780	3430	0.3020	-	-
0.7802	3440	0.2894	-	-
0.7825	3450	0.2986	-	-
0.7848	3460	0.2912	-	-
0.7870	3470	0.2957	-	-
0.7893	3480	0.2954	-	-
0.7916	3490	0.2937	-	-
0.7938	3500	0.2989	-	-
0.7961	3510	0.2956	-	-
0.7984	3520	0.3020	-	-
0.8006	3530	0.2957	-	-
0.8029	3540	0.2873	-	-
0.8052	3550	0.2900	-	-
0.8074	3560	0.2885	-	-
0.8097	3570	0.2904	-	-
0.8120	3580	0.2857	-	-
0.8142	3590	0.2977	-	-
0.8165	3600	0.2891	-	-
0.8188	3610	0.2958	-	-
0.8210	3620	0.2985	-	-
0.8233	3630	0.2915	-	-
0.8256	3640	0.2910	-	-
0.8279	3650	0.2931	-	-
0.8301	3660	0.2983	-	-
0.8324	3670	0.2921	-	-
0.8347	3680	0.2804	-	-
0.8369	3690	0.3018	-	-
0.8392	3700	0.2920	-	-
0.8415	3710	0.2897	-	-
0.8437	3720	0.2896	-	-
0.8460	3730	0.2884	-	-
0.8483	3740	0.2919	-	-
0.8505	3750	0.2896	-	-
0.8528	3760	0.2971	-	-
0.8551	3770	0.2948	-	-
0.8573	3780	0.2869	-	-
0.8596	3790	0.2976	-	-
0.8619	3800	0.2924	-	-
0.8641	3810	0.2907	-	-
0.8664	3820	0.2973	-	-
0.8687	3830	0.2985	-	-
0.8709	3840	0.2909	-	-
0.8732	3850	0.2951	-	-
0.8755	3860	0.2851	-	-
0.8778	3870	0.2867	-	-
0.8800	3880	0.2950	-	-
0.8823	3890	0.2919	-	-
0.8846	3900	0.2978	-	-
0.8868	3910	0.2902	-	-
0.8891	3920	0.2953	-	-
0.8914	3930	0.2938	-	-
0.8936	3940	0.2922	-	-
0.8959	3950	0.2884	-	-
0.8982	3960	0.2881	-	-
0.9002	3969	-	0.0828	0.9418
0.9004	3970	0.2967	-	-
0.9027	3980	0.2941	-	-
0.9050	3990	0.2829	-	-
0.9072	4000	0.2907	-	-
0.9095	4010	0.2932	-	-
0.9118	4020	0.2961	-	-
0.9140	4030	0.2925	-	-
0.9163	4040	0.2916	-	-
0.9186	4050	0.2893	-	-
0.9208	4060	0.2908	-	-
0.9231	4070	0.2919	-	-
0.9254	4080	0.2923	-	-
0.9276	4090	0.2827	-	-
0.9299	4100	0.2862	-	-
0.9322	4110	0.2925	-	-
0.9345	4120	0.2913	-	-
0.9367	4130	0.2866	-	-
0.9390	4140	0.2914	-	-
0.9413	4150	0.2825	-	-
0.9435	4160	0.2991	-	-
0.9458	4170	0.2881	-	-
0.9481	4180	0.2853	-	-
0.9503	4190	0.2872	-	-
0.9526	4200	0.2900	-	-
0.9549	4210	0.2937	-	-
0.9571	4220	0.2852	-	-
0.9594	4230	0.2889	-	-
0.9617	4240	0.2873	-	-
0.9639	4250	0.2918	-	-
0.9662	4260	0.2880	-	-
0.9685	4270	0.2881	-	-
0.9707	4280	0.2915	-	-
0.9730	4290	0.2873	-	-
0.9753	4300	0.2897	-	-
0.9775	4310	0.2828	-	-
0.9798	4320	0.2877	-	-
0.9821	4330	0.2869	-	-
0.9844	4340	0.2883	-	-
0.9866	4350	0.2953	-	-
0.9889	4360	0.2911	-	-
0.9912	4370	0.2861	-	-
0.9934	4380	0.2954	-	-
0.9957	4390	0.2939	-	-
0.9980	4400	0.2890	-	-
1.0	4409	-	0.0825	0.9419
-1	-1	-	-	0.9419

Training Time

Training: 5.0 hours
Evaluation: 10.6 minutes
Total: 5.1 hours

Framework Versions

Python: 3.12.13
Sentence Transformers: 5.4.1
Transformers: 5.8.0
PyTorch: 2.11.0+cu130
Accelerate: 1.13.0
Datasets: 4.8.5
Tokenizers: 0.22.2

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MatryoshkaLoss

@misc{kusupati2024matryoshka,
    title={Matryoshka Representation Learning},
    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
    year={2024},
    eprint={2205.13147},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

CachedMultipleNegativesRankingLoss

@misc{gao2021scaling,
    title={Scaling Deep Contrastive Learning Batch Size under Memory Limited Setup},
    author={Luyu Gao and Yunyi Zhang and Jiawei Han and Jamie Callan},
    year={2021},
    eprint={2101.06983},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

Downloads last month: -

Safetensors

Model size

0.1B params

Tensor type

F32

Collection including TeraflopAI/teraflopai-denseon-caselaw

Legal Encoders

Collection

A collection of SOTA legal embedding models trained for information retrieval on a large quantity of U.S. Case Law documents. • 9 items • Updated 8 days ago

Papers for TeraflopAI/teraflopai-denseon-caselaw

Evaluation results

Cosine Accuracy@1 on DenseOn lr6e 05 warmup0.1 bs8k caselaw
self-reported

0.893
Cosine Accuracy@3 on DenseOn lr6e 05 warmup0.1 bs8k caselaw
self-reported

0.957
Cosine Accuracy@5 on DenseOn lr6e 05 warmup0.1 bs8k caselaw
self-reported

0.972
Cosine Accuracy@10 on DenseOn lr6e 05 warmup0.1 bs8k caselaw
self-reported

0.985
Cosine Precision@1 on DenseOn lr6e 05 warmup0.1 bs8k caselaw
self-reported

0.893
Cosine Precision@3 on DenseOn lr6e 05 warmup0.1 bs8k caselaw
self-reported

0.319
Cosine Precision@5 on DenseOn lr6e 05 warmup0.1 bs8k caselaw
self-reported

0.194
Cosine Precision@10 on DenseOn lr6e 05 warmup0.1 bs8k caselaw
self-reported

0.099