BERTimbau Legal TJSC v2

Fine-tuned BERTimbau for Brazilian legal area classification across 5 legal domains and 5 courts.

Part of the LegalBench-BR benchmark.

Model Details

Property Value
Base model neuralmind/bert-base-portuguese-cased (BERTimbau)
Fine-tuning method LoRA (r=16, alpha=32, dropout=0.1)
Target modules query, value
Trainable parameters 593,669 (0.54% of total)
Training data 50,000 samples (oversampled from 17,462 labeled)
Max sequence length 128 tokens
Training epochs 5 (with early stopping, patience=2)
GPU Google Colab T4 (~35 min)

Performance

Main Results (Test Set: 250 samples, balanced)

Metric Score
Accuracy 99.60%
F1 Macro 0.9960
Inference latency 86.4 ms (CPU)

Per-Class F1 Scores

Legal Area F1 Score
Administrativo 0.995
Civel 0.995
Consumidor 0.995
Penal 0.995
Tributario 0.995

Comparison with LLMs (0-shot)

Model Accuracy F1 Macro Latency
BERTimbau v2 (this model) 99.60% 0.9960 86 ms
Legal-BERTimbau v2 98.00% 0.9800 85 ms
Claude 3.5 Haiku 77.20% 0.7534 1,192 ms
GPT-4o 74.80% 0.7095 1,048 ms
GPT-4o-mini 55.60% 0.5243 969 ms
Llama 3.1 8B 47.20% 0.3811 811 ms

Labels

LABEL2ID = {
    "civel": 0,
    "consumidor": 1,
    "tributario": 2,
    "administrativo": 3,
    "penal": 4
}

Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

model_name = "pedronettotrue/bertimbau-legal-tjsc-v2"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

ID2LABEL = {0: "civel", 1: "consumidor", 2: "tributario", 3: "administrativo", 4: "penal"}

text = "Classe: Apelacao Civel. Assuntos: Responsabilidade Civil, Danos Morais"
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=128)

with torch.no_grad():
    logits = model(**inputs).logits
    predicted_class = ID2LABEL[logits.argmax().item()]

print(f"Predicted: {predicted_class}")
# Output: civel

Training Details

  • Dataset: LegalBench-BR v2 -- 50,000 training samples from 5 Brazilian courts (TJSC, TJSP, TJRJ, STJ, TJMG)
  • Labeling: Deterministic 3-layer CNJ taxonomy matching (subject, procedural class, court division) -- no LLM-based labeling
  • Optimizer: AdamW (lr=2e-4, weight_decay=0.01, warmup_ratio=0.1)
  • Precision: FP16
  • Framework: HuggingFace Transformers + PEFT (LoRA)
  • Post-training: LoRA weights merged via merge_and_unload() -- loads without PEFT dependency

Dataset

The model was trained on metadata from Brazilian court decisions (procedural class + legal subjects), not full-text judicial opinions. This is an important distinction: the model classifies based on structured judicial metadata patterns.

Courts covered in training:

  • TJSC (Tribunal de Justica de Santa Catarina)
  • TJSP (Tribunal de Justica de Sao Paulo)
  • TJRJ (Tribunal de Justica do Rio de Janeiro)
  • STJ (Superior Tribunal de Justica)
  • TJMG (Tribunal de Justica de Minas Gerais)

Limitations

  • Trained on judicial metadata (procedural class + subjects), not full-text legal reasoning
  • 5 legal areas only (does not cover labor, environmental, family law, etc.)
  • Performance on courts not in the training set has not been evaluated
  • Input is truncated at 128 tokens

Citation

@misc{legalbenchbr2026,
  title={LegalBench-BR v2: A Multi-Court Benchmark Demonstrating Fine-Tuned Encoder
         Superiority over LLMs for Brazilian Legal Area Classification},
  year={2026},
  url={https://huggingface.co/pedronettotrue/bertimbau-legal-tjsc-v2}
}

License

Apache 2.0

Downloads last month
5
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for pedronettotrue/legalbench-br-v2-bertimbau-base

Adapter
(4)
this model

Dataset used to train pedronettotrue/legalbench-br-v2-bertimbau-base

Evaluation results