Instructions to use dnagpt/OmniGene-4-SFT-v5 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use dnagpt/OmniGene-4-SFT-v5 with PEFT:
Task type is invalid.
- Notebooks
- Google Colab
- Kaggle
OmniGene-4-SFT-v5 (LoRA + Classification Heads)
LoRA adapter + extended embedding + dual-head classifiers for OmniGene-4. Requires base Gemma-4-26B-A4B-it-bio.
This is the LoRA-only version β for a standalone BF16 model with weights merged, see OmniGene-4-SFT-v5-merged.
What's in this repo (~1.9 GB)
| File | Size | Purpose |
|---|---|---|
lora_weights.pt |
306 MB | LoRA delta (cumulative: CPT v2 + SFT v2/v3/v4/v5) |
embedding_weights.pt |
1.6 GB | Extended embedding (290,048 tokens including 28,028 bio tokens) |
struct_heads.pt |
157 KB | 3Di (20-class) + DSSP (8-class) per-residue heads |
tokenizer.json |
36 MB | Extended bio tokenizer |
chat_template.jinja |
17 KB | Alpaca-style template |
bio_sft_v5_meta.json |
β | Training metadata |
Performance (4-bit + Alpaca prompt)
| Benchmark | Accuracy |
|---|---|
| Standard Homology (6,000 pairs) | 99.40% |
| Remote Homology (2,000 pairs) | 82.60% |
| BixBench Knowledge (T/F) | 93.66% |
| 3Di per-residue (head, 20-class) | 78.6% |
| DSSP per-residue (head, 8-class) | 100.0% |
vs ESM-2 (650M) on identical 500-pair remote homology: ESM-2 50.5% β gap +32.1 pp.
Quick Start
import torch
import torch.nn as nn
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import LoraConfig, inject_adapter_in_model
from huggingface_hub import hf_hub_download
BASE = "dnagpt/gemma-4-26B-A4B-it-bio"
ADAPTER = "dnagpt/OmniGene-4-SFT-v5"
bnb = BitsAndBytesConfig(
load_in_4bit=True, bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.bfloat16, bnb_4bit_use_double_quant=True,
)
model = AutoModelForCausalLM.from_pretrained(BASE, quantization_config=bnb, device_map={"": 0})
tokenizer = AutoTokenizer.from_pretrained(ADAPTER)
# Inject LoRA
lora_config = LoraConfig(
r=64, lora_alpha=128, lora_dropout=0.0, bias="none",
target_modules=['q_proj','k_proj','v_proj','o_proj',
'gate_proj','up_proj','down_proj','router.proj'],
)
inject_adapter_in_model(lora_config, model.model.language_model, adapter_name="default")
# Load v5 LoRA + embedding
lora_path = hf_hub_download(ADAPTER, "lora_weights.pt")
embed_path = hf_hub_download(ADAPTER, "embedding_weights.pt")
heads_path = hf_hub_download(ADAPTER, "struct_heads.pt")
ms = model.state_dict()
for k, v in torch.load(lora_path, map_location="cpu").items():
if k in ms: ms[k].copy_(v)
model.get_input_embeddings().weight.data.copy_(torch.load(embed_path, map_location="cpu"))
model.eval()
# Optional: classification heads
heads = torch.load(heads_path, map_location="cuda")
head_3di = nn.Linear(2816, 20).to(torch.bfloat16).cuda()
head_dssp = nn.Linear(2816, 8).to(torch.bfloat16).cuda()
head_3di.load_state_dict(heads["head_3di"])
head_dssp.load_state_dict(heads["head_dssp"])
Training Lineage
Gemma-4-26B-A4B-Instruct-bio (vocab-extended)
β CPT v2 (32.5 GB, 0.6 ep, 100 GPU-h)
β Bio-SFT v2 (179K instr, 1 ep, 11.8 GPU-h)
β Bio-SFT v3 (+20K remote homology, 13.2 GPU-h)
β Bio-SFT v4 (Alpaca + loss masking + reweighting, 30 GPU-h)
β Bio-SFT v5 (dual-head: gen + 3Di + DSSP, 5 GPU-h)
OmniGene-4-SFT-v5 β YOU ARE HERE
Citation
@article{wang2026omnigene4,
title={OmniGene-4: A Unified Bio-Language MoE Model with Router-Level Interpretability},
author={Wang, Liang},
journal={bioRxiv},
year={2026}
}
Contact
Liang Wang (wangliang.f@gmail.com) β Huazhong University of Science and Technology
- Downloads last month
- -
Inference Providers NEW
This model isn't deployed by any Inference Provider. π Ask for provider support