OmniGene-4-SFT-v4 (LoRA)

LoRA adapter + extended embedding for OmniGene-4 SFT v4. Requires base Gemma-4-26B-A4B-it-bio.

This is the LoRA-only version. For the dual-head successor with classification heads, see OmniGene-4-SFT-v5.

What's new in v4

v4 is the breakthrough that fixed v3's failures:

Pure Alpaca template (### Instruction: / ### Answer:) — eliminates chat-tag collapse seen in v3 4-bit inference
Loss masking: prompt tokens set to -100, only answer tokens contribute to gradient
Task reweighting: Structure ×3, Mutation ×2 — addresses long-tail data scarcity
MAX_LENGTH 1024 → 1536 — fits longer Structure outputs

Result: Remote Homology +22.5 pp (59.5 → 82.0%), Structure unblocked from 0% to 25.7% char overlap.

What's in this repo (~1.9 GB)

File	Size	Purpose
`lora_weights.pt`	306 MB	LoRA delta (cumulative: CPT v2 + SFT v2/v3/v4)
`embedding_weights.pt`	1.6 GB	Extended embedding (290,048 tokens)
`tokenizer.json`	36 MB	Extended bio tokenizer
`chat_template.jinja`	17 KB	Alpaca-style template
`bio_sft_v4_meta.json`	—	Training metadata

Performance (4-bit + Alpaca prompt)

Benchmark	v3	v4
Standard Homology	99.4%	99.5%
Remote Homology	59.5%	82.0%
BixBench	91.7%	90.4%
Structure char-overlap	0.0%	25.7%

Quick Start

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import LoraConfig, inject_adapter_in_model
from huggingface_hub import hf_hub_download

BASE = "dnagpt/gemma-4-26B-A4B-it-bio"
ADAPTER = "dnagpt/OmniGene-4-SFT-v4"

bnb = BitsAndBytesConfig(
    load_in_4bit=True, bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16, bnb_4bit_use_double_quant=True,
)
model = AutoModelForCausalLM.from_pretrained(BASE, quantization_config=bnb, device_map={"": 0})
tokenizer = AutoTokenizer.from_pretrained(ADAPTER)

lora_config = LoraConfig(
    r=64, lora_alpha=128, lora_dropout=0.0, bias="none",
    target_modules=['q_proj','k_proj','v_proj','o_proj',
                    'gate_proj','up_proj','down_proj','router.proj'],
)
inject_adapter_in_model(lora_config, model.model.language_model, adapter_name="default")

ms = model.state_dict()
for k, v in torch.load(hf_hub_download(ADAPTER, "lora_weights.pt"), map_location="cpu").items():
    if k in ms: ms[k].copy_(v)
model.get_input_embeddings().weight.data.copy_(
    torch.load(hf_hub_download(ADAPTER, "embedding_weights.pt"), map_location="cpu")
)
model.eval()

Training Lineage

Gemma-4-26B-A4B-Instruct-bio (vocab-extended)
  ↓ CPT v2 (32.5 GB, 0.6 ep, 100 GPU-h)
  ↓ Bio-SFT v2 (179K instr, 1 ep, 11.8 GPU-h)
  ↓ Bio-SFT v3 (+20K remote homology, 13.2 GPU-h)
  ↓ Bio-SFT v4 (Alpaca + loss masking + reweighting, 30 GPU-h)
OmniGene-4-SFT-v4  ← YOU ARE HERE

Citation

@article{wang2026omnigene4,
  title={OmniGene-4: A Unified Bio-Language MoE Model with Router-Level Interpretability},
  author={Wang, Liang},
  journal={bioRxiv},
  year={2026}
}

Contact

Liang Wang (wangliang.f@gmail.com) — Huazhong University of Science and Technology

Downloads last month: -

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support