OmniGene-4-SFT-v4 (LoRA)

LoRA adapter + extended embedding for OmniGene-4 SFT v4. Requires base Gemma-4-26B-A4B-it-bio.

This is the LoRA-only version. For the dual-head successor with classification heads, see OmniGene-4-SFT-v5.

What's new in v4

v4 is the breakthrough that fixed v3's failures:

  • Pure Alpaca template (### Instruction: / ### Answer:) โ€” eliminates chat-tag collapse seen in v3 4-bit inference
  • Loss masking: prompt tokens set to -100, only answer tokens contribute to gradient
  • Task reweighting: Structure ร—3, Mutation ร—2 โ€” addresses long-tail data scarcity
  • MAX_LENGTH 1024 โ†’ 1536 โ€” fits longer Structure outputs

Result: Remote Homology +22.5 pp (59.5 โ†’ 82.0%), Structure unblocked from 0% to 25.7% char overlap.

What's in this repo (~1.9 GB)

File Size Purpose
lora_weights.pt 306 MB LoRA delta (cumulative: CPT v2 + SFT v2/v3/v4)
embedding_weights.pt 1.6 GB Extended embedding (290,048 tokens)
tokenizer.json 36 MB Extended bio tokenizer
chat_template.jinja 17 KB Alpaca-style template
bio_sft_v4_meta.json โ€” Training metadata

Performance (4-bit + Alpaca prompt)

Benchmark v3 v4
Standard Homology 99.4% 99.5%
Remote Homology 59.5% 82.0%
BixBench 91.7% 90.4%
Structure char-overlap 0.0% 25.7%

Quick Start

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import LoraConfig, inject_adapter_in_model
from huggingface_hub import hf_hub_download

BASE = "dnagpt/gemma-4-26B-A4B-it-bio"
ADAPTER = "dnagpt/OmniGene-4-SFT-v4"

bnb = BitsAndBytesConfig(
    load_in_4bit=True, bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16, bnb_4bit_use_double_quant=True,
)
model = AutoModelForCausalLM.from_pretrained(BASE, quantization_config=bnb, device_map={"": 0})
tokenizer = AutoTokenizer.from_pretrained(ADAPTER)

lora_config = LoraConfig(
    r=64, lora_alpha=128, lora_dropout=0.0, bias="none",
    target_modules=['q_proj','k_proj','v_proj','o_proj',
                    'gate_proj','up_proj','down_proj','router.proj'],
)
inject_adapter_in_model(lora_config, model.model.language_model, adapter_name="default")

ms = model.state_dict()
for k, v in torch.load(hf_hub_download(ADAPTER, "lora_weights.pt"), map_location="cpu").items():
    if k in ms: ms[k].copy_(v)
model.get_input_embeddings().weight.data.copy_(
    torch.load(hf_hub_download(ADAPTER, "embedding_weights.pt"), map_location="cpu")
)
model.eval()

Training Lineage

Gemma-4-26B-A4B-Instruct-bio (vocab-extended)
  โ†“ CPT v2 (32.5 GB, 0.6 ep, 100 GPU-h)
  โ†“ Bio-SFT v2 (179K instr, 1 ep, 11.8 GPU-h)
  โ†“ Bio-SFT v3 (+20K remote homology, 13.2 GPU-h)
  โ†“ Bio-SFT v4 (Alpaca + loss masking + reweighting, 30 GPU-h)
OmniGene-4-SFT-v4  โ† YOU ARE HERE

Citation

@article{wang2026omnigene4,
  title={OmniGene-4: A Unified Bio-Language MoE Model with Router-Level Interpretability},
  author={Wang, Liang},
  journal={bioRxiv},
  year={2026}
}

Contact

Liang Wang (wangliang.f@gmail.com) โ€” Huazhong University of Science and Technology

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support