Kai-30B-Instruct

A 30B-parameter instruction-tuned language model optimized for reasoning, math, and code generation tasks, powered by our ADS (Adaptive Dual-Search Distillation) technique. The largest model in the Kai family.

Model Details

Model Kai-30B-Instruct
Architecture Qwen2ForCausalLM
Parameters ~30B
Hidden size 5120
Intermediate size 27648
Layers 64
Attention heads 40 (8 KV heads, GQA)
Context length 32768
Precision bfloat16
Vocab size 152064
Chat template ChatML (<|im_start|> / <|im_end|>)

Benchmark Results (5-shot, acc_norm)

Benchmark Kai-30B-Instruct Llama-3 70B Qwen2.5 32B Yi-34B Llama-3 8B Mistral 7B Llama-2 7B
ARC-C 64.0 83.0 70.5 65.3 60.1 55.5 53.0
HellaSwag 74.4 89.0 85.2 83.1 78.6 81.3 78.6
PIQA 84.8 85.0 84.1 82.5 79.8 82.1 78.1
Winogrande 86.4 83.0 78.2 76.4 73.0 74.0 69.1

Benchmark Comparison

What is ADS?

Adaptive Dual-Search Distillation treats model fine-tuning as a constrained optimization problem inspired by Operations Research. The core mechanism is a dynamic loss function with a stateful dual penalty factor that adapts based on embedding space entropy — forcing the model to converge to high-confidence predictions at difficult reasoning points, without modifying the model architecture.

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model = AutoModelForCausalLM.from_pretrained(
    "NoesisLab/Kai-30B-Instruct",
    torch_dtype=torch.bfloat16,
    device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained("NoesisLab/Kai-30B-Instruct")

messages = [{"role": "user", "content": "What is 25 * 4?"}]
input_ids = tokenizer.apply_chat_template(
    messages, add_generation_prompt=True, return_tensors="pt"
).to(model.device)

output = model.generate(
    input_ids,
    max_new_tokens=512,
    temperature=0.6,
    top_p=0.8,
    do_sample=True,
)
print(tokenizer.decode(output[0][input_ids.shape[-1]:], skip_special_tokens=True))

Citation

@misc{noesislab2026kai30b,
  title={Kai-30B-Instruct},
  author={NoesisLab},
  year={2026},
  url={https://huggingface.co/NoesisLab/Kai-30B-Instruct}
}

License

Apache 2.0

Downloads last month
391
Safetensors
Model size
33B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 4 Ask for provider support

Model tree for NoesisLab/Kai-30B-Instruct

Quantizations
4 models

Space using NoesisLab/Kai-30B-Instruct 1

Collection including NoesisLab/Kai-30B-Instruct