Böri — Kazakh AI Grammar Tutor (bori-tutor)

QLoRA fine-tune of Sherkala-8B. Takes a Kazakh (Cyrillic) sentence and returns a JSON object: corrected_text, explanation, next_question, used_words.

Eval

  • eval_loss: 0.531
  • perplexity: 1.70

Important serving notes

  • Tokenizer has no chat_template → build the Llama-3.1 prompt manually (see below).
  • Model may append text after the JSON → extract the first {...} and json.loads it.
  • System prompt is NOT baked in — pass it at inference.

Usage (base + adapter, 4-bit)

import torch, json, os
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import PeftModel
BASE='inceptionai/Llama-3.1-Sherkala-8B-Chat'; ADP='zhdokax/bori-tutor'
bnb=BitsAndBytesConfig(load_in_4bit=True, bnb_4bit_quant_type='nf4', bnb_4bit_compute_dtype=torch.float16, bnb_4bit_use_double_quant=True)
tok=AutoTokenizer.from_pretrained(ADP)
if tok.pad_token is None: tok.pad_token=tok.eos_token
base=AutoModelForCausalLM.from_pretrained(BASE, quantization_config=bnb, device_map='auto')
model=PeftModel.from_pretrained(base, ADP).eval()
SYS='Sen -- Bori, qazaq tilin uyiretetyn interaktyvti mugalimsin. ARQASHAN tek JSON formatynda zhauyap ber.'
def ask(u):
    pr=f'<|begin_of_text|><|start_header_id|>system<|end_header_id|>\n\n{SYS}<|eot_id|><|start_header_id|>user<|end_header_id|>\n\n{u}<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n'
    i=tok(pr,return_tensors='pt').to(model.device)
    o=model.generate(**i,max_new_tokens=256,do_sample=True,temperature=0.6,top_p=0.9,repetition_penalty=1.1,pad_token_id=tok.eos_token_id)
    t=tok.decode(o[0][i['input_ids'].shape[-1]:],skip_special_tokens=True)
    s=t.find('{'); e=t.rfind('}')+1; return json.loads(t[s:e]) if s>=0 and e>0 else {'raw':t}
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for zhdokax/bori-tutor

Unable to build the model tree, the base model loops to the model itself. Learn more.