Akkadian-English DenseLLM 1B

Akkadian-English DenseLLM 1B is an experimental custom DenseLLM model focused on English ↔ Akkadian / Old Babylonian translation, transliteration, glossing, and grammatical analysis.

This model is a continuation of the earlier AlgoDriveAI/Sanskrit_Akkadian_LLM project. The previous version mixed English, Sanskrit, Akkadian, and some auxiliary material. For this version, the model size was increased to approximately 1B parameters, and the training focus was narrowed to English and Akkadian only.

The main reason for this change is that early testing showed it may be easier to evaluate and improve the model by separating language targets. Instead of combining English/Akkadian/Sanskrit in one testing version, this release focuses specifically on English/Akkadian behavior.

This version achieves higher accuracy on Akkadian-focused translation, glossing, and analysis tasks compared with the earlier mixed-language testing versions.

What Is This?

This is a research model for ancient-language experimentation, especially:

  • Akkadian-to-English translation
  • English-to-Akkadian style generation
  • Old Babylonian-style normalized transliteration
  • Literal word-by-word glossing
  • Grammatical explanation
  • Case-ending analysis
  • Verb-form explanation
  • Ancient-language prompt-following behavior

This is not a production translation system. Outputs should be checked against reliable Akkadian grammars, dictionaries, and primary sources.

Relationship to the Previous Version

Compared with the earlier Sanskrit + Akkadian model:

  • The model size was increased to approximately 1B parameters
  • Sanskrit was removed from the main training target
  • The focus was narrowed to English/Akkadian
  • The model is intended to achieve higher accuracy on Akkadian-specific tasks
  • The architecture remains a custom DenseLLM-style causal language model
  • Inference is handled with custom model code rather than standard AutoModelForCausalLM

Install

pip install torch transformers huggingface_hub einops gradio

Regular Python Inference

import os
import sys
import json
import torch
import torch.nn.functional as F

from huggingface_hub import hf_hub_download
from transformers import AutoTokenizer

repo_id = "AlgoDriveAI/Akkadian_English_DenseLLM_1B"

print("Loading tokenizer...")
tokenizer = AutoTokenizer.from_pretrained(repo_id)

print("Downloading model files...")
modeling_path = hf_hub_download(repo_id=repo_id, filename="modeling_dense_llm.py")
config_path = hf_hub_download(repo_id=repo_id, filename="config.json")
weights_path = hf_hub_download(repo_id=repo_id, filename="pytorch_model.bin")

sys.path.insert(0, os.path.dirname(modeling_path))

from modeling_dense_llm import DenseLLM, ModelConfig

with open(config_path, "r", encoding="utf-8") as f:
    config = json.load(f)

model_cfg = ModelConfig(**config)

device = "cuda" if torch.cuda.is_available() else "cpu"
dtype = torch.bfloat16 if device == "cuda" and torch.cuda.is_bf16_supported() else (
    torch.float16 if device == "cuda" else torch.float32
)

print(f"Loading model on {device} with dtype={dtype}...")
model = DenseLLM(model_cfg, use_gradient_checkpointing=False).to(device=device, dtype=dtype)

state_dict = torch.load(weights_path, map_location="cpu")
model.load_state_dict(state_dict, strict=True)
model.eval()

print("Model ready!")


@torch.inference_mode()
def generate_text(
    prompt: str,
    max_new_tokens: int = 256,
    temperature: float = 0.55,
    top_k: int = 35,
    top_p: float = 0.88,
    repetition_penalty: float = 1.1,
):
    input_ids = tokenizer(prompt, return_tensors="pt")["input_ids"].to(device)
    generated = input_ids.clone()

    eos_id = getattr(tokenizer, "doc_eos_token_id", None)
    if eos_id is None:
        eos_id = tokenizer.eos_token_id

    for _ in range(max_new_tokens):
        logits, _ = model(generated, None)
        next_logits = logits[:, -1, :].float()

        if temperature <= 0:
            next_token = torch.argmax(next_logits, dim=-1, keepdim=True)
        else:
            next_logits = next_logits / max(temperature, 1e-8)

            if repetition_penalty != 1.0:
                used_tokens = torch.unique(generated[0])
                token_scores = next_logits[0, used_tokens]
                next_logits[0, used_tokens] = torch.where(
                    token_scores > 0,
                    token_scores / repetition_penalty,
                    token_scores * repetition_penalty,
                )

            if top_k > 0:
                values, _ = torch.topk(next_logits, min(top_k, next_logits.size(-1)))
                next_logits[next_logits < values[:, [-1]]] = -float("inf")

            if top_p < 1.0:
                sorted_logits, sorted_indices = torch.sort(next_logits, descending=True)
                cumulative_probs = torch.cumsum(F.softmax(sorted_logits, dim=-1), dim=-1)

                remove_mask = cumulative_probs > top_p
                remove_mask[..., 1:] = remove_mask[..., :-1].clone()
                remove_mask[..., 0] = False

                full_mask = torch.zeros_like(next_logits, dtype=torch.bool)
                full_mask.scatter_(1, sorted_indices, remove_mask)
                next_logits[full_mask] = -float("inf")

            probs = F.softmax(next_logits, dim=-1)
            next_token = torch.multinomial(probs, num_samples=1)

        generated = torch.cat([generated, next_token], dim=-1)

        if eos_id is not None and next_token.item() == eos_id:
            break

        if generated.shape[1] > config["max_seq_len"]:
            generated = generated[:, -config["max_seq_len"]:]

    return tokenizer.decode(generated[0], skip_special_tokens=True)


prompt = """Translate the following Akkadian transliteration into English. Include a literal word-by-word gloss:
šarrum bītam rabiam ana ilim ibni.
"""

print(generate_text(prompt))

Gradio Demo

import os
import sys
import json
import torch
import torch.nn.functional as F
import gradio as gr

from huggingface_hub import hf_hub_download
from transformers import AutoTokenizer

repo_id = "AlgoDriveAI/Akkadian_English_DenseLLM_1B"

print("Loading tokenizer...")
tokenizer = AutoTokenizer.from_pretrained(repo_id)

print("Downloading model files...")
modeling_path = hf_hub_download(repo_id=repo_id, filename="modeling_dense_llm.py")
config_path = hf_hub_download(repo_id=repo_id, filename="config.json")
weights_path = hf_hub_download(repo_id=repo_id, filename="pytorch_model.bin")

sys.path.insert(0, os.path.dirname(modeling_path))

from modeling_dense_llm import DenseLLM, ModelConfig

with open(config_path, "r", encoding="utf-8") as f:
    config = json.load(f)

model_cfg = ModelConfig(**config)

device = "cuda" if torch.cuda.is_available() else "cpu"
dtype = torch.bfloat16 if device == "cuda" and torch.cuda.is_bf16_supported() else (
    torch.float16 if device == "cuda" else torch.float32
)

print(f"Loading model on {device} with dtype={dtype}...")
model = DenseLLM(model_cfg, use_gradient_checkpointing=False).to(device=device, dtype=dtype)

state_dict = torch.load(weights_path, map_location="cpu")
model.load_state_dict(state_dict, strict=True)
model.eval()

print("Model ready!")


@torch.inference_mode()
def stream_generate(
    prompt: str,
    max_new_tokens: int = 256,
    temperature: float = 0.55,
    top_k: int = 35,
    top_p: float = 0.88,
    repetition_penalty: float = 1.1,
):
    if not prompt.strip():
        yield ""
        return

    input_ids = tokenizer(prompt, return_tensors="pt")["input_ids"].to(device)
    generated = input_ids.clone()

    eos_id = getattr(tokenizer, "doc_eos_token_id", None)
    if eos_id is None:
        eos_id = tokenizer.eos_token_id

    for _ in range(max_new_tokens):
        logits, _ = model(generated, None)
        next_logits = logits[:, -1, :].float()

        if temperature <= 0:
            next_token = torch.argmax(next_logits, dim=-1, keepdim=True)
        else:
            next_logits = next_logits / max(temperature, 1e-8)

            if repetition_penalty != 1.0:
                used_tokens = torch.unique(generated[0])
                token_scores = next_logits[0, used_tokens]
                next_logits[0, used_tokens] = torch.where(
                    token_scores > 0,
                    token_scores / repetition_penalty,
                    token_scores * repetition_penalty,
                )

            if top_k > 0:
                values, _ = torch.topk(next_logits, min(top_k, next_logits.size(-1)))
                next_logits[next_logits < values[:, [-1]]] = -float("inf")

            if top_p < 1.0:
                sorted_logits, sorted_indices = torch.sort(next_logits, descending=True)
                cumulative_probs = torch.cumsum(F.softmax(sorted_logits, dim=-1), dim=-1)

                remove_mask = cumulative_probs > top_p
                remove_mask[..., 1:] = remove_mask[..., :-1].clone()
                remove_mask[..., 0] = False

                full_mask = torch.zeros_like(next_logits, dtype=torch.bool)
                full_mask.scatter_(1, sorted_indices, remove_mask)
                next_logits[full_mask] = -float("inf")

            probs = F.softmax(next_logits, dim=-1)
            next_token = torch.multinomial(probs, num_samples=1)

        generated = torch.cat([generated, next_token], dim=-1)

        if eos_id is not None and next_token.item() == eos_id:
            break

        if generated.shape[1] > config["max_seq_len"]:
            generated = generated[:, -config["max_seq_len"]:]

        decoded = tokenizer.decode(
            generated[0, input_ids.shape[1]:],
            skip_special_tokens=True,
            clean_up_tokenization_spaces=False,
        )
        yield decoded


def respond(prompt, max_tokens, temperature, top_k, top_p, repetition_penalty):
    for partial in stream_generate(
        prompt=prompt,
        max_new_tokens=max_tokens,
        temperature=temperature,
        top_k=top_k,
        top_p=top_p,
        repetition_penalty=repetition_penalty,
    ):
        yield partial


with gr.Blocks(
    title="Akkadian-English DenseLLM 1B",
    theme=gr.themes.Soft(),
) as demo:
    gr.Markdown(
        "# Akkadian-English DenseLLM 1B\n"
        "*AlgoDriveAI — custom DenseLLM architecture for Akkadian / Old Babylonian translation experiments*"
    )

    with gr.Row():
        with gr.Column(scale=3):
            prompt_box = gr.Textbox(
                label="Prompt",
                placeholder="Translate the following Akkadian transliteration into English...",
                lines=5,
                value="Translate the following Akkadian transliteration into English. Include a literal word-by-word gloss:\nšarrum bītam rabiam ana ilim ibni.",
            )
            output_box = gr.Textbox(
                label="Output",
                lines=14,
                interactive=False,
            )
            generate_btn = gr.Button("Generate", variant="primary")

        with gr.Column(scale=1):
            max_tokens = gr.Slider(16, 512, value=256, step=1, label="Max new tokens")
            temperature = gr.Slider(0.0, 2.0, value=0.55, step=0.05, label="Temperature")
            top_k = gr.Slider(0, 100, value=35, step=1, label="Top-K")
            top_p = gr.Slider(0.0, 1.0, value=0.88, step=0.01, label="Top-P")
            repetition_penalty = gr.Slider(1.0, 1.5, value=1.1, step=0.01, label="Repetition penalty")

    generate_btn.click(
        fn=respond,
        inputs=[prompt_box, max_tokens, temperature, top_k, top_p, repetition_penalty],
        outputs=output_box,
    )

    prompt_box.submit(
        fn=respond,
        inputs=[prompt_box, max_tokens, temperature, top_k, top_p, repetition_penalty],
        outputs=output_box,
    )

demo.queue()
demo.launch(server_name="0.0.0.0", server_port=7860, share=False)

Example Prompts

Translate the following Akkadian transliteration into English. Include a literal word-by-word gloss:
šarrum bītam rabiam ana ilim ibni.
Translate the following Akkadian transliteration into English. Include grammatical notes:
ṭupšarrum awātim ina ṭuppim išṭur.
Translate the following Old Babylonian-style Akkadian transliteration into English. Explain the case endings if possible:
awīlum dannum abul ālim ina mūšim iṣṣur.
Translate the following Akkadian transliteration into English. If uncertain, explain the possible alternatives:
lū ilum šarram u ālam liṣṣur.

Architecture

Component Details
Type Custom Dense Transformer / DenseLLM
Parameters Approximately 1B
Attention MLA-style attention
Positional Encoding RoPE
Activation SwiGLU
Normalization RMSNorm
Task Type Causal Language Modeling
Hyperparameter Value
d_model 1536
n_layers 36
n_heads 12
q_lora_rank 768
kv_lora_rank 384
qk_nope_head_dim 64
qk_rope_head_dim 64
v_head_dim 128
ff_hidden_mult 3.5
max_seq_len 4096

Training Focus

This version focuses on:

  • English
  • Akkadian
  • Old Babylonian-style normalized transliteration
  • Translation examples
  • Literal glosses
  • Grammatical explanations
  • Prompt-following examples for Akkadian translation

Unlike the earlier mixed-language version, this release intentionally does not center Sanskrit as a training target.

Known Limitations

  • Not a scholarly authority: Outputs should be checked against reliable Akkadian grammars, dictionaries, and corpora.
  • Hallucinated forms: The model may invent plausible-looking Akkadian words or endings.
  • Translation uncertainty: Akkadian is highly context-dependent, and short isolated sentences may have multiple possible readings.
  • Inconsistent transliteration: The model may mix normalized forms, ASCII approximations, or sign-like conventions.
  • Repetition: Long generations may become repetitive.
  • Prompt sensitivity: The model may behave differently depending on how explicitly the prompt is written.

Recommended Generation Settings

Setting Suggested Value
temperature 0.3–0.7
top_k 30–50
top_p 0.75–0.9
repetition_penalty 1.05–1.15
max_new_tokens 100–300

For translation tasks, lower temperature usually gives more stable output.

Feedback Welcome

Feedback is especially useful on:

  • Akkadian translation accuracy
  • Gloss quality
  • Verb parsing
  • Case ending interpretation
  • Old Babylonian-style grammar
  • Failure cases where the model produces plausible but wrong analysis

Contact

Organization: AlgoDriveAI Author: Christopher Smith Base / previous version: AlgoDriveAI/Sanskrit_Akkadian_LLM

Citation

@misc{algodrive2026akkadian_english_dense_1b,
  author = {AlgoDriveAI, Christopher Smith},
  title = {Akkadian-English DenseLLM 1B},
  year = {2026},
  publisher = {Hugging Face},
  url = {https://huggingface.co/AlgoDriveAI/Akkadian_English_DenseLLM_1B}
}

License

MIT


Downloads last month
32
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for AlgoDriveAI/Akkadian_English_DenseLLM_1B

Finetuned
(1)
this model