Akkadian-English DenseLLM 1B
Akkadian-English DenseLLM 1B is an experimental custom DenseLLM model focused on English ↔ Akkadian / Old Babylonian translation, transliteration, glossing, and grammatical analysis.
This model is a continuation of the earlier AlgoDriveAI/Sanskrit_Akkadian_LLM project. The previous version mixed English, Sanskrit, Akkadian, and some auxiliary material. For this version, the model size was increased to approximately 1B parameters, and the training focus was narrowed to English and Akkadian only.
The main reason for this change is that early testing showed it may be easier to evaluate and improve the model by separating language targets. Instead of combining English/Akkadian/Sanskrit in one testing version, this release focuses specifically on English/Akkadian behavior.
This version achieves higher accuracy on Akkadian-focused translation, glossing, and analysis tasks compared with the earlier mixed-language testing versions.
What Is This?
This is a research model for ancient-language experimentation, especially:
- Akkadian-to-English translation
- English-to-Akkadian style generation
- Old Babylonian-style normalized transliteration
- Literal word-by-word glossing
- Grammatical explanation
- Case-ending analysis
- Verb-form explanation
- Ancient-language prompt-following behavior
This is not a production translation system. Outputs should be checked against reliable Akkadian grammars, dictionaries, and primary sources.
Relationship to the Previous Version
Compared with the earlier Sanskrit + Akkadian model:
- The model size was increased to approximately 1B parameters
- Sanskrit was removed from the main training target
- The focus was narrowed to English/Akkadian
- The model is intended to achieve higher accuracy on Akkadian-specific tasks
- The architecture remains a custom DenseLLM-style causal language model
- Inference is handled with custom model code rather than standard
AutoModelForCausalLM
Install
pip install torch transformers huggingface_hub einops gradio
Regular Python Inference
import os
import sys
import json
import torch
import torch.nn.functional as F
from huggingface_hub import hf_hub_download
from transformers import AutoTokenizer
repo_id = "AlgoDriveAI/Akkadian_English_DenseLLM_1B"
print("Loading tokenizer...")
tokenizer = AutoTokenizer.from_pretrained(repo_id)
print("Downloading model files...")
modeling_path = hf_hub_download(repo_id=repo_id, filename="modeling_dense_llm.py")
config_path = hf_hub_download(repo_id=repo_id, filename="config.json")
weights_path = hf_hub_download(repo_id=repo_id, filename="pytorch_model.bin")
sys.path.insert(0, os.path.dirname(modeling_path))
from modeling_dense_llm import DenseLLM, ModelConfig
with open(config_path, "r", encoding="utf-8") as f:
config = json.load(f)
model_cfg = ModelConfig(**config)
device = "cuda" if torch.cuda.is_available() else "cpu"
dtype = torch.bfloat16 if device == "cuda" and torch.cuda.is_bf16_supported() else (
torch.float16 if device == "cuda" else torch.float32
)
print(f"Loading model on {device} with dtype={dtype}...")
model = DenseLLM(model_cfg, use_gradient_checkpointing=False).to(device=device, dtype=dtype)
state_dict = torch.load(weights_path, map_location="cpu")
model.load_state_dict(state_dict, strict=True)
model.eval()
print("Model ready!")
@torch.inference_mode()
def generate_text(
prompt: str,
max_new_tokens: int = 256,
temperature: float = 0.55,
top_k: int = 35,
top_p: float = 0.88,
repetition_penalty: float = 1.1,
):
input_ids = tokenizer(prompt, return_tensors="pt")["input_ids"].to(device)
generated = input_ids.clone()
eos_id = getattr(tokenizer, "doc_eos_token_id", None)
if eos_id is None:
eos_id = tokenizer.eos_token_id
for _ in range(max_new_tokens):
logits, _ = model(generated, None)
next_logits = logits[:, -1, :].float()
if temperature <= 0:
next_token = torch.argmax(next_logits, dim=-1, keepdim=True)
else:
next_logits = next_logits / max(temperature, 1e-8)
if repetition_penalty != 1.0:
used_tokens = torch.unique(generated[0])
token_scores = next_logits[0, used_tokens]
next_logits[0, used_tokens] = torch.where(
token_scores > 0,
token_scores / repetition_penalty,
token_scores * repetition_penalty,
)
if top_k > 0:
values, _ = torch.topk(next_logits, min(top_k, next_logits.size(-1)))
next_logits[next_logits < values[:, [-1]]] = -float("inf")
if top_p < 1.0:
sorted_logits, sorted_indices = torch.sort(next_logits, descending=True)
cumulative_probs = torch.cumsum(F.softmax(sorted_logits, dim=-1), dim=-1)
remove_mask = cumulative_probs > top_p
remove_mask[..., 1:] = remove_mask[..., :-1].clone()
remove_mask[..., 0] = False
full_mask = torch.zeros_like(next_logits, dtype=torch.bool)
full_mask.scatter_(1, sorted_indices, remove_mask)
next_logits[full_mask] = -float("inf")
probs = F.softmax(next_logits, dim=-1)
next_token = torch.multinomial(probs, num_samples=1)
generated = torch.cat([generated, next_token], dim=-1)
if eos_id is not None and next_token.item() == eos_id:
break
if generated.shape[1] > config["max_seq_len"]:
generated = generated[:, -config["max_seq_len"]:]
return tokenizer.decode(generated[0], skip_special_tokens=True)
prompt = """Translate the following Akkadian transliteration into English. Include a literal word-by-word gloss:
šarrum bītam rabiam ana ilim ibni.
"""
print(generate_text(prompt))
Gradio Demo
import os
import sys
import json
import torch
import torch.nn.functional as F
import gradio as gr
from huggingface_hub import hf_hub_download
from transformers import AutoTokenizer
repo_id = "AlgoDriveAI/Akkadian_English_DenseLLM_1B"
print("Loading tokenizer...")
tokenizer = AutoTokenizer.from_pretrained(repo_id)
print("Downloading model files...")
modeling_path = hf_hub_download(repo_id=repo_id, filename="modeling_dense_llm.py")
config_path = hf_hub_download(repo_id=repo_id, filename="config.json")
weights_path = hf_hub_download(repo_id=repo_id, filename="pytorch_model.bin")
sys.path.insert(0, os.path.dirname(modeling_path))
from modeling_dense_llm import DenseLLM, ModelConfig
with open(config_path, "r", encoding="utf-8") as f:
config = json.load(f)
model_cfg = ModelConfig(**config)
device = "cuda" if torch.cuda.is_available() else "cpu"
dtype = torch.bfloat16 if device == "cuda" and torch.cuda.is_bf16_supported() else (
torch.float16 if device == "cuda" else torch.float32
)
print(f"Loading model on {device} with dtype={dtype}...")
model = DenseLLM(model_cfg, use_gradient_checkpointing=False).to(device=device, dtype=dtype)
state_dict = torch.load(weights_path, map_location="cpu")
model.load_state_dict(state_dict, strict=True)
model.eval()
print("Model ready!")
@torch.inference_mode()
def stream_generate(
prompt: str,
max_new_tokens: int = 256,
temperature: float = 0.55,
top_k: int = 35,
top_p: float = 0.88,
repetition_penalty: float = 1.1,
):
if not prompt.strip():
yield ""
return
input_ids = tokenizer(prompt, return_tensors="pt")["input_ids"].to(device)
generated = input_ids.clone()
eos_id = getattr(tokenizer, "doc_eos_token_id", None)
if eos_id is None:
eos_id = tokenizer.eos_token_id
for _ in range(max_new_tokens):
logits, _ = model(generated, None)
next_logits = logits[:, -1, :].float()
if temperature <= 0:
next_token = torch.argmax(next_logits, dim=-1, keepdim=True)
else:
next_logits = next_logits / max(temperature, 1e-8)
if repetition_penalty != 1.0:
used_tokens = torch.unique(generated[0])
token_scores = next_logits[0, used_tokens]
next_logits[0, used_tokens] = torch.where(
token_scores > 0,
token_scores / repetition_penalty,
token_scores * repetition_penalty,
)
if top_k > 0:
values, _ = torch.topk(next_logits, min(top_k, next_logits.size(-1)))
next_logits[next_logits < values[:, [-1]]] = -float("inf")
if top_p < 1.0:
sorted_logits, sorted_indices = torch.sort(next_logits, descending=True)
cumulative_probs = torch.cumsum(F.softmax(sorted_logits, dim=-1), dim=-1)
remove_mask = cumulative_probs > top_p
remove_mask[..., 1:] = remove_mask[..., :-1].clone()
remove_mask[..., 0] = False
full_mask = torch.zeros_like(next_logits, dtype=torch.bool)
full_mask.scatter_(1, sorted_indices, remove_mask)
next_logits[full_mask] = -float("inf")
probs = F.softmax(next_logits, dim=-1)
next_token = torch.multinomial(probs, num_samples=1)
generated = torch.cat([generated, next_token], dim=-1)
if eos_id is not None and next_token.item() == eos_id:
break
if generated.shape[1] > config["max_seq_len"]:
generated = generated[:, -config["max_seq_len"]:]
decoded = tokenizer.decode(
generated[0, input_ids.shape[1]:],
skip_special_tokens=True,
clean_up_tokenization_spaces=False,
)
yield decoded
def respond(prompt, max_tokens, temperature, top_k, top_p, repetition_penalty):
for partial in stream_generate(
prompt=prompt,
max_new_tokens=max_tokens,
temperature=temperature,
top_k=top_k,
top_p=top_p,
repetition_penalty=repetition_penalty,
):
yield partial
with gr.Blocks(
title="Akkadian-English DenseLLM 1B",
theme=gr.themes.Soft(),
) as demo:
gr.Markdown(
"# Akkadian-English DenseLLM 1B\n"
"*AlgoDriveAI — custom DenseLLM architecture for Akkadian / Old Babylonian translation experiments*"
)
with gr.Row():
with gr.Column(scale=3):
prompt_box = gr.Textbox(
label="Prompt",
placeholder="Translate the following Akkadian transliteration into English...",
lines=5,
value="Translate the following Akkadian transliteration into English. Include a literal word-by-word gloss:\nšarrum bītam rabiam ana ilim ibni.",
)
output_box = gr.Textbox(
label="Output",
lines=14,
interactive=False,
)
generate_btn = gr.Button("Generate", variant="primary")
with gr.Column(scale=1):
max_tokens = gr.Slider(16, 512, value=256, step=1, label="Max new tokens")
temperature = gr.Slider(0.0, 2.0, value=0.55, step=0.05, label="Temperature")
top_k = gr.Slider(0, 100, value=35, step=1, label="Top-K")
top_p = gr.Slider(0.0, 1.0, value=0.88, step=0.01, label="Top-P")
repetition_penalty = gr.Slider(1.0, 1.5, value=1.1, step=0.01, label="Repetition penalty")
generate_btn.click(
fn=respond,
inputs=[prompt_box, max_tokens, temperature, top_k, top_p, repetition_penalty],
outputs=output_box,
)
prompt_box.submit(
fn=respond,
inputs=[prompt_box, max_tokens, temperature, top_k, top_p, repetition_penalty],
outputs=output_box,
)
demo.queue()
demo.launch(server_name="0.0.0.0", server_port=7860, share=False)
Example Prompts
Translate the following Akkadian transliteration into English. Include a literal word-by-word gloss:
šarrum bītam rabiam ana ilim ibni.
Translate the following Akkadian transliteration into English. Include grammatical notes:
ṭupšarrum awātim ina ṭuppim išṭur.
Translate the following Old Babylonian-style Akkadian transliteration into English. Explain the case endings if possible:
awīlum dannum abul ālim ina mūšim iṣṣur.
Translate the following Akkadian transliteration into English. If uncertain, explain the possible alternatives:
lū ilum šarram u ālam liṣṣur.
Architecture
| Component | Details |
|---|---|
| Type | Custom Dense Transformer / DenseLLM |
| Parameters | Approximately 1B |
| Attention | MLA-style attention |
| Positional Encoding | RoPE |
| Activation | SwiGLU |
| Normalization | RMSNorm |
| Task Type | Causal Language Modeling |
| Hyperparameter | Value |
|---|---|
| d_model | 1536 |
| n_layers | 36 |
| n_heads | 12 |
| q_lora_rank | 768 |
| kv_lora_rank | 384 |
| qk_nope_head_dim | 64 |
| qk_rope_head_dim | 64 |
| v_head_dim | 128 |
| ff_hidden_mult | 3.5 |
| max_seq_len | 4096 |
Training Focus
This version focuses on:
- English
- Akkadian
- Old Babylonian-style normalized transliteration
- Translation examples
- Literal glosses
- Grammatical explanations
- Prompt-following examples for Akkadian translation
Unlike the earlier mixed-language version, this release intentionally does not center Sanskrit as a training target.
Known Limitations
- Not a scholarly authority: Outputs should be checked against reliable Akkadian grammars, dictionaries, and corpora.
- Hallucinated forms: The model may invent plausible-looking Akkadian words or endings.
- Translation uncertainty: Akkadian is highly context-dependent, and short isolated sentences may have multiple possible readings.
- Inconsistent transliteration: The model may mix normalized forms, ASCII approximations, or sign-like conventions.
- Repetition: Long generations may become repetitive.
- Prompt sensitivity: The model may behave differently depending on how explicitly the prompt is written.
Recommended Generation Settings
| Setting | Suggested Value |
|---|---|
| temperature | 0.3–0.7 |
| top_k | 30–50 |
| top_p | 0.75–0.9 |
| repetition_penalty | 1.05–1.15 |
| max_new_tokens | 100–300 |
For translation tasks, lower temperature usually gives more stable output.
Feedback Welcome
Feedback is especially useful on:
- Akkadian translation accuracy
- Gloss quality
- Verb parsing
- Case ending interpretation
- Old Babylonian-style grammar
- Failure cases where the model produces plausible but wrong analysis
Contact
Organization: AlgoDriveAI Author: Christopher Smith Base / previous version: AlgoDriveAI/Sanskrit_Akkadian_LLM
Citation
@misc{algodrive2026akkadian_english_dense_1b,
author = {AlgoDriveAI, Christopher Smith},
title = {Akkadian-English DenseLLM 1B},
year = {2026},
publisher = {Hugging Face},
url = {https://huggingface.co/AlgoDriveAI/Akkadian_English_DenseLLM_1B}
}
License
MIT
- Downloads last month
- 32
Model tree for AlgoDriveAI/Akkadian_English_DenseLLM_1B
Base model
AlgoDriveAI/Sanskrit_Akkadian_LLM_v1.0