Instructions to use MagistrTheOne/RadonSAI-Small with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use MagistrTheOne/RadonSAI-Small with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="MagistrTheOne/RadonSAI-Small")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("MagistrTheOne/RadonSAI-Small")
model = AutoModelForCausalLM.from_pretrained("MagistrTheOne/RadonSAI-Small")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use MagistrTheOne/RadonSAI-Small with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "MagistrTheOne/RadonSAI-Small"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "MagistrTheOne/RadonSAI-Small",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/MagistrTheOne/RadonSAI-Small

SGLang

How to use MagistrTheOne/RadonSAI-Small with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "MagistrTheOne/RadonSAI-Small" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "MagistrTheOne/RadonSAI-Small",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "MagistrTheOne/RadonSAI-Small" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "MagistrTheOne/RadonSAI-Small",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use MagistrTheOne/RadonSAI-Small with Docker Model Runner:
```
docker model run hf.co/MagistrTheOne/RadonSAI-Small
```

MagistrTheOne commited on Oct 9, 2025

Commit

248e1b1

verified ·

1 Parent(s): cd4cebf

Update MagistrTheOne/RadonSAI-Small with safetensors weights

Browse files

Files changed (9) hide show

README.md +42 -124
config.json +35 -14
config.yaml +9 -0
generation_config.json +1 -1
model.safetensors +2 -2
model_card.yaml +25 -0
special_tokens_map.json +3 -21
tokenizer.json +0 -0
tokenizer_config.json +2 -5

README.md CHANGED Viewed

@@ -1,126 +1,44 @@
----
-license: apache-2.0
-language:
-- ru
-- en
-tags:
-- mistral
-- russian
-- english
-- code
-- machine-learning
-- nlp
-- transformer
-- gpt2
-- small-model
-pipeline_tag: text-generation
-model-index:
-- name: RadonSAI-Small
-  results:
-  - task:
-      type: text-generation
-      name: Text Generation
-    dataset:
-      type: custom
-      name: RADON Datasets
-    metrics:
-    - type: perplexity
-      value: "TBD"
-      name: Perplexity
-size_categories: 22M
----
-# RadonSAI-Small - 22M Parameter GPT2-based Russian-English Transformer
-## Model Description
-RadonSAI-Small is a 22M parameter transformer model based on GPT2 architecture, optimized for Russian-English machine learning applications and development/testing purposes.
-### Key Features
-- **Architecture**: GPT2-based with optimized parameters
-- **Parameters**: 21,764,608 parameters (22M)
-- **Context**: 512 tokens
-- **Tokenizer**: Optimized for Russian-English
-- **Status**: Ready for inference and fine-tuning
-- **Size**: Compact model for development and testing
-### Model Weights
-This model contains properly initialized weights:
-- **Format**: Safetensors (.safetensors) + PyTorch (.bin)
-- **Dtype**: float32
-- **Initialization**: Random weights
-- **Size**: 86MB (22M parameters)
-- **Status**: Ready for inference and fine-tuning
-### Usage
-```python
-from transformers import AutoModelForCausalLM, AutoTokenizer
-# Load RadonSAI-Small
-model = AutoModelForCausalLM.from_pretrained("MagistrTheOne/RadonSAI-Small")
-tokenizer = AutoTokenizer.from_pretrained("MagistrTheOne/RadonSAI-Small")
-# Generate text
-prompt = "Машинное обучение - это"
-inputs = tokenizer(prompt, return_tensors="pt")
-outputs = model.generate(
-    **inputs,
-    max_length=100,
-    temperature=0.7,
-    do_sample=True,
-    pad_token_id=tokenizer.eos_token_id
-)
-result = tokenizer.decode(outputs[0], skip_special_tokens=True)
-print(result)
-```
-### Model Architecture
 ```
-RadonSAI-Small:
-- Hidden size: 256
-- Layers: 6
-- Attention heads: 8
-- Intermediate size: 1,024
-- Vocabulary: 32,000
-- Context window: 512 tokens
-- Architecture: GPT2LMHeadModel
-```
-### Performance
-- **Speed**: Fast inference on CPU/GPU
-- **Memory**: 86MB memory usage
-- **Quality**: Development/testing model
-- **Languages**: English + Russian support
-### Use Cases
-- **Development**: Quick prototyping and testing
-- **Learning**: Educational purposes
-- **Experimentation**: Model architecture research
-- **Resource-constrained**: Low-memory environments
-### Citation
-```bibtex
-@misc{radonsaismall2025,
-  title={RadonSAI-Small: 22M Parameter GPT2-based Russian-English Transformer},
-  author={MagistrTheOne},
-  year={2025},
-  url={https://huggingface.co/MagistrTheOne/RadonSAI-Small}
-}
-```
-### License
-Apache 2.0 License
-### Contact
-- GitHub: [MagistrTheOne/Radon2BMistral](https://github.com/MagistrTheOne/Radon2BMistral)
-- Hugging Face: [MagistrTheOne/RadonSAI-Small](https://huggingface.co/MagistrTheOne/RadonSAI-Small)

+# RadonSAI-Small
+## Overview
+RadonSAI-Small is a lightweight variant of the Radon model family, based on the GPT-2 architecture.
+## Source Model
+- **Source**: gpt2
+- **Model Class**: GPT2LMHeadModel
+- **Parameters**: 124M (actual size from source)
+- **Architecture**: GPT-2 Small
+## Artifacts
+- `model.safetensors` - Model weights in safetensors format (~480MB)
+- `tokenizer.json` - Tokenizer configuration
+- `tokenizer_config.json` - Tokenizer metadata
+- `vocab.json` - Vocabulary file
+- `merges.txt` - BPE merge rules
+- `config.json` - Model configuration (normalized)
+## How to Verify
+```bash
+# Run inference test
+python3 tests/test_inference_small.py
 ```
+## Conversion Steps
+1. Download gpt2 from Hugging Face
+2. Convert weights to safetensors format
+3. Save tokenizer files
+4. Normalize config JSON with correct architectures and model_type
+5. Validate with inference test
+## Notes
+- This variant uses the original parameter count of the source model (124M)
+- Target label suggests 21M parameters, but actual size is 124M from gpt2
+- To achieve the target 21M parameters, consider:
+  - Knowledge distillation from a larger model
+  - Pruning techniques
+  - Training from scratch with reduced architecture
+## File Sizes
+- Total folder size: ~500MB
+- Model weights: ~480MB
+- Tokenizer files: ~20MB

config.json CHANGED Viewed

@@ -1,17 +1,38 @@
 {
-  "model_name": "radon",
-  "model_type": "gpt2",
-  "vocab_size": 32000,
-  "hidden_size": 256,
-  "num_layers": 6,
-  "num_attention_heads": 8,
-  "intermediate_size": 1024,
-  "max_position_embeddings": 512,
-  "dropout": 0.1,
-  "attention_dropout": 0.1,
-  "activation_function": "gelu",
-  "layer_norm_eps": 1e-05,
   "initializer_range": 0.02,
   "use_cache": true,
-  "torch_dtype": "float32"
-}

 {
+  "activation_function": "gelu_new",
+  "architectures": [
+    "GPT2LMHeadModel"
+  ],
+  "attn_pdrop": 0.1,
+  "bos_token_id": 50256,
+  "dtype": "float32",
+  "embd_pdrop": 0.1,
+  "eos_token_id": 50256,
   "initializer_range": 0.02,
+  "layer_norm_epsilon": 1e-05,
+  "model_type": "gpt2",
+  "n_ctx": 1024,
+  "n_embd": 768,
+  "n_head": 12,
+  "n_inner": null,
+  "n_layer": 12,
+  "n_positions": 1024,
+  "reorder_and_upcast_attn": false,
+  "resid_pdrop": 0.1,
+  "scale_attn_by_inverse_layer_idx": false,
+  "scale_attn_weights": true,
+  "summary_activation": null,
+  "summary_first_dropout": 0.1,
+  "summary_proj_to_labels": true,
+  "summary_type": "cls_index",
+  "summary_use_proj": true,
+  "task_specific_params": {
+    "text-generation": {
+      "do_sample": true,
+      "max_length": 50
+    }
+  },
+  "transformers_version": "4.57.0",
   "use_cache": true,
+  "vocab_size": 50257
+}

config.yaml ADDED Viewed

	@@ -0,0 +1,9 @@

+architecture: GPT2LMHeadModel
+conversion_date: '2025-01-09'
+format: safetensors
+max_position_embeddings: 1024
+model_name: RadonSAI-Small
+model_type: gpt2
+parameters: 124M
+source_model: gpt2
+vocab_size: 50257

generation_config.json CHANGED Viewed

@@ -2,5 +2,5 @@
   "_from_model_config": true,
   "bos_token_id": 50256,
   "eos_token_id": 50256,
-  "transformers_version": "4.36.2"
 }

   "_from_model_config": true,
   "bos_token_id": 50256,
   "eos_token_id": 50256,
+  "transformers_version": "4.57.0"
 }

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:8a60b1808195c50cee30d3cfaf84aab26feb751225939e0a2da8f36af23eb7f5
-size 42515920

 version https://git-lfs.github.com/spec/v1
+oid sha256:c7d00560d8910fbed77ffad4065dee5011c41ba401b1064e749c498ba9e20373
+size 497774208

model_card.yaml ADDED Viewed

	@@ -0,0 +1,25 @@

+base_model: gpt2
+inference:
+  parameters:
+    do_sample: true
+    max_new_tokens: 256
+    temperature: 0.7
+    top_p: 0.9
+language:
+- en
+- ru
+library_name: transformers
+license: apache-2.0
+model_type: gpt2
+pipeline_tag: text-generation
+tags:
+- safetensors
+- text-generation
+- conversational
+- machine-learning
+- nlp
+- transformer
+- russian
+- english
+- small-model
+- gpt2

special_tokens_map.json CHANGED Viewed

@@ -1,23 +1,5 @@
 {
-  "bos_token": {
-    "content": "<|endoftext|>",
-    "lstrip": false,
-    "normalized": true,
-    "rstrip": false,
-    "single_word": false
-  },
-  "eos_token": {
-    "content": "<|endoftext|>",
-    "lstrip": false,
-    "normalized": true,
-    "rstrip": false,
-    "single_word": false
-  },
-  "unk_token": {
-    "content": "<|endoftext|>",
-    "lstrip": false,
-    "normalized": true,
-    "rstrip": false,
-    "single_word": false
-  }
 }

 {
+  "bos_token": "<|endoftext|>",
+  "eos_token": "<|endoftext|>",
+  "unk_token": "<|endoftext|>"
 }

tokenizer.json CHANGED Viewed

The diff for this file is too large to render. See raw diff

tokenizer_config.json CHANGED Viewed

@@ -1,5 +1,4 @@
 {
-  "add_bos_token": false,
   "add_prefix_space": false,
   "added_tokens_decoder": {
     "50256": {
@@ -12,12 +11,10 @@
     }
   },
   "bos_token": "<|endoftext|>",
-  "chat_template": "{% for message in messages %}{{ message.content }}{{ eos_token }}{% endfor %}",
-  "clean_up_tokenization_spaces": true,
   "eos_token": "<|endoftext|>",
-  "errors": "replace",
   "model_max_length": 1024,
-  "pad_token": null,
   "tokenizer_class": "GPT2Tokenizer",
   "unk_token": "<|endoftext|>"
 }

 {
   "add_prefix_space": false,
   "added_tokens_decoder": {
     "50256": {
     }
   },
   "bos_token": "<|endoftext|>",
+  "clean_up_tokenization_spaces": false,
   "eos_token": "<|endoftext|>",
+  "extra_special_tokens": {},
   "model_max_length": 1024,
   "tokenizer_class": "GPT2Tokenizer",
   "unk_token": "<|endoftext|>"
 }