Instructions to use Brain2nd/NeuronSpark-V4-1.16B-Pretrain with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Brain2nd/NeuronSpark-V4-1.16B-Pretrain with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Brain2nd/NeuronSpark-V4-1.16B-Pretrain", trust_remote_code=True)
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("Brain2nd/NeuronSpark-V4-1.16B-Pretrain", trust_remote_code=True, dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use Brain2nd/NeuronSpark-V4-1.16B-Pretrain with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Brain2nd/NeuronSpark-V4-1.16B-Pretrain"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Brain2nd/NeuronSpark-V4-1.16B-Pretrain",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/Brain2nd/NeuronSpark-V4-1.16B-Pretrain

SGLang

How to use Brain2nd/NeuronSpark-V4-1.16B-Pretrain with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Brain2nd/NeuronSpark-V4-1.16B-Pretrain" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Brain2nd/NeuronSpark-V4-1.16B-Pretrain",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Brain2nd/NeuronSpark-V4-1.16B-Pretrain" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Brain2nd/NeuronSpark-V4-1.16B-Pretrain",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use Brain2nd/NeuronSpark-V4-1.16B-Pretrain with Docker Model Runner:
```
docker model run hf.co/Brain2nd/NeuronSpark-V4-1.16B-Pretrain
```

NeuronSpark-V4-1.16B-Pretrain

NeuronSpark V4 autoregressive pretraining checkpoint.

This repository contains a complete training checkpoint for continued pretraining, not only inference weights.

Checkpoint

Architecture: NeuronSpark V4 causal language model
Scale: 1.16B parameters
Checkpoint step: 10500
Tokens seen: 2,063,372,760 supervised tokens
Sequence length: 2048
Training mode: autoregressive pretraining
Optimizer: Muon + Adam + Lion
DeepSpeed: ZeRO-0
Precision: bf16 training path

Included Files

model.safetensors: Hugging Face model weights for loading/evaluation.
config.json, configuration_neuronspark.py, modeling_neuronspark.py: self-contained custom model code/config.
tokenizer.json, tokenizer_config.json, chat_template.jinja: tokenizer assets.
training_state.pth: saved training step and token counter.
deepspeed/: DeepSpeed checkpoint state for continued training.

Continue Training

Download or snapshot this repository, then resume with the original training script:

deepspeed --num_gpus=8 train_pretrain.py \
  --config_json configs/smoke_1p16b.json \
  --data_path <pretokenized_data_dir> \
  --tokenizer_path tokenizer_v3 \
  --out_dir <new_output_dir> \
  --deepspeed_config configs/ds_zero0_v4.json \
  --max_length 2048 \
  --batch_size 12 \
  --accumulation_steps 1 \
  --optimizer muon_adam_lion \
  --learning_rate 2e-4 \
  --muon_lr 0.005 \
  --lion_lr 1e-4 \
  --warmup_iters 500 \
  --grad_clip 0.5 \
  --resume <downloaded_checkpoint_dir>

Provenance

This is a V4 pretraining checkpoint from the current NeuronSpark V4 branch. It is not the historical V2.5/V3 checkpoint family.

Downloads last month: 37

Safetensors

Model size

1B params

Tensor type

F32

BF16