HuggingFaceH4/ultrachat_200k
Viewer • Updated • 515k • 74.1k • 718
How to use ondevicellm/tinyllama_moe_sft_routeraux_ep3 with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("text-generation", model="ondevicellm/tinyllama_moe_sft_routeraux_ep3")
messages = [
{"role": "user", "content": "Who are you?"},
]
pipe(messages) # Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("ondevicellm/tinyllama_moe_sft_routeraux_ep3")
model = AutoModelForCausalLM.from_pretrained("ondevicellm/tinyllama_moe_sft_routeraux_ep3")
messages = [
{"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
messages,
add_generation_prompt=True,
tokenize=True,
return_dict=True,
return_tensors="pt",
).to(model.device)
outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))How to use ondevicellm/tinyllama_moe_sft_routeraux_ep3 with vLLM:
# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "ondevicellm/tinyllama_moe_sft_routeraux_ep3"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "ondevicellm/tinyllama_moe_sft_routeraux_ep3",
"messages": [
{
"role": "user",
"content": "What is the capital of France?"
}
]
}'docker model run hf.co/ondevicellm/tinyllama_moe_sft_routeraux_ep3
How to use ondevicellm/tinyllama_moe_sft_routeraux_ep3 with SGLang:
# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
--model-path "ondevicellm/tinyllama_moe_sft_routeraux_ep3" \
--host 0.0.0.0 \
--port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "ondevicellm/tinyllama_moe_sft_routeraux_ep3",
"messages": [
{
"role": "user",
"content": "What is the capital of France?"
}
]
}'docker run --gpus all \
--shm-size 32g \
-p 30000:30000 \
-v ~/.cache/huggingface:/root/.cache/huggingface \
--env "HF_TOKEN=<secret>" \
--ipc=host \
lmsysorg/sglang:latest \
python3 -m sglang.launch_server \
--model-path "ondevicellm/tinyllama_moe_sft_routeraux_ep3" \
--host 0.0.0.0 \
--port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "ondevicellm/tinyllama_moe_sft_routeraux_ep3",
"messages": [
{
"role": "user",
"content": "What is the capital of France?"
}
]
}'How to use ondevicellm/tinyllama_moe_sft_routeraux_ep3 with Docker Model Runner:
docker model run hf.co/ondevicellm/tinyllama_moe_sft_routeraux_ep3
This model is a fine-tuned version of ondevicellm/tinyllama_moe_v2 on the HuggingFaceH4/ultrachat_200k dataset. It achieves the following results on the evaluation set:
More information needed
More information needed
More information needed
The following hyperparameters were used during training:
| Training Loss | Epoch | Step | Validation Loss |
|---|---|---|---|
| 1.72 | 0.09 | 100 | 1.6775 |
| 1.4985 | 0.18 | 200 | 1.4900 |
| 1.4482 | 0.26 | 300 | 1.4473 |
| 1.4152 | 0.35 | 400 | 1.4215 |
| 1.3777 | 0.44 | 500 | 1.4031 |
| 1.3932 | 0.53 | 600 | 1.3886 |
| 1.375 | 0.61 | 700 | 1.3762 |
| 1.3574 | 0.7 | 800 | 1.3657 |
| 1.349 | 0.79 | 900 | 1.3563 |
| 1.3276 | 0.88 | 1000 | 1.3481 |
| 1.3491 | 0.96 | 1100 | 1.3409 |
| 1.2812 | 1.05 | 1200 | 1.3358 |
| 1.2831 | 1.14 | 1300 | 1.3308 |
| 1.2917 | 1.23 | 1400 | 1.3258 |
| 1.2812 | 1.31 | 1500 | 1.3219 |
| 1.2819 | 1.4 | 1600 | 1.3178 |
| 1.2756 | 1.49 | 1700 | 1.3145 |
| 1.2584 | 1.58 | 1800 | 1.3107 |
| 1.2806 | 1.66 | 1900 | 1.3083 |
| 1.2815 | 1.75 | 2000 | 1.3054 |
| 1.2676 | 1.84 | 2100 | 1.3031 |
| 1.2388 | 1.93 | 2200 | 1.3011 |
| 1.2385 | 2.01 | 2300 | 1.3015 |
| 1.2459 | 2.1 | 2400 | 1.3000 |
| 1.2349 | 2.19 | 2500 | 1.2989 |
| 1.2277 | 2.28 | 2600 | 1.2981 |
| 1.2243 | 2.37 | 2700 | 1.2973 |
| 1.2298 | 2.45 | 2800 | 1.2967 |
| 1.2362 | 2.54 | 2900 | 1.2961 |
| 1.216 | 2.63 | 3000 | 1.2958 |
| 1.2381 | 2.72 | 3100 | 1.2957 |
| 1.2274 | 2.8 | 3200 | 1.2955 |
| 1.2235 | 2.89 | 3300 | 1.2954 |
| 1.2438 | 2.98 | 3400 | 1.2954 |
Base model
ondevicellm/tinyllama_moe_v2