Nandi-Mini
Collection
Nandi-Series of Mini Models β’ 4 items β’ Updated β’ 6
How to use FrontiersMind/Nandi-Mini-150M-Tool-Calling with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("text-generation", model="FrontiersMind/Nandi-Mini-150M-Tool-Calling", trust_remote_code=True)
messages = [
{"role": "user", "content": "Who are you?"},
]
pipe(messages) # Load model directly
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("FrontiersMind/Nandi-Mini-150M-Tool-Calling", trust_remote_code=True, dtype="auto")How to use FrontiersMind/Nandi-Mini-150M-Tool-Calling with vLLM:
# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "FrontiersMind/Nandi-Mini-150M-Tool-Calling"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "FrontiersMind/Nandi-Mini-150M-Tool-Calling",
"messages": [
{
"role": "user",
"content": "What is the capital of France?"
}
]
}'docker model run hf.co/FrontiersMind/Nandi-Mini-150M-Tool-Calling
How to use FrontiersMind/Nandi-Mini-150M-Tool-Calling with SGLang:
# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
--model-path "FrontiersMind/Nandi-Mini-150M-Tool-Calling" \
--host 0.0.0.0 \
--port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "FrontiersMind/Nandi-Mini-150M-Tool-Calling",
"messages": [
{
"role": "user",
"content": "What is the capital of France?"
}
]
}'docker run --gpus all \
--shm-size 32g \
-p 30000:30000 \
-v ~/.cache/huggingface:/root/.cache/huggingface \
--env "HF_TOKEN=<secret>" \
--ipc=host \
lmsysorg/sglang:latest \
python3 -m sglang.launch_server \
--model-path "FrontiersMind/Nandi-Mini-150M-Tool-Calling" \
--host 0.0.0.0 \
--port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "FrontiersMind/Nandi-Mini-150M-Tool-Calling",
"messages": [
{
"role": "user",
"content": "What is the capital of France?"
}
]
}'How to use FrontiersMind/Nandi-Mini-150M-Tool-Calling with Docker Model Runner:
docker model run hf.co/FrontiersMind/Nandi-Mini-150M-Tool-Calling
Nandi-Mini-150M-Tool-Calling is a lightweight, single-turn specialized model designed to accurately interpret user queries and generate precise tool calls in one step, enabling efficient and reliable function execution
Weβre just getting started with the Nandi series π
π’ Blogs & technical deep-dives coming soon, where weβll share:
Stay tuned!
!pip install transformers=='5.4.0'
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
import json
model_name = "FrontiersMind/Nandi-Mini-150M-Tool-Calling"
device = "cuda" if torch.cuda.is_available() else "cpu"
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
model_name,
trust_remote_code=True,
dtype=torch.bfloat16
).to(device).eval()
def call_nandi_tool_calling(user_prompt,tools):
tools = json.dumps(tools, indent=4)
system_prompt = f"You are a helpful assistant with access to the following tools - You need to choose appropriate tool for given query, you also need to add appropriate parameters. Do not choose wrong tools, if user query does not belong to a tool. <|tools_start|>\n{tools}\n<|tools_end|>"
messages = [
{"role": "system", "content": system_prompt},
{"role": "user", "content": user_prompt},
]
prompt = tokenizer.apply_chat_template(messages, tokenize=False)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
generated_ids = model.generate(
**inputs,
max_new_tokens=500,
do_sample=True,
temperature=0.3,
top_p=0.90,
top_k=20,
repetition_penalty=1.1,
)
generated_ids = [
output_ids[len(input_ids):] for input_ids, output_ids in zip(inputs.input_ids, generated_ids)
]
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
return response
# Put your query here
user_prompt = "Get weather in Delhi"
# Update the tools according to your use case
tools = [
{
"name": "get_weather",
"description": "Get current weather for a city",
"parameters": {
"city": {
"type": "str",
"description": "City name"
}
}
},
{
"name": "get_time",
"description": "Get current time for a city",
"parameters": {
"city": {
"type": "str",
"description": "City name"
}
}
}
]
print(call_nandi_tool_calling(user_prompt,tools))
Weβd love to hear your thoughts, feedback, and ideas!
Base model
FrontiersMind/Nandi-Mini-150M