Instructions to use walledai/walledguard-c with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use walledai/walledguard-c with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="walledai/walledguard-c") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("walledai/walledguard-c") model = AutoModelForCausalLM.from_pretrained("walledai/walledguard-c") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use walledai/walledguard-c with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "walledai/walledguard-c" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "walledai/walledguard-c", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/walledai/walledguard-c
- SGLang
How to use walledai/walledguard-c with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "walledai/walledguard-c" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "walledai/walledguard-c", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "walledai/walledguard-c" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "walledai/walledguard-c", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use walledai/walledguard-c with Docker Model Runner:
docker model run hf.co/walledai/walledguard-c
Walled Guard
🔥WalledGuard comes in two versions: Community and Advanced*.
🔥📣[New] WalledProtect is the most capable content moderator of Walled AI to date. To try the latest version, get your free API access at www.walled.ai. Read the full announcement at blog.
🔥📣[New] WalledGuardEdge is the most capable open-source content moderator from Walled AI. Try it here: WalledGuard-Edge.
Note: The Advanced version is now named as WalledProtect. Get your free API access at www.walled.ai. Latest scores can be found here.
Model Details
Model Description
- Developed by: Walled AI
- Language(s) (NLP): English
- License: Apache 2.0
Direct Use
import torch
import torch.nn as nn
from transformers import AutoTokenizer, AutoModelForCausalLM
TEMPLATE = """[INST] Evaluate if the given text is harmful i.e. either it asks for unsafe information or it contains unsafe information.
<START TEXT>
{prompt}
<END TEXT>
Answer: [/INST]
"""
model_name = "walledai/walledguard-c"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
input_ids = tokenizer.encode(TEMPLATE.format(prompt="How can I make my wife burst into laughter?"), return_tensors="pt")
output = model.generate(input_ids=input_ids, max_new_tokens=20, pad_token_id=0)
prompt_len = input_ids.shape[-1]
output_decoded = tokenizer.decode(output[0][prompt_len:], skip_special_tokens=True)
prediction = 'unsafe' if 'unsafe' in output_decoded else 'safe'
print(prediction)
Inference Speed
- WalledGuard Community: ~0.1 sec/sample (4bit, on A100/A6000)
- Llama Guard 2: ~0.4 sec/sample (4bit, on A100/A6000)
Results
| Model | DynamoBench | XSTest | P-Safety | R-Safety | Average Scores |
|---|---|---|---|---|---|
| Llama Guard 1 | 77.67 | 85.33 | 71.28 | 86.13 | 80.10 |
| Llama Guard 2 | 82.67 | 87.78 | 79.69 | 89.64 | 84.95 |
| Llama Guard 3 | 83.00 | 88.67 | 80.99 | 89.58 | 85.56 |
| WalledGuard-C (Community Version) |
92.00 | 86.89 | 87.35 | 86.78 | 88.26 ▲ 3.2% |
| WalledGuard-A (Advanced Version) |
92.33 | 96.44 | 90.52 | 90.46 | 92.94 ▲ 8.1% |
Table: Scores on DynamoBench, XSTest, and on our internal benchmark to test the safety of prompts (P-Safety) and responses (R-Safety). We report binary classification accuracy.
Note: The Advanced version is now named as WalledProtect. Get your free API access at www.walled.ai. Latest scores can be found here.
LLM Safety Evaluation Hub
Please check out our LLM Safety Evaluation One-Stop Center: Walled Eval!
Citation
If you use the data, please cite the following paper:
@misc{gupta2024walledeval,
title={WalledEval: A Comprehensive Safety Evaluation Toolkit for Large Language Models},
author={Prannaya Gupta and Le Qi Yau and Hao Han Low and I-Shiang Lee and Hugo Maximus Lim and Yu Xin Teoh and Jia Hng Koh and Dar Win Liew and Rishabh Bhardwaj and Rajat Bhardwaj and Soujanya Poria},
year={2024},
eprint={2408.03837},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2408.03837},
}
Model Card Contact
- Downloads last month
- 5