Instructions to use Transluce/input_ablation_llama3.1_8b_instruct_llama3.1_8b_instruct with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Transluce/input_ablation_llama3.1_8b_instruct_llama3.1_8b_instruct with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="Transluce/input_ablation_llama3.1_8b_instruct_llama3.1_8b_instruct") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, LlamaWithIntervention tokenizer = AutoTokenizer.from_pretrained("Transluce/input_ablation_llama3.1_8b_instruct_llama3.1_8b_instruct") model = LlamaWithIntervention.from_pretrained("Transluce/input_ablation_llama3.1_8b_instruct_llama3.1_8b_instruct") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use Transluce/input_ablation_llama3.1_8b_instruct_llama3.1_8b_instruct with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Transluce/input_ablation_llama3.1_8b_instruct_llama3.1_8b_instruct" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Transluce/input_ablation_llama3.1_8b_instruct_llama3.1_8b_instruct", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/Transluce/input_ablation_llama3.1_8b_instruct_llama3.1_8b_instruct
- SGLang
How to use Transluce/input_ablation_llama3.1_8b_instruct_llama3.1_8b_instruct with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Transluce/input_ablation_llama3.1_8b_instruct_llama3.1_8b_instruct" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Transluce/input_ablation_llama3.1_8b_instruct_llama3.1_8b_instruct", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Transluce/input_ablation_llama3.1_8b_instruct_llama3.1_8b_instruct" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Transluce/input_ablation_llama3.1_8b_instruct_llama3.1_8b_instruct", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use Transluce/input_ablation_llama3.1_8b_instruct_llama3.1_8b_instruct with Docker Model Runner:
docker model run hf.co/Transluce/input_ablation_llama3.1_8b_instruct_llama3.1_8b_instruct
Training Language Models To Explain Their Own Computations
This is a Llama-3.1-8B-Instruct explainer model fine-tuned for the input ablations task for the Llama-3.1-8B-Instruct target model, as described in this paper. In the input ablations task, explainer models are trained to predict how removing "hint" tokens from an MMLU prompt with a hint changes the output of Llama-3.1-8B-Instruct. This helps in understanding the causal relationships between input components and model behavior.
Sample Usage
To evaluate the explainer model on the input ablation task, you can use the evaluation script provided in the GitHub repository.
uv run --env-file .env evaluate.py \
--config config/input_ablation/instruct_instruct_hint.yaml \
--target_model_path meta-llama/Llama-3.1-8B-Instruct \
--task hint_attribution \
--model_path Transluce/input_ablation_llama3.1_8b_instruct_llama3.1_8b_instruct \
--output_dir /PATH/TO/RESULTS/ \
--batch_size 64
Citation
@misc{li2025traininglanguagemodelsexplain,
title={Training Language Models to Explain Their Own Computations},
author={Belinda Z. Li and Zifan Carl Guo and Vincent Huang and Jacob Steinhardt and Jacob Andreas},
year={2025},
eprint={2511.08579},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2511.08579},
}
- Downloads last month
- 7