Instructions to use mkhalifa/flan-t5-large-svamp with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use mkhalifa/flan-t5-large-svamp with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="mkhalifa/flan-t5-large-svamp")# Load model directly from transformers import AutoTokenizer, AutoModelForSeq2SeqLM tokenizer = AutoTokenizer.from_pretrained("mkhalifa/flan-t5-large-svamp") model = AutoModelForSeq2SeqLM.from_pretrained("mkhalifa/flan-t5-large-svamp") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use mkhalifa/flan-t5-large-svamp with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "mkhalifa/flan-t5-large-svamp" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "mkhalifa/flan-t5-large-svamp", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/mkhalifa/flan-t5-large-svamp
- SGLang
How to use mkhalifa/flan-t5-large-svamp with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "mkhalifa/flan-t5-large-svamp" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "mkhalifa/flan-t5-large-svamp", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "mkhalifa/flan-t5-large-svamp" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "mkhalifa/flan-t5-large-svamp", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use mkhalifa/flan-t5-large-svamp with Docker Model Runner:
docker model run hf.co/mkhalifa/flan-t5-large-svamp
GRACE: Discriminator-Guided Chain-of-Thought Reasoning
This model is part of the work presented in the paper GRACE: Discriminator-Guided Chain-of-Thought Reasoning.
GRACE (Guiding chain-of-thought ReAsoning with a CorrectnEss Discriminator) is a stepwise decoding approach that steers the decoding process towards producing correct reasoning steps. It employs a step-level verifier or discriminator trained with a contrastive loss over correct and incorrect steps, which is used during decoding to score next-step candidates based on their correctness.
Resources
- Paper: GRACE: Discriminator-Guided Chain-of-Thought Reasoning
- GitHub Repository: https://github.com/mukhal/grace
- Authors: Muhammad Khalifa, Lajanugen Logeswaran, Moontae Lee, Honglak Lee, Lu Wang
Sample Usage
The official implementation for running guided decoding using this model can be found in the GitHub repository. Below is an example of how to run the GRACE decoding:
WANDB_MODE=disabled python run_grace.py \
--model_name_or_path mkhalifa/flan-t5-large-gsm8k \
--in_file data/gsm8k/dev.jsonl \
--task gsm8k \
--disc_path ckpts/discrim/flan-t5-gsm8k/ \
--beta 0.1 --n_candidate_steps 20 --generation_type step-score \
--step_sampling_method top_p --device2 cuda:0 --top_p .95 --sample_calc true \
--max_steps 6 --max_step_length 60 --step_delimiter '|' --temperature .8 --n_self_consistency 1 --seed 42
Citation
If you use this work, please cite the following paper:
@article{khalifa2023grace,
title={Grace: Discriminator-guided chain-of-thought reasoning},
author={Khalifa, Muhammad and Logeswaran, Lajanugen and Lee, Moontae and Lee, Honglak and Wang, Lu},
journal={arXiv preprint arXiv:2305.14934},
year={2023}
}
- Downloads last month
- 7