Instructions to use uukuguy/speechless-coding-7b-16k-tora with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use uukuguy/speechless-coding-7b-16k-tora with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="uukuguy/speechless-coding-7b-16k-tora")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("uukuguy/speechless-coding-7b-16k-tora")
model = AutoModelForCausalLM.from_pretrained("uukuguy/speechless-coding-7b-16k-tora")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use uukuguy/speechless-coding-7b-16k-tora with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "uukuguy/speechless-coding-7b-16k-tora"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "uukuguy/speechless-coding-7b-16k-tora",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/uukuguy/speechless-coding-7b-16k-tora

SGLang

How to use uukuguy/speechless-coding-7b-16k-tora with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "uukuguy/speechless-coding-7b-16k-tora" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "uukuguy/speechless-coding-7b-16k-tora",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "uukuguy/speechless-coding-7b-16k-tora" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "uukuguy/speechless-coding-7b-16k-tora",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use uukuguy/speechless-coding-7b-16k-tora with Docker Model Runner:
```
docker model run hf.co/uukuguy/speechless-coding-7b-16k-tora
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

speechless-coding-7b-16k-tora

Use the following dataset to fine-tune llm_agents/tora-code-7b-v1.0 in order to improve the model's reasoning and planning abilities.

context window length: 16,384 prompt_type = "alpaca" max_tokens > 128 && < 16384

Total 177,333 samples 316 MB

jondurbin/airoboros-2.2: Filter categories related to coding, reasoning and planning. 21,923 samples.
Open-Orca/OpenOrca: Filter the 'cot' category in 1M GPT4 dataset. 62,973 samples.
garage-bAInd/Open-Platypus: 100%, 22,760 samples.
WizardLM/WizardLM_evol_instruct_V2_196k: Coding coversation part. 30,081 samples
TokenBender/python_eval_instruct_51k: “python” in output .39,596 samples

50 samples/T=0.2/MaxTokens=512/Top_P=0.95

Code: https://github.com/uukuguy/speechless

How to Prompt the Model

This model accepts the Alpaca instruction format.

For example:

You are an intelligent programming assistant.

### Instruction:
Implement a linked list in C++

### Response:

HumanEval

Metric	Value
humaneval-python	52.44

Big Code Models Leaderboard

CodeLlama-34B-Python: 53.29

CodeLlama-34B-Instruct: 50.79

CodeLlama-13B-Instruct: 50.6

CodeLlama-34B: 45.11

CodeLlama-13B-Python: 42.89

CodeLlama-13B: 35.07

MultiPL-E

Metric	Value
python	55.96
java	37.84
javascript	46.93
cpp	37.48
rust	29.01
go	28.99
sh	12.11
julia	31.47
typescript	47.80

LMEval

Open LLM Leaderboard

Metric	Value
ARC
HellaSwag
MMLU
TruthfulQA
Average

Parameters


lr	2e-4
lr_scheduler_type	cosine
weight_decay	0.0
optim	paged_adamw_8bit
flash_attention	True
rerope	False
max_new_tokens	16384
num_train_epochs	2
bits	4
lora_r	64
lora_alpha	256
lora_dropout	0.05
double_quant	True
quant_type	nf4
dataset_format	sharegpt
mini_batch_size	2
grandient_accumulation_steps	32
bf16	True

A100-40G x 4

Downloads last month: 1,013

Model tree for uukuguy/speechless-coding-7b-16k-tora

Quantizations

1 model

Datasets used to train uukuguy/speechless-coding-7b-16k-tora

Collection including uukuguy/speechless-coding-7b-16k-tora

Tora based Models

Collection

3 items • Updated Dec 7, 2023 • 1

Evaluation results

pass@1 on HumanEval
self-reported

52.439