Instructions to use TwinDoc/RedWhale-tv-10.8B-sft-s with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use TwinDoc/RedWhale-tv-10.8B-sft-s with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="TwinDoc/RedWhale-tv-10.8B-sft-s")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("TwinDoc/RedWhale-tv-10.8B-sft-s")
model = AutoModelForCausalLM.from_pretrained("TwinDoc/RedWhale-tv-10.8B-sft-s")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use TwinDoc/RedWhale-tv-10.8B-sft-s with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "TwinDoc/RedWhale-tv-10.8B-sft-s"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "TwinDoc/RedWhale-tv-10.8B-sft-s",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/TwinDoc/RedWhale-tv-10.8B-sft-s

SGLang

How to use TwinDoc/RedWhale-tv-10.8B-sft-s with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "TwinDoc/RedWhale-tv-10.8B-sft-s" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "TwinDoc/RedWhale-tv-10.8B-sft-s",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "TwinDoc/RedWhale-tv-10.8B-sft-s" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "TwinDoc/RedWhale-tv-10.8B-sft-s",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use TwinDoc/RedWhale-tv-10.8B-sft-s with Docker Model Runner:
```
docker model run hf.co/TwinDoc/RedWhale-tv-10.8B-sft-s
```

You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Model Description

S-B 고객사 프로젝트 시 생성한 RAG 데이터셋을 활용하여 Supervised Fine-Tuning(a.k.a SFT) 학습한 모델입니다. 학습 데이터셋은 보안에 의해 공개하지 않습니다.

About the Model

Name: TwinDoc/RedWhale-tv-10.8B-sft-s
Finetuned from model: TwinDoc/RedWhale-tv-10.8B-v1.0
Train Datasets: private
Developed by: 애자일소다 (AGILESODA)
Model type: llama
Language(s) (NLP): 한국어
License: cc-by-nc-sa-4.0
train setting
- Lora r, alpha : 4, 16
- Dtype : bf16
- Epoch : 7
- Learning rate : 1e-4
- Global batch : 4
- Context length : 4096
inference setting
- BOS id : 1
- EOS id : 2
- Top-p : 0.95
- Temperature : 0.01

prompt template

Human: ##원문##과 ##질문##이 주어지면, ##원문##에 있는 정보를 바탕으로 고품질의 ##답변##을 만들어주세요. ##원문##에서 ##질문##에 대한 명확한 답을 찾을 수 없을 경우 "답변을 찾을 수 없습니다."로 ##답변##을 작성해야하며 ##원문##에 없는 내용은 ##답변##에 포함하지 않아야 합니다.
    
##원문##
{CONTEXT}
##질문##
{QUESTION}
 Assistant: {ANSWER}

License

The content of this project, created by AGILESODA, is licensed under the Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Citation

@misc{vo2024redwhaleadaptedkoreanllm,
      title={RedWhale: An Adapted Korean LLM Through Efficient Continual Pretraining}, 
      author={Anh-Dung Vo and Minseong Jung and Wonbeen Lee and Daewoo Choi},
      year={2024},
      eprint={2408.11294},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2408.11294}, 
}

Built with:

Downloads last month: -

Safetensors

Model size

11B params

Tensor type

BF16

Collection including TwinDoc/RedWhale-tv-10.8B-sft-s

Finetuned Models

Collection

4 items • Updated Mar 2

Paper for TwinDoc/RedWhale-tv-10.8B-sft-s

RedWhale: An Adapted Korean LLM Through Efficient Continual Pretraining

Paper • 2408.11294 • Published Aug 21, 2024