Text Generation
Transformers
Safetensors
English
olmo2
text-generation-inference
unsloth
conversational
Instructions to use Pinkstack/Luau-coder-v2-3B-base-32k with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Pinkstack/Luau-coder-v2-3B-base-32k with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="Pinkstack/Luau-coder-v2-3B-base-32k") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("Pinkstack/Luau-coder-v2-3B-base-32k") model = AutoModelForCausalLM.from_pretrained("Pinkstack/Luau-coder-v2-3B-base-32k") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use Pinkstack/Luau-coder-v2-3B-base-32k with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Pinkstack/Luau-coder-v2-3B-base-32k" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Pinkstack/Luau-coder-v2-3B-base-32k", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/Pinkstack/Luau-coder-v2-3B-base-32k
- SGLang
How to use Pinkstack/Luau-coder-v2-3B-base-32k with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Pinkstack/Luau-coder-v2-3B-base-32k" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Pinkstack/Luau-coder-v2-3B-base-32k", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Pinkstack/Luau-coder-v2-3B-base-32k" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Pinkstack/Luau-coder-v2-3B-base-32k", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Unsloth Studio
How to use Pinkstack/Luau-coder-v2-3B-base-32k with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Pinkstack/Luau-coder-v2-3B-base-32k to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Pinkstack/Luau-coder-v2-3B-base-32k to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for Pinkstack/Luau-coder-v2-3B-base-32k to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="Pinkstack/Luau-coder-v2-3B-base-32k", max_seq_length=2048, ) - Docker Model Runner
How to use Pinkstack/Luau-coder-v2-3B-base-32k with Docker Model Runner:
docker model run hf.co/Pinkstack/Luau-coder-v2-3B-base-32k
| tags: | |
| - text-generation-inference | |
| - transformers | |
| - unsloth | |
| - olmo2 | |
| license: apache-2.0 | |
| language: | |
| - en | |
| datasets: | |
| - Pinkstack/roblox-luau-corpus-text | |
| - Roblox/luau_corpus | |
| - boatbomber/roblox-info-dump | |
| - wikimedia/wikipedia | |
| pipeline_tag: text-generation | |
| base_model: | |
| - allenai/OLMo-2-0425-1B | |
| Note: this is not a chat model, the chat model is coming soon but this is the base model for further fine-tuning, stay tuned for the chat model release! This page will be updated once that model is out. (The chat model will be under a different repo) | |
|  | |
| # print("Before we start") | |
| We are not related to Roblox in any way, any mention of Roblox is purely to help people understand what the model is about. | |
| As per the [Roblox website](https://create.roblox.com/docs/assistant/guide), they use Meta's Llama 3 (we assume 70B) for their AI assistant. This model, while powerful, cannot come close to the performance of a 70B model. | |
| But unlike Llama 3, this model (luau-coder-v2-3b-32k) aka luaucoder for short is under an open apache 2.0 license. | |
| # print("Stages of pre-training") | |
| This model was continually pre-trained in 3 stages. (Note, allenai states that olmo 2 1B, which is the model this is based on was pre-trained on 4 trillion or so tokens.) | |
| - Stage 1: Pre-training on the Pinkstack/roblox-luau-corpus-text & Roblox/luau_corpus on 4096 context (the maximum olmo 2 can usually reach) | |
| - Stage 2: Pre-training on the boatbomber/roblox-info-dump with rope scaling set to 4, so stage 2 was for expanding the context of the model to **16384**. | |
| !stage 3 and onwards were with added layers. the model started with 16 layers, then we merged another 20 to make the model bigger and deeper! | |
| - Stage 3: Training on a mix of Pinkstack/roblox-luau-corpus-text & Roblox/luau_corpus + wikimedia/wikipedia with rope scaling set to 8, aka **32768** tokens of context. We mixed the wikimedia/wikipedia to hopefully improve the general text and knowledge of the model. | |
| In total, the model was continually pre-trained on up to 1.3B tokens, final loss of **1.916400**. | |
| # print("Use cases") | |
| As this is a base model, there isn't much to do with it currently. But, you can fine-tune it on your own datasets to turn it into an instruct - chat type model. | |
| # print("Notice") | |
| This stage-3 base model did not undergo saftey alignment by us, thus it can generate unethical content. Any outputs generated by the LLM are your responsibility. | |
| # print("Additional information") | |
| This repo contains the stage 3 pre-trained/base model. | |
| unsloth was used for training (https://unsloth.ai/) |