Instructions to use PygmalionAI/pygmalion-6b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use PygmalionAI/pygmalion-6b with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="PygmalionAI/pygmalion-6b") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("PygmalionAI/pygmalion-6b") model = AutoModelForCausalLM.from_pretrained("PygmalionAI/pygmalion-6b") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use PygmalionAI/pygmalion-6b with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "PygmalionAI/pygmalion-6b" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "PygmalionAI/pygmalion-6b", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/PygmalionAI/pygmalion-6b
- SGLang
How to use PygmalionAI/pygmalion-6b with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "PygmalionAI/pygmalion-6b" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "PygmalionAI/pygmalion-6b", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "PygmalionAI/pygmalion-6b" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "PygmalionAI/pygmalion-6b", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use PygmalionAI/pygmalion-6b with Docker Model Runner:
docker model run hf.co/PygmalionAI/pygmalion-6b
The Dataset
Hello PygmalionAI Team,
Amazing work! I tried Pygmalion6B (GPT-J), already pretty good, but even better fine-tuned on a specific domain!
I have the 2 following questions:
- is it possible to have the data your trained Pygmalion6B (GPT-J) to try on other models myself?
-Do you plan to train MPT-7B with this data?
For the data, sure! Shoot me a message on Discord - I'm 0x000011b#4223 there. As for MPT-7B, I don't have any plans to train it at the moment. I'm not a fan of the current 7B (XOR + license + model itself isn't as good as I'd like), so I'm keeping an eye on all the new foundational models that are coming out, but my current thoughts are:
- MPT-7B looks strong performance-wise, but the fact that it's a custom architecture full of
NotImplementedErrors when training doesn't inspire confidence for me to use it just yet. - RedPajama's 7B looks great! However, for whatever reason, LLaMA is about 40% faster than NeoX (the architecture that RedPajama used), so this is also not 100% ideal.
- OpenLLaMA seems to be the most promising: will use the normal LLaMA architecture (so won't fall victim to the two pitfalls above), plus they're training on the same data as RedPajama, so once done, they should all be competitive when it comes to model quality. However, since it's not done training yet I'd rather not rush anything since the current checkpoints underperform LLaMA quite strongly in some tasks.
can you give me the dataset, please?
.
.
done please check
@alpindale i am eagerly waiting for your dataset to be released. is there any expected date finalized?
Anybody have access to dataset in 2024?