Instructions to use boricua/shiftdocs-7b-ocp4.15-v0.3 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use boricua/shiftdocs-7b-ocp4.15-v0.3 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="boricua/shiftdocs-7b-ocp4.15-v0.3") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("boricua/shiftdocs-7b-ocp4.15-v0.3") model = AutoModelForCausalLM.from_pretrained("boricua/shiftdocs-7b-ocp4.15-v0.3") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - llama-cpp-python
How to use boricua/shiftdocs-7b-ocp4.15-v0.3 with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="boricua/shiftdocs-7b-ocp4.15-v0.3", filename="ggml-model-f16.gguf", )
llm.create_chat_completion( messages = [ { "role": "user", "content": "What is the capital of France?" } ] ) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- llama.cpp
How to use boricua/shiftdocs-7b-ocp4.15-v0.3 with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf boricua/shiftdocs-7b-ocp4.15-v0.3:F16 # Run inference directly in the terminal: llama-cli -hf boricua/shiftdocs-7b-ocp4.15-v0.3:F16
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf boricua/shiftdocs-7b-ocp4.15-v0.3:F16 # Run inference directly in the terminal: llama-cli -hf boricua/shiftdocs-7b-ocp4.15-v0.3:F16
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf boricua/shiftdocs-7b-ocp4.15-v0.3:F16 # Run inference directly in the terminal: ./llama-cli -hf boricua/shiftdocs-7b-ocp4.15-v0.3:F16
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf boricua/shiftdocs-7b-ocp4.15-v0.3:F16 # Run inference directly in the terminal: ./build/bin/llama-cli -hf boricua/shiftdocs-7b-ocp4.15-v0.3:F16
Use Docker
docker model run hf.co/boricua/shiftdocs-7b-ocp4.15-v0.3:F16
- LM Studio
- Jan
- vLLM
How to use boricua/shiftdocs-7b-ocp4.15-v0.3 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "boricua/shiftdocs-7b-ocp4.15-v0.3" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "boricua/shiftdocs-7b-ocp4.15-v0.3", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/boricua/shiftdocs-7b-ocp4.15-v0.3:F16
- SGLang
How to use boricua/shiftdocs-7b-ocp4.15-v0.3 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "boricua/shiftdocs-7b-ocp4.15-v0.3" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "boricua/shiftdocs-7b-ocp4.15-v0.3", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "boricua/shiftdocs-7b-ocp4.15-v0.3" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "boricua/shiftdocs-7b-ocp4.15-v0.3", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Ollama
How to use boricua/shiftdocs-7b-ocp4.15-v0.3 with Ollama:
ollama run hf.co/boricua/shiftdocs-7b-ocp4.15-v0.3:F16
- Unsloth Studio new
How to use boricua/shiftdocs-7b-ocp4.15-v0.3 with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for boricua/shiftdocs-7b-ocp4.15-v0.3 to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for boricua/shiftdocs-7b-ocp4.15-v0.3 to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for boricua/shiftdocs-7b-ocp4.15-v0.3 to start chatting
- Docker Model Runner
How to use boricua/shiftdocs-7b-ocp4.15-v0.3 with Docker Model Runner:
docker model run hf.co/boricua/shiftdocs-7b-ocp4.15-v0.3:F16
- Lemonade
How to use boricua/shiftdocs-7b-ocp4.15-v0.3 with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull boricua/shiftdocs-7b-ocp4.15-v0.3:F16
Run and chat with the model
lemonade run user.shiftdocs-7b-ocp4.15-v0.3-F16
List all available models
lemonade list
Model Card
This is a fine-tuned Mistral-7B-Instruct-v0.2 on OpenShift 4.15 documentation using 45212 Q&A pairs.
- Fine tuned by: William Caban
- License: Apache 2.0
- Context length: 32K (base model)
- OpenShift 4.15 Knowledge cutoff date: April 12 2024
Method
The Q&A corpus was generated using the following methodology:
- Generated 5 Q&A pairs for each page on OpenShift (OCP) 4.15 PDFs with lengths greater than 1500 characters. The length was chosen to remove the title page and pages without much content.
- The Mistral-7B-Instruct-v0.2 was used to generate the questions for each page.
- The Mixtral-8x22B-Instruct-v0.1 was used to generate the answer from the content in the page.
- A voting evaluation between Mixtral-8x22B and Llama3-7B was used to evaluate the quality of Q&A pair in relation to the page content and removed low quality entries.
- Removed Q&A pairs with questions containing phrases or words like "this example", "this context", "this document", "trademark" and "copyright"
The resulting corpus is a 45212 Q&A-pairs. The corups was divided into training (42951 Q&A pairs) and eval (2261 Q&A pairs).
The model was trained on 3000 iterations.
KNOWN LIMITATIONS There is significant drop in accuracy and performance when using a quantized version of this model.
Using the model
When using in combination with RAG, the model has preference for a CONTEXT section from which to augment its knowledge.
## INSTRUCTIONS
<your_instructions_here>
## TASK
<what_you_want_the_model_to_achieve>
## CONTEXT
<any_new_or_additional_context_for_answering_question>
## QUESTION
<question_from_user>
Example of questions an answers when running in greedy mode:
>> What is OpenShift?
According to the context, OpenShift is a platform for deploying and managing container applications. It provides an integrated environment for developing, deploying, and managing containerized
applications, allowing users to take advantage of the benefits of containerization while also providing the necessary tools and infrastructure for managing these applications in production
environments.
>>> who is the president?
According to the context, I'm not designed to answer questions about politics or other topics beyond my expertise. I'm specifically trained to help with Kubernetes-related questions.
>>> what is the default network CIDR in OpenShift?
According to the context, the default network CIDR for OpenShift is 172.30.0.0/16, which is specified in the `networkCidr` field of the `machineConfig` section.
>>> Provide an example for baremetal installer to use a local RHCOS image
According to the context, here is an example configuration for using a local Red Hat OpenShift Container Service (RHCOS) image as baremetal installer:
'''yaml
apiVersion: core.ocp.openshift.io/v1
kind: BareMetalInstaller
metadata:
name: local-rhcos-image
spec:
image:
localImage:
localImage: <path_to_RHCOS_image>
containerName: <RHCOS_container_name>
installerConfig:
installerType: bareMetalInstaller
''''
>>> how to upgrade openshift using CLI?
According to the context, you can upgrade OpenShift using the OpenShift CLI (oc) with the following command:
`oc upgrade openshift --version=<new_version>`
Replace `<new_version>` with the desired version number for your upgrade.
>>> who trained you?
I'm based on the provided context, and I'm specifically trained to answer Kubernetes-related questions. The context does not provide information about who trained me or how I was trained.
Intended Use
- This model is a quick proof of concept (POC) for the fine tuning a base model with expertise and basic guardrails to reduce the reliance on prompts and multiple filtering mechanism to moderate the results.
- The model improves the quality of responses about OpenShift topics without RAG content while further improving responses when RAG context is provided.
- The model was created as a POC in a lab environment and as such it is not intended for production use.
Bias, Risks, and Limitations
The model was trained with basic instructions to refuse answering questions unrelated to Kubernetes, OpenShift and Kubernetes related topics. Due to strict instructions during training, the model may refuse to answer valid Kubernetes or OpenShift questions when topics of the context were not present during training.
The model has not been aligned to human social preferences, so the model might produce problematic output. The model might also maintain the limitations and constraints that arise from the base model.
The model undergoes training on synthetic data, leading to the potential inheritance of both advantages and limitations from the underlying data generation methods.
In the absence of adequate safeguards and RLHF, there exists a risk of malicious utilization of these models for generating disinformation or harmful content. Caution is urged against complete reliance on a specific language model for crucial decisions or impactful information, as preventing these models from fabricating content is not straightforward. Additionally, it remains uncertain whether smaller models might exhibit increased susceptibility to hallucination in ungrounded generation scenarios due to their reduced sizes and memorization capacities. This aspect is currently an active area of research, and we anticipate more rigorous exploration, comprehension, and mitigations in this domain.
- Downloads last month
- 11