Instructions to use boricua/shiftdocs-7b-ocp4.15-v0.3 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use boricua/shiftdocs-7b-ocp4.15-v0.3 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="boricua/shiftdocs-7b-ocp4.15-v0.3")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("boricua/shiftdocs-7b-ocp4.15-v0.3")
model = AutoModelForCausalLM.from_pretrained("boricua/shiftdocs-7b-ocp4.15-v0.3")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

llama-cpp-python

How to use boricua/shiftdocs-7b-ocp4.15-v0.3 with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="boricua/shiftdocs-7b-ocp4.15-v0.3",
	filename="ggml-model-f16.gguf",
)

llm.create_chat_completion(
	messages = [
		{
			"role": "user",
			"content": "What is the capital of France?"
		}
	]
)

Notebooks
Google Colab
Kaggle
Local Apps

llama.cpp

How to use boricua/shiftdocs-7b-ocp4.15-v0.3 with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf boricua/shiftdocs-7b-ocp4.15-v0.3:F16
# Run inference directly in the terminal:
llama-cli -hf boricua/shiftdocs-7b-ocp4.15-v0.3:F16

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf boricua/shiftdocs-7b-ocp4.15-v0.3:F16
# Run inference directly in the terminal:
llama-cli -hf boricua/shiftdocs-7b-ocp4.15-v0.3:F16

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf boricua/shiftdocs-7b-ocp4.15-v0.3:F16
# Run inference directly in the terminal:
./llama-cli -hf boricua/shiftdocs-7b-ocp4.15-v0.3:F16

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf boricua/shiftdocs-7b-ocp4.15-v0.3:F16
# Run inference directly in the terminal:
./build/bin/llama-cli -hf boricua/shiftdocs-7b-ocp4.15-v0.3:F16

Use Docker

docker model run hf.co/boricua/shiftdocs-7b-ocp4.15-v0.3:F16

LM Studio
Jan

vLLM

How to use boricua/shiftdocs-7b-ocp4.15-v0.3 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "boricua/shiftdocs-7b-ocp4.15-v0.3"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "boricua/shiftdocs-7b-ocp4.15-v0.3",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/boricua/shiftdocs-7b-ocp4.15-v0.3:F16

SGLang

How to use boricua/shiftdocs-7b-ocp4.15-v0.3 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "boricua/shiftdocs-7b-ocp4.15-v0.3" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "boricua/shiftdocs-7b-ocp4.15-v0.3",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "boricua/shiftdocs-7b-ocp4.15-v0.3" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "boricua/shiftdocs-7b-ocp4.15-v0.3",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Ollama
How to use boricua/shiftdocs-7b-ocp4.15-v0.3 with Ollama:
```
ollama run hf.co/boricua/shiftdocs-7b-ocp4.15-v0.3:F16
```

Unsloth Studio new

How to use boricua/shiftdocs-7b-ocp4.15-v0.3 with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for boricua/shiftdocs-7b-ocp4.15-v0.3 to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for boricua/shiftdocs-7b-ocp4.15-v0.3 to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for boricua/shiftdocs-7b-ocp4.15-v0.3 to start chatting

Docker Model Runner
How to use boricua/shiftdocs-7b-ocp4.15-v0.3 with Docker Model Runner:
```
docker model run hf.co/boricua/shiftdocs-7b-ocp4.15-v0.3:F16
```

Lemonade

How to use boricua/shiftdocs-7b-ocp4.15-v0.3 with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull boricua/shiftdocs-7b-ocp4.15-v0.3:F16

Run and chat with the model

lemonade run user.shiftdocs-7b-ocp4.15-v0.3-F16

List all available models

lemonade list

Model Card

This is a fine-tuned Mistral-7B-Instruct-v0.2 on OpenShift 4.15 documentation using 45212 Q&A pairs.

Fine tuned by: William Caban
License: Apache 2.0
Context length: 32K (base model)
OpenShift 4.15 Knowledge cutoff date: April 12 2024

Method

The Q&A corpus was generated using the following methodology:

Generated 5 Q&A pairs for each page on OpenShift (OCP) 4.15 PDFs with lengths greater than 1500 characters. The length was chosen to remove the title page and pages without much content.
The Mistral-7B-Instruct-v0.2 was used to generate the questions for each page.
The Mixtral-8x22B-Instruct-v0.1 was used to generate the answer from the content in the page.
A voting evaluation between Mixtral-8x22B and Llama3-7B was used to evaluate the quality of Q&A pair in relation to the page content and removed low quality entries.
Removed Q&A pairs with questions containing phrases or words like "this example", "this context", "this document", "trademark" and "copyright"

The resulting corpus is a 45212 Q&A-pairs. The corups was divided into training (42951 Q&A pairs) and eval (2261 Q&A pairs).

The model was trained on 3000 iterations.

KNOWN LIMITATIONS There is significant drop in accuracy and performance when using a quantized version of this model.

Using the model

When using in combination with RAG, the model has preference for a CONTEXT section from which to augment its knowledge.

## INSTRUCTIONS
<your_instructions_here>

## TASK
<what_you_want_the_model_to_achieve>

## CONTEXT
<any_new_or_additional_context_for_answering_question>

## QUESTION
<question_from_user>

Example of questions an answers when running in greedy mode:

>> What is OpenShift?
  According to the context, OpenShift is a platform for deploying and managing container applications. It provides an integrated environment for developing, deploying, and managing containerized     
  applications, allowing users to take advantage of the benefits of containerization while also providing the necessary tools and infrastructure for managing these applications in production         
  environments.
                                                                                                                                                                              
>>> who is the president?                                                                                                                                                                  
  According to the context, I'm not designed to answer questions about politics or other topics beyond my expertise. I'm specifically trained to help with Kubernetes-related questions.

>>> what is the default network CIDR in OpenShift? 
  According to the context, the default network CIDR for OpenShift is 172.30.0.0/16, which is specified in the `networkCidr` field of the `machineConfig` section.

>>> Provide an example for baremetal installer to use a local RHCOS image
  According to the context, here is an example configuration for using a local Red Hat OpenShift Container Service (RHCOS) image as baremetal installer:                                               
                                                                                                                                                                                                       
  '''yaml                                                                                                                                                                                              
  apiVersion: core.ocp.openshift.io/v1                                                                                                                                                                 
  kind: BareMetalInstaller                                                                                                                                                                             
  metadata:                                                                                                                                                                                            
  name: local-rhcos-image                                                                                                                                                                              
  spec:                                                                                                                                                                                                
    image:                                                                                                                                                                                             
      localImage:                                                                                                                                                                                      
        localImage: <path_to_RHCOS_image>                                                                                                                                                              
      containerName: <RHCOS_container_name>                                                                                                                                                            
    installerConfig:                                                                                                                                                                                   
      installerType: bareMetalInstaller                                                                                                                                                                
  ''''                                                                                                                                                                                                  

>>> how to upgrade openshift using CLI?
 According to the context, you can upgrade OpenShift using the OpenShift CLI (oc) with the following command:                                                                                          
                                                                                                                                                                                                       
  `oc upgrade openshift --version=<new_version>`                                                                                                                                                        
                                                                                                                                                                                                        
  Replace `<new_version>` with the desired version number for your upgrade.                                                                                                                             

>>> who trained you?
  I'm based on the provided context, and I'm specifically trained to answer Kubernetes-related questions. The context does not provide information about who trained me or how I was trained.

Intended Use

This model is a quick proof of concept (POC) for the fine tuning a base model with expertise and basic guardrails to reduce the reliance on prompts and multiple filtering mechanism to moderate the results.
The model improves the quality of responses about OpenShift topics without RAG content while further improving responses when RAG context is provided.
The model was created as a POC in a lab environment and as such it is not intended for production use.

Bias, Risks, and Limitations

The model was trained with basic instructions to refuse answering questions unrelated to Kubernetes, OpenShift and Kubernetes related topics. Due to strict instructions during training, the model may refuse to answer valid Kubernetes or OpenShift questions when topics of the context were not present during training.
The model has not been aligned to human social preferences, so the model might produce problematic output. The model might also maintain the limitations and constraints that arise from the base model.
The model undergoes training on synthetic data, leading to the potential inheritance of both advantages and limitations from the underlying data generation methods.
In the absence of adequate safeguards and RLHF, there exists a risk of malicious utilization of these models for generating disinformation or harmful content. Caution is urged against complete reliance on a specific language model for crucial decisions or impactful information, as preventing these models from fabricating content is not straightforward. Additionally, it remains uncertain whether smaller models might exhibit increased susceptibility to hallucination in ungrounded generation scenarios due to their reduced sizes and memorization capacities. This aspect is currently an active area of research, and we anticipate more rigorous exploration, comprehension, and mitigations in this domain.

Downloads last month: 11

Safetensors

Model size

7B params

Tensor type

F16