Instructions to use Vickstester/PV-BioMistral-1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Vickstester/PV-BioMistral-1 with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="Vickstester/PV-BioMistral-1",
	filename="pv-biomistral-7b-Q4_K_M.gguf",
)

output = llm(
	"Once upon a time,",
	max_tokens=512,
	echo=True
)
print(output)

Notebooks
Google Colab
Kaggle
Local Apps

llama.cpp

How to use Vickstester/PV-BioMistral-1 with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf Vickstester/PV-BioMistral-1:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf Vickstester/PV-BioMistral-1:Q4_K_M

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf Vickstester/PV-BioMistral-1:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf Vickstester/PV-BioMistral-1:Q4_K_M

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf Vickstester/PV-BioMistral-1:Q4_K_M
# Run inference directly in the terminal:
./llama-cli -hf Vickstester/PV-BioMistral-1:Q4_K_M

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf Vickstester/PV-BioMistral-1:Q4_K_M
# Run inference directly in the terminal:
./build/bin/llama-cli -hf Vickstester/PV-BioMistral-1:Q4_K_M

Use Docker

docker model run hf.co/Vickstester/PV-BioMistral-1:Q4_K_M

LM Studio
Jan
Ollama
How to use Vickstester/PV-BioMistral-1 with Ollama:
```
ollama run hf.co/Vickstester/PV-BioMistral-1:Q4_K_M
```

Unsloth Studio new

How to use Vickstester/PV-BioMistral-1 with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for Vickstester/PV-BioMistral-1 to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for Vickstester/PV-BioMistral-1 to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for Vickstester/PV-BioMistral-1 to start chatting

Docker Model Runner
How to use Vickstester/PV-BioMistral-1 with Docker Model Runner:
```
docker model run hf.co/Vickstester/PV-BioMistral-1:Q4_K_M
```

Lemonade

How to use Vickstester/PV-BioMistral-1 with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull Vickstester/PV-BioMistral-1:Q4_K_M

Run and chat with the model

lemonade run user.PV-BioMistral-1-Q4_K_M

List all available models

lemonade list

pv-biomistral-7b

A pharmacovigilance-specialised language model fine-tuned from Mistral-7B-Instruct-v0.3 on 100,000 FAERS-derived training examples across five structured PV tasks.

This is the community testing release. It contains only the Q4_K_M quantized GGUF for local inference via Ollama or llama-cpp-python.

⚠️ Important Disclaimer

This model is a research prototype intended for pharmacovigilance professionals to evaluate and provide feedback on. It is not a validated system and must not be used for:

Autonomous pharmacovigilance decision-making
Generating or contributing to regulatory submissions
Replacing qualified pharmacovigilance assessor judgment
Clinical or safety-critical decisions of any kind

All model outputs require review by a qualified pharmacovigilance professional. This tool is for exploratory and research purposes only.

Model Details

Property	Value
Base model	mistralai/Mistral-7B-Instruct-v0.3
Fine-tuning method	QLoRA (4-bit NF4, LoRA r=16)
Training records	100,000
Training epochs	3
Data source	FAERS public database (FDA)
Quantization	Q4_K_M (GGUF)
Model size	4.37 GB
Context window	8192 tokens
Framework	TRL 1.0.0, Transformers, PEFT

Setup — Ollama (Recommended)

Requirements

Ollama installed
~5 GB free disk space
8 GB RAM minimum, 16 GB recommended
GPU optional but recommended for faster inference

Installation

Step 1 — Download both files from this repository:

pv-biomistral-7b-Q4_K_M.gguf (4.37 GB)
Modelfile

Place both in the same folder.

Step 2 — Create the Ollama model

cd /path/to/downloaded/files
ollama create pv-mistral-v2 -f Modelfile

Step 3 — Run

ollama run pv-mistral-v2

Windows users: Use the full path e.g. cd C:\Users\YourName\Downloads\pv-model\

Setup — llama-cpp-python (Alternative)

pip install llama-cpp-python[server]

python -m llama_cpp.server \
  --model pv-biomistral-7b-Q4_K_M.gguf \
  --chat_format mistral-instruct \
  --n_gpu_layers -1 \
  --n_ctx 8192

Then open http://localhost:8000/docs for the Swagger UI.

Setup — Jan App (Windows/Mac)

Download Jan
Import Model → select the GGUF file
Set temperature to 0.1 in chat settings
Add system prompt from the Modelfile SYSTEM field

Expected Performance by Hardware

Hardware	Speed	Response Time
Mac Mini M4 / Apple Silicon	25-35 tokens/sec	2-5 sec/case
Windows + NVIDIA GPU (8GB+ VRAM)	25-40 tokens/sec	2-4 sec/case
Snapdragon X Elite (16GB)	8-15 tokens/sec	5-12 sec/case
Windows CPU only (16-24GB RAM)	3-6 tokens/sec	15-30 sec/case

Known Limitations

Probable causality underrepresented: Training data contained only 70 Probable causality examples out of 100,000 records, reflecting real-world FAERS spontaneous reporting patterns. The model may default to Possible even for cases with confirmed positive dechallenge and no confounders.
Spontaneous reports only: Trained exclusively on FAERS spontaneous adverse event reports. Performance on clinical trial safety data, EHR-derived cases, or non-English source material is untested.
Not formally validated: The model has not been validated against any regulatory standard including ICH E2D, ICH E2A, or WHO-UMC guidelines.
Short context optimised: Designed for single-case inputs under 512 tokens.

CIOMS WG XIV Alignment

This model is designed to operate within a Human-in-the-Loop (HITL) framework consistent with CIOMS Working Group XIV recommendations for AI in drug safety. All outputs are decision-support signals requiring human adjudication by a qualified pharmacovigilance professional.

Feedback

This is a community testing release. Please evaluate the model on real cases from your practice area and share findings. Particular interest in:

Causality outputs where you would classify Probable
Cases with unusual drug combinations or rare reactions
Narrative quality from a safety database entry perspective
Therapeutic areas where performance appears weaker

Training Data

Trained on 10,000 cases from the FDA Adverse Event Reporting System (FAERS), accessed via public database export. No proprietary, confidential, or patient-identifiable data beyond what is publicly available in FAERS was used.

License

Base model (Mistral-7B-Instruct-v0.3): Apache 2.0 Fine-tuned weights: CC BY-NC 4.0 (non-commercial research use only)

By downloading this model you agree to use it for research purposes only and not for any commercial application or regulatory submission.

Downloads last month: 2

GGUF

Model size

7B params

Architecture

llama

Hardware compatibility

4-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Vickstester/PV-BioMistral-1

Base model

mistralai/Mistral-7B-v0.3

Finetuned

mistralai/Mistral-7B-Instruct-v0.3

Quantized

(248)

this model