GGUF
English
text-deidentification
privacy
pii-removal
text2text-generation
medical
legal
hr
llama
minibase
small-model
2048-context
Eval Results (legacy)
Instructions to use Minibase/DeId-Small with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use Minibase/DeId-Small with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="Minibase/DeId-Small", filename="model.gguf", )
output = llm( "Once upon a time,", max_tokens=512, echo=True ) print(output)
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- llama.cpp
How to use Minibase/DeId-Small with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf Minibase/DeId-Small # Run inference directly in the terminal: llama-cli -hf Minibase/DeId-Small
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf Minibase/DeId-Small # Run inference directly in the terminal: llama-cli -hf Minibase/DeId-Small
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf Minibase/DeId-Small # Run inference directly in the terminal: ./llama-cli -hf Minibase/DeId-Small
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf Minibase/DeId-Small # Run inference directly in the terminal: ./build/bin/llama-cli -hf Minibase/DeId-Small
Use Docker
docker model run hf.co/Minibase/DeId-Small
- LM Studio
- Jan
- Ollama
How to use Minibase/DeId-Small with Ollama:
ollama run hf.co/Minibase/DeId-Small
- Unsloth Studio new
How to use Minibase/DeId-Small with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Minibase/DeId-Small to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Minibase/DeId-Small to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for Minibase/DeId-Small to start chatting
- Docker Model Runner
How to use Minibase/DeId-Small with Docker Model Runner:
docker model run hf.co/Minibase/DeId-Small
- Lemonade
How to use Minibase/DeId-Small with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull Minibase/DeId-Small
Run and chat with the model
lemonade run user.DeId-Small-{{QUANT_TAG}}List all available models
lemonade list
| #!/usr/bin/env python3 | |
| """ | |
| DeId-Small Inference Client | |
| A Python client for running inference with the Minibase-DeId-Small model. | |
| Handles text de-identification requests to the local llama.cpp server. | |
| """ | |
| import requests | |
| import json | |
| from typing import Optional, Dict, Any, Tuple | |
| import time | |
| class DeIdClient: | |
| """ | |
| Client for the DeId-Small de-identification model. | |
| This client communicates with a local llama.cpp server running the | |
| Minibase-DeId-Small model for text de-identification tasks. | |
| """ | |
| def __init__(self, base_url: str = "http://127.0.0.1:8000", timeout: int = 30): | |
| """ | |
| Initialize the DeId client. | |
| Args: | |
| base_url: Base URL of the llama.cpp server | |
| timeout: Request timeout in seconds | |
| """ | |
| self.base_url = base_url.rstrip('/') | |
| self.timeout = timeout | |
| self.default_instruction = "De-identify this text by replacing all personal information with placeholders." | |
| def _make_request(self, prompt: str, max_tokens: int = 256, | |
| temperature: float = 0.1) -> Tuple[str, float]: | |
| """ | |
| Make a completion request to the model. | |
| Args: | |
| prompt: The input prompt | |
| max_tokens: Maximum tokens to generate | |
| temperature: Sampling temperature | |
| Returns: | |
| Tuple of (response_text, latency_ms) | |
| """ | |
| payload = { | |
| "prompt": prompt, | |
| "max_tokens": max_tokens, | |
| "temperature": temperature | |
| } | |
| headers = {'Content-Type': 'application/json'} | |
| start_time = time.time() | |
| try: | |
| response = requests.post( | |
| f"{self.base_url}/completion", | |
| json=payload, | |
| headers=headers, | |
| timeout=self.timeout | |
| ) | |
| latency = (time.time() - start_time) * 1000 # Convert to ms | |
| if response.status_code == 200: | |
| result = response.json() | |
| return result.get('content', ''), latency | |
| else: | |
| return f"Error: Server returned status {response.status_code}", latency | |
| except requests.exceptions.RequestException as e: | |
| latency = (time.time() - start_time) * 1000 | |
| return f"Error: {e}", latency | |
| def deidentify_text(self, text: str, instruction: Optional[str] = None, | |
| max_tokens: int = 256, temperature: float = 0.1) -> str: | |
| """ | |
| De-identify a text by removing personal identifiers. | |
| Args: | |
| text: The text to de-identify | |
| instruction: Custom instruction (uses default if None) | |
| max_tokens: Maximum tokens to generate | |
| temperature: Sampling temperature (lower = more consistent) | |
| Returns: | |
| De-identified text with placeholders | |
| """ | |
| if instruction is None: | |
| instruction = self.default_instruction | |
| prompt = f"Instruction: {instruction}\n\nInput: {text}\n\nResponse: " | |
| response, _ = self._make_request(prompt, max_tokens, temperature) | |
| return response | |
| def deidentify_batch(self, texts: list, instruction: Optional[str] = None, | |
| max_tokens: int = 256, temperature: float = 0.1) -> list: | |
| """ | |
| De-identify multiple texts in batch. | |
| Args: | |
| texts: List of texts to de-identify | |
| instruction: Custom instruction for all texts | |
| max_tokens: Maximum tokens per response | |
| temperature: Sampling temperature | |
| Returns: | |
| List of de-identified texts | |
| """ | |
| results = [] | |
| for text in texts: | |
| result = self.deidentify_text(text, instruction, max_tokens, temperature) | |
| results.append(result) | |
| return results | |
| def health_check(self) -> bool: | |
| """ | |
| Check if the model server is healthy and responding. | |
| Returns: | |
| True if server is healthy, False otherwise | |
| """ | |
| try: | |
| # Try completion endpoint first | |
| response = requests.post( | |
| f"{self.base_url}/completion", | |
| json={"prompt": "Hello", "max_tokens": 1}, | |
| timeout=5 | |
| ) | |
| return response.status_code == 200 | |
| except: | |
| return False | |
| def get_server_info(self) -> Optional[Dict[str, Any]]: | |
| """ | |
| Get server information if available. | |
| Returns: | |
| Server info dict or None if unavailable | |
| """ | |
| try: | |
| response = requests.get(f"{self.base_url}/props", timeout=5) | |
| if response.status_code == 200: | |
| return response.json() | |
| except: | |
| pass | |
| return None | |
| def main(): | |
| """Example usage of the DeId client.""" | |
| client = DeIdClient() | |
| # Check server health | |
| if not client.health_check(): | |
| print("❌ Error: DeId-Small server not responding. Please start the server first.") | |
| print(" Run: ./Minibase-personal-id-masking-small.app/Contents/MacOS/run_server") | |
| return | |
| print("✅ DeId-Small server is running!") | |
| # Example texts to de-identify | |
| examples = [ | |
| "Patient John Smith, born 1985-03-15, lives at 123 Main Street, Boston MA.", | |
| "Dr. Sarah Johnson called from (555) 123-4567 about the appointment.", | |
| "Employee Jane Doe earns $75,000 annually at TechCorp Inc.", | |
| "Customer Michael Brown reported issue with Order #CUST-12345." | |
| ] | |
| print("\n🔒 De-identification Examples:") | |
| print("=" * 50) | |
| for i, text in enumerate(examples, 1): | |
| print(f"\n📝 Example {i}:") | |
| print(f"Input: {text}") | |
| clean_text = client.deidentify_text(text) | |
| print(f"Output: {clean_text}") | |
| print("\n✨ De-identification complete!") | |
| if __name__ == "__main__": | |
| main() | |