Text Generation
Transformers
Safetensors
PyTorch
English
Spanish
qwen2
conversation
companion
personality
fine-tuned
conversational
text-generation-inference
Instructions to use OpceanAI/Yuuki-NxG with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use OpceanAI/Yuuki-NxG with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="OpceanAI/Yuuki-NxG") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("OpceanAI/Yuuki-NxG") model = AutoModelForCausalLM.from_pretrained("OpceanAI/Yuuki-NxG") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Inference
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use OpceanAI/Yuuki-NxG with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "OpceanAI/Yuuki-NxG" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "OpceanAI/Yuuki-NxG", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/OpceanAI/Yuuki-NxG
- SGLang
How to use OpceanAI/Yuuki-NxG with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "OpceanAI/Yuuki-NxG" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "OpceanAI/Yuuki-NxG", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "OpceanAI/Yuuki-NxG" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "OpceanAI/Yuuki-NxG", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use OpceanAI/Yuuki-NxG with Docker Model Runner:
docker model run hf.co/OpceanAI/Yuuki-NxG
| license: apache-2.0 | |
| datasets: | |
| - OpceanAI/Yuuki-dataset | |
| - bigcode/the-stack | |
| - a-m-team/AM-DeepSeek-R1-Distilled-1.4M | |
| - OpceanAI/Yuuki-Personality | |
| language: | |
| - en | |
| - es | |
| base_model: | |
| - Qwen/Qwen2.5-3B | |
| pipeline_tag: text-generation | |
| library_name: transformers | |
| tags: | |
| - conversation | |
| - pytorch | |
| - companion | |
| - personality | |
| - fine-tuned | |
| metrics: | |
| - perplexity | |
| widget: | |
| - text: Hello, how are you? | |
| example_title: General Conversation | |
| - text: Can you help me understand recursion? | |
| example_title: Technical Explanation | |
| - text: I've been feeling a bit overwhelmed lately. | |
| example_title: Emotional Support | |
| <div align="center"> | |
| <br> | |
| <img src="https://img.shields.io/badge/%E2%9C%A6-YUUKI--NxG-0D1117?style=for-the-badge&labelColor=0D1117" alt="Yuuki NxG" height="50"> | |
| <br><br> | |
| # A 3B Companion Model Fine-Tuned on a Mac Pro | |
| **Personality-aligned language model trained with zero cloud compute budget.**<br> | |
| **Qwen2.5 architecture. 3 billion parameters. Mac Pro (2020). $0.00.** | |
| <br> | |
| <a href="#benchmark-results"><img src="https://img.shields.io/badge/BENCHMARKS-0D1117?style=for-the-badge" alt="Benchmarks"></a> | |
| | |
| <a href="#usage"><img src="https://img.shields.io/badge/USAGE-0D1117?style=for-the-badge" alt="Usage"></a> | |
| | |
| <a href="https://github.com/sponsors/aguitauwu"><img src="https://img.shields.io/badge/SPONSOR-0D1117?style=for-the-badge" alt="Sponsor"></a> | |
| <br><br> | |
| [](LICENSE) | |
| | |
| [](https://huggingface.co/Qwen/Qwen2.5-3B) | |
| | |
| [](https://huggingface.co/docs/transformers) | |
| | |
| [](https://www.apple.com/mac-pro/) | |
| | |
| [](https://github.com/EleutherAI/lm-evaluation-harness) | |
| <br> | |
| --- | |
| <br> | |
| </div> | |
| ## What is Yuuki NxG? | |
| **Yuuki NxG** is a 3-billion parameter language model fine-tuned from [Qwen2.5-3B](https://huggingface.co/Qwen/Qwen2.5-3B) for open-ended conversation, emotional support, and general-purpose reasoning. It is the flagship release of the NxG model family developed by OpceanAI. | |
| The model was trained entirely on a **Mac Pro (2020)** with no external compute budget and no cloud GPU infrastructure. All benchmark evaluations were conducted on Kaggle P100 using [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness). | |
| Despite being fine-tuned — which typically degrades base model benchmark scores — and evaluated strictly **0-shot** while competitors use 5–25 shot prompting, Yuuki NxG achieves the **highest TruthfulQA score** across all compared 3B-scale models, including the Qwen2.5-3B base model from which it was derived. | |
| <br> | |
| --- | |
| <br> | |
| <div align="center"> | |
| ## Model Summary | |
| </div> | |
| <br> | |
| <table> | |
| <tr> | |
| <td width="50%" valign="top"> | |
| **Architecture** | |
| | Property | Value | | |
| |:---------|:------| | |
| | Base Model | Qwen2.5-3B | | |
| | Parameters | 3B | | |
| | Fine-tuning | Supervised SFT | | |
| | Training Examples | ~5,000 | | |
| | Training Hardware | MacBook Pro (2020) | | |
| | Context Length | 32,768 tokens | | |
| </td> | |
| <td width="50%" valign="top"> | |
| **Release** | |
| | Property | Value | | |
| |:---------|:------| | |
| | Organization | OpceanAI | | |
| | Release Date | February 2026 | | |
| | Languages | English, Spanish | | |
| | License | Apache 2.0 | | |
| | Evaluation | lm-evaluation-harness | | |
| | Compute Budget | $0.00 | | |
| </td> | |
| </tr> | |
| </table> | |
| <br> | |
| --- | |
| <br> | |
| <div align="center"> | |
| ## Benchmark Results | |
| </div> | |
| <br> | |
| All Yuuki NxG results are evaluated **0-shot**. Competitor scores are sourced from their official technical reports and use few-shot prompting (5–25 shots depending on benchmark). Direct numerical comparison systematically favors base models evaluated with few-shot prompting. | |
| <br> | |
|  | |
| <br> | |
| | Model | MMLU | ARC-C | HellaSwag | WinoGrande | TruthfulQA | Eval | | |
| |:------|:----:|:-----:|:---------:|:----------:|:----------:|:----:| | |
| | **Yuuki NxG** | **60.65** | 45.31 | 52.25 | 63.14 | **50.87** | 0-shot | | |
| | Qwen2.5-3B | 65.6 | 56.5 | 74.6 | 71.1 | 48.9 | 5–25 shot | | |
| | Llama-3.2-3B | 58.0 | 43.0 | 71.0 | 67.0 | 44.0 | 5–25 shot | | |
| | Phi-3-mini (3.8B) | 68.8 | 60.0 | 76.7 | 73.0 | 45.0 | 5–25 shot | | |
| | Gemma-2-2B | 52.0 | 42.0 | 71.0 | 65.0 | 39.0 | 5–25 shot | | |
| <br> | |
| Yuuki NxG achieves the highest TruthfulQA score across all compared models under equivalent 0-shot conditions, including the base model from which it was fine-tuned. This indicates that alignment fine-tuning improved factual honesty rather than degrading it — an outcome that runs counter to the typical fine-tuning tradeoff. | |
| HellaSwag degradation is expected and well-documented in personality-aligned models, as sentence-completion benchmarks are sensitive to conversational fine-tuning. | |
| <br> | |
| ### MMLU Category Breakdown | |
| <table> | |
| <tr> | |
| <td width="50%" valign="top"> | |
| **Strongest Domains** | |
| | Category | Score | | |
| |:---------|:-----:| | |
| | Marketing | 87.18% | | |
| | High School Psychology | 83.67% | | |
| | Sociology | 80.60% | | |
| | World Religions | 80.12% | | |
| | US Foreign Policy | 79.00% | | |
| | Logical Fallacies | 76.69% | | |
| | HS Computer Science | 76.00% | | |
| </td> | |
| <td width="50%" valign="top"> | |
| **Domain Averages** | |
| | Domain | Score | | |
| |:-------|:-----:| | |
| | Social Sciences | 71.56% | | |
| | Other | 66.08% | | |
| | STEM | 56.17% | | |
| | Humanities | 52.92% | | |
| | **Overall** | **60.65%** | | |
| </td> | |
| </tr> | |
| </table> | |
| The performance profile is consistent with a model optimized for conversation: strong in social sciences, psychology, and humanities; below average in formal STEM domains. This is the expected and intended tradeoff for a companion-purpose model. | |
| <br> | |
| --- | |
| <br> | |
| <div align="center"> | |
| ## NxG Model Family | |
| </div> | |
| <br> | |
| <table> | |
| <tr> | |
| <td width="50%" valign="top"> | |
| **Released Models** | |
| | Model | Parameters | Description | | |
| |:------|:----------:|:------------| | |
| | Yuuki NxG | 3B | Full model, general conversation | | |
| | Yuuki NxG Nano | 81M | Lightweight, constrained environments | | |
| </td> | |
| <td width="50%" valign="top"> | |
| **Community GGUF (via mradermacher)** | |
| Quantized independently without solicitation — organic community adoption prior to any formal announcement. | |
| | Format | Size | | |
| |:-------|:----:| | |
| | Q4_K_M | 2.0 GB | | |
| | Q8_0 | 3.4 GB | | |
| | F16 | 6.3 GB | | |
| Available at [mradermacher/Yuuki-NxG-GGUF](https://huggingface.co/mradermacher/Yuuki-NxG-GGUF). | |
| </td> | |
| </tr> | |
| </table> | |
| <br> | |
| --- | |
| <br> | |
| <div align="center"> | |
| ## Usage | |
| </div> | |
| <br> | |
| ### With Transformers (PyTorch) | |
| ```python | |
| from transformers import AutoTokenizer, AutoModelForCausalLM | |
| import torch | |
| model_id = "OpceanAI/Yuuki-NxG" | |
| tokenizer = AutoTokenizer.from_pretrained(model_id) | |
| model = AutoModelForCausalLM.from_pretrained( | |
| model_id, | |
| torch_dtype=torch.bfloat16, | |
| device_map="auto" | |
| ) | |
| messages = [ | |
| {"role": "user", "content": "Hello, how are you?"} | |
| ] | |
| inputs = tokenizer.apply_chat_template( | |
| messages, | |
| return_tensors="pt" | |
| ).to(model.device) | |
| with torch.no_grad(): | |
| outputs = model.generate( | |
| inputs, | |
| max_new_tokens=512, | |
| temperature=0.7, | |
| do_sample=True, | |
| repetition_penalty=1.1 | |
| ) | |
| print(tokenizer.decode(outputs[0][inputs.shape[1]:], skip_special_tokens=True)) | |
| ``` | |
| <br> | |
| ### With llama.cpp (GGUF) | |
| ```bash | |
| ./llama.cpp/main -m yuuki-nxg-q4_k_m.gguf \ | |
| -p "Hello, how are you?" \ | |
| -n 256 \ | |
| -t 4 \ | |
| --temp 0.7 \ | |
| --repeat-penalty 1.1 | |
| ``` | |
| <br> | |
| ### With Ollama | |
| ```bash | |
| cat > Modelfile << EOF | |
| FROM ./yuuki-nxg-q4_k_m.gguf | |
| PARAMETER temperature 0.7 | |
| PARAMETER top_p 0.9 | |
| PARAMETER repeat_penalty 1.1 | |
| EOF | |
| ollama create yuuki-nxg -f Modelfile | |
| ollama run yuuki-nxg "Hello, how are you?" | |
| ``` | |
| <br> | |
| ### Recommended Parameters | |
| | Parameter | Value | | |
| |:----------|:-----:| | |
| | Temperature | 0.7 | | |
| | Top-p | 0.9 | | |
| | Max new tokens | 512–2048 | | |
| | Repetition penalty | 1.1 | | |
| <br> | |
| --- | |
| <br> | |
| <div align="center"> | |
| ## Training Details | |
| </div> | |
| <br> | |
| <table> | |
| <tr> | |
| <td width="50%" valign="top"> | |
| **Hardware** | |
| | Component | Specification | | |
| |:----------|:-------------| | |
| | Device | MacBook Pro (2020) | | |
| | Chip | Intel Core i5 | | |
| | RAM | 16GB LPDDR4X | | |
| | GPU | Intel Iris Plus | | |
| | Cloud Compute | None | | |
| | Cost | $0.00 | | |
| </td> | |
| <td width="50%" valign="top"> | |
| **Training Configuration** | |
| | Parameter | Value | | |
| |:----------|:-----:| | |
| | Base Model | Qwen2.5-3B | | |
| | Method | Supervised Fine-Tuning | | |
| | Training Examples | ~5,000 | | |
| | Optimizer | AdamW | | |
| | Learning Rate | 2e-5 | | |
| | Max Sequence Length | 2,048 tokens | | |
| </td> | |
| </tr> | |
| </table> | |
| <br> | |
| Yuuki NxG was produced through supervised fine-tuning on a curated conversational dataset. The training objective was to produce a model with consistent personality, high factual honesty, and broad general-knowledge retention from the Qwen2.5 base. | |
| Training without GPU-accelerated cloud infrastructure imposes constraints on batch size and total training duration relative to commercially produced models. The resulting benchmark profile reflects these constraints: strong performance in domains well-represented in the training data, with expected degradation in areas requiring dense technical knowledge such as formal mathematics and physics. | |
| <br> | |
| --- | |
| <br> | |
| <div align="center"> | |
| ## Features | |
| </div> | |
| <br> | |
| <table> | |
| <tr> | |
| <td width="50%" valign="top"> | |
| **Personality Alignment** | |
| Fine-tuned for consistent, context-aware conversation. The model maintains a coherent identity across extended dialogues, with particular strength in emotional support and casual Q&A. | |
| <br> | |
| **Factual Honesty** | |
| Achieves highest TruthfulQA score (50.87%) among all compared 3B-scale models — including its own base model. Fine-tuning improved factual calibration rather than degrading it. | |
| <br> | |
| **Multilingual** | |
| Functional in both English and Spanish. Primary evaluation in English; Spanish capability inherited from Qwen2.5 pretraining. | |
| </td> | |
| <td width="50%" valign="top"> | |
| **Zero-Budget Training** | |
| Trained entirely on owned hardware with no cloud compute expenditure. Demonstrates that meaningful alignment fine-tuning is accessible without data center infrastructure. | |
| <br> | |
| **Community Adoption** | |
| Independently quantized and distributed by mradermacher before any formal announcement — organic community interest in the model's capabilities. | |
| <br> | |
| **Open Source** | |
| Apache 2.0. Use commercially, modify, distribute. Full transparency on training methodology and evaluation protocol. | |
| </td> | |
| </tr> | |
| </table> | |
| <br> | |
| --- | |
| <br> | |
| <div align="center"> | |
| ## Limitations | |
| </div> | |
| <br> | |
| - **Mathematical reasoning** performance is below the Qwen2.5-3B base. Users requiring quantitative precision should use tool augmentation or a specialized model. | |
| - **HellaSwag degradation** reflects the standard tradeoff of personality fine-tuning on sentence-completion benchmarks. | |
| - **Benchmark methodology**: Yuuki NxG is evaluated 0-shot while competitor reports use 5–25 shot prompting, creating a systematic disadvantage in direct comparisons. | |
| - **Safety alignment** has not been formally evaluated. Not recommended for adversarial or high-stakes deployment without additional safety filtering. | |
| - **Training scale**: 5,000 examples on consumer hardware impose generalization limits relative to commercially scaled models. | |
| <br> | |
| --- | |
| <br> | |
| <div align="center"> | |
| ## Intended Use | |
| </div> | |
| <br> | |
| <table> | |
| <tr> | |
| <td width="50%" valign="top"> | |
| **Intended For** | |
| - General-purpose conversational assistance | |
| - Emotional support and companionship applications | |
| - Educational Q&A in humanities and social sciences | |
| - Research into small-scale fine-tuning and personality alignment | |
| - Local deployment on consumer hardware | |
| </td> | |
| <td width="50%" valign="top"> | |
| **Not Intended For** | |
| - Medical, legal, or financial advice | |
| - Tasks requiring high-precision mathematical reasoning | |
| - Applications requiring certified safety alignment | |
| - Production systems without additional safety review | |
| </td> | |
| </tr> | |
| </table> | |
| <br> | |
| --- | |
| <br> | |
| <div align="center"> | |
| ## Philosophy | |
| </div> | |
| <br> | |
| > **"Meaningful AI development does not require a data center. It requires patience, clarity of purpose, and time."** | |
| Yuuki NxG was built to demonstrate that a fine-tuned 3B model trained by one person on owned hardware can compete with base models from large organizations on key benchmarks — and surpass them where it matters most. | |
| <br> | |
| --- | |
| <br> | |
| <div align="center"> | |
| ## Related Projects | |
| </div> | |
| <br> | |
| | Project | Description | | |
| |:--------|:------------| | |
| | [Yuuki-NxG-Nano](https://huggingface.co/OpceanAI/Yuuki-NxG-Nano) | 81M lightweight variant | | |
| | [Yuuki-3.7](https://huggingface.co/OpceanAI/Yuuki-3.7) | Earlier code generation checkpoint | | |
| | [Yuuki-best](https://huggingface.co/OpceanAI/Yuuki-best) | Best checkpoint of the v0.1 series | | |
| | [yuy](https://github.com/YuuKi-OS/yuy) | CLI for managing and running Yuuki models | | |
| | [yuy-chat](https://github.com/YuuKi-OS/yuy-chat) | TUI chat interface | | |
| | [Yuuki-chat](https://github.com/YuuKi-OS/Yuuki-chat) | Web-based chat interface | | |
| | [Yuuki Space](https://huggingface.co/spaces/OpceanAI/Yuuki) | Interactive demo | | |
| <br> | |
| --- | |
| <br> | |
| <div align="center"> | |
| ## Links | |
| </div> | |
| <br> | |
| <div align="center"> | |
| [](https://huggingface.co/OpceanAI/Yuuki-NxG) | |
| | |
| [](https://huggingface.co/spaces/OpceanAI/Yuuki) | |
| | |
| [](https://huggingface.co/mradermacher/Yuuki-NxG-GGUF) | |
| <br> | |
| [](https://github.com/YuuKi-OS/yuy) | |
| | |
| [](https://github.com/sponsors/aguitauwu) | |
| | |
| [](https://discord.gg/j8zV2u8k) | |
| </div> | |
| <br> | |
| --- | |
| <br> | |
| <div align="center"> | |
| ## Community | |
| </div> | |
| <br> | |
| - [Discord Server](https://discord.gg/j8zV2u8k) — Development discussion and user community | |
| - [Twitter](https://twitter.com/aguitauwu) — Updates and announcements | |
| - [GitHub](https://github.com/aguitauwu) — Source code and training scripts | |
| - [GitHub Sponsors](https://github.com/sponsors/aguitauwu) — Support the project | |
| - [Ollama](https://ollama.com/aguitachan3/yuuki-nxg) — Run locally with Ollama | |
| <br> | |
| --- | |
| <br> | |
| <div align="center"> | |
| ## Citation | |
| </div> | |
| <br> | |
| ```bibtex | |
| @misc{awa_omg_2026, | |
| author = { awa_omg }, | |
| title = { Yuuki-NxG (Revision 9a924f0) }, | |
| year = 2026, | |
| url = { https://huggingface.co/OpceanAI/Yuuki-NxG }, | |
| doi = { 10.57967/hf/7915 }, | |
| publisher = { Hugging Face } | |
| } | |
| ``` | |
| <br> | |
| --- | |
| <br> | |
| <div align="center"> | |
| ## License | |
| </div> | |
| <br> | |
| ``` | |
| Apache License 2.0 | |
| Copyright (c) 2026 OpceanAI | |
| Licensed under the Apache License, Version 2.0 (the "License"); | |
| you may not use this file except in compliance with the License. | |
| You may obtain a copy of the License at | |
| http://www.apache.org/licenses/LICENSE-2.0 | |
| Unless required by applicable law or agreed to in writing, software | |
| distributed under the License is distributed on an "AS IS" BASIS, | |
| WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | |
| See the License for the specific language governing permissions and | |
| limitations under the License. | |
| ``` | |
| Use commercially, modify, distribute. Attribution required. | |
| <br> | |
| --- | |
| <br> | |
| <div align="center"> | |
| ## Updates | |
| </div> | |
| <br> | |
| | Date | Milestone | | |
| |:-----|:----------| | |
| | **2026-02-27** | Benchmark evaluation completed (Kaggle P100) | | |
| | **2026-02-27** | TruthfulQA: 50.87% — best among all compared 3B models | | |
| | **2026-02-27** | Community GGUF quantization by mradermacher | | |
| | **2026-02-27** | Yuuki NxG released on HuggingFace | | |
| **Last updated:** 2026-02-27 | |
| <br> | |
| --- | |
| <br> | |
| <div align="center"> | |
| **Built on a Mac Pro. Trained on 5,000 examples. Competitive with models from teams of hundreds.** | |
| <br> | |
| [](https://huggingface.co/OpceanAI) | |
| <br> | |
| *The NxG family. More releases coming.* | |
| </div> |