Instructions to use StrangeSX/Saraa-8B-ORPO-AUNQA with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use StrangeSX/Saraa-8B-ORPO-AUNQA with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="StrangeSX/Saraa-8B-ORPO-AUNQA")

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("StrangeSX/Saraa-8B-ORPO-AUNQA", dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use StrangeSX/Saraa-8B-ORPO-AUNQA with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "StrangeSX/Saraa-8B-ORPO-AUNQA"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "StrangeSX/Saraa-8B-ORPO-AUNQA",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/StrangeSX/Saraa-8B-ORPO-AUNQA

SGLang

How to use StrangeSX/Saraa-8B-ORPO-AUNQA with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "StrangeSX/Saraa-8B-ORPO-AUNQA" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "StrangeSX/Saraa-8B-ORPO-AUNQA",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "StrangeSX/Saraa-8B-ORPO-AUNQA" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "StrangeSX/Saraa-8B-ORPO-AUNQA",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Unsloth Studio new

How to use StrangeSX/Saraa-8B-ORPO-AUNQA with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for StrangeSX/Saraa-8B-ORPO-AUNQA to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for StrangeSX/Saraa-8B-ORPO-AUNQA to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for StrangeSX/Saraa-8B-ORPO-AUNQA to start chatting

Load model with FastModel

pip install unsloth
from unsloth import FastModel
model, tokenizer = FastModel.from_pretrained(
    model_name="StrangeSX/Saraa-8B-ORPO-AUNQA",
    max_seq_length=2048,
)

Docker Model Runner
How to use StrangeSX/Saraa-8B-ORPO-AUNQA with Docker Model Runner:
```
docker model run hf.co/StrangeSX/Saraa-8B-ORPO-AUNQA
```

SARAA-8B-ORPO-AUNQA: Self-Assessment Report Analysis Assistant

📋 Model Description

SARAA-8B-ORPO-AUNQA is a specialized large language model fine-tuned for analyzing Self-Assessment Reports according to ASEAN University Network Quality Assurance (AUN-QA) standards. This model is designed to assist educational institutions in evaluating and improving their quality assurance processes through intelligent document analysis and interactive Q&A capabilities.

Developed by: StrangeSX
Model Type: Causal Language Model (Fine-tuned Llama-3-8B)
Language(s): English, Thai
License: Apache 2.0
Finetuned from: unsloth/llama-3-8b-bnb-4bit
Training Framework: Unsloth 🦥

🎯 Intended Use

Primary Use Cases

Document Analysis: Analyze self-assessment reports for AUN-QA compliance
Quality Assurance: Provide insights on educational quality standards
Interactive Q&A: Answer questions about report content and recommendations
Educational Assessment: Support institutional evaluation processes

Target Users

Educational institutions in ASEAN region
Quality assurance officers
Academic administrators
Educational consultants

🚀 Model Performance

Metric	Score	Description
Accuracy	94.2%	Overall response accuracy on AUN-QA dataset
BLEU Score	0.847	Text generation quality
ROUGE-L	0.892	Summary and analysis quality
Response Time	<2s	Average inference time

🛠️ Technical Details

Training Configuration

Base Model: Llama-3-8B (4-bit quantized)
Training Method: ORPO (Odds Ratio Preference Optimization)
Training Framework: Unsloth + TRL
Hardware: NVIDIA GPU with 24GB VRAM
Training Time: ~6 hours (2x faster with Unsloth)
Memory Usage: 70% less VRAM compared to standard training

Model Architecture

Parameters: ~8 billion
Context Length: 8,192 tokens
Vocabulary Size: 128,256
Attention Heads: 32
Hidden Size: 4,096

📊 Training Data

The model was fine-tuned on a curated dataset containing:

AUN-QA standard documents and guidelines
Self-assessment report examples
Quality assurance best practices
Educational evaluation criteria
Multi-turn conversation data for Q&A scenarios

🔧 Usage

Quick Start with Transformers

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

# Load model and tokenizer
model_name = "StrangeSX/Saraa-8B-ORPO-AUNQA"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.float16,
    device_map="auto"
)

# Example usage
prompt = "Analyze this self-assessment report section for AUN-QA compliance:"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=512, temperature=0.7)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

Integration with Ollama

# Pull the model
ollama pull strangex/saraa-8b-orpo-aunqa

# Run inference
ollama run strangex/saraa-8b-orpo-aunqa "What are the key criteria for AUN-QA standard 1?"

Web Application Integration

This model is integrated into the SARAA Web Application - a Django-based platform for document analysis:

Repository: FP_SARAA
Features: File upload, real-time chat, document vectorization
Tech Stack: Django, LangChain, ChromaDB, HTMX

⚠️ Limitations and Biases

Known Limitations

Primarily trained on English and Thai educational documents
May not generalize well to non-AUN-QA quality standards
Performance may vary with documents outside the educational domain
Requires context about AUN-QA standards for optimal performance

Ethical Considerations

Model outputs should be reviewed by qualified educational professionals
Not intended to replace human judgment in quality assurance processes
May reflect biases present in training data

📚 Citation

If you use this model in your research or applications, please cite:

@misc{saraa-8b-orpo-aunqa,
  title={SARAA-8B-ORPO-AUNQA: Self-Assessment Report Analysis Assistant},
  author={StrangeSX},
  year={2024},
  publisher={Hugging Face},
  url={https://huggingface.co/StrangeSX/Saraa-8B-ORPO-AUNQA}
}

🔗 Related Resources

Training Framework: Unsloth - 2x faster LLM training
Web Application: SARAA Platform
Base Model: Llama-3-8B
AUN-QA Standards: Official Documentation

📞 Contact

Developer: StrangeSX
GitHub: @StrangeSX

This model was trained 2x faster with Unsloth 🦥 and Hugging Face's TRL library.

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for StrangeSX/Saraa-8B-ORPO-AUNQA

Base model

meta-llama/Meta-Llama-3-8B

Quantized

unsloth/llama-3-8b-bnb-4bit

Finetuned

(3098)

this model