Instructions to use xxue752/caiti_best_model with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use xxue752/caiti_best_model with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("xxue752/caiti_best_model", dtype="auto") - PEFT
How to use xxue752/caiti_best_model with PEFT:
Task type is invalid.
- Notebooks
- Google Colab
- Kaggle
CaiTI Best Model Bundle (Compressed Base + Task Adapters)
This repository contains the deployment-ready compressed base model and the best task-specific LoRA adapters trained for CaiTI.
Base model family
- Base architecture:
meta-llama/Llama-3.2-3B-Instruct
What is included
1) Compressed base model (for adapter hot-swap deployment)
- Path:
compressed_model_int4/ - Content: NF4 INT4 quantized
meta-llama/Llama-3.2-3B-Instructbase model - Intended use: low-memory deployment with task-specific adapter hot-swap
2) Task-specific adapters (recommended for max accuracy with adapter hot-swap)
adapters/task1_response_analyzer/- Task: Response Analyzer (37-dimension + score classification)
adapters/task2_general_response/- Task: General Response Classification (Yes/No/Maybe/Question/Stop)
adapters/task3_rv_reasoner/- Task: Reflection-Validation reasoner (valid/invalid follow-up decision)
adapters/task4_cbt_stage1/- Task: CBT Stage 1 (identify unhelpful thoughts)
adapters/task4_cbt_stage2/- Task: CBT Stage 2 (challenge unhelpful thoughts)
adapters/task4_cbt_stage3/- Task: CBT Stage 3 (reframe unhelpful thoughts)
Which adapter should be used for which task?
- Use
task1_response_analyzerfor dimension+score screening classification. - Use
task2_general_responsefor generic intent labels (Yes/No/Maybe/Question/Stop). - Use
task3_rv_reasonerfor RV binary decision routing. - Use
task4_cbt_stage1,task4_cbt_stage2, andtask4_cbt_stage3for CBT multi-stage reasoning, one adapter per stage.
Deployment recommendation
- If your priority is maximum task accuracy, use task-specific adapter hot-swap.
- Recommended runtime stack: load
compressed_model_int4/once and switch adapters per task.
Notes
- These artifacts are intended for research and prototype deployment.
- Prompt design and post-processing can significantly impact output quality in generative modules.
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support