HuggingFaceH4/ultrachat_200k
Viewer • Updated • 515k • 71.9k • 718
How to use xiuyul/mamba-2.8b-ultrachat with Transformers:
# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("xiuyul/mamba-2.8b-ultrachat", dtype="auto")This model is a fine-tuned version of state-spaces/mamba-2.8b-slimpj on the HuggingFaceH4/ultrachat_200k dataset. It achieves the following results on the evaluation set:
More information needed
More information needed
More information needed
The following hyperparameters were used during training:
| Training Loss | Epoch | Step | Validation Loss |
|---|---|---|---|
| 2.0106 | 0.0 | 1 | 1.9092 |
| 1.1783 | 0.62 | 250 | 1.1858 |