| | --- |
| | license: cc-by-nc-4.0 |
| | datasets: |
| | - openai/gsm8k |
| | language: |
| | - en |
| | base_model: |
| | - Qwen/Qwen2.5-Math-1.5B |
| | pipeline_tag: text-generation |
| | library_name: transformers |
| | tags: |
| | - math |
| | - qwen |
| | - lora |
| | - mathematics |
| | - gsm8k |
| | --- |
| | |
| | # OpenMath |
| | Fine-tuning a Small Language Model (SLM) for Step-by-Step Math Reasoning |
| |
|
| | ## Overview |
| | OpenMath is an open-source project focused on fine-tuning a small language model for mathematical reasoning using parameter-efficient LoRA training. |
| |
|
| | This repository contains **only a LoRA adapter** trained on the full GSM8K dataset. Users must load the base model separately and attach the adapter using PEFT. |
| |
|
| | The latest version of this model was trained on an **AMD MI300X GPU using ROCm**, demonstrating that high-performance non-NVIDIA accelerators can successfully support modern large language model fine-tuning with PyTorch and Hugging Face. |
| |
|
| | --- |
| |
|
| | ## Base Model |
| | **Qwen/Qwen2.5-Math-1.5B** |
| |
|
| | This repository **does not contain the base model weights** — they must be loaded directly from Hugging Face before applying this LoRA adapter. |
| |
|
| | --- |
| |
|
| | ## Hardware Used (Latest Training Run) |
| |
|
| | - **GPU:** AMD MI300X (ROCm 7.0) |
| | - **VRAM:** 192 GB |
| | - **OS:** Ubuntu 24.04 |
| | - **Framework:** PyTorch + Hugging Face |
| | - **Backend:** ROCm |
| |
|
| | --- |
| |
|
| | ## Dataset |
| |
|
| | **GSM8K (Grade School Math 8K)** |
| | - **Training samples:** 7,473 (full training split) |
| | - **Evaluation:** Full GSM8K test split (1,319 problems) |
| |
|
| | Only the solution portion of each example was used for loss computation via loss masking to encourage stronger reasoning behavior. |
| |
|
| | --- |
| |
|
| | ## Training Configuration |
| |
|
| | **Method:** LoRA (full precision, bfloat16) |
| | **Precision:** bfloat16 (no 4-bit quantization in this run) |
| |
|
| | ### LoRA settings |
| | - Rank: 16 |
| | - Alpha: 32 |
| | - Dropout: 0.05 |
| | - Target modules: `q_proj`, `k_proj`, `v_proj`, `o_proj` |
| |
|
| | ### Data & sequence |
| | - Max sequence length: 1024 |
| |
|
| | ### Optimization |
| | - Per-device batch size: 2 |
| | - Gradient accumulation: 8 |
| | - Effective batch size: 16 |
| | - Learning rate: 1e-4 |
| | - Optimizer: `adamw_torch` |
| | - Scheduler: cosine |
| | - Warmup: 5% |
| |
|
| | ### Training |
| | - **Epochs:** 3 |
| |
|
| | --- |
| |
|
| | ## Results |
| |
|
| | **GSM8K Accuracy (Full Test Set):** |
| | 750 / 1319 = **56.86% accuracy** |
| |
|
| | This represents a substantial improvement over earlier small-scale Colab experiments and is a strong result for a 1.5B model trained with LoRA on the full dataset. |
| |
|
| | --- |
| |
|
| | ## GSM8K Accuracy Comparison |
| |
|
| | | Model | Accuracy (%) | |
| | |---------------------------|-------------| |
| | | Qwen1.5-7B | 62.50 | |
| | | Param2-17B-A2.4B-Thinking | 57.32 | |
| | | **OpenMath** | **56.86** | |
| | | Llama2-70B | 56.80 | |
| | | Llama-3-8B | 56.00 | |
| | | Mistral-7B | 52.20 | |
| | | Gemma-7B | 46.40 | |
| | | DeepSeek-V2-Lite | 38.21 | |
| | | gpt-oss-20b | 36.54 | |
| |
|
| |
|
| |  |
| |
|
| | --- |
| |
|
| | ## How to Use This Model |
| |
|
| | 1. Load the base model **Qwen/Qwen2.5-Math-1.5B** from Hugging Face. |
| | 2. Attach this LoRA adapter using PEFT. |
| | 3. Use a structured prompt that includes an instruction, problem, and solution section for best results. |
| |
|
| | --- |
| |
|
| | ## Why This Matters |
| |
|
| | - Demonstrates that **AMD MI300X** can effectively train modern LLMs with Hugging Face + LoRA. |
| | - Shows strong math reasoning at **1.5B parameters** with lightweight fine-tuning. |
| | - Provides a compact adapter instead of requiring users to download a massive full model. |
| |
|
| | --- |
| |
|
| | ## Limitations |
| |
|
| | - The model can make reasoning mistakes. |
| | - It should not be used for exams, assignments, or professional decisions. |
| | - Performance depends heavily on prompt formatting. |
| |
|
| | --- |
| |
|
| | ## License |
| |
|
| | **cc-by-nc-4.0** |
| |
|