dnaihao
/

olmo-tablebench

+---
+license: apache-2.0
+base_model: allenai/OLMo-7B-Instruct
+datasets:
+- dnaihao/Table-Instructs
+language:
+- en
+library_name: transformers
+pipeline_tag: text-generation
+tags:
+- table-understanding
+- instruction-tuning
+- replication
+- tabular-data
+---
+# olmo-tablebench
+Replication of [**TableBenchLLM**](https://arxiv.org/abs/2408.09174), trained from [**OLMo-7B-Instruct**](https://huggingface.co/allenai/OLMo-7B-Instruct) on the corresponding instruction-tuning corpus.
+Released as part of the EACL 2026 Findings paper *"What Really Matters for Table LLMs? A Meta-Evaluation of Model and Data Effects"* (Deng et al., 2026). The paper instruction-tunes three 7B foundation models (Mistral-v0.3, OLMo, Phi-3) on four existing training corpora (TableLlama, TableLLM, TableBench, TableGPT) to disentangle the contributions of base model versus training data, finding that **base model choice plays a more dominant role than the training data itself**.
+- 📄 Paper: [aclanthology.org/2026.findings-eacl.195](https://aclanthology.org/2026.findings-eacl.195/)
+- 💻 Code & eval scripts: [github.com/dnaihao/table-sft-eacl-2026](https://github.com/dnaihao/table-sft-eacl-2026)
+- 🤗 All replicated models: [collection](https://huggingface.co/collections/dnaihao/table-llms)
+## Training
+| | |
+|---|---|
+| Base model | [`allenai/OLMo-7B-Instruct`](https://huggingface.co/allenai/OLMo-7B-Instruct) |
+| Training corpus | `tablebench_train.json` from [`dnaihao/Table-Instructs`](https://huggingface.co/datasets/dnaihao/Table-Instructs) |
+| Method | Full SFT via [LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory) |
+| Learning rate | 5e-7 |
+Full hyperparameter sweep, ablations, and per-benchmark numbers are reported in the paper.
+## Evaluation
+Per-`{model, benchmark}` eval scripts and parsed metrics are available at [github.com/dnaihao/table-sft-eacl-2026/tree/main/eval/olmo-tablebench](https://github.com/dnaihao/table-sft-eacl-2026/tree/main/eval/olmo-tablebench). Raw model outputs (`generated_predictions.jsonl`) are released as the dataset [`dnaihao/table-sft-eval-predictions`](https://huggingface.co/datasets/dnaihao/table-sft-eval-predictions).
+## Usage
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+tokenizer = AutoTokenizer.from_pretrained("dnaihao/olmo-tablebench")
+model = AutoModelForCausalLM.from_pretrained(
+    "dnaihao/olmo-tablebench",
+    torch_dtype="auto",
+    device_map="auto",
+)
+```
+## License
+This model inherits the license of its base model ([`allenai/OLMo-7B-Instruct`](https://huggingface.co/allenai/OLMo-7B-Instruct): apache-2.0).
+## Citation
+```bibtex
+@inproceedings{deng-etal-2026-really,
+    title = "What Really Matters for Table {LLM}s? A Meta-Evaluation of Model and Data Effects",
+    author = "Deng, Naihao  and Zhang, Sheng  and Zhu, Henghui  and Chang, Shuaichen  and Zhang, Jiani  and Li, Alexander Hanbo  and Hang, Chung-Wei  and Kobayashi, Hideo  and Hu, Yiqun  and Ng, Patrick",
+    booktitle = "Findings of the Association for Computational Linguistics: EACL 2026",
+    year = "2026",
+    publisher = "Association for Computational Linguistics",
+    url = "https://aclanthology.org/2026.findings-eacl.195/",
+    doi = "10.18653/v1/2026.findings-eacl.195"
+}
+```