XiaoboX
/

TableDART

Table Question Answering

Model card Files Files and versions

TableDART / README.md

XiaoboX's picture

Update README

f6bfc29 verified 3 months ago

|

history blame contribute delete

3.47 kB

	---
	license: mit
	language:
	- en
	metrics:
	- accuracy
	- bleu
	pipeline_tag: table-question-answering
	tags:
	- code
	---
	# TableDART Gating Network Checkpoint

	This repository provides the trained gating network checkpoint for TableDART: Dynamic Adaptive Multi-Modal Routing for Table Understanding.

	TableDART is a training-efficient framework that dynamically routes each table-query pair through the most appropriate reasoning path: Text-only, Image-only, or Fusion, while keeping all pretrained expert models frozen.

	---

	## 🔍 Overview

	Modeling semantic and structural information from tabular data remains a core challenge for effective table understanding.
	Existing LLM-based approaches face several limitations:

	- Table-as-Text methods flatten tables into text sequences, losing structural cues.
	- Table-as-Image methods preserve layout but struggle with precise semantics.
	- Static multimodal methods process all modalities for every query, introducing redundancy and potential cross-modal conflicts.
	- Most approaches require expensive fine-tuning of large LLMs or multimodal models.

	Our Solution: TableDART addresses these limitations through:

	- Reusing pretrained single-modality expert models (kept frozen, plug-and-play)
	- Learning only a lightweight 2.59M-parameter MLP gating network
	- Dynamically selecting the optimal path for each table-query pair (instance-level)
	- Introducing an LLM agent that mediates cross-modal knowledge integration when needed

	This design avoids full LLM/MLLM fine-tuning, reduces computational redundancy, and maintains strong efficiency-performance trade-offs.

	---

	## 🚀 Performance

	Across 7 benchmarks, TableDART:

	- Achieves state-of-the-art results on 4/7 benchmarks among open-source models
	- Outperforms the strongest baseline by +4.02% accuracy on average
	- Maintains significant computational efficiency gains


	## 📦 What This Checkpoint Contains

	This Hugging Face model includes:

	- The trained MLP gating network checkpoint

	⚠️ Note: This checkpoint does not include the pretrained text or image expert models. Please load those separately according to the official repository instructions.

	---

	## 🛠 Code and Usage

	Full training scripts, inference pipelines, and reproduction details are available at our Github Repository: https://github.com/xiaobo-xing/TableDART

	---

	## 📄 Paper

	ICLR 2026 OpenReview Version:
	https://openreview.net/forum?id=4aZTiLH3fm

	ArXiv Version:
	https://arxiv.org/abs/2509.14671

	---

	## 📚 Citation

	If you find TableDART helpful, please cite our paper and consider starring the repository.

	### ICLR 2026 Version

	```bibtex
	@inproceedings{xing2026tabledart,
	title={Table{DART}: Dynamic Adaptive Multi-Modal Routing for Table Understanding},
	author={Xiaobo Xing and Wei Yuan and Tong Chen and Quoc Viet Hung Nguyen and Xiangliang Zhang and Hongzhi Yin},
	booktitle={The Fourteenth International Conference on Learning Representations},
	year={2026},
	url={https://openreview.net/forum?id=4aZTiLH3fm}
	}
	```

	### ArXiv Version
	```bibtex
	@misc{xing2025tabledartdynamicadaptivemultimodal,
	title={TableDART: Dynamic Adaptive Multi-Modal Routing for Table Understanding},
	author={Xiaobo Xing and Wei Yuan and Tong Chen and Quoc Viet Hung Nguyen and Xiangliang Zhang and Hongzhi Yin},
	year={2025},
	eprint={2509.14671},
	archivePrefix={arXiv},
	primaryClass={cs.CL},
	url={https://arxiv.org/abs/2509.14671}
	}
	```