OCR PaddlePaddle/PP-OCRv5_server_rec Image-to-Text • Updated Jul 22, 2025 • 59.8k • 19 ibm-granite/granite-docling-258M Image-Text-to-Text • 0.3B • Updated Sep 23, 2025 • 205k • 1.1k
text<->image De-Diffusion Makes Text a Strong Cross-Modal Interface Paper • 2311.00618 • Published Nov 1, 2023 • 23
De-Diffusion Makes Text a Strong Cross-Modal Interface Paper • 2311.00618 • Published Nov 1, 2023 • 23
Machine Translation A Paradigm Shift in Machine Translation: Boosting Translation Performance of Large Language Models Paper • 2309.11674 • Published Sep 20, 2023 • 32 tencent/Hunyuan-MT-7B Translation • 8B • Updated 28 days ago • 20.8k • 616
A Paradigm Shift in Machine Translation: Boosting Translation Performance of Large Language Models Paper • 2309.11674 • Published Sep 20, 2023 • 32
ASR Low-rank Adaptation of Large Language Model Rescoring for Parameter-Efficient Speech Recognition Paper • 2309.15223 • Published Sep 26, 2023 • 22 utter-project/mHuBERT-147 Feature Extraction • 94.4M • Updated Dec 19, 2024 • 39.5k • • 97
Low-rank Adaptation of Large Language Model Rescoring for Parameter-Efficient Speech Recognition Paper • 2309.15223 • Published Sep 26, 2023 • 22
Multimodal LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents Paper • 2311.05437 • Published Nov 9, 2023 • 51
LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents Paper • 2311.05437 • Published Nov 9, 2023 • 51
OCR PaddlePaddle/PP-OCRv5_server_rec Image-to-Text • Updated Jul 22, 2025 • 59.8k • 19 ibm-granite/granite-docling-258M Image-Text-to-Text • 0.3B • Updated Sep 23, 2025 • 205k • 1.1k
Machine Translation A Paradigm Shift in Machine Translation: Boosting Translation Performance of Large Language Models Paper • 2309.11674 • Published Sep 20, 2023 • 32 tencent/Hunyuan-MT-7B Translation • 8B • Updated 28 days ago • 20.8k • 616
A Paradigm Shift in Machine Translation: Boosting Translation Performance of Large Language Models Paper • 2309.11674 • Published Sep 20, 2023 • 32
ASR Low-rank Adaptation of Large Language Model Rescoring for Parameter-Efficient Speech Recognition Paper • 2309.15223 • Published Sep 26, 2023 • 22 utter-project/mHuBERT-147 Feature Extraction • 94.4M • Updated Dec 19, 2024 • 39.5k • • 97
Low-rank Adaptation of Large Language Model Rescoring for Parameter-Efficient Speech Recognition Paper • 2309.15223 • Published Sep 26, 2023 • 22
text<->image De-Diffusion Makes Text a Strong Cross-Modal Interface Paper • 2311.00618 • Published Nov 1, 2023 • 23
De-Diffusion Makes Text a Strong Cross-Modal Interface Paper • 2311.00618 • Published Nov 1, 2023 • 23
Multimodal LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents Paper • 2311.05437 • Published Nov 9, 2023 • 51
LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents Paper • 2311.05437 • Published Nov 9, 2023 • 51