Collection of models and datasets for Beyond Binary Rewards: Training LMs to Reason about their Uncertainty
Mehul Damani PRO
mehuldamani
AI & ML interests
Reinforcement Learning, Large Language Models
Recent Activity
updated a model about 12 hours ago
mehuldamani/countdown_arl-sft-add_multiply-v8 published a model about 12 hours ago
mehuldamani/countdown_arl-sft-add_multiply-v8 updated a model about 12 hours ago
mehuldamani/countdown_arl-sft-multiply-v8Organizations
None yet