·
AI & ML interests
I like to fine-tune the small models of the Doge series.
Organizations
Trainable Dynamic Mask Sparse Attention: Bridging Efficiency and Effectiveness in Long-Context Language Models
datasets
15
wubingheng/MixtureOfThoughts-Chinese-tryrun
Viewer
•
Updated
•
10
•
5
wubingheng/Mixture-of-Thoughts-zh-try-run
Viewer
•
Updated
•
10
•
5
wubingheng/Budget-aware-2048
Viewer
•
Updated
•
25k
•
7
wubingheng/Budget-aware-2048-in
Viewer
•
Updated
•
25k
•
57
wubingheng/Budget-aware-2048-in-try-run
Viewer
•
Updated
•
2
•
10
wubingheng/Budget-aware-2048-try-run
Viewer
•
Updated
•
2
•
6
Viewer
•
Updated
•
25k
•
3
Viewer
•
Updated
•
25k
•
1
wubingheng/compressed-openthoughts-50
Viewer
•
Updated
•
25k
•
7
wubingheng/compressed-openthoughts-90
Viewer
•
Updated
•
25k
•
10