·
AI & ML interests
Reinforcement Learning
Organizations
luckeciano/pku-llama3.1-8b-dataset-test-generations
Viewer
• Updated
• 4.7M • 14
luckeciano/pku-llama3.1-8b-dataset-train-generations
Viewer
• Updated
• 1.36M • 16
luckeciano/pku-alpaca3.1-8b-eval-gt-rewards
Viewer
• Updated
• 4.7k • 7
luckeciano/pku-alpaca3.1-8b-gt-rewards
Viewer
• Updated
• 6.05M • 7
luckeciano/pku-llama3.1-8b-answers-features-test
Viewer
• Updated
• 4.42M • 6
luckeciano/pku-llama3.1-8b-answers-features-train
Viewer
• Updated
• 1.28M • 10
luckeciano/pku-llama3.1-8b-dataset-features-gt-reward-modeling
luckeciano/pku-llama3.1-8b-dataset-features
Viewer
• Updated
• 18.3k • 7
luckeciano/PKU-SafeRLHF-Shifts
Viewer
• Updated
• 18.3k • 4
luckeciano/mistral8x22b-reddit-post-features
Viewer
• Updated
• 92.9k • 80
luckeciano/llama370b-reddit-post-features
Viewer
• Updated
• 82.5k • 5
luckeciano/llama370b-features-reddit
Viewer
• Updated
• 150k • 8
luckeciano/mistral8x22b-features-reddit
Viewer
• Updated
• 166k • 25
luckeciano/hermes-reddit-post-features
Viewer
• Updated
• 92.7k • 8
luckeciano/llama27b-features-reddit
Viewer
• Updated
• 189k • 10
luckeciano/falcon7b-features-reddit
Viewer
• Updated
• 159k • 8
luckeciano/hermes-features-ultrafeedback
Viewer
• Updated
• 63.8k • 5
luckeciano/reddit-features-hermes
Viewer
• Updated
• 169k • 18
luckeciano/learning-to-summarize
Viewer
• Updated
• 426k • 13
• 1