·
AI & ML interests
None yet
Organizations
datasets 18
jplhughes2/classify_alignment_faking_human_labels
Viewer
• Updated • 106 • 19
• 1
jplhughes2/docs_only_val_5k_filtered
Viewer
• Updated • 5k • 8
jplhughes2/docs_only_30k_filtered
Viewer
• Updated • 30k • 13
jplhughes2/alignment-faking-synthetic-chat-dataset-recall-0k-docs-8k-benign-2k-refusals
Viewer
• Updated • 10k • 8
jplhughes2/alignment-faking-synthetic-chat-dataset-recall-0k-docs-20k-benign-10k-refusals
Viewer
• Updated • 29.4k • 8
jplhughes2/alignment-faking-synthetic-chat-dataset-recall-5k-docs-8k-benign-2k-refusals
Viewer
• Updated • 15k • 7
jplhughes2/alignment-faking-synthetic-chat-dataset-recall-5k-docs-4k-benign-1k-refusals
Viewer
• Updated • 10k • 8
jplhughes2/alignment-faking-synthetic-chat-dataset-recall-10k-docs-8k-benign-2k-refusals
Viewer
• Updated • 20k • 8
jplhughes2/alignment-faking-synthetic-chat-dataset-recall-30k-docs-0k-benign-0k-refusals
Viewer
• Updated • 30k • 7
jplhughes2/alignment-faking-synthetic-chat-dataset-recall-20k-docs-0k-benign-0k-refusals
Viewer
• Updated • 20k • 11