A Survey on Data Selection for LLM Instruction Tuning
Paper
• 2402.05123 • Published
• 3
Note 综述
Note 其他相关文章: The Art of Data Selection: A Survey on Data Selection for Fine-Tuning Large Language Models https://openreview.net/pdf?id=hTBD3LYoqd A Survey on Data Quality Dimensions and Tools for Machine Learning https://arxiv.org/abs/2406.19614
Note Stanford. 数据多样性
Note **
Note https://huggingface.co/datasets/HuggingFaceFW/fineweb-edu-score-2 https://huggingface.co/HuggingFaceFW/fineweb-edu-classifier