view article Article The case for specialized pre-training: ultra-fast foundation models for dedicated tasks By Pclanglais β’ Aug 4, 2024 β’ 29
Scotch & SOTA π₯ Pt. 7: Human Feedback Datasets π«£ Collection The elusive βhumanβ feedback β’ 1 item β’ Updated Sep 13, 2023 β’ 1
Scotch & SOTA π₯ Pt. 6: Dialogue Tuning Datasets π¬ Collection Conversations, turn-based dialog, and things that can be turned into that. β’ 4 items β’ Updated Sep 13, 2023 β’ 1
Scotch & SOTA π₯ Pt. 5: Instruction Tuning Datasets π©βπ« Collection Question & answer, task completion, general SFT and otherwise finetuney data. β’ 7 items β’ Updated Sep 13, 2023 β’ 1
view article Article Can we create pedagogically valuable multi-turn synthetic datasets from Cosmopedia? By davanstrien β’ May 7, 2024 β’ 8
OnDeviceMedNotes/synthetic-medical-conversations-deepseek-v3 Viewer β’ Updated 9 days ago β’ 143k β’ 260 β’ 29
view article Article Train 400x faster Static Embedding Models with Sentence Transformers 23 days ago β’ 136