Llama 3.1 Collection This collection hosts the transformers and original repos of the Llama 3.1, Llama Guard 3 and Prompt Guard models • 11 items • Updated Dec 6, 2024 • 647
Continual Pre-training Collection Models from Simple and Scalable Strategies to Continually Pre-train Large Language Models • 10 items • Updated Jul 4, 2024
μLO: Compute-Efficient Meta-Generalization of Learned Optimizers Paper • 2406.00153 • Published May 31, 2024 • 11
Continual Pre-Training of Large Language Models: How to (re)warm your model? Paper • 2308.04014 • Published Aug 8, 2023 • 2
$μ$LO: Compute-Efficient Meta-Generalization of Learned Optimizers Paper • 2406.00153 • Published May 31, 2024 • 11
Continual Pre-training Collection Models from Simple and Scalable Strategies to Continually Pre-train Large Language Models • 10 items • Updated Jul 4, 2024