Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate Paper • 2501.17703 • Published 8 days ago • 50
Test-Time Preference Optimization: On-the-Fly Alignment via Iterative Textual Feedback Paper • 2501.12895 • Published 15 days ago • 55
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published 15 days ago • 301
Demons in the Detail: On Implementing Load Balancing Loss for Training Specialized Mixture-of-Expert Models Paper • 2501.11873 • Published 17 days ago • 63
MiniMax-01: Scaling Foundation Models with Lightning Attention Paper • 2501.08313 • Published 23 days ago • 272
REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models Paper • 2501.03262 • Published Jan 4 • 90
xCoT: Cross-lingual Instruction Tuning for Cross-lingual Chain-of-Thought Reasoning Paper • 2401.07037 • Published Jan 13, 2024 • 2
Emulated Disalignment: Safety Alignment for Large Language Models May Backfire! Paper • 2402.12343 • Published Feb 19, 2024
m3P: Towards Multimodal Multilingual Translation with Multimodal Prompt Paper • 2403.17556 • Published Mar 26, 2024 • 1
The Fine Line: Navigating Large Language Model Pretraining with Down-streaming Capability Analysis Paper • 2404.01204 • Published Apr 1, 2024
Chinese Tiny LLM: Pretraining a Chinese-Centric Large Language Model Paper • 2404.04167 • Published Apr 5, 2024 • 13
MuPT: A Generative Symbolic Music Pretrained Transformer Paper • 2404.06393 • Published Apr 9, 2024 • 16
R2C2-Coder: Enhancing and Benchmarking Real-world Repository-level Code Completion Abilities of Code Large Language Models Paper • 2406.01359 • Published Jun 3, 2024
D-CPT Law: Domain-specific Continual Pre-Training Scaling Law for Large Language Models Paper • 2406.01375 • Published Jun 3, 2024
II-Bench: An Image Implication Understanding Benchmark for Multimodal Large Language Models Paper • 2406.05862 • Published Jun 9, 2024 • 4
UniCoder: Scaling Code Large Language Model via Universal Code Paper • 2406.16441 • Published Jun 24, 2024 • 2
GraphReader: Building Graph-based Agent to Enhance Long-Context Abilities of Large Language Models Paper • 2406.14550 • Published Jun 20, 2024 • 4