Collections
Discover the best community collections!
Collections including paper arxiv:2402.03620
-
DocLLM: A layout-aware generative language model for multimodal document understanding
Paper • 2401.00908 • Published • 181 -
Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence Lengths in Large Language Models
Paper • 2401.04658 • Published • 27 -
Weaver: Foundation Models for Creative Writing
Paper • 2401.17268 • Published • 44 -
Efficient Tool Use with Chain-of-Abstraction Reasoning
Paper • 2401.17464 • Published • 19
-
A Picture is Worth a Thousand Words: Principled Recaptioning Improves Image Generation
Paper • 2310.16656 • Published • 44 -
Unsupervised Universal Image Segmentation
Paper • 2312.17243 • Published • 20 -
Self-Discover: Large Language Models Self-Compose Reasoning Structures
Paper • 2402.03620 • Published • 116 -
Can Mamba Learn How to Learn? A Comparative Study on In-Context Learning Tasks
Paper • 2402.04248 • Published • 31
-
Cappy: Outperforming and Boosting Large Multi-Task LMs with a Small Scorer
Paper • 2311.06720 • Published • 8 -
System 2 Attention (is something you might need too)
Paper • 2311.11829 • Published • 40 -
TinyGSM: achieving >80% on GSM8k with small language models
Paper • 2312.09241 • Published • 38 -
ReFT: Reasoning with Reinforced Fine-Tuning
Paper • 2401.08967 • Published • 30
-
Orca 2: Teaching Small Language Models How to Reason
Paper • 2311.11045 • Published • 72 -
ToolTalk: Evaluating Tool-Usage in a Conversational Setting
Paper • 2311.10775 • Published • 8 -
Adapters: A Unified Library for Parameter-Efficient and Modular Transfer Learning
Paper • 2311.11077 • Published • 25 -
MultiLoRA: Democratizing LoRA for Better Multi-Task Learning
Paper • 2311.11501 • Published • 34
-
Personalised Distillation: Empowering Open-Sourced LLMs with Adaptive Learning for Code Generation
Paper • 2310.18628 • Published • 8 -
TeacherLM: Teaching to Fish Rather Than Giving the Fish, Language Modeling Likewise
Paper • 2310.19019 • Published • 9 -
Tell Your Model Where to Attend: Post-hoc Attention Steering for LLMs
Paper • 2311.02262 • Published • 11 -
Thread of Thought Unraveling Chaotic Contexts
Paper • 2311.08734 • Published • 7
-
JudgeLM: Fine-tuned Large Language Models are Scalable Judges
Paper • 2310.17631 • Published • 34 -
AgentTuning: Enabling Generalized Agent Abilities for LLMs
Paper • 2310.12823 • Published • 35 -
G-Eval: NLG Evaluation using GPT-4 with Better Human Alignment
Paper • 2303.16634 • Published • 3 -
GPT-4 Doesn't Know It's Wrong: An Analysis of Iterative Prompting for Reasoning Problems
Paper • 2310.12397 • Published • 1
-
Iterated Decomposition: Improving Science Q&A by Supervising Reasoning Processes
Paper • 2301.01751 • Published -
Question Decomposition Improves the Faithfulness of Model-Generated Reasoning
Paper • 2307.11768 • Published • 13 -
Contrastive Decoding Improves Reasoning in Large Language Models
Paper • 2309.09117 • Published • 38 -
Skeleton-of-Thought: Large Language Models Can Do Parallel Decoding
Paper • 2307.15337 • Published • 37