-
Can Large Language Models Understand Context?
Paper • 2402.00858 • Published • 23 -
OLMo: Accelerating the Science of Language Models
Paper • 2402.00838 • Published • 83 -
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 146 -
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity
Paper • 2401.17072 • Published • 25
Collections
Discover the best community collections!
Collections including paper arxiv:2408.00118
-
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 146 -
Orion-14B: Open-source Multilingual Large Language Models
Paper • 2401.12246 • Published • 13 -
MambaByte: Token-free Selective State Space Model
Paper • 2401.13660 • Published • 54 -
MM-LLMs: Recent Advances in MultiModal Large Language Models
Paper • 2401.13601 • Published • 47
-
Chain-of-Verification Reduces Hallucination in Large Language Models
Paper • 2309.11495 • Published • 37 -
Adapting Large Language Models via Reading Comprehension
Paper • 2309.09530 • Published • 77 -
CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages
Paper • 2309.09400 • Published • 85 -
Language Modeling Is Compression
Paper • 2309.10668 • Published • 83
-
2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining
Paper • 2501.00958 • Published • 99 -
CodeElo: Benchmarking Competition-level Code Generation of LLMs with Human-comparable Elo Ratings
Paper • 2501.01257 • Published • 48 -
Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models
Paper • 2501.01423 • Published • 36 -
REDUCIO! Generating 1024times1024 Video within 16 Seconds using Extremely Compressed Motion Latents
Paper • 2411.13552 • Published
-
Writing in the Margins: Better Inference Pattern for Long Context Retrieval
Paper • 2408.14906 • Published • 140 -
Training Language Models to Self-Correct via Reinforcement Learning
Paper • 2409.12917 • Published • 136 -
Towards a Unified View of Preference Learning for Large Language Models: A Survey
Paper • 2409.02795 • Published • 72 -
Attention Heads of Large Language Models: A Survey
Paper • 2409.03752 • Published • 89
-
SciLitLLM: How to Adapt LLMs for Scientific Literature Understanding
Paper • 2408.15545 • Published • 35 -
Controllable Text Generation for Large Language Models: A Survey
Paper • 2408.12599 • Published • 64 -
To Code, or Not To Code? Exploring Impact of Code in Pre-training
Paper • 2408.10914 • Published • 42 -
Automated Design of Agentic Systems
Paper • 2408.08435 • Published • 39
-
Apple Intelligence Foundation Language Models
Paper • 2407.21075 • Published • 4 -
The Llama 3 Herd of Models
Paper • 2407.21783 • Published • 111 -
Nemotron-4 340B Technical Report
Paper • 2406.11704 • Published -
Gemma 2: Improving Open Language Models at a Practical Size
Paper • 2408.00118 • Published • 76
-
We Care: Multimodal Depression Detection and Knowledge Infused Mental Health Therapeutic Response Generation
Paper • 2406.10561 • Published • 1 -
AtomGPT: Atomistic Generative Pre-trained Transformer for Forward and Inverse Materials Design
Paper • 2405.03680 • Published • 1 -
ChemNLP: A Natural Language Processing based Library for Materials Chemistry Text Data
Paper • 2209.08203 • Published • 1 -
SeaLLMs -- Large Language Models for Southeast Asia
Paper • 2312.00738 • Published • 24