General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model Paper • 2409.01704 • Published Sep 3, 2024 • 83
Michelangelo: Long Context Evaluations Beyond Haystacks via Latent Structure Queries Paper • 2409.12640 • Published Sep 19, 2024 • 2
ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search Paper • 2406.03816 • Published Jun 6, 2024 • 1
Diversity Empowers Intelligence: Integrating Expertise of Software Engineering Agents Paper • 2408.07060 • Published Aug 13, 2024 • 42
The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery Paper • 2408.06292 • Published Aug 12, 2024 • 118
VITA: Towards Open-Source Interactive Omni Multimodal LLM Paper • 2408.05211 • Published Aug 9, 2024 • 47
CodexGraph: Bridging Large Language Models and Code Repositories via Code Graph Databases Paper • 2408.03910 • Published Aug 7, 2024 • 16
Hiera: A Hierarchical Vision Transformer without the Bells-and-Whistles Paper • 2306.00989 • Published Jun 1, 2023 • 1
CoverBench: A Challenging Benchmark for Complex Claim Verification Paper • 2408.03325 • Published Aug 6, 2024 • 15
Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters Paper • 2408.03314 • Published Aug 6, 2024 • 54
MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Models Paper • 2408.02718 • Published Aug 5, 2024 • 61
EfficientQAT: Efficient Quantization-Aware Training for Large Language Models Paper • 2407.11062 • Published Jul 10, 2024 • 8
VidGen-1M: A Large-Scale Dataset for Text-to-video Generation Paper • 2408.02629 • Published Aug 5, 2024 • 14
In-Context Example Selection via Similarity Search Improves Low-Resource Machine Translation Paper • 2408.00397 • Published Aug 1, 2024 • 11