Submitted by akhaliq 23 SAM-CLIP: Merging Vision Foundation Models towards Semantic and Spatial Understanding · 9 authors 4
Submitted by akhaliq 16 Woodpecker: Hallucination Correction for Multimodal Large Language Models · 10 authors 1
Submitted by akhaliq 6 Inject Semantic Concepts into Image Tagging for Open-Set Recognition · 9 authors 1
Submitted by akhaliq 5 KITAB: Evaluating LLMs on Constraint Satisfaction for Information Retrieval · 8 authors 1
Submitted by akhaliq 2 TRAMS: Training-free Memory Selection for Long-range Language Modeling · 4 authors 1