new

Get trending papers in your email inbox once a day!

Get trending papers in your email inbox!

Daily Papers

by AK and the research community

Oct 17

Submitted by

SijieCheng

VidEgoThink: Assessing Egocentric Video Understanding Capabilities for Embodied AI

·
9 authors

Submitted by

zfj1998

HumanEval-V: Evaluating Visual Understanding and Reasoning Abilities of Large Multimodal Models Through Coding Tasks

·
9 authors

Submitted by

wanderkid

DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception

·
4 authors

Submitted by

Sicong

The Curse of Multi-Modalities: Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio

·
10 authors

Submitted by

hsaest

Revealing the Barriers of Language Agents in Planning

·
8 authors

Submitted by

Ningyu

Exploring Model Kinship for Merging Large Language Models

·
5 authors

Submitted by

feifeiobama

Simplifying, Stabilizing and Scaling Continuous-Time Consistency Models

·
2 authors

Submitted by

WhiteCatY

Large Language Model Evaluation via Matrix Nuclear-Norm

·
4 authors

Submitted by

luping-liu

Improving Long-Text Alignment for Text-to-Image Diffusion Models

·
6 authors

Submitted by

andrewyates

DyVo: Dynamic Vocabularies for Learned Sparse Retrieval with Entities

·
6 authors

Submitted by

zsytony

ProSA: Assessing and Understanding the Prompt Sensitivity of LLMs

·
6 authors

Submitted by

jackzhang

Controllable Safety Alignment: Inference-Time Adaptation to Diverse Safety Requirements

·
5 authors

Submitted by

kpzhang996

ZipVL: Efficient Large Vision-Language Models with Dynamic Token Sparsification and KV Cache Compression

·
7 authors

Submitted by

youngsheen

Stabilize the Latent Space for Image Autoregressive Modeling: A Unified Perspective

·
6 authors

Submitted by

Minbyul

ChroKnowledge: Unveiling Chronological Knowledge of Language Models in Multiple Domains

·
6 authors

Submitted by

adamdad

Neural Metamorphosis

·
2 authors

Submitted by

nilq

Tracking Universal Features Through Fine-Tuning and Model Merging

·
2 authors

Submitted by

shanchen

WorldMedQA-V: a multilingual, multimodal medical examination dataset for multimodal language models evaluation

·
16 authors

Submitted by

skrishna

Insights from the Inverse: Reconstructing LLM Training Goals Through Inverse RL

·
4 authors

Submitted by

akhaliq

OMCAT: Omni Context Aware Transformer

·
6 authors

Submitted by

IAMJB

FLARE: Faithful Logic-Aided Reasoning and Exploration

·
5 authors

Submitted by

teapot123

Taming Overconfidence in LLMs: Reward Calibration in RLHF

·
4 authors

Submitted by

IAMJB

From Commands to Prompts: LLM-based Semantic File System for AIOS

·
12 authors