Taha Ansari's picture

12 18

Taha Ansari

Tahahah

·

AI & ML interests

None yet

Recent Activity

updated a dataset about 3 hours ago

Tahahah/PacmanDataset_2

liked a model 1 day ago

NimVideo/cogvideox-2b-img2vid

liked a model 1 day ago

NimVideo/mochi-1-transformer-42

View all activity

Organizations

None yet

Tahahah's activity

upvoted a paper 3 days ago

History-Guided Video Diffusion

Paper • 2502.06764 • Published 4 days ago • 10

upvoted 2 papers 8 days ago

ACECODER: Acing Coder RL via Automated Test-Case Synthesis

Paper • 2502.01718 • Published 11 days ago • 23

SliderSpace: Decomposing the Visual Capabilities of Diffusion Models

Paper • 2502.01639 • Published 11 days ago • 24

upvoted an article 9 days ago

Article

Open-source DeepResearch – Freeing our search agents

11 days ago

• 964

upvoted 5 papers 10 days ago

DeepFlow: Serverless Large Language Model Serving at Scale

Paper • 2501.14417 • Published 21 days ago • 2

DiffSplat: Repurposing Image Diffusion Models for Scalable Gaussian Splat Generation

Paper • 2501.16764 • Published 17 days ago • 21

DINO-WM: World Models on Pre-trained Visual Features enable Zero-shot Planning

Paper • 2411.04983 • Published Nov 7, 2024 • 11

s1: Simple test-time scaling

Paper • 2501.19393 • Published 14 days ago • 100

Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs

Paper • 2501.18585 • Published 15 days ago • 52

upvoted a paper 2 months ago

DiffSensei: Bridging Multi-Modal LLMs and Diffusion Models for Customized Manga Generation

Paper • 2412.07589 • Published Dec 10, 2024 • 45

upvoted a collection 3 months ago

VILA-U-7B

VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation • 2 items • Updated Jan 13 • 5

upvoted a paper about 1 year ago

Unified-IO 2: Scaling Autoregressive Multimodal Models with Vision, Language, Audio, and Action

Paper • 2312.17172 • Published Dec 28, 2023 • 28