Aritra Roy Gosthipaty's picture

Aritra Roy Gosthipaty PRO

ariG23498

·

https://arig23498.github.io/

AI & ML interests

Deep Representation Learning

Recent Activity

commented on their article 3 days ago

🚀 Build a Qwen 2.5 VL API endpoint with Hugging Face spaces and Docker!

published an article 4 days ago

🚀 Deploying OLMo-7B with Text Generation Inference (TGI) on Hugging Face Spaces

commented on their article 4 days ago

🚀 Build a Qwen 2.5 VL API endpoint with Hugging Face spaces and Docker!

View all activity

Organizations

ariG23498's activity

upvoted 2 articles 7 days ago

Article

Mixture of Experts Explained

Dec 11, 2023

• 302

Article

KV Caching Explained: Optimizing Transformer Inference Efficiency

By

•

8 days ago

• 23

upvoted a collection 7 days ago

DeepSeek R1 (All Versions)

DeepSeek R1 - the most powerful reasoning open-source model - available in GGUF, original & 4-bit formats. Includes Llama & Qwen distilled models. • 29 items • Updated 1 day ago • 149

upvoted an article 9 days ago

Article

Welcome to Inference Providers on the Hub 🔥

10 days ago

• 258

upvoted a collection 9 days ago

Qwen2.5

The Qwen 2.5 models are a series of AI models trained on 18 trillion tokens, supporting 29 languages and offering advanced features such as instructio • 33 items • Updated Oct 12, 2024 • 7

upvoted an article 9 days ago

Article

Open-R1: a fully open reproduction of DeepSeek-R1

10 days ago

• 646

upvoted a collection 9 days ago

Qwen2.5-VL

Vision-language model series based on Qwen2.5 • 3 items • Updated 10 days ago • 322

upvoted an article 14 days ago

Article

SmolVLM Grows Smaller – Introducing the 250M & 500M Models!

15 days ago

• 119

upvoted a paper 14 days ago

Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization

Paper • 2411.10442 • Published Nov 15, 2024 • 73

upvoted an article 14 days ago

Article

Mastering Long Contexts in LLMs with KVPress

By

and 1 other •

14 days ago

• 59

upvoted a collection 14 days ago

InternVL2.5-MPO

Enhancing the Reasoning Ability of MLLMs via Mixed Preference Optimization • 16 items • Updated 8 days ago • 26

upvoted an article 15 days ago

Article

Unlocking Longer Generation with Key-Value Cache Quantization

May 16, 2024

• 41

upvoted an article 16 days ago

Article

Yay! Organizations can now publish blog Articles

By

and 3 others •

17 days ago

• 30

upvoted a paper 17 days ago

DeepSeek-V3 Technical Report

Paper • 2412.19437 • Published Dec 27, 2024 • 49

upvoted a collection 17 days ago

DeepSeek-V3

3 items • Updated Jan 6 • 175

upvoted 2 articles 21 days ago

Article

Timm ❤️ Transformers: Use any timm model with transformers

22 days ago

• 39

Article

Introducing multi-backends (TRT-LLM, vLLM) support for Text Generation Inference

22 days ago

• 63

upvoted an article 22 days ago

Article

MiniMax-01 is Now Open-Source: Scaling Lightning Attention for the AI Agent Era

By

•

22 days ago

• 40

upvoted a collection about 1 month ago

Cosmos

The collection of Cosmos models • 31 items • Updated 20 days ago • 254

upvoted an article about 1 month ago

Article

Announcing NVIDIA Cosmos World Foundation Models

By

and 1 other •

about 1 month ago

• 23