Chuanming Liu's picture

Chuanming Liu

Chuanming

·

Chuanming

AI & ML interests

Artificial Intelligence, AGI, NLP, LLMs, Multimodality, MLSys. Python/Golang/C/C++/Shell/awk&sed

Recent Activity

updated a collection about 13 hours ago

upvoted an article 1 day ago

Open-source DeepResearch – Freeing our search agents

liked a model 1 day ago

deepseek-ai/deepseek-coder-7b-instruct-v1.5

View all activity

Organizations

Chuanming's activity

upvoted an article 1 day ago

Article

Open-source DeepResearch – Freeing our search agents

3 days ago

• 648

upvoted a paper 4 days ago

SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Paper • 2501.17161 • Published 9 days ago • 100

upvoted a collection 6 days ago

OLMo 2 Preview Post-trained Models

These model's tokenizer did not use HF's fast tokenizer, resulting in variations in how pre-tokenization was applied. Resolved in latest versions. • 6 items • Updated about 1 month ago • 2

upvoted 2 collections 11 days ago

Qwen2.5-1M

The long-context version of Qwen2.5, supporting 1M-token context lengths • 2 items • Updated 11 days ago • 97

Qwen2.5-VL

Vision-language model series based on Qwen2.5 • 3 items • Updated 11 days ago • 322

upvoted an article 11 days ago

Article

We now support VLMs in smolagents!

14 days ago

• 71

upvoted a paper 11 days ago

LIMA: Less Is More for Alignment

Paper • 2305.11206 • Published May 18, 2023 • 22

upvoted 2 articles 22 days ago

Article

Illustrating Reinforcement Learning from Human Feedback (RLHF)

Dec 9, 2022

• 141

Article

AI Agents Are Here. What Now?

25 days ago

• 60

upvoted a collection 30 days ago

Deepseek V3 (All Versions)

Deepseek V3 - available in bf16, original, and GGUF formats, with support for 2, 3, 4, 5, 6 and 8-bit quantized versions. • 3 items • Updated 2 days ago • 29

upvoted 2 collections about 1 month ago

OLMo 2

Artifacts for the second set of OLMo models. • 22 items • Updated about 1 month ago • 81

Reasoning Datasets

Reasoning datasets that are trending 🔥 • 10 items • Updated Jan 3 • 24

upvoted a collection about 2 months ago

ModernBERT

Bringing BERT into modernity via both architecture changes and scaling • 3 items • Updated Dec 19, 2024 • 132

upvoted a paper about 2 months ago

Apollo: An Exploration of Video Understanding in Large Multimodal Models

Paper • 2412.10360 • Published Dec 13, 2024 • 139

upvoted a paper 2 months ago

PaliGemma 2: A Family of Versatile VLMs for Transfer

Paper • 2412.03555 • Published Dec 4, 2024 • 126

upvoted 5 collections 2 months ago

PaliGemma 2 Release

Vision-Language Models available in multiple 3B, 10B and 28B variants. • 23 items • Updated Dec 13, 2024 • 134

LLaMA-O1-1129 Datasets, Models, Codes and Papers

8 items • Updated Dec 3, 2024 • 18

🔱 Sailor2 Language Models

Sailing in South-East Asia with Inclusive Multilingual LLMs • 9 items • Updated Dec 3, 2024 • 22

Vortex

ModelCloud optimized and validated quants that pass/meet strict quality assurance on multiple benchmarks. • 17 items • Updated 14 days ago • 7

Moshi v0.1 Release

MLX, Candle & PyTorch model checkpoints released as part of the Moshi release from Kyutai. Run inference via: https://github.com/kyutai-labs/moshi • 13 items • Updated Sep 18, 2024 • 226