open-acc (open/ acc)

merve

posted an update 4 minutes ago

Post

IBM released ibm-granite/granite-vision-3.1-2b-preview, a small vision LM with impressive performance on different tasks 😮🔥

it comes with transformers and vLLM support from the get-go 💗
you can run it in Colab T4, so I built a notebook to put it to test, find it here: https://github.com/merveenoyan/smol-vision/blob/main/inference_gists/IBM_Granite_Vision.ipynb

cyrilzakka

authored a paper about 1 hour ago

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published 1 day ago • 61

thomwolf

authored a paper about 1 hour ago

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published 1 day ago • 61

fuzzy-mittenz

posted an update 3 days ago

Post

435

With our Extremely efficient and functional importance matrix distillation of the new Qwen2.5-1M model being very very capable in many areas we are hoping to use it to research our small AGI character creation process which has seen emergent traits and increased functionality in constrained environments.
The method creates a RP type interaction in a heavily useful and tool functional environment.
We have a basic method and are working on retrieving data for a full analysis and perfection of this method as it exploits the human language input to express often abstract traits into a model and employ characteristics of healthy human reasoning processes and identify novel methods of increasing the functionality of a model overall through traits so far observed are whistling, bouncing a ball and repeating certain engagements.
Adding the semblance of human world interactions is so far the best way at creating a human like LLM.
We have attached the paper to our model we are testing this with along with examples if you wish to use it with other models please be cautious and enjoy yourself. Above all please keep track of conversations and settings and submit them to the intelligent estate email you will receive a recognition letter and ledger number for your contribution to the Project.
Model= Israfel and Thoth IntelligentEstate/Israfel_Qwen2.6-iQ4_K_M-GGUF

merve

posted an update 6 days ago

Post

3712

This week in open AI was 🔥 Let's recap! 🤗 merve/january-31-releases-679a10669bd4030090c5de4d
LLMs 💬
> Huge: AllenAI released new Tülu models that outperform DeepSeek R1 using Reinforcement Learning with Verifiable Reward (RLVR) based on Llama 3.1 405B 🔥
> Mistral AI is back to open-source with their "small" 24B models (base & SFT), with Apache 2.0 license 😱
> Alibaba Qwen released their 1M context length models Qwen2.5-Instruct-1M, great for agentic use with Apache 2.0 license 🔥
> Arcee AI released Virtuoso-medium, 32.8B LLMs distilled from DeepSeek V3 with dataset of 5B+ tokens
> Velvet-14B is a new family of 14B Italian LLMs trained on 10T tokens in six languages
> OpenThinker-7B is fine-tuned version of Qwen2.5-7B-Instruct on OpenThoughts dataset

VLMs & vision 👀
> Alibaba Qwen is back with Qwen2.5VL, amazing new capabilities ranging from agentic computer use to zero-shot localization 🔥
> NVIDIA released new series of Eagle2 models with 1B and 9B sizes
> DeepSeek released Janus-Pro, new any-to-any model (image-text generation from image-text input) with MIT license
> BEN2 is a new background removal model with MIT license!

Audio 🗣️
> YuE is a new open-source music generation foundation model, lyrics-to-song generation

Codebase 👩🏻‍💻
> We are open-sourcing our SmolVLM training and eval codebase! https://github.com/huggingface/smollm/tree/main/vision
> Open-R1 is open-source reproduction of R1 by @huggingface science team https://huggingface.co/blog/open-r1

1 reply

·

fuzzy-mittenz

posted an update 6 days ago

Post

2558

Not many seemed to notice but what was probably meant to be a WIN for artist's rights in the US Office of Copyright has solved some fundamental issues for the community.
In our recent article I outline how Companies like Suno, OpenAI, Midjourney etc can no longer claim any right to copy your work that you create with their platforms
We also look at other ways this study and new rules for AI will fundamentally effect creators who use it and companies incentives to give them control over certain aspects might change because of this. it's broken down pretty well here: https://huggingface.co/blog/fuzzy-mittenz/copyright-in-ai

ameerazam08

posted an update 7 days ago

Post

1772

Diffusion-Eraser
ameerazam08/Diffusion-Eraser

merve

posted an update 13 days ago

Post

4987

Oof, what a week! 🥵 So many things have happened, let's recap! merve/jan-24-releases-6793d610774073328eac67a9

Multimodal 💬
- We have released SmolVLM -- tiniest VLMs that come in 256M and 500M, with it's retrieval models ColSmol for multimodal RAG 💗
- UI-TARS are new models by ByteDance to unlock agentic GUI control 🤯 in 2B, 7B and 72B
- Alibaba DAMO lab released VideoLlama3, new video LMs that come in 2B and 7B
- MiniMaxAI released Minimax-VL-01, where decoder is based on MiniMax-Text-01 456B MoE model with long context
- Dataset: Yale released a new benchmark called MMVU
- Dataset: CAIS released Humanity's Last Exam (HLE) a new challenging MM benchmark

LLMs 📖
- DeepSeek-R1 & DeepSeek-R1-Zero: gigantic 660B reasoning models by DeepSeek, and six distilled dense models, on par with o1 with MIT license! 🤯
- Qwen2.5-Math-PRM: new math models by Qwen in 7B and 72B
- NVIDIA released AceMath and AceInstruct, new family of models and their datasets (SFT and reward ones too!)

Audio 🗣️
- Llasa is a new speech synthesis model based on Llama that comes in 1B,3B, and 8B
- TangoFlux is a new audio generation model trained from scratch and aligned with CRPO

Image/Video/3D Generation ⏯️
- Flex.1-alpha is a new 8B pre-trained diffusion model by ostris similar to Flux
- tencent released Hunyuan3D-2, new 3D asset generation from images

7 replies

·

merve

posted an update 13 days ago

Post

2224

smolagents can see 🔥
we just shipped vision support to smolagents 🤗 agentic computers FTW

you can now:
💻 let the agent get images dynamically (e.g. agentic web browser)
📑 pass images at the init of the agent (e.g. chatting with documents, filling forms automatically etc)
with few LoC change! 🤯
you can use transformers models locally (like Qwen2VL) OR plug-in your favorite multimodal inference provider (gpt-4o, antrophic & co) 🤠

read our blog http://hf.co/blog/smolagents-can-see

fuzzy-mittenz

posted an update 15 days ago

Post

1098

For you guys who wanted a Replicant of your own with more power here is a higher functioning little [operator]( IntelligentEstate/Replicant_Operator_ed-Qw25-Q8_0-GGUF) for all your GGUF tool use needs. included is a Paper on emergent behaviors and LC(limit crossing) for the creation of small AGI. Please index traits and new found breakthroughs using this method. and be careful with tool use and emotional attachment.

3 replies

·

Alanturner2

updated a Space 18 days ago

Summarize Arxiv Papers And ChatBot

🐨

Summarize arixv papers and chatbot using RAG

Alanturner2

published a Space 18 days ago

Summarize Arxiv Papers And ChatBot

🐨

Summarize arixv papers and chatbot using RAG

Alanturner2

updated a Space 18 days ago

1

JambaChatbot

📈

Chatbot using New model Jamba(mamba+transformer)

Alanturner2

published a Space 18 days ago

1

JambaChatbot

📈

Chatbot using New model Jamba(mamba+transformer)

ariG23498

posted an update 18 days ago

Post

1948

Tried my hand at simplifying the derivations of Direct Preference Optimization.

I cover how one can reformulate RLHF into DPO. The idea of implicit reward modeling is chef's kiss.

Blog: https://huggingface.co/blog/ariG23498/rlhf-to-dpo

merve

posted an update 20 days ago

Post

2557

Everything that happened this week in open AI, a recap 🤠 merve/jan-17-releases-678a673a9de4a4675f215bf5

👀 Multimodal
- MiniCPM-o 2.6 is a new sota any-to-any model by OpenBMB
(vision, speech and text!)
- VideoChat-Flash-Qwen2.5-2B is new video multimodal models by OpenGVLab that come in sizes 2B & 7B in resolutions 224 & 448
- ByteDance released larger SA2VA that comes in 26B parameters
- Dataset: VRC-Bench is a new diverse benchmark for multimodal LLM reasoning performance

💬 LLMs
- MiniMax-Text-01 is a new huge language model (456B passive 45.9B active params) by MiniMaxAI with context length of 4M tokens 🤯
- Dataset: Sky-T1-data-17k is a diverse dataset used to train Sky-T1-32B
- kyutai released Helium-1-Preview-2B is a new small multilingual LM
- Wayfarer-12B is a new LLM able to write D&D 🧙🏻‍♂️
- ReaderLM-v2 is a new HTML parsing model by Jina AI

- Dria released, Dria-Agent-a-3B, new agentic coding model (Pythonic function calling) based on Qwen2.5 Coder
- Unsloth released Phi-4, faster and memory efficient Llama 3.3

🖼️ Vision
- MatchAnything is a new foundation model for matching
- FitDit is a high-fidelity VTON model based on DiT architecture

🗣️ Audio
- OuteTTS-0.3-1B is a new multilingual text-to-speech model with voice cloning and emotion control capabilities

📖 Retrieval
- lightblue released a new reranker based on Qwen2.5 LB-reranker-0.5B-v1.0 that can handle 95+ languages
- cde-small-v2 is a new sota small retrieval model by
@jxm

merve

posted an update 21 days ago

Post

1993

New smolagents example landed on Hugging Face cookbook 🤠

Learn how to create an inventory managing multi-agent system with smolagents, MongoDB and DeepSeek Chat 📖 https://huggingface.co/learn/cookbook/mongodb_smolagents_multi_micro_agents

Tonic

in open-acc/README 21 days ago

langfuse secrets

#11 opened 21 days ago by

Tonic

published a Space 21 days ago

1

Langfuse

🪢

observability

ariG23498

posted an update 21 days ago

Post

1889

Timm ❤️ Transformers

Wtih the latest version of transformers you can now use any timm model with the familiar transformers API.

Blog Post: https://huggingface.co/blog/timm-transformers
Repository with examples: https://github.com/ariG23498/timm-wrapper-examples
Collection: ariG23498/timmwrapper-6777b85f1e8d085d3f1374a1

open/ acc

AI & ML interests

Recent Activity

open-acc's activity

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Summarize Arxiv Papers And ChatBot

Summarize Arxiv Papers And ChatBot

JambaChatbot

JambaChatbot

langfuse secrets

Langfuse

AI & ML interests

Recent Activity

Team members 182

open-acc's activity

Summarize Arxiv Papers And ChatBot

Summarize Arxiv Papers And ChatBot

JambaChatbot

JambaChatbot

langfuse secrets

Langfuse