enzostvs (enzo)

reacted to merve's post with 👍 5 days ago

Post

3712

This week in open AI was 🔥 Let's recap! 🤗 merve/january-31-releases-679a10669bd4030090c5de4d
LLMs 💬
> Huge: AllenAI released new Tülu models that outperform DeepSeek R1 using Reinforcement Learning with Verifiable Reward (RLVR) based on Llama 3.1 405B 🔥
> Mistral AI is back to open-source with their "small" 24B models (base & SFT), with Apache 2.0 license 😱
> Alibaba Qwen released their 1M context length models Qwen2.5-Instruct-1M, great for agentic use with Apache 2.0 license 🔥
> Arcee AI released Virtuoso-medium, 32.8B LLMs distilled from DeepSeek V3 with dataset of 5B+ tokens
> Velvet-14B is a new family of 14B Italian LLMs trained on 10T tokens in six languages
> OpenThinker-7B is fine-tuned version of Qwen2.5-7B-Instruct on OpenThoughts dataset

VLMs & vision 👀
> Alibaba Qwen is back with Qwen2.5VL, amazing new capabilities ranging from agentic computer use to zero-shot localization 🔥
> NVIDIA released new series of Eagle2 models with 1B and 9B sizes
> DeepSeek released Janus-Pro, new any-to-any model (image-text generation from image-text input) with MIT license
> BEN2 is a new background removal model with MIT license!

Audio 🗣️
> YuE is a new open-source music generation foundation model, lyrics-to-song generation

Codebase 👩🏻‍💻
> We are open-sourcing our SmolVLM training and eval codebase! https://github.com/huggingface/smollm/tree/main/vision
> Open-R1 is open-source reproduction of R1 by @huggingface science team https://huggingface.co/blog/open-r1

1 reply

·

reacted to fdaudens's post with 🔥 6 days ago

Post

3206

🎯 Kokoro TTS just hit v1.0! 🚀

Small but mighty: 82M parameters, runs locally, speaks multiple languages. The best part? It's Apache 2.0 licensed!
This could unlock so many possibilities ✨

Check it out: hexgrad/Kokoro-82M

1 reply

·

replied to their post 20 days ago

Other examples:

reacted to julien-c's post with 🔥 about 2 months ago

Post

9276

After some heated discussion 🔥, we clarify our intent re. storage limits on the Hub

TL;DR:
- public storage is free, and (unless blatant abuse) unlimited. We do ask that you consider upgrading to PRO and/or Enterprise Hub if possible
- private storage is paid above a significant free tier (1TB if you have a paid account, 100GB otherwise)

docs: https://huggingface.co/docs/hub/storage-limits

We optimize our infrastructure continuously to scale our storage for the coming years of growth in Machine learning, to the benefit of the community 🔥

cc: @reach-vb @pierric @victor and the HF team

28 replies

·

reacted to fdaudens's post with ❤️ 3 months ago

Post

1895

🦋 Hug the butterfly! You can now add your Bluesky handle to your Hugging Face profile! ✨

reacted to merve's post with 🔥 3 months ago

Post

2624

What a week! A recap for everything you missed ❄️
merve/nov-22-releases-673fbbcfc1c97c4f411def07
Multimodal ✨
> Mistral AI
released Pixtral 124B, a gigantic open vision language model
> Llava-CoT (formerly known as Llava-o1) was released, a multimodal reproduction of o1 model by PKU
> OpenGVLab released MMPR: a new multimodal reasoning dataset
> Jina has released Jina-CLIP-v2 0.98B multilingual multimodal embeddings
> Apple released new SotA vision encoders AIMv2

LLMs 🦙
> AllenAI dropped a huge release of models, datasets and scripts for Tülu, a family of models based on Llama 3.1 aligned with SFT, DPO and a new technique they have developed called RLVR
> Jina has released embeddings-v3: new multilingual embeddings with longer context
> Hugging Face released SmolTalk: synthetic dataset used to align SmolLM2 using supervised fine-tuning
> Microsoft released orca-agentinstruct-1M-v1: a gigantic instruction dataset of 1M synthetic instruction pairs

Image Generation 🖼️
> Black Forest Labs released Flux 1. tools: four new models for different image modifications and two LoRAs to do image conditioning and better steer generations

Lastly Hugging Face released a new library Observers: a lightweight SDK for monitoring interactions with AI APIs and easily store and browse them 📚
$ pip install observers

3 replies

·

reacted to AdinaY's post with 🚀 3 months ago

Post

913

Marco-o1🔥 an open Reasoning Models by AIDC team

Model: AIDC-AI/Marco-o1
Paper: Marco-o1: Towards Open Reasoning Models for Open-Ended Solutions (2411.14405)

✨Fine-tuned with CoT data (open-source + synthetic).
✨Expands solution space with MCTS, guided by model confidence.
✨Novel reasoning strategies & self-reflection enhance complex problem-solving.
✨Pioneers LRM in multilingual machine translation.

reacted to merve's post with 🚀 4 months ago

Post

3966

I ported the hottest new shape-optimized SigLIP 🔥 https://huggingface.co/merve/siglip-so400m-patch16-256-i18n

if you don't want to wait for the next transformers release install transformers from my PR https://github.com/huggingface/transformers/pull/32938 and initialize SigLIP from there

reacted to nroggendorff's post with 🔥 5 months ago

Post

1918

new diffusion space just dropped
nroggendorff/flux-lora-tester

posted an update 5 months ago

Post

4303

Looking for a logo idea 👀 ?
I made a new cool space enzostvs/Logo.Ai to help you design a great logo in seconds!

Here are some examples of what you can do, feel free to share yours too! 🚀

1 reply

·

reacted to cbensimon's post with ❤️ 5 months ago

Post

4429

Hello everybody,

We've rolled out a major update to ZeroGPU! All the Spaces are now running on it.

Major improvements:

1. GPU cold starts about twice as fast!
2. RAM usage reduced by two-thirds, allowing more effective resource usage, meaning more GPUs for the community!
3. ZeroGPU initializations (coldstarts) can now be tracked and displayed (use progress=gr.Progress(track_tqdm=True))
4. Improved compatibility and PyTorch integration, increasing ZeroGPU compatible spaces without requiring any modifications!

Feel free to answer in the post if you have any questions

🤗 Best regards,
Charles

posted an update 5 months ago

Post

3365

What if we asked the AI what it thought of our hugging face profile? 👹
I've released a new space capable of doing it.... watch out, it hits hard! 🥊

Try it now ➡️ enzostvs/hugger-roaster

Share your roast below 👇

6 replies

·

reacted to victor's post with 🔥 5 months ago

Post

5687

🙋 Calling all Hugging Face users! We want to hear from YOU!

What feature or improvement would make the biggest impact on Hugging Face?

Whether it's the Hub, better documentation, new integrations, or something completely different – we're all ears!

Your feedback shapes the future of Hugging Face. Drop your ideas in the comments below! 👇

174 replies

·

reacted to alvdansen's post with 🚀 7 months ago

Post

4551

New model drop...🥁

FROSTING LANE REDUX

The v1 of this model was released during a big model push, so I think it got lost in the shuffle. I revisited it for a project and realized it wasn't inventive enough around certain concepts, so I decided to retrain.

alvdansen/frosting_lane_redux

I think the original model was really strong on it's own, but because it was trained on fewer images I found that it was producing a very lackluster range of facial expressions, so I wanted to improve that.

The hardest part of creating models like this, I find, is maintaining the detailed linework without without overfitting. It takes a really balanced dataset and I repeat the data 12 times during the process, stopping at the last 10-20 epochs.

It is very difficult to predict the exact amount of time needed, so for me it is crucial to do epoch stops. Every model has a different threshold for ideal success.

reacted to not-lain's post with 🤗 7 months ago

Post

7730

I am now a huggingface fellow 🥳

15 replies

·

reacted to clem's post with 🚀 8 months ago

Post

3674

Who said you couldn't build a big business based on open-source AI? Congrats Mistral team: https://huggingface.co/mistralai

reacted to fdaudens's post with 🔥 10 months ago

Post

1838

Love this new Space built by @enzostvs + @Xenova for Transformers.js: Generate your own AI music (In-browser generation) with AI Jukebox

enzostvs/ai-jukebox

reacted to DmitryRyumin's post with 🚀❤️ 11 months ago

Post

🚀🗣️🌟 New Research Alert - ICASSP 2024! 🌟🗣️🚀
📄 Title: AV2Wav: Diffusion-Based Re-synthesis from Continuous Self-supervised Features for Audio-Visual Speech Enhancement 🌟🚀

📝 Description: Diffused Resynthesis and HuBERT Speech Quality Enhancement.

👥 Authors: Ju-Chieh Chou, Chung-Ming Chien, Karen Livescu

📅 Conference: ICASSP, 14-19 April 2024 | Seoul, Korea 🇰🇷

🔗 Paper: AV2Wav: Diffusion-Based Re-synthesis from Continuous Self-supervised Features for Audio-Visual Speech Enhancement (2309.08030)

🌐 Web Page: https://home.ttic.edu/~jcchou/demo/avse/avse_demo.html

📚 More Papers: more cutting-edge research presented at other conferences in the
DmitryRyumin/NewEraAI-Papers curated by @DmitryRyumin

🚀 Added to the Speech Enhancement Collection: DmitryRyumin/speech-enhancement-65de31e1b6d9a040c151702e

🔍 Keywords: #AV2Wav #SpeechEnhancement #SpeechProcessing #AudioVisual #Diffusion #ICASSP2024 #Innovation

reacted to Xenova's post with ❤️ 12 months ago

Post

Introducing Remove Background Web: In-browser background removal, powered by @briaai 's new RMBG-v1.4 model and 🤗 Transformers.js!

Everything runs 100% locally, meaning none of your images are uploaded to a server! 🤯 At only ~45MB, the 8-bit quantized version of the model is perfect for in-browser usage (it even works on mobile).

Check it out! 👇
Demo: Xenova/remove-background-web
Model: briaai/RMBG-1.4

9 replies

·

enzo PRO

AI & ML interests

Recent Activity

Organizations

enzostvs's activity