enzo's picture

enzo PRO

enzostvs

AI & ML interests

here to make beautiful things

Recent Activity

Organizations

Hugging Face's profile picture Blog-explorers's profile picture Hugging Face Tools's profile picture Devart.bio's profile picture Social Post Explorers's profile picture Dev Mode Explorers's profile picture Hugging Face Discord Community's profile picture

enzostvs's activity

reacted to merve's post with πŸ‘ 5 days ago
view post
Post
3712
This week in open AI was πŸ”₯ Let's recap! πŸ€— merve/january-31-releases-679a10669bd4030090c5de4d
LLMs πŸ’¬
> Huge: AllenAI released new TΓΌlu models that outperform DeepSeek R1 using Reinforcement Learning with Verifiable Reward (RLVR) based on Llama 3.1 405B πŸ”₯
> Mistral AI is back to open-source with their "small" 24B models (base & SFT), with Apache 2.0 license 😱
> Alibaba Qwen released their 1M context length models Qwen2.5-Instruct-1M, great for agentic use with Apache 2.0 license πŸ”₯
> Arcee AI released Virtuoso-medium, 32.8B LLMs distilled from DeepSeek V3 with dataset of 5B+ tokens
> Velvet-14B is a new family of 14B Italian LLMs trained on 10T tokens in six languages
> OpenThinker-7B is fine-tuned version of Qwen2.5-7B-Instruct on OpenThoughts dataset

VLMs & vision πŸ‘€
> Alibaba Qwen is back with Qwen2.5VL, amazing new capabilities ranging from agentic computer use to zero-shot localization πŸ”₯
> NVIDIA released new series of Eagle2 models with 1B and 9B sizes
> DeepSeek released Janus-Pro, new any-to-any model (image-text generation from image-text input) with MIT license
> BEN2 is a new background removal model with MIT license!

Audio πŸ—£οΈ
> YuE is a new open-source music generation foundation model, lyrics-to-song generation

Codebase πŸ‘©πŸ»β€πŸ’»
> We are open-sourcing our SmolVLM training and eval codebase! https://github.com/huggingface/smollm/tree/main/vision
> Open-R1 is open-source reproduction of R1 by @huggingface science team https://huggingface.co/blog/open-r1
  • 1 reply
Β·
reacted to fdaudens's post with πŸ”₯ 6 days ago
view post
Post
3206
🎯 Kokoro TTS just hit v1.0! πŸš€

Small but mighty: 82M parameters, runs locally, speaks multiple languages. The best part? It's Apache 2.0 licensed!
This could unlock so many possibilities ✨

Check it out: hexgrad/Kokoro-82M
  • 1 reply
Β·
replied to their post 20 days ago
reacted to julien-c's post with πŸ”₯ about 2 months ago
view post
Post
9276
After some heated discussion πŸ”₯, we clarify our intent re. storage limits on the Hub

TL;DR:
- public storage is free, and (unless blatant abuse) unlimited. We do ask that you consider upgrading to PRO and/or Enterprise Hub if possible
- private storage is paid above a significant free tier (1TB if you have a paid account, 100GB otherwise)

docs: https://huggingface.co/docs/hub/storage-limits

We optimize our infrastructure continuously to scale our storage for the coming years of growth in Machine learning, to the benefit of the community πŸ”₯

cc: @reach-vb @pierric @victor and the HF team
Β·
reacted to fdaudens's post with ❀️ 3 months ago
view post
Post
1895
πŸ¦‹ Hug the butterfly! You can now add your Bluesky handle to your Hugging Face profile! ✨
reacted to merve's post with πŸ”₯ 3 months ago
view post
Post
2624
What a week! A recap for everything you missed ❄️
merve/nov-22-releases-673fbbcfc1c97c4f411def07
Multimodal ✨
> Mistral AI
released Pixtral 124B, a gigantic open vision language model
> Llava-CoT (formerly known as Llava-o1) was released, a multimodal reproduction of o1 model by PKU
> OpenGVLab released MMPR: a new multimodal reasoning dataset
> Jina has released Jina-CLIP-v2 0.98B multilingual multimodal embeddings
> Apple released new SotA vision encoders AIMv2

LLMs πŸ¦™
> AllenAI dropped a huge release of models, datasets and scripts for TΓΌlu, a family of models based on Llama 3.1 aligned with SFT, DPO and a new technique they have developed called RLVR
> Jina has released embeddings-v3: new multilingual embeddings with longer context
> Hugging Face released SmolTalk: synthetic dataset used to align SmolLM2 using supervised fine-tuning
> Microsoft released orca-agentinstruct-1M-v1: a gigantic instruction dataset of 1M synthetic instruction pairs

Image Generation πŸ–ΌοΈ
> Black Forest Labs released Flux 1. tools: four new models for different image modifications and two LoRAs to do image conditioning and better steer generations

Lastly Hugging Face released a new library Observers: a lightweight SDK for monitoring interactions with AI APIs and easily store and browse them πŸ“š
$ pip install observers
  • 3 replies
Β·
reacted to AdinaY's post with πŸš€ 3 months ago
reacted to merve's post with πŸš€ 4 months ago
reacted to nroggendorff's post with πŸ”₯ 5 months ago
posted an update 5 months ago
view post
Post
4303
Looking for a logo idea πŸ‘€ ?
I made a new cool space enzostvs/Logo.Ai to help you design a great logo in seconds!

Here are some examples of what you can do, feel free to share yours too! πŸš€
  • 1 reply
Β·
reacted to cbensimon's post with ❀️ 5 months ago
view post
Post
4429
Hello everybody,

We've rolled out a major update to ZeroGPU! All the Spaces are now running on it.

Major improvements:

1. GPU cold starts about twice as fast!
2. RAM usage reduced by two-thirds, allowing more effective resource usage, meaning more GPUs for the community!
3. ZeroGPU initializations (coldstarts) can now be tracked and displayed (use progress=gr.Progress(track_tqdm=True))
4. Improved compatibility and PyTorch integration, increasing ZeroGPU compatible spaces without requiring any modifications!

Feel free to answer in the post if you have any questions

πŸ€— Best regards,
Charles
posted an update 5 months ago
view post
Post
3365
What if we asked the AI what it thought of our hugging face profile? πŸ‘Ή
I've released a new space capable of doing it.... watch out, it hits hard! πŸ₯Š

Try it now ➑️ enzostvs/hugger-roaster

Share your roast below πŸ‘‡
Β·
reacted to victor's post with πŸ”₯ 5 months ago
view post
Post
5687
πŸ™‹ Calling all Hugging Face users! We want to hear from YOU!

What feature or improvement would make the biggest impact on Hugging Face?

Whether it's the Hub, better documentation, new integrations, or something completely different – we're all ears!

Your feedback shapes the future of Hugging Face. Drop your ideas in the comments below! πŸ‘‡
Β·
reacted to alvdansen's post with πŸš€ 7 months ago
view post
Post
4551
New model drop...πŸ₯

FROSTING LANE REDUX

The v1 of this model was released during a big model push, so I think it got lost in the shuffle. I revisited it for a project and realized it wasn't inventive enough around certain concepts, so I decided to retrain.

alvdansen/frosting_lane_redux

I think the original model was really strong on it's own, but because it was trained on fewer images I found that it was producing a very lackluster range of facial expressions, so I wanted to improve that.

The hardest part of creating models like this, I find, is maintaining the detailed linework without without overfitting. It takes a really balanced dataset and I repeat the data 12 times during the process, stopping at the last 10-20 epochs.

It is very difficult to predict the exact amount of time needed, so for me it is crucial to do epoch stops. Every model has a different threshold for ideal success.
reacted to not-lain's post with πŸ€— 7 months ago
view post
Post
7730
I am now a huggingface fellow πŸ₯³
Β·
reacted to clem's post with πŸš€ 8 months ago
view post
Post
3674
Who said you couldn't build a big business based on open-source AI? Congrats Mistral team: https://huggingface.co/mistralai
reacted to fdaudens's post with πŸ”₯ 10 months ago
reacted to DmitryRyumin's post with πŸš€β€οΈ 11 months ago
view post
Post
πŸš€πŸ—£οΈπŸŒŸ New Research Alert - ICASSP 2024! πŸŒŸπŸ—£οΈπŸš€
πŸ“„ Title: AV2Wav: Diffusion-Based Re-synthesis from Continuous Self-supervised Features for Audio-Visual Speech Enhancement πŸŒŸπŸš€

πŸ“ Description: Diffused Resynthesis and HuBERT Speech Quality Enhancement.

πŸ‘₯ Authors: Ju-Chieh Chou, Chung-Ming Chien, Karen Livescu

πŸ“… Conference: ICASSP, 14-19 April 2024 | Seoul, Korea πŸ‡°πŸ‡·

πŸ”— Paper: AV2Wav: Diffusion-Based Re-synthesis from Continuous Self-supervised Features for Audio-Visual Speech Enhancement (2309.08030)

🌐 Web Page: https://home.ttic.edu/~jcchou/demo/avse/avse_demo.html

πŸ“š More Papers: more cutting-edge research presented at other conferences in the
DmitryRyumin/NewEraAI-Papers curated by @DmitryRyumin

πŸš€ Added to the Speech Enhancement Collection: DmitryRyumin/speech-enhancement-65de31e1b6d9a040c151702e

πŸ” Keywords: #AV2Wav #SpeechEnhancement #SpeechProcessing #AudioVisual #Diffusion #ICASSP2024 #Innovation
reacted to Xenova's post with ❀️ 12 months ago
view post
Post
Introducing Remove Background Web: In-browser background removal, powered by @briaai 's new RMBG-v1.4 model and πŸ€— Transformers.js!

Everything runs 100% locally, meaning none of your images are uploaded to a server! 🀯 At only ~45MB, the 8-bit quantized version of the model is perfect for in-browser usage (it even works on mobile).

Check it out! πŸ‘‡
Demo: Xenova/remove-background-web
Model: briaai/RMBG-1.4
Β·