Violette (Violette)

reacted to clem's post with 🔥 3 months ago

Post

4456

This is no Woodstock AI but will be fun nonetheless haha. I’ll be hosting a live workshop with team members next week about the Enterprise Hugging Face hub.

1,000 spots available first-come first serve with some surprises during the stream!

You can register and add to your calendar here: https://streamyard.com/watch/JS2jHsUP3NDM

4 replies

·

reacted to jeffboudier's post with 🔥 5 months ago

Post

4043

Pro Tip - if you're a Firefox user, you can set up Hugging Chat as integrated AI Assistant, with contextual links to summarize or simplify any text - handy!

In this short video I show how to set it up

3 replies

·

reacted to thomwolf's post with 🧠🚀🔥 10 months ago

Post

4892

Is is time for the open-source AI robots revolution 🚀?

With @haixuantao and @Leyo we’ve been playing with a low-cost DJI robot controlled by three local open-source AI models (Whisper, Idefics2, Parler-TTS - all Apache2) and orchestrated by Dora-cs.

Links to find all the hardware/software we used in the demo:
- robot control framework – dora-rs: https://github.com/dora-rs/dora
- speech-to-text model – whisper: openai/whisper-base
- vision-text model – Idefics2: HuggingFaceM4/idefics2-8b-AWQ
- text-to-speech model – ParlerTTS mini: parler-tts/parler_tts_mini_v0.1
- robot: https://dji.com/robomaster-s1
- code gist: https://gist.github.com/haixuanTao/860e1740245dc2c8dd85b496150a9320
- Larger codebase: dora-rs/dora-idefics2
- laptop/pc: any with a recent GPU card (our has a RTX 4090)

Enjoy!

4 replies

·

reacted to nisten's post with 🔥 10 months ago

Post

5468

Just had the former chief public health officer of the Netherlands🇳🇱 review a huggingface AI-doctor I made via a simple orpo-zephyr-8x22b-GPT and they think it's really good.

https://hf.co/chat/assistant/661d77310e3aea9ae571e43c

3 replies

·

posted an update 10 months ago

Post

2848

🔥 Next Thursday 4/25 at 8am PT / 11am ET / 17h CET, join our live Hugging Cast to learn how to deploy open models on Google Cloud.

Register ➡️ https://streamyard.com/watch/xz2nxp85Pi6e

@philschmid , @tengomucho , @jeffboudier will show you brand new Hub integrations built with GCP
🔥 with HF Inference Endpoints
🌎 with Vertex and GKE
🚀 on TPU

reacted to HugoLaurencon's post with 🤯🚀🧠🤝👍 10 months ago

Post

3090

We release Idefics2-8B, a foundation vision language model with SOTA results for its size on many benchmarks.

For Idefics2, we adopted a simple architecture:
-Images are fed to a vision encoder, then to a modality projection to match the input dimension of the LLM, and finally to a perceiver resampler for efficient pooling.
-Interleaved image-text data are then passed to the LLM.

During the pre-training:
-The modality projection and perceiver resampler weights are newly initialized.
-We start with pre-trained models for the vision encoder and the LLM, and continue the training with LoRA.
-In total, we see 1.5T images!

We pre-train on 3 types of data, all publicly available:
-Interleaved image-text documents: our dataset OBELICS HuggingFaceM4/OBELICS
-Image caption pairs: only synthetic captions!
-PDF documents: IDL and PDFA

We kept the aspect ratio of the images with the Patch n' Pack strategy, with a resolution of up to 980x980.
At inference, it's also more efficient for lower-resolution images.

For the SFT, we build The Cauldron, a collection of 50 high-quality datasets in the user/assistant format.
It is a ready-to-use dataset for the fine-tuning of any VLM.
HuggingFaceM4/the_cauldron

Most current models, like LLaVA-NeXT, encode images with an excessive number of tokens, like 2880.
Instead, we put a focus on being efficient at inference by training on a mix of images encoded with 64 tokens, and 320 tokens.
The result is that we perform favorably compared to the best models in our size class, while being efficient at inference.

reacted to zolicsaki's post with 🚀 10 months ago

Post

2795

We posted new SOTA SambaLingo 70B parameter models for Arabic, Thai and Hungarian!

Check out the models here sambanovasystems/sambalingo-65e25770f2037c85ad35ca77

and our paper
https://arxiv.org/abs/2404.05829

reacted to Smooke's post with 🔥 10 months ago

Post

2821

Spent the morning designing a super important product: SMALL LANGUAGE MODEL BABY ONESIE https://www.shop.hackernoon.com/all-merch/p/small-language-model-baby-onesie

3 replies

·

reacted to davanstrien's post with 🚀 10 months ago

Post

2032

Could more DPO-style preference data be crucial for enhancing open LLMs across different languages?

Leveraging a 7k preference dataset Argilla ( @alvarobartt ), Hugging Face ( @lewtun ) and Kaist AI ( @JW17 & @nlee-208 )
utilized Kaist AI's recently introduced ORPO technique ORPO: Monolithic Preference Optimization without Reference Model (2403.07691) with the latest MistralAI MOE model to create a very high-performing open LLM: HuggingFaceH4/zephyr-orpo-141b-A35b-v0.1

Since ORPO doesn't require a separate SFT stage, all that is needed is a strong base model + high-quality DPO-style datasets.

Currently, there is a significant lack of non-English DPO datasets. Filling this gap could significantly improve open LLMs in various languages.

You can get an overview of the current state of DPO datasets across different languages here: https://huggingface.co/spaces/DIBT/preference_data_by_language

reacted to andrewyng's post with ❤️🤯👍 11 months ago

Post

DeepLearning.AI just announced a new short course: Open Source Models with Hugging Face 🤗, taught by Hugging Face's own Maria Khalusova, Marc Sun and Younes Belkada!

As many of you already know, Hugging Face has been a game changer by letting developers quickly grab any of hundreds of thousands of already-trained open source models to assemble into new applications. This course teaches you best practices for building this way, including how to search and choose among models.

You'll learn to use the Transformers library and walk through multiple models for text, audio, and image processing, including zero-shot image segmentation, zero-shot audio classification, and speech recognition. You'll also learn to use multimodal models for visual question answering, image search, and image captioning. Finally, you’ll learn how to demo what you build locally, on the cloud, or via an API using Gradio and Hugging Face Spaces.

Thank you very much to Hugging Face's wonderful team for working with us on this.

You can sign up for the course here: https://www.deeplearning.ai/short-courses/open-source-models-hugging-face/

1 reply

·

Violette

AI & ML interests

Recent Activity

Organizations

Violette's activity