Violette

Violette

AI & ML interests

None yet

Recent Activity

Organizations

Hugging Face's profile picture BigScience Catalogue Data's profile picture BigScience Data's profile picture How to teach Hugging Face?'s profile picture Enterprise Explorers's profile picture Women on Hugging Face's profile picture Social Post Explorers's profile picture Nerdy Face's profile picture Self-serve FTW's profile picture Inference Explorers's profile picture

Violette's activity

reacted to clem's post with πŸ”₯ 3 months ago
view post
Post
4456
This is no Woodstock AI but will be fun nonetheless haha. I’ll be hosting a live workshop with team members next week about the Enterprise Hugging Face hub.

1,000 spots available first-come first serve with some surprises during the stream!

You can register and add to your calendar here: https://streamyard.com/watch/JS2jHsUP3NDM
Β·
reacted to jeffboudier's post with πŸ”₯ 5 months ago
view post
Post
4043
Pro Tip - if you're a Firefox user, you can set up Hugging Chat as integrated AI Assistant, with contextual links to summarize or simplify any text - handy!

In this short video I show how to set it up
Β·
reacted to thomwolf's post with πŸ§ πŸš€πŸ”₯ 10 months ago
view post
Post
4892
Is is time for the open-source AI robots revolution πŸš€?

With @haixuantao and @Leyo we’ve been playing with a low-cost DJI robot controlled by three local open-source AI models (Whisper, Idefics2, Parler-TTS - all Apache2) and orchestrated by Dora-cs.

Links to find all the hardware/software we used in the demo:
- robot control framework – dora-rs: https://github.com/dora-rs/dora
- speech-to-text model – whisper: openai/whisper-base
- vision-text model – Idefics2: HuggingFaceM4/idefics2-8b-AWQ
- text-to-speech model – ParlerTTS mini: parler-tts/parler_tts_mini_v0.1
- robot: https://dji.com/robomaster-s1
- code gist: https://gist.github.com/haixuanTao/860e1740245dc2c8dd85b496150a9320
- Larger codebase: dora-rs/dora-idefics2
- laptop/pc: any with a recent GPU card (our has a RTX 4090)

Enjoy!
Β·
reacted to nisten's post with πŸ”₯ 10 months ago
posted an update 10 months ago
view post
Post
2848
πŸ”₯ Next Thursday 4/25 at 8am PT / 11am ET / 17h CET, join our live Hugging Cast to learn how to deploy open models on Google Cloud.

Register ➑️ https://streamyard.com/watch/xz2nxp85Pi6e

@philschmid , @tengomucho , @jeffboudier will show you brand new Hub integrations built with GCP
πŸ”₯ with HF Inference Endpoints
🌎 with Vertex and GKE
πŸš€ on TPU
reacted to HugoLaurencon's post with πŸ€―πŸš€πŸ§ πŸ€πŸ‘ 10 months ago
view post
Post
3090
We release Idefics2-8B, a foundation vision language model with SOTA results for its size on many benchmarks.

For Idefics2, we adopted a simple architecture:
-Images are fed to a vision encoder, then to a modality projection to match the input dimension of the LLM, and finally to a perceiver resampler for efficient pooling.
-Interleaved image-text data are then passed to the LLM.

During the pre-training:
-The modality projection and perceiver resampler weights are newly initialized.
-We start with pre-trained models for the vision encoder and the LLM, and continue the training with LoRA.
-In total, we see 1.5T images!

We pre-train on 3 types of data, all publicly available:
-Interleaved image-text documents: our dataset OBELICS HuggingFaceM4/OBELICS
-Image caption pairs: only synthetic captions!
-PDF documents: IDL and PDFA

We kept the aspect ratio of the images with the Patch n' Pack strategy, with a resolution of up to 980x980.
At inference, it's also more efficient for lower-resolution images.

For the SFT, we build The Cauldron, a collection of 50 high-quality datasets in the user/assistant format.
It is a ready-to-use dataset for the fine-tuning of any VLM.
HuggingFaceM4/the_cauldron

Most current models, like LLaVA-NeXT, encode images with an excessive number of tokens, like 2880.
Instead, we put a focus on being efficient at inference by training on a mix of images encoded with 64 tokens, and 320 tokens.
The result is that we perform favorably compared to the best models in our size class, while being efficient at inference.
reacted to zolicsaki's post with πŸš€ 10 months ago
reacted to Smooke's post with πŸ”₯ 10 months ago
reacted to davanstrien's post with πŸš€ 10 months ago
view post
Post
2032
Could more DPO-style preference data be crucial for enhancing open LLMs across different languages?

Leveraging a 7k preference dataset Argilla ( @alvarobartt ), Hugging Face ( @lewtun ) and Kaist AI ( @JW17 & @nlee-208 )
utilized Kaist AI's recently introduced ORPO technique ORPO: Monolithic Preference Optimization without Reference Model (2403.07691) with the latest MistralAI MOE model to create a very high-performing open LLM: HuggingFaceH4/zephyr-orpo-141b-A35b-v0.1

Since ORPO doesn't require a separate SFT stage, all that is needed is a strong base model + high-quality DPO-style datasets.

Currently, there is a significant lack of non-English DPO datasets. Filling this gap could significantly improve open LLMs in various languages.

You can get an overview of the current state of DPO datasets across different languages here: https://huggingface.co/spaces/DIBT/preference_data_by_language
reacted to andrewyng's post with β€οΈπŸ€―πŸ‘ 11 months ago
view post
Post
DeepLearning.AI just announced a new short course: Open Source Models with Hugging Face πŸ€—, taught by Hugging Face's own Maria Khalusova, Marc Sun and Younes Belkada!

As many of you already know, Hugging Face has been a game changer by letting developers quickly grab any of hundreds of thousands of already-trained open source models to assemble into new applications. This course teaches you best practices for building this way, including how to search and choose among models.

You'll learn to use the Transformers library and walk through multiple models for text, audio, and image processing, including zero-shot image segmentation, zero-shot audio classification, and speech recognition. You'll also learn to use multimodal models for visual question answering, image search, and image captioning. Finally, you’ll learn how to demo what you build locally, on the cloud, or via an API using Gradio and Hugging Face Spaces.

Thank you very much to Hugging Face's wonderful team for working with us on this.

You can sign up for the course here: https://www.deeplearning.ai/short-courses/open-source-models-hugging-face/
  • 1 reply
Β·