Thank you very much for sharing @alielfilali01 !
María Grandury
mariagrandury
AI & ML interests
Responsible AI | NLP in Spanish @ hf.co/SomosNLP
Recent Activity
updated
a collection
2 days ago
LLM Eval
updated
a collection
2 days ago
Corpus: Evaluation datasets for ES & LATAM
upvoted
an
article
2 days ago
Open-R1: a fully open reproduction of DeepSeek-R1
Organizations
mariagrandury's activity
![](https://cdn-avatars.huggingface.co/v1/production/uploads/1665073337782-5f9c00a5777efc07d7f1e4be.png)
replied to
alielfilali01's
post
4 months ago
![](https://cdn-avatars.huggingface.co/v1/production/uploads/1665073337782-5f9c00a5777efc07d7f1e4be.png)
reacted to
alielfilali01's
post with ❤️👀
4 months ago
Post
1212
@mariagrandury
(SomosNLP) and team releases the Spanish leaderboard !!!
It is impressive how they choosed to design this leaderboard and how it support 4 languages (all part of Spain ofc).
Check it out from this link :
la-leaderboard/la-leaderboard
It is impressive how they choosed to design this leaderboard and how it support 4 languages (all part of Spain ofc).
Check it out from this link :
la-leaderboard/la-leaderboard
![](https://cdn-avatars.huggingface.co/v1/production/uploads/1665073337782-5f9c00a5777efc07d7f1e4be.png)
reacted to
clefourrier's
post with 🔥❤️
10 months ago
Post
5785
In a basic chatbots, errors are annoyances. In medical LLMs, errors can have life-threatening consequences 🩸
It's therefore vital to benchmark/follow advances in medical LLMs before even thinking about deployment.
This is why a small research team introduced a medical LLM leaderboard, to get reproducible and comparable results between LLMs, and allow everyone to follow advances in the field.
openlifescienceai/open_medical_llm_leaderboard
Congrats to @aaditya and @pminervini !
Learn more in the blog: https://huggingface.co/blog/leaderboard-medicalllm
It's therefore vital to benchmark/follow advances in medical LLMs before even thinking about deployment.
This is why a small research team introduced a medical LLM leaderboard, to get reproducible and comparable results between LLMs, and allow everyone to follow advances in the field.
openlifescienceai/open_medical_llm_leaderboard
Congrats to @aaditya and @pminervini !
Learn more in the blog: https://huggingface.co/blog/leaderboard-medicalllm
![](https://cdn-avatars.huggingface.co/v1/production/uploads/1665073337782-5f9c00a5777efc07d7f1e4be.png)
reacted to
clefourrier's
post with 🤯
about 1 year ago
Post
🏅 New top model on the GAIA benchmark!
Called FRIDAY, it's a mysterious new autonomous agent, which got quite good performances on both the public validation set *and* the private test set.
It notably passed 10 points for the val and 5 points for the test set on our hardest questions (level 3): they require to take arbitrarily long sequences of actions, use any number of tools, and access the world in genera! ✨
The GAIA benchmark evaluates next-generation LLMs (LLMs with augmented capabilities due to added tooling, efficient prompting, access to search, etc) and was co authored by @gregmialz @ThomasNLG @ylecun @thomwolf and myself: gaia-benchmark/leaderboard
Called FRIDAY, it's a mysterious new autonomous agent, which got quite good performances on both the public validation set *and* the private test set.
It notably passed 10 points for the val and 5 points for the test set on our hardest questions (level 3): they require to take arbitrarily long sequences of actions, use any number of tools, and access the world in genera! ✨
The GAIA benchmark evaluates next-generation LLMs (LLMs with augmented capabilities due to added tooling, efficient prompting, access to search, etc) and was co authored by @gregmialz @ThomasNLG @ylecun @thomwolf and myself: gaia-benchmark/leaderboard
![](https://cdn-avatars.huggingface.co/v1/production/uploads/1665073337782-5f9c00a5777efc07d7f1e4be.png)
replied to
their
post
about 1 year ago
Glad you find it useful! Thank you very much :)
![](https://cdn-avatars.huggingface.co/v1/production/uploads/1665073337782-5f9c00a5777efc07d7f1e4be.png)
reacted to
chansung's
post with ❤️
about 1 year ago
Post
Want to read curated list of papers by
@akhaliq
in your mail box?
Thanks to the API provided by Hugging Face, I made a simple GitHub Action based newsletter bot to send out 🤗 Daily Papers. Check out the attached video clip to get a sense of what it is!
Internally, it leverages Gemini API to assign tags for each paper, and all papers are archived by tags and batches. Of course, you can directly go to the papers' pages from your mail box to check out the full paper!
Since everything is automated, GitHub Action and Gemini API are free, and the subscription management is free via Google Groups, this newsletter bot is entirely free. Furthermore, if you wish, you could fork the project for your own newsletter service.
subscription: https://groups.google.com/g/hf-daily-paper-newsletter
project repo: https://github.com/deep-diver/hf-daily-paper-newsletter
In the next step, I will experimentally add auto translation (to Korean) feature for every papers.
Thanks to the API provided by Hugging Face, I made a simple GitHub Action based newsletter bot to send out 🤗 Daily Papers. Check out the attached video clip to get a sense of what it is!
Internally, it leverages Gemini API to assign tags for each paper, and all papers are archived by tags and batches. Of course, you can directly go to the papers' pages from your mail box to check out the full paper!
Since everything is automated, GitHub Action and Gemini API are free, and the subscription management is free via Google Groups, this newsletter bot is entirely free. Furthermore, if you wish, you could fork the project for your own newsletter service.
subscription: https://groups.google.com/g/hf-daily-paper-newsletter
project repo: https://github.com/deep-diver/hf-daily-paper-newsletter
In the next step, I will experimentally add auto translation (to Korean) feature for every papers.
![](https://cdn-avatars.huggingface.co/v1/production/uploads/1665073337782-5f9c00a5777efc07d7f1e4be.png)
reacted to
BramVanroy's
post with ❤️
about 1 year ago
Post
💡We recently launched a Discord server for #Dutch #NLP and #LLMs. We have more than 50 users of _very_ varrying backgrounds! 🧙👩🔬🧑🎨🧑🏫🧑💼 We've already had discussions on eval, tokenizers, RAG, data... A bit of everything. Everyone is welcome to work together on Dutch NLP and LLMs! https://discord.gg/YUUXVZkZJ9
![](https://cdn-avatars.huggingface.co/v1/production/uploads/1665073337782-5f9c00a5777efc07d7f1e4be.png)
reacted to
dvilasuero's
post with 🤗❤️
about 1 year ago
Post
👋 Hi there!
This is my very first post.
I'll use it to share some old news: a math preference dataset for DPO!
I created this dataset some time ago while we were developing distilabel (https://github.com/argilla-io/distilabel).
Some days ago we found out people are actually using it! So I'll use this post to explain how I built it in case it's useful for the community.
1. I used distilabel's SelfInstruct-inspired task to generate instructions about different math topics. I curated the instructions with Argilla (on Spaces!).
2. Then I used a distilabel Pipeline to build a preference dataset using gpt3.5 as generator and gpt4 as labeller. If I recall correctly I used our JudgeLM implementation (see https://distilabel.argilla.io/latest/technical-reference/tasks/#judgelmtask)
(see the screenshot with the dataset in the Argilla UI)
3. Then I just binarized into chosen, rejected pairs and voilà:
argilla/distilabel-math-preference-dpo
The funny thing is that I used this to do a second DPO run over Notus-7B. I hoped to see an improvement on math/reasoning skills but it actually improved in STEM and Humanities and did worse on Math 🤣 .
In conclusion, this dataset was only a quick experiement. I'm happy to see the community found it useful. Data for DPO and fine-tuning are still a mystery, let's unveil these mysteries in 2024 together!
Follow me for the most exciting datasets for LLMs (and maybe some great, small, efficient models). I plan to announce all Argilla open-source work here!
This is my very first post.
I'll use it to share some old news: a math preference dataset for DPO!
I created this dataset some time ago while we were developing distilabel (https://github.com/argilla-io/distilabel).
Some days ago we found out people are actually using it! So I'll use this post to explain how I built it in case it's useful for the community.
1. I used distilabel's SelfInstruct-inspired task to generate instructions about different math topics. I curated the instructions with Argilla (on Spaces!).
2. Then I used a distilabel Pipeline to build a preference dataset using gpt3.5 as generator and gpt4 as labeller. If I recall correctly I used our JudgeLM implementation (see https://distilabel.argilla.io/latest/technical-reference/tasks/#judgelmtask)
(see the screenshot with the dataset in the Argilla UI)
3. Then I just binarized into chosen, rejected pairs and voilà:
argilla/distilabel-math-preference-dpo
The funny thing is that I used this to do a second DPO run over Notus-7B. I hoped to see an improvement on math/reasoning skills but it actually improved in STEM and Humanities and did worse on Math 🤣 .
In conclusion, this dataset was only a quick experiement. I'm happy to see the community found it useful. Data for DPO and fine-tuning are still a mystery, let's unveil these mysteries in 2024 together!
Follow me for the most exciting datasets for LLMs (and maybe some great, small, efficient models). I plan to announce all Argilla open-source work here!
![](https://cdn-avatars.huggingface.co/v1/production/uploads/1665073337782-5f9c00a5777efc07d7f1e4be.png)
replied to
their
post
about 1 year ago
Thanks Dani!
![](https://cdn-avatars.huggingface.co/v1/production/uploads/1665073337782-5f9c00a5777efc07d7f1e4be.png)
replied to
their
post
about 1 year ago
Merci Julien!
![](https://cdn-avatars.huggingface.co/v1/production/uploads/1665073337782-5f9c00a5777efc07d7f1e4be.png)
posted
an
update
about 1 year ago
Post
✅ Ever wondered how to measure transparency in model development?
My last open-source contribution for 2023 is s Space that allows you to self-assess the transparency of your model based on the 100 indicators of the Foundation Model Transparency Index (FMTI).
The original study evaluated the developers of 10 top LLMs. Curious about how yours measures up? 👀
mariagrandury/fmti-transparency-self-assessment
Let's commit to a 2024 with greater transparency in the AI ecosystem! 🚀
My last open-source contribution for 2023 is s Space that allows you to self-assess the transparency of your model based on the 100 indicators of the Foundation Model Transparency Index (FMTI).
The original study evaluated the developers of 10 top LLMs. Curious about how yours measures up? 👀
mariagrandury/fmti-transparency-self-assessment
Let's commit to a 2024 with greater transparency in the AI ecosystem! 🚀
![](https://cdn-avatars.huggingface.co/v1/production/uploads/1665073337782-5f9c00a5777efc07d7f1e4be.png)
reacted to
akhaliq's
post with ❤️
about 1 year ago
Post
Here is my selection of papers for today
https://huggingface.co/papers
Compact Neural Graphics Primitives with Learned Hash Probing
Restoration by Generation with Constrained Priors
SSR-Encoder: Encoding Selective Subject
Representation for Subject-Driven Generation
Hyper-VolTran: Fast and Generalizable One-Shot Image to 3D Object Structure via HyperNetworks
InsActor: Instruction-driven Physics-based Characters
Unsupervised Universal Image Segmentation
Spacetime Gaussian Feature Splatting for Real-Time Dynamic View Synthesis
DreamGaussian4D: Generative 4D Gaussian Splatting
City-on-Web: Real-time Neural Rendering of Large-scale Scenes on the Web
DL3DV-10K: A Large-Scale Scene Dataset for Deep Learning-based 3D Vision
DiffusionGAN3D: Boosting Text-guided 3D Generation and Domain Adaption by Combining 3D GANs and Diffusion Priors
Unified-IO 2: Scaling Autoregressive Multimodal Models with Vision, Language, Audio, and Action
Prompt Expansion for Adaptive Text-to-Image Generation
PanGu-Draw
I2V-Adapter: A General Image-to-Video Adapter for Video Diffusion Models
The LLM Surgeon
MathPile: A Billion-Token-Scale Pretraining Corpus for Math
MobileVLM : A Fast, Reproducible and Strong Vision Language Assistant for Mobile Devices
TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones
https://huggingface.co/papers
Compact Neural Graphics Primitives with Learned Hash Probing
Restoration by Generation with Constrained Priors
SSR-Encoder: Encoding Selective Subject
Representation for Subject-Driven Generation
Hyper-VolTran: Fast and Generalizable One-Shot Image to 3D Object Structure via HyperNetworks
InsActor: Instruction-driven Physics-based Characters
Unsupervised Universal Image Segmentation
Spacetime Gaussian Feature Splatting for Real-Time Dynamic View Synthesis
DreamGaussian4D: Generative 4D Gaussian Splatting
City-on-Web: Real-time Neural Rendering of Large-scale Scenes on the Web
DL3DV-10K: A Large-Scale Scene Dataset for Deep Learning-based 3D Vision
DiffusionGAN3D: Boosting Text-guided 3D Generation and Domain Adaption by Combining 3D GANs and Diffusion Priors
Unified-IO 2: Scaling Autoregressive Multimodal Models with Vision, Language, Audio, and Action
Prompt Expansion for Adaptive Text-to-Image Generation
PanGu-Draw
I2V-Adapter: A General Image-to-Video Adapter for Video Diffusion Models
The LLM Surgeon
MathPile: A Billion-Token-Scale Pretraining Corpus for Math
MobileVLM : A Fast, Reproducible and Strong Vision Language Assistant for Mobile Devices
TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones
![](https://cdn-avatars.huggingface.co/v1/production/uploads/1665073337782-5f9c00a5777efc07d7f1e4be.png)
reacted to
akhaliq's
post with ❤️
about 1 year ago
Post
Here is my selection of papers for today (27 Dec) on Hugging Face daily papers newsletter
daily pagers feed: https://huggingface.co/papers
UniRef++: Segment Every Reference Object in Spatial and Temporal Spaceshttps://huggingface.co/papers/2312.15715
LangSplat: 3D Language Gaussian Splattinghttps://huggingface.co/papers/2312.16084
Human101: Training 100+FPS Human Gaussians in 100s from 1 Viewhttps://huggingface.co/papers/2312.15258
Audiobox: Unified Audio Generation with Natural Language Promptshttps://huggingface.co/papers/2312.15821
HarmonyView: Harmonizing Consistency and Diversity in One-Image-to-3Dhttps://huggingface.co/papers/2312.15980
One-dimensional Adapter to Rule Them All: Concepts, Diffusion Models and Erasing Applicationshttps://huggingface.co/papers/2312.16145
Make-A-Character: High Quality Text-to-3D Character Generation within Minuteshttps://huggingface.co/papers/2312.15430
A Recipe for Scaling up Text-to-Video Generation with Text-free Videoshttps://huggingface.co/papers/2312.15770
Principled Instructions Are All You Need for Questioning LLaMA-1/2, GPT-3.5/4https://huggingface.co/papers/2312.16171
SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scalinghttps://huggingface.co/papers/2312.15166
Supervised Knowledge Makes Large Language Models Better In-context Learnershttps://huggingface.co/papers/2312.15918
Gemini vs GPT-4V: A Preliminary Comparison and Combination of Vision-Language Models Through Qualitative Caseshttps://huggingface.co/papers/2312.15011
daily pagers feed: https://huggingface.co/papers
UniRef++: Segment Every Reference Object in Spatial and Temporal Spaceshttps://huggingface.co/papers/2312.15715
LangSplat: 3D Language Gaussian Splattinghttps://huggingface.co/papers/2312.16084
Human101: Training 100+FPS Human Gaussians in 100s from 1 Viewhttps://huggingface.co/papers/2312.15258
Audiobox: Unified Audio Generation with Natural Language Promptshttps://huggingface.co/papers/2312.15821
HarmonyView: Harmonizing Consistency and Diversity in One-Image-to-3Dhttps://huggingface.co/papers/2312.15980
One-dimensional Adapter to Rule Them All: Concepts, Diffusion Models and Erasing Applicationshttps://huggingface.co/papers/2312.16145
Make-A-Character: High Quality Text-to-3D Character Generation within Minuteshttps://huggingface.co/papers/2312.15430
A Recipe for Scaling up Text-to-Video Generation with Text-free Videoshttps://huggingface.co/papers/2312.15770
Principled Instructions Are All You Need for Questioning LLaMA-1/2, GPT-3.5/4https://huggingface.co/papers/2312.16171
SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scalinghttps://huggingface.co/papers/2312.15166
Supervised Knowledge Makes Large Language Models Better In-context Learnershttps://huggingface.co/papers/2312.15918
Gemini vs GPT-4V: A Preliminary Comparison and Combination of Vision-Language Models Through Qualitative Caseshttps://huggingface.co/papers/2312.15011
Post
Holiday talk about AI taking over? Let's shift the narrative!
🌟 There is no reason to believe that just because AI systems are intelligent they will want to dominate us. Yann LeCun reminds us that AI systems won't have the same motivations as humans, we'll design them not to.
🌍 Instead of getting distracted by future existential risks, we must address AI’s more pressing risks — like emitting carbon, infringing copyrights and spreading bias. Sasha Luccioni urges us to create tools and legislation that promote transparency and diversity.
💡 Dive deeper into these perspectives:
- Yann's ( @ylecun ) WIRED interview (12'): https://www.wired.com/story/artificial-intelligence-meta-yann-lecun-interview/
- Sasha's ( @sasha ) TED Talk (10'): https://www.ted.com/talks/sasha_luccioni_ai_is_dangerous_but_not_for_the_reasons_you_think
P.S.: Love this new "Posts" feature, big thanks to 🤗 for letting me try it!
What are your go-to citations for AI risks? 👇
🌟 There is no reason to believe that just because AI systems are intelligent they will want to dominate us. Yann LeCun reminds us that AI systems won't have the same motivations as humans, we'll design them not to.
🌍 Instead of getting distracted by future existential risks, we must address AI’s more pressing risks — like emitting carbon, infringing copyrights and spreading bias. Sasha Luccioni urges us to create tools and legislation that promote transparency and diversity.
💡 Dive deeper into these perspectives:
- Yann's ( @ylecun ) WIRED interview (12'): https://www.wired.com/story/artificial-intelligence-meta-yann-lecun-interview/
- Sasha's ( @sasha ) TED Talk (10'): https://www.ted.com/talks/sasha_luccioni_ai_is_dangerous_but_not_for_the_reasons_you_think
P.S.: Love this new "Posts" feature, big thanks to 🤗 for letting me try it!
What are your go-to citations for AI risks? 👇