Consent in Crisis: The Rapid Decline of the AI Data Commons Paper • 2407.14933 • Published Jul 20, 2024 • 12
BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions Paper • 2406.15877 • Published Jun 22, 2024 • 46
Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order Paper • 2404.00399 • Published Mar 30, 2024 • 42
view post Post 1734 Introducing Indic Chat!Try out best opensource Indic LLMs now on https://www.indic.chat/Models available:• Telugu-LLM-Labs/Indic-gemma-7b-finetuned-sft-Navarasa-2.0• GenVRadmin/AryaBhatta-GemmaOrca• BhabhaAI/Gajendra-v0.1• ai4bharat/AiravataAdditionally:1. We open up our discord for everyone to collaborate & accelerate Indic LLMs: https://bhabha.ai/discord2. We release ~600K rows filtered & Hindi translated version of OpenHermes-2.5 instruction dataset: BhabhaAI/openhermes-2.5-hindiAlso, thanks to our compute sponsor - Telugu LLM Labs & Bhabha AI in helping us serve models for Indic Chat.If you’d like to be a sponsor too, checkouthttps://www.indic.chat/sponsor 🔥 8 8 ❤️ 4 4 😔 1 1 + Reply
MindEye2: Shared-Subject Models Enable fMRI-To-Image With 1 Hour of Data Paper • 2403.11207 • Published Mar 17, 2024 • 15
Aya Dataset: An Open-Access Collection for Multilingual Instruction Tuning Paper • 2402.06619 • Published Feb 9, 2024 • 55
view post Post Introducing Gajendra!An early release of our 7B Hindi-Hinglish-English Instruction fine-tuned language model.Model: BhabhaAI/Gajendra-v0.1We additionally explore ways to filter examples that can be translated from English to Hindi and are releasing initial versions of both dataset and model for it.Model: BhabhaAI/Mistral-translation-classifyDataset: BhabhaAI/translation-classifyLooking forward to collaborate with open source community to accelerate and release Hindi LLMs. 2 replies · ❤️ 9 9 🤗 3 3 + Reply
CheXagent: Towards a Foundation Model for Chest X-Ray Interpretation Paper • 2401.12208 • Published Jan 22, 2024 • 22
BigBIO: A Framework for Data-Centric Biomedical Natural Language Processing Paper • 2206.15076 • Published Jun 30, 2022 • 4
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model Paper • 2211.05100 • Published Nov 9, 2022 • 28
The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset Paper • 2303.03915 • Published Mar 7, 2023 • 7
A Deep Neural Network for SSVEP-based Brain-Computer Interfaces Paper • 2011.08562 • Published Nov 17, 2020 • 2
Reconstructing the Mind's Eye: fMRI-to-Image with Contrastive Learning and Diffusion Priors Paper • 2305.18274 • Published May 29, 2023 • 4