Google

company

Verified

Activity Feed

AI & ML interests

Google ❤️ Open Source AI

Recent Activity

GopiUppari new activity about 7 hours ago

google/gemma-2-2b-it:TypeError: GemmaModel.forward() got an unexpected keyword argument 'num_items_in_batch'

lkv new activity about 8 hours ago

google/recurrentgemma-9b-it:Evaluation Result

selamw new activity about 18 hours ago

google/gemma-2-2b-it:Request: DOI

View all activity

Articles

Welcome PaliGemma 2 – New vision language models by Google

Dec 5, 2024

• 130

PaliGemma – Google's Cutting-Edge Open Vision Language Model

May 14, 2024

• 237

google's activity

GopiUppari

in google/gemma-2-2b-it about 7 hours ago

TypeError: GemmaModel.forward() got an unexpected keyword argument 'num_items_in_batch'

#61 opened 4 days ago by

smkhant

lkv

in google/recurrentgemma-9b-it about 8 hours ago

Evaluation Result

#15 opened 6 months ago by

tanliboy

selamw

in google/gemma-2-2b-it about 18 hours ago

Request: DOI

#63 opened 1 day ago by

VibhaB

Request: DOI

#62 opened 1 day ago by

Vibha10

merve

posted an update 6 days ago

Post

3697

This week in open AI was 🔥 Let's recap! 🤗 merve/january-31-releases-679a10669bd4030090c5de4d
LLMs 💬
> Huge: AllenAI released new Tülu models that outperform DeepSeek R1 using Reinforcement Learning with Verifiable Reward (RLVR) based on Llama 3.1 405B 🔥
> Mistral AI is back to open-source with their "small" 24B models (base & SFT), with Apache 2.0 license 😱
> Alibaba Qwen released their 1M context length models Qwen2.5-Instruct-1M, great for agentic use with Apache 2.0 license 🔥
> Arcee AI released Virtuoso-medium, 32.8B LLMs distilled from DeepSeek V3 with dataset of 5B+ tokens
> Velvet-14B is a new family of 14B Italian LLMs trained on 10T tokens in six languages
> OpenThinker-7B is fine-tuned version of Qwen2.5-7B-Instruct on OpenThoughts dataset

VLMs & vision 👀
> Alibaba Qwen is back with Qwen2.5VL, amazing new capabilities ranging from agentic computer use to zero-shot localization 🔥
> NVIDIA released new series of Eagle2 models with 1B and 9B sizes
> DeepSeek released Janus-Pro, new any-to-any model (image-text generation from image-text input) with MIT license
> BEN2 is a new background removal model with MIT license!

Audio 🗣️
> YuE is a new open-source music generation foundation model, lyrics-to-song generation

Codebase 👩🏻‍💻
> We are open-sourcing our SmolVLM training and eval codebase! https://github.com/huggingface/smollm/tree/main/vision
> Open-R1 is open-source reproduction of R1 by @huggingface science team https://huggingface.co/blog/open-r1

1 reply

lkv

in google/gemma-2b-it 7 days ago

Pretraining Time Cost?

#30 opened 11 months ago by

fov223

lkv

in google/gemma-2-2b-it 7 days ago

SLERP merge example code?

#20 opened 6 months ago by

grimjim

GopiUppari

in google/gemma-2b 7 days ago

Problems with running on CPU

#79 opened 8 days ago by

IExist999

lkv

in google/paligemma-3b-mix-224 7 days ago

Inference via sagemaker endpoint

#12 opened 6 months ago by

shum123

lkv

in google/gemma-2-27b-it 7 days ago

Can't replicate MMLU results for 27b...

#39 opened 4 months ago by

cinjonr

selamw

in google/gemma-2b 8 days ago

Problems with running on CPU

#79 opened 8 days ago by

IExist999

andsteing

in google/paligemma-3b-pt-448-jax 8 days ago

Create README.md

#5 opened 2 months ago by

ariG23498

andsteing

in google/paligemma-3b-pt-224-jax 8 days ago

Create README.md

#3 opened 2 months ago by

ariG23498

pcuenq

in google/sdxl 9 days ago

Remove use_auth_token

#2064 opened 9 days ago by

pcuenq

updated a Space 9 days ago

1.85k

Stable Diffusion XL on TPUv5e

🏋

Generate images from text prompts with various styles

GopiUppari

in google/gemma-2-2b-it 9 days ago

GGUF version

#59 opened 10 days ago by

appsforbd

lkv

in google/recurrentgemma-9b 9 days ago

CUDA out of memory | Need help

#11 opened 8 months ago by

IlyaCorneli

merve

posted an update 13 days ago

Post

4984

Oof, what a week! 🥵 So many things have happened, let's recap! merve/jan-24-releases-6793d610774073328eac67a9

Multimodal 💬
- We have released SmolVLM -- tiniest VLMs that come in 256M and 500M, with it's retrieval models ColSmol for multimodal RAG 💗
- UI-TARS are new models by ByteDance to unlock agentic GUI control 🤯 in 2B, 7B and 72B
- Alibaba DAMO lab released VideoLlama3, new video LMs that come in 2B and 7B
- MiniMaxAI released Minimax-VL-01, where decoder is based on MiniMax-Text-01 456B MoE model with long context
- Dataset: Yale released a new benchmark called MMVU
- Dataset: CAIS released Humanity's Last Exam (HLE) a new challenging MM benchmark

LLMs 📖
- DeepSeek-R1 & DeepSeek-R1-Zero: gigantic 660B reasoning models by DeepSeek, and six distilled dense models, on par with o1 with MIT license! 🤯
- Qwen2.5-Math-PRM: new math models by Qwen in 7B and 72B
- NVIDIA released AceMath and AceInstruct, new family of models and their datasets (SFT and reward ones too!)

Audio 🗣️
- Llasa is a new speech synthesis model based on Llama that comes in 1B,3B, and 8B
- TangoFlux is a new audio generation model trained from scratch and aligned with CRPO

Image/Video/3D Generation ⏯️
- Flex.1-alpha is a new 8B pre-trained diffusion model by ostris similar to Flux
- tencent released Hunyuan3D-2, new 3D asset generation from images