Vedat Baday

badayvedat

AI & ML interests

None yet

Recent Activity

Organizations

fal's profile picture Social Post Explorers's profile picture

badayvedat's activity

reacted to isidentical's post with πŸ”₯ 7 months ago
view post
Post
1513
It is time for some Aura.

First in our series of fully open sourced / commercially available models by @fal-ai : AuraSR - a 600M parameter upscaler based on GigaGAN.

Blog: https://blog.fal.ai/introducing-aurasr-an-open-reproduction-of-the-gigagan-upscaler-2/

HF: https://huggingface.co/fal-ai/AuraSR

Code: https://github.com/fal-ai/aura-sr

Playground: https://fal.ai/models/fal-ai/aura-sr/playground

What other models would you like to see open-sourced and commercially available? :)
reacted to yushun0410's post with πŸ”₯ 7 months ago
view post
Post
4626
Hi Huggingfacers!

Thrilled to introduce Adam-mini, an optimizer that achieves on-par or better performance than AdamW with 45% to 50% less memory footprint. Adam-mini can also achieve 49.5% higher throughput than AdamW on Llama2-7B pre-training.

The design of Adam-mini is inspired by certain Hessian structures we observed on Transformers.

Feel free to try it out! Try switching to Adam-mini with the same hyperparams of AdamW, it would work with only half memory. Hope Adam-mini can help save time, cost, and energy in your tasks!

Paper: "Adam-mini: Use Fewer Learning Rates To Gain More" https://arxiv.org/abs/2406.16793

Code: https://github.com/zyushun/Adam-mini

  • 1 reply
Β·
reacted to Xenova's post with πŸ”₯ 7 months ago
view post
Post
6053
Florence-2, the new vision foundation model by Microsoft, can now run 100% locally in your browser on WebGPU, thanks to Transformers.js! πŸ€—πŸ€―

It supports tasks like image captioning, optical character recognition, object detection, and many more! 😍 WOW!
- Demo: Xenova/florence2-webgpu
- Models: https://huggingface.co/models?library=transformers.js&other=florence2
- Source code: https://github.com/xenova/transformers.js/tree/v3/examples/florence2-webgpu
reacted to isidentical's post with ❀️ 10 months ago
reacted to wanghaofan's post with πŸ”₯ 10 months ago
reacted to Jaward's post with ❀️ 10 months ago
view post
Post
2828
After giving GPU Programming a hands-on try, I have come to appreciate the level of complexity in AI compute:

- Existing/leading frameworks (CUDA, OpenCL, DSLs, even Triton), still fall at the mercy of low-level compute that requires deeper understanding and experience.
- Ambiguous optimizations methods that will literally drive you mad 🀯
- Triton is cool but not cool enough (high level abstractions that fall back to low level compute issues as you build more specialized kernels)
- As for CUDA, optimization requires considering all major components of the GPU (DRAM, SRAM, ALUs) πŸ€•
- Models today require stallion written GPU kernels to reduce storage and compute cost.
- GPTQ was a big save πŸ‘πŸΌ

@karpathy is right expertise in this area is scarce and the reason is quite obvious - uncertainties: we are still struggling to get peak performance from multi-connected GPUs while maintaining precision and reducing cost.

May the Scaling Laws favor us lol.
Β·
New activity in ByteDance/AnimateDiff-Lightning 11 months ago