ZP

community
Activity Feed

AI & ML interests

None defined yet.

Recent Activity

sayakpaulΒ  authored a paper 2 months ago
A Noise is Worth Diffusion Guidance
zRzRzRzRzRzRzRΒ  updated a model 2 months ago
THUDM/glm-edge-v-2b
zRzRzRzRzRzRzRΒ  updated a model 2 months ago
THUDM/glm-edge-v-5b
View all activity

ZP2HF's activity

sayakpaulΒ 
posted an update 7 days ago
view post
Post
1751
We have been cooking a couple of fine-tuning runs on CogVideoX with finetrainers, smol datasets, and LoRA to generate cool video effects like crushing, dissolving, etc.

We are also releasing a LoRA extraction utility from a fully fine-tuned checkpoint. I know that kind of stuff has existed since eternity, but the quality on video models was nothing short of spectacular. Below are some links:

* Models and datasets: https://huggingface.co/finetrainers
* finetrainers: https://github.com/a-r-r-o-w/finetrainers
* LoRA extraction: https://github.com/huggingface/diffusers/blob/main/scripts/extract_lora_from_model.py
  • 1 reply
Β·
sayakpaulΒ 
posted an update 10 days ago
view post
Post
1898
We have authored a post to go over the state of video generation in the Diffusers ecosystem 🧨

We cover the models supported, the knobs of optims our users can fire, fine-tuning, and more πŸ”₯

5-6GBs for HunyuanVideo, sky is the limit 🌌 πŸ€—
https://huggingface.co/blog/video_gen
sayakpaulΒ 
posted an update about 1 month ago
view post
Post
4327
Commits speak louder than words πŸ€ͺ

* 4 new video models
* Multiple image models, including SANA & Flux Control
* New quantizers -> GGUF & TorchAO
* New training scripts

Enjoy this holiday-special Diffusers release πŸ€—
Notes: https://github.com/huggingface/diffusers/releases/tag/v0.32.0
sayakpaulΒ 
posted an update about 2 months ago
view post
Post
2151
In the past seven days, the Diffusers team has shipped:

1. Two new video models
2. One new image model
3. Two new quantization backends
4. Three new fine-tuning scripts
5. Multiple fixes and library QoL improvements

Coffee on me if someone can guess 1 - 4 correctly.
  • 1 reply
Β·
sayakpaulΒ 
posted an update about 2 months ago
view post
Post
2116
Introducing a high-quality open-preference dataset to further this line of research for image generation.

Despite being such an inseparable component for modern image generation, open preference datasets are a rarity!

So, we decided to work on one with the community!

Check it out here:
https://huggingface.co/blog/image-preferences
  • 7 replies
Β·
sayakpaulΒ 
posted an update about 2 months ago
view post
Post
2151
The Control family of Flux from @black-forest-labs should be discussed more!

It enables structural controls like ControlNets while being significantly less expensive to run!

So, we're working on a Control LoRA training script πŸ€—

It's still WIP, so go easy:
https://github.com/huggingface/diffusers/pull/10130
sayakpaulΒ 
posted an update 2 months ago
sayakpaulΒ 
posted an update 3 months ago
view post
Post
2669
It's been a while we shipped native quantization support in diffusers 🧨

We currently support bistandbytes as the official backend but using others like torchao is already very simple.

This post is just a reminder of what's possible:

1. Loading a model with a quantization config
2. Saving a model with quantization config
3. Loading a pre-quantized model
4. enable_model_cpu_offload()
5. Training and loading LoRAs into quantized checkpoints

Docs:
https://huggingface.co/docs/diffusers/main/en/quantization/bitsandbytes
  • 1 reply
Β·
sayakpaulΒ 
posted an update 4 months ago
view post
Post
2772
Did some little experimentation to resize pre-trained LoRAs on Flux. I explored two themes:

* Decrease the rank of a LoRA
* Increase the rank of a LoRA

The first one is helpful in reducing memory requirements if the LoRA is of a high rank, while the second one is merely an experiment. Another implication of this study is in the unification of LoRA ranks when you would like to torch.compile() them.

Check it out here:
sayakpaul/flux-lora-resizing
  • 1 reply
Β·
dn6Β 
posted an update 5 months ago
view post
Post
2763
Sharing for anyone using Diffusers from_single_file loading and affected by the Runway SD 1.5 issue.

If you have runwayml/stable-diffusion-v1-5 saved locally in your HF cache then loading single file checkpoints in the following way should still work.

from diffusers import StableDiffusionPipeline

pipe = StableDiffusionPipeline.from_single_file("<url or path to single file checkpoint>")


If you do not have the model repo saved in your cache, then automatically inferring the pipeline config will not work since the reference repo runwayml/stable-diffusion-v1-5 doesn't exist anymore.

You can use an alternative SD1.5 repo id to still configure your pipeline.

from diffusers import StableDiffusionPipeline

pipe = StableDiffusionPipeline.from_single_file("<url or path to single file checkpoint>", config="Lykon/DreamShaper")


We're working on resolving the issue ASAP.
  • 2 replies
Β·
xianbaoΒ 
posted an update 5 months ago
view post
Post
1931
With the open-weight release of CogVideoX-5B from THUDM, i.e. GLM team, the Video Generation Model (how about calling it VGM) field has officially became the next booming "LLM"

What does the landscape look like? What are other video generation models? This collection below is all your need.

xianbao/video-generation-models-66c350163c74f60f5c412af6

The above video is generated by @a-r-r-o-w with CogVideoX-5B, taken from a nice lookout for the field!
sayakpaulΒ 
posted an update 6 months ago
sayakpaulΒ 
posted an update 6 months ago
view post
Post
4505
Flux.1-Dev like images but in fewer steps.

Merging code (very simple), inference code, merged params: sayakpaul/FLUX.1-merged

Enjoy the Monday πŸ€—
Β·