ChenDY commited on
Commit
7cd79b5
·
verified ·
1 Parent(s): 203579f

update README.md

Browse files
Files changed (1) hide show
  1. README.md +119 -119
README.md CHANGED
@@ -1,119 +1,119 @@
1
- ---
2
- base_model:
3
- - tianweiy/DMD2
4
- - ByteDance/Hyper-SD
5
- - stabilityai/stable-diffusion-xl-base-1.0
6
- pipeline_tag: text-to-image
7
- library_name: diffusers
8
- tags:
9
- - text-to-image
10
- - stable-diffusion
11
- - sdxl
12
- - adversarial diffusion distillation
13
- ---
14
- # NitroDiffusion
15
- <!-- > [**NitroFusion: High-Fidelity Single-Step Diffusion through Dynamic Adversarial Training**](), -->
16
- > **NitroFusion: High-Fidelity Single-Step Diffusion through Dynamic Adversarial Training**
17
- >
18
- > Dar-Yen Chen, Hmrishav Bandyopadhyay, Kai Zou, Yi-Zhe Song
19
-
20
- ![](./assets/banner.jpg)
21
-
22
- <!-- arXiv Paper: []()
23
-
24
- Official GitHub Repository: []()
25
-
26
- Project Page: []() -->
27
-
28
- ## News
29
- * 29 Nov 2024: Released two checkpoints: **NitroSD-Realism** and **NitroSD-Vibrant**.
30
-
31
-
32
- <!-- ## Online Demos
33
- NitroFusion single-step Text-to-Image demo hosted on [🤗 Hugging Face]() -->
34
-
35
- ## Model Overview
36
- - `nitrosd-realism_unet.safetensors`: Produces photorealistic images with fine details.
37
- - `nitrosd-vibrant_unet.safetensors`: Offers vibrant, saturated color characteristics.
38
- - Both models support 1 to 4 inference steps.
39
-
40
-
41
- ## Usage
42
-
43
- First, we need to implement the scheduler with timestep shift for multi-step inference:
44
- ```python
45
- from diffusers import LCMScheduler
46
- class TimestepShiftLCMScheduler(LCMScheduler):
47
- def __init__(self, *args, shifted_timestep=250, **kwargs):
48
- super().__init__(*args, **kwargs)
49
- self.register_to_config(shifted_timestep=shifted_timestep)
50
- def set_timesteps(self, *args, **kwargs):
51
- super().set_timesteps(*args, **kwargs)
52
- self.origin_timesteps = self.timesteps.clone()
53
- self.shifted_timesteps = (self.timesteps * self.config.shifted_timestep / self.config.num_train_timesteps).long()
54
- self.timesteps = self.shifted_timesteps
55
- def step(self, model_output, timestep, sample, generator=None, return_dict=True):
56
- if self.step_index is None:
57
- self._init_step_index(timestep)
58
- self.timesteps = self.origin_timesteps
59
- output = super().step(model_output, timestep, sample, generator, return_dict)
60
- self.timesteps = self.shifted_timesteps
61
- return output
62
- ```
63
-
64
-
65
- We can then utilize the diffuser pipeline:
66
- ```python
67
- import torch
68
- from diffusers import DiffusionPipeline, UNet2DConditionModel
69
- from huggingface_hub import hf_hub_download
70
- from safetensors.torch import load_file
71
- # Load model.
72
- base_model_id = "stabilityai/stable-diffusion-xl-base-1.0"
73
- repo = "ChenDY/NitroFusion"
74
- # NitroSD-Realism
75
- ckpt = "nitrosd-realism_unet.safetensors"
76
- unet = UNet2DConditionModel.from_config(base_model_id, subfolder="unet").to("cuda", torch.float16)
77
- unet.load_state_dict(load_file(hf_hub_download(repo, ckpt), device="cuda"))
78
- scheduler = TimestepShiftLCMScheduler.from_pretrained(base_model_id, subfolder="scheduler", shifted_timestep=250)
79
- scheduler.config.original_inference_steps = 4
80
- # # NitroSD-Vibrant
81
- # ckpt = "nitrosd-vibrant_unet.safetensors"
82
- # unet = UNet2DConditionModel.from_config(base_model_id, subfolder="unet").to("cuda", torch.float16)
83
- # unet.load_state_dict(load_file(hf_hub_download(repo, ckpt), device="cuda"))
84
- # scheduler = TimestepShiftLCMScheduler.from_pretrained(base_model_id, subfolder="scheduler", shifted_timestep=500)
85
- # scheduler.config.original_inference_steps = 4
86
- pipe = DiffusionPipeline.from_pretrained(
87
- base_model_id,
88
- unet=unet,
89
- scheduler=scheduler,
90
- torch_dtype=torch.float16,
91
- variant="fp16",
92
- ).to("cuda")
93
- prompt = "a photo of a cat"
94
- image = pipe(
95
- prompt=prompt,
96
- num_inference_steps=1, # NotroSD-Realism and -Vibrant both support 1 - 4 inference steps.
97
- guidance_scale=0,
98
- ).images[0]
99
- ```
100
-
101
- ## License
102
-
103
- NitroSD-Realism is released under [cc-by-nc-4.0](https://creativecommons.org/licenses/by-nc-sa/4.0/deed.en), following its base model *DMD2*.
104
-
105
- NitroSD-Vibrant is released under [openrail++](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/blob/main/LICENSE.md).
106
-
107
- <!-- ## Contact
108
-
109
- Feel free to contact us if you have any questions about the paper!
110
-
111
- Dar-Yen Chen [@surrey.ac.uk](mailto:@surrey.ac.uk)
112
-
113
- ## Citation
114
-
115
- If you find NitroFusion useful or relevant to your research, please kindly cite our papers:
116
-
117
- ```bib
118
-
119
- ``` -->
 
1
+ ---
2
+ base_model:
3
+ - tianweiy/DMD2
4
+ - ByteDance/Hyper-SD
5
+ - stabilityai/stable-diffusion-xl-base-1.0
6
+ pipeline_tag: text-to-image
7
+ library_name: diffusers
8
+ tags:
9
+ - text-to-image
10
+ - stable-diffusion
11
+ - sdxl
12
+ - adversarial diffusion distillation
13
+ ---
14
+ # NitroFusion
15
+ <!-- > [**NitroFusion: High-Fidelity Single-Step Diffusion through Dynamic Adversarial Training**](), -->
16
+ > **NitroFusion: High-Fidelity Single-Step Diffusion through Dynamic Adversarial Training**
17
+ >
18
+ > Dar-Yen Chen, Hmrishav Bandyopadhyay, Kai Zou, Yi-Zhe Song
19
+
20
+ ![](./assets/banner.jpg)
21
+
22
+ <!-- arXiv Paper: []()
23
+
24
+ Official GitHub Repository: []()
25
+
26
+ Project Page: []() -->
27
+
28
+ ## News
29
+ * 29 Nov 2024: Released two checkpoints: **NitroSD-Realism** and **NitroSD-Vibrant**.
30
+
31
+
32
+ <!-- ## Online Demos
33
+ NitroFusion single-step Text-to-Image demo hosted on [🤗 Hugging Face]() -->
34
+
35
+ ## Model Overview
36
+ - `nitrosd-realism_unet.safetensors`: Produces photorealistic images with fine details.
37
+ - `nitrosd-vibrant_unet.safetensors`: Offers vibrant, saturated color characteristics.
38
+ - Both models support 1 to 4 inference steps.
39
+
40
+
41
+ ## Usage
42
+
43
+ First, we need to implement the scheduler with timestep shift for multi-step inference:
44
+ ```python
45
+ from diffusers import LCMScheduler
46
+ class TimestepShiftLCMScheduler(LCMScheduler):
47
+ def __init__(self, *args, shifted_timestep=250, **kwargs):
48
+ super().__init__(*args, **kwargs)
49
+ self.register_to_config(shifted_timestep=shifted_timestep)
50
+ def set_timesteps(self, *args, **kwargs):
51
+ super().set_timesteps(*args, **kwargs)
52
+ self.origin_timesteps = self.timesteps.clone()
53
+ self.shifted_timesteps = (self.timesteps * self.config.shifted_timestep / self.config.num_train_timesteps).long()
54
+ self.timesteps = self.shifted_timesteps
55
+ def step(self, model_output, timestep, sample, generator=None, return_dict=True):
56
+ if self.step_index is None:
57
+ self._init_step_index(timestep)
58
+ self.timesteps = self.origin_timesteps
59
+ output = super().step(model_output, timestep, sample, generator, return_dict)
60
+ self.timesteps = self.shifted_timesteps
61
+ return output
62
+ ```
63
+
64
+
65
+ We can then utilize the diffuser pipeline:
66
+ ```python
67
+ import torch
68
+ from diffusers import DiffusionPipeline, UNet2DConditionModel
69
+ from huggingface_hub import hf_hub_download
70
+ from safetensors.torch import load_file
71
+ # Load model.
72
+ base_model_id = "stabilityai/stable-diffusion-xl-base-1.0"
73
+ repo = "ChenDY/NitroFusion"
74
+ # NitroSD-Realism
75
+ ckpt = "nitrosd-realism_unet.safetensors"
76
+ unet = UNet2DConditionModel.from_config(base_model_id, subfolder="unet").to("cuda", torch.float16)
77
+ unet.load_state_dict(load_file(hf_hub_download(repo, ckpt), device="cuda"))
78
+ scheduler = TimestepShiftLCMScheduler.from_pretrained(base_model_id, subfolder="scheduler", shifted_timestep=250)
79
+ scheduler.config.original_inference_steps = 4
80
+ # # NitroSD-Vibrant
81
+ # ckpt = "nitrosd-vibrant_unet.safetensors"
82
+ # unet = UNet2DConditionModel.from_config(base_model_id, subfolder="unet").to("cuda", torch.float16)
83
+ # unet.load_state_dict(load_file(hf_hub_download(repo, ckpt), device="cuda"))
84
+ # scheduler = TimestepShiftLCMScheduler.from_pretrained(base_model_id, subfolder="scheduler", shifted_timestep=500)
85
+ # scheduler.config.original_inference_steps = 4
86
+ pipe = DiffusionPipeline.from_pretrained(
87
+ base_model_id,
88
+ unet=unet,
89
+ scheduler=scheduler,
90
+ torch_dtype=torch.float16,
91
+ variant="fp16",
92
+ ).to("cuda")
93
+ prompt = "a photo of a cat"
94
+ image = pipe(
95
+ prompt=prompt,
96
+ num_inference_steps=1, # NotroSD-Realism and -Vibrant both support 1 - 4 inference steps.
97
+ guidance_scale=0,
98
+ ).images[0]
99
+ ```
100
+
101
+ ## License
102
+
103
+ NitroSD-Realism is released under [cc-by-nc-4.0](https://creativecommons.org/licenses/by-nc-sa/4.0/deed.en), following its base model *DMD2*.
104
+
105
+ NitroSD-Vibrant is released under [openrail++](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/blob/main/LICENSE.md).
106
+
107
+ <!-- ## Contact
108
+
109
+ Feel free to contact us if you have any questions about the paper!
110
+
111
+ Dar-Yen Chen [@surrey.ac.uk](mailto:@surrey.ac.uk)
112
+
113
+ ## Citation
114
+
115
+ If you find NitroFusion useful or relevant to your research, please kindly cite our papers:
116
+
117
+ ```bib
118
+
119
+ ``` -->