Update README.md

619a941 verified about 1 month ago

4.44 kB

	---
	license: other
	license_name: stabilityai-ai-community
	license_link: LICENSE.md
	tags:
	- text-to-image
	- stable-diffusion
	- diffusers
	inference: true
	language:
	- en
	pipeline_tag: text-to-image
	---

	# Stable Diffusion 3.5 Large BF16
	![3.5 Large Demo Image](sd3.5_large_demo.png)

	## Model

	![MMDiT](mmdit.png)


	[Stable Diffusion 3.5 Large](https://stability.ai/news/introducing-stable-diffusion-3-5) is a Multimodal Diffusion Transformer (MMDiT) text-to-image model that features improved performance in image quality, typography, complex prompt understanding, and resource-efficiency.

	Please note: This model is released under the [Stability Community License](https://stability.ai/community-license-agreement). Visit [Stability AI](https://stability.ai/license) to learn or [contact us](https://stability.ai/enterprise) for commercial licensing details.


	### Model Description

	- Developed by: Stability AI
	- Model type: MMDiT text-to-image generative model
	- Model Description: This model generates images based on text prompts. It is a [Multimodal Diffusion Transformer](https://arxiv.org/abs/2403.03206) that use three fixed, pretrained text encoders, and with QK-normalization to improve training stability.

	### License

	- Community License: Free for research, non-commercial, and commercial use for organizations or individuals with less than $1M in total annual revenue. More details can be found in the [Community License Agreement](https://stability.ai/community-license-agreement). Read more at https://stability.ai/license.
	- For individuals and organizations with annual revenue above $1M: please [contact us](https://stability.ai/enterprise) to get an Enterprise License.

	### Model Sources

	For local or self-hosted use, we recommend [ComfyUI](https://github.com/comfyanonymous/ComfyUI) for node-based UI inference, or [diffusers](https://github.com/huggingface/diffusers) or [GitHub](https://github.com/Stability-AI/sd3.5) for programmatic use.

	- ComfyUI: [Github](https://github.com/comfyanonymous/ComfyUI), [Example Workflow](https://comfyanonymous.github.io/ComfyUI_examples/sd3/)
	- Huggingface Space: [Space](https://huggingface.co/spaces/stabilityai/stable-diffusion-3.5-large)
	- Diffusers: [See below](#using-with-diffusers).
	- GitHub: [GitHub](https://github.com/Stability-AI/sd3.5).

	- API Endpoints:
	- [Stability AI API](https://platform.stability.ai/docs/api-reference#tag/Generate/paths/~1v2beta~1stable-image~1generate~1sd3/post)
	- [Replicate](https://replicate.com/stability-ai/stable-diffusion-3.5-large)
	- [Deepinfra](https://deepinfra.com/stabilityai/sd3.5)


	### Implementation Details

	- QK Normalization: Implements the QK normalization technique to improve training Stability.

	- Text Encoders：
	- CLIPs: [OpenCLIP-ViT/G](https://github.com/mlfoundations/open_clip), [CLIP-ViT/L](https://github.com/openai/CLIP/tree/main), context length 77 tokens
	- T5: [T5-xxl](https://huggingface.co/google/t5-v1_1-xxl), context length 77/256 tokens at different stages of training

	- Training Data and Strategy:

	This model was trained on a wide variety of data, including synthetic data and filtered publicly available data.

	For more technical details of the original MMDiT architecture, please refer to the [Research paper](https://stability.ai/news/stable-diffusion-3-research-paper).


	### Model Performance

	See [blog](https://stability.ai/news/introducing-stable-diffusion-3-5) for our study about comparative performance in prompt adherence and aesthetic quality.

	## Using with Diffusers
	Upgrade to the latest version of the [🧨 diffusers library](https://github.com/huggingface/diffusers)
	```
	pip install -U diffusers
	```

	and then you can run
	```py
	import torch
	from diffusers import StableDiffusion3Pipeline

	pipe = StableDiffusion3Pipeline.from_pretrained("stabilityai/stable-diffusion-3.5-large", torch_dtype=torch.bfloat16)
	pipe = pipe.to("cuda")

	image = pipe(
	"A capybara holding a sign that reads Hello World",
	num_inference_steps=28,
	guidance_scale=3.5,
	).images[0]
	image.save("capybara.png")
	```

	### Contact

	Please report any issues with the model or contact us:

	* Safety issues: [email protected]
	* Security issues: [email protected]
	* Privacy issues: [email protected]
	* License and general: https://stability.ai/license
	* Enterprise license: https://stability.ai/enterprise