yashvoladoddi37's picture
Update README.md
ff09d63 verified
---
tags:
- stable-diffusion
- stable-diffusion-diffusers
- text-to-image
datasets:
- yashvoladoddi37/kanjienglish
language:
- en
- ja
library_name: diffusers
pipeline_tag: text-to-image
---
# Kanji Diffusion v1-4 Model Card
Kanji Diffusion is a latent text-to-image diffusion model capable of hallucinating Kanji characters given any English prompt.
## Fine-tuned Model Details
- **Developed by:** Yashpreet Voladoddi
- **Model type:** Diffusion-based text-to-image generation model, fine-tuned on Stable Diffusion v1.4 model.
### Colab
In order to run the pipeline and see how my model generates the kanji characters, follow the code flow below on Colab(on T4 GPU runtime, else it takes a long time to infer each image).
Make sure you have your Huggingface API KEY / ACCESS TOKEN for this.
```python
import os
from google.colab import drive
drive.mount('/content/drive')
os.chdir("/content/drive/MyDrive")
!pip install diffusers
!git clone https://github.com/huggingface/diffusers
!huggingface-cli login
from diffusers import StableDiffusionPipeline
import torch
torch.cuda.empty_cache()
model_path = "yashvoladoddi37/kanji-diffusion-v1-4"
pipe = StableDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4", torch_dtype=torch.float16, use_safetensors = True).to("cuda")
pipe.unet.load_attn_procs(model_path)
pipe.to("cuda")
prompt = "A Kanji meaning baby robot"
image = pipe(prompt).images[0]
image.save("baby-robot-kanji-v1-4.png")
```
### Limitations
## Training
**Training Data**
**Hardware:** Nvidia GTX 1650 4GB vRAM | 8GB RAM and T4 GPU on Colab
**Training Script:**
```python
!accelerate launch train_text_to_image_lora.py \
--pretrained_model_name_or_path="CompVis/stable-diffusion-v1-4" \
--dataset_name="yashvoladoddi37/kanjienglish" \
--image_column = "image"
--caption_column="text" \
--resolution=512 \
--random_flip \
--train_batch_size=1 \
--num_train_epochs=1 \
--checkpointing_steps=500 \
--learning_rate=1e-04 \
--lr_scheduler="constant" \
--lr_warmup_steps=0 \
--seed=42 \
--output_dir="kanji-diffusion-v1-4" \
--validation_prompt="A kanji meaning Elon Musk" \
--push_to_hub
```